Wikibase

From devsummit

3 statements.

Author Tags Primary Session Secondary Sessions Position Statement
Adam Baso Machine Learning, Mobile, Schema.org, Structured Data, Templates, Wikibase, Wikidata Knowledge as a Service Advancing the Contributor Experience, Research, Analytics, and Machine Learning

Structure Most Things with Schema.org

The future of digital information will likely be brokered by major platform providers such as Google, Apple, Amazon, Microsoft, and international equivalents and social networks. We're thankful they extend our reach, even as we seek to help consumers on the platforms join our movement.

We could help platform providers, their users, and our users solve problems better through adoption of the open standard Schema.org into Wikipedia pages mapped with templates and, ideally, federated and synchronized Wikidata properties.

Benefits:

  1. Wikipedia will have even better presentation and placement in search engines and other data rich experiences.
  2. We provide an opportunity for a more consistent data model for template authors and people/bots filling template values. And the richly defined Schema.org entities provide a good target to reach on all entities represented in the Wikipedia/Wikimedia corpora. Standardization can reduce duplication of effort and inconsistencies.
  3. We introduce an easier vector for mobile contribution, which could include simpler and different data entry, mapping, and modeling.
  4. We can elevate an open standard and push its adoption forward while increasing the movement's standing in the open standards community.
  5. Schema.org compliant data is more easily amenable to machine learning models that cover data structures, the relations between entities, and the dynamics of sociotechnical systems. This could bolster practical applications like vandalism detection, coverage analysis, and much more.
  6. This might provide a means for the education sector to educate students about knowledge creation, and data modeling, and more. It might also afford scientists and other practitioners a further standardized way to model the knowledge in their fields.

What would it take? And can this be done in harmony with the existing {{Template}} system?

This session will discuss the following:

  1. Are we aligned on the benefits, and which ones?
  2. Implementation options.
    1. Can we extend templates so they could be mapped to Schema.org?
      1. Would it be okay to derive the mapping by manual and automated analysis at WMF/WMDE and apply it behind the scenes? Would that be sustainable?
      2. Could we make it easy for template authors to mark up their templates for Schema.org compatibility and have some level of enforcement? Could Schema.org attributes and entity types be autosuggested for template creators?
    2. Is it easy to relate the most existing and proposed Wikidata entity types and properties to existing Schema.org entities and properties?
    3. What would it take to streamline MCR Schema.org data structures or MCR Wikibase property clusters mapped to Schema.org on defined entity types?
    4. Furthermore, if we can do #1 and #2, what's to prevent us from letting templates as is merely be the interface for Schema.org compliant Wikibase entities and properties (e.g., by duck typing / autosynthesis)?
    5. How could we bidirectionally synchronized between Wikipedia and Wikidata with confidence in a way compatible with patroller expectations? And what storage and event processing would be needed? Can the systems be scaled in a way to accommodate arrival of real-time and increasingly fine grained information?
Thomas Pellissier Tanon Analytics, API, Collaboration, JavaScript, Lua, Mobile, Multimedia, Structured Data, User Experience, Wikibase, Wikidata Knowledge as a Service

Title: Content structuration and metadata are keys to fulfil our strategy

Content: The Wikimedia mouvement strategy is making a focus on serving more different kinds of knowledge and sharing them with allies and partners [1]. I believe that the most important ground work for reaching these goals is to focus on the outgoing project of moving MediaWiki from a "wikitext plus media file" collaboration system to a platform allowing people to be able to collaborate on many kind of contents and to organise them in a cohesive way.

Two axes seem important to me to pursue this goal: 1. Support a broader set of different contents, not just wikitext, Wikibase items and Lua/JavaScript/CSS contents but also images, sounds, movies, books, etc., that should bee editable just like wiki text pages in order to allow people to improve them in a collaborative way. 2. Build platforms and tools allowing contributors to create and clean metadata about these contents in order to build together the broadest cohesive set of knowledge ever available and increase its reusability.

Going in these directions would allow us to:

  • Allow sister projects (and possible new ones) to use relevant content structure for their projects instead of a designed-for-everything wiki markup. It should lead to an increase of their reusability and their user-friendliness, just like what the Structured Metadata on Commons project is aiming for.
  • Build powerful APIs to retrieve and edit content just like Wikidata has and so, make working with partners easier.
  • Increase the connections between our contents and their discovery using their metadata
  • Build better tasked-focused mobile viewing and edit interfaces
  • Be more ready for the possible new environment changes like voice-powered interfaces


Some examples of projects we could work on in order to move in this direction:

  • Use the multiple content revision facility to migrate progressively the data that could be structured out of Wikitext on all our projects (like the structured metadata on Commons project is aiming for files)
  • Federate all our structured content into a "Wikimedia Query Service" that would allow to do unified and powerful analytics and to ease the discoverability of all our contents
  • The logical granularity of Wikisource, Wikibooks and Wikiversity contents (and maybe other projects?) is not the wiki page but the set of wiki pages storing a book or a course. MediaWiki should be able to support such use cases by providing a "collection" system allowing us to add metadata and to do operations (renaming, add to watchlist) on sets of wiki pages.
  • Switch projects that stores fairly structured data in wiki text templates (like Wiktionary or Wikispecies) from a Wikitext storage to a structured one. Build on top of them user interfaces to edit their local contents (and maybe also the relevant data from Wikidata) and provide nice displays and APIs to make humans and machine both able to retrieve these contents.
  • ...

[1] https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Direction#We_will_advance_our_world_through_knowledge.

Adam Shorland Architecture, Open Source, Wikibase Growing the MediaWiki Technical Community Embracing Open Source Software

I believe that the technicaly community should strive to collect and effectively disseminate technical knowledge as per the Wikimedia missions attement.

Ability to grow out technical community can be compared with ones own ability to gain knowledge in technical spaces within the Wikimedia movement. Currently there are many barriers to entry that have been surfaced year after year with some but little movement forward past them.

To scale and ready the community we should push forward and enable the use of emerging trends in technology, such as knowledge retaining Q&A platforms. There are many other organizations and softwares that do this much better than Wikimedia we should learn from them. Looking at Q&A platforms specifically, talk pages have never really been a good place to ask questions and retain knowledge in a searchable way for use in the future. Stackoverfolw, as an example, has proven to be an invaluable resource for people in technical spaces and we can learn from that. MediaWiki is an amazing piece of software, but we should not feel 'boxed in' by it. The Wikimedia foundation is not the MediaWiki foundation, MediaWiki does not have to always just be a wiki page.

Our commitment to Open Source is often something that slows down many actions within the movement, however this is not something that should change as it is integral to what Wikimedia stands for at the core. We should embrace our Open Source commitments and reach out to and engage with organizations using our software more. Wikimedia Germany does this outreach specifically with the Wikibase extension, looking for other users and engaging them to discover how they are using it, why, and how it can be better. The Wikibase extension also specifically the Wikibase Query Service shows us that not everything has to be a wiki page, as the query service disseminates knowledge under a free licence effectively.

I hope that the summit will agree that entry to our technical space, and increasing knowledge persistence within our technical space needs some thought and work, and that we should stay committed to Mediawiki as a software and platform, but that it can look, feel and act different while Wikimedia stays true to its mission.