Session:10: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
{{Session | {{Session | ||
|title=Knowledge as a Service | |title=Knowledge as a Service | ||
|facilitator=Statement:58 | |facilitator=Statement:58,Statement:34,Statement:4,Statement:54 | ||
}} | }} |
Revision as of 12:48, 14 December 2017
5 primary statements. 4 secondary statements.
|
Loading... |
Primary
Author | Tags | Primary Session | Secondary Sessions | Position Statement |
---|---|---|---|---|
Adam Baso | Machine Learning, Mobile, Schema.org, Structured Data, Templates, Wikibase, Wikidata | Knowledge as a Service | Advancing the Contributor Experience, Research, Analytics, and Machine Learning |
Structure Most Things with Schema.orgThe future of digital information will likely be brokered by major platform providers such as Google, Apple, Amazon, Microsoft, and international equivalents and social networks. We're thankful they extend our reach, even as we seek to help consumers on the platforms join our movement. We could help platform providers, their users, and our users solve problems better through adoption of the open standard Schema.org into Wikipedia pages mapped with templates and, ideally, federated and synchronized Wikidata properties. Benefits:
What would it take? And can this be done in harmony with the existing {{Template}} system? This session will discuss the following:
|
Marius Hoch | Third Parties, Wikidata | Knowledge as a Service |
Making use of Wikidata's knowledge on Wikimedia projects and beyond It is foreseeable that the way our content will be consumed is going to change a lot in the next years, as both the demographics of the internet as well as the devices used by our readers are changing. We should try to adapt to this by offering our content in ways best suited for many user scenarios. In order to achieve this, we need to modularize and structure our content so that it can be easily re-interpreted and used in many different ways. Wikidata gives us the possibility to easily cater for this trend, by providing machine readable data about any subject, which can be formatted and presented in a wide variety of ways and languages. Wikidata makes it possible to more easily maintain up to date information on subjects in many languages, without the burden of manual data maintenance. We should strengthen this by improving the integration of Wikidata with other Wikimedia projects, providing easy ways to use and profit from Wikidata especially for small communities and by making the power of Wikidata more visible. While all Wikimedia communities will profit from this, it can be especially worthwhile for small communities, that currently don't have the resources for managing data, like Infoboxes, themselves. Example projects that will help in this area are the 'ArticlePlaceholder', that allows serving Wikidata-data about a certain subject if there's no article about it. Also the plans for automatic Infoboxes derived solely from Wikidata and other means of using Wikidata-data on Wikimedia projects. While both of these projects can have a big community impact, they need to fit in with the current infrastructure. Also they pose certain new scalability and data presentation challenges that need to be addressed. Furthermore, Wikidata's information should be easy to reuse by third party projects to increase visibility and in order to gain contributions and data donations, making Wikidata the true data hub of the internet. This goal raises longstanding issues with the current Wikimedia dump infrastructure, which is neither very flexible nor does it provide a machine readable interface for data consumers. Also bringing more individual editors and organizations into Wikidata poses various infrastructure and scalability issues coping with the sheer amount of data and changes happening, as well as providing convenient tools for establishing and maintaining data quality. | |
Katherine Maher | Alternative Interfaces, Architecture, Knowledge as a Service, Strategy, User Experience | Knowledge as a Service | Advancing the Contributor Experience |
This proposal focuses on the "Knowledge as a service" part of the strategic direction. When I look at the core of what we do, to some extent I see a model that we've mastered, and that we're making incremental improvements to. My concern is that, while that model is incredible and powerful as a community, the model for the interface and the delivery mechanism for the product the community creates are changing, and for us to continue what we're doing today may or may not prepare us for what the future actually looks like. I think it also limits our ability to unlock all of the tremendous knowledge, unstructured and structured, that exists within our projects. And I also believe that it limits us to certain forms of knowledge and a certain hierarchy of creation in a way that is very inward-looking. Right now much of our information is sitting, unstructured, in a SQL database, rendered through PHP, read through a rendering engine into a browser to read/write in one interface: the browser. While this is amazing for the world of the browser, we're not going to be a browser-based information world for that much longer, any more than anything else. It's not that the browser is going to go away, the browser will be like books: books haven't gone away, radio hasn't gone away, but there will be a transformation to a new interface, and we need to be ready for it. Perhaps we should actually backfill into those older interfaces that we're not currently part of, because people still use those interfaces, and those interfaces are valuable. Essentially this is about taking the Model-View-Controller paradigm to the next level, and also about extending it to participation and to the "write" part of our read-write system. Even if Alexa is serving Wikimedia content outside the browser, there is no mechanism for contributing trough Alexa. We need to be planning for an architecture of information and architecture of experiences that is independent of the browser. How do you get the most value out of the existing content? How do you serve a snippet to someone who just needs a quick answer? How do you serve different layers of sophistication to 8th-graders versus the college graduate, versus the PhD? Can we engage in the knowledge ecosystem and leverage what we have as a platform, and our traffic distribution and awareness, to actually open up greater resources of knowledge? These are some of the topics I would like to see discussed at the Dev Summit. |
Thomas Pellissier Tanon | Analytics, API, Collaboration, JavaScript, Lua, Mobile, Multimedia, Structured Data, User Experience, Wikibase, Wikidata | Knowledge as a Service |
Title: Content structuration and metadata are keys to fulfil our strategy Content: The Wikimedia mouvement strategy is making a focus on serving more different kinds of knowledge and sharing them with allies and partners [1]. I believe that the most important ground work for reaching these goals is to focus on the outgoing project of moving MediaWiki from a "wikitext plus media file" collaboration system to a platform allowing people to be able to collaborate on many kind of contents and to organise them in a cohesive way. Two axes seem important to me to pursue this goal: 1. Support a broader set of different contents, not just wikitext, Wikibase items and Lua/JavaScript/CSS contents but also images, sounds, movies, books, etc., that should bee editable just like wiki text pages in order to allow people to improve them in a collaborative way. 2. Build platforms and tools allowing contributors to create and clean metadata about these contents in order to build together the broadest cohesive set of knowledge ever available and increase its reusability. Going in these directions would allow us to:
| |
Leila Zia | Infrastructure, Knowledge as a Service, Knowledge Equity, Languages, Oral Knowledge, Research, Strategy, Trust | Knowledge as a Service | Research, Analytics, and Machine Learning |
Title: Knowledge is our direction. What's next? Combined knowledge as a service (KAS) and knowledge equity (KE) is identified as our strategic direction (draft). We have decided to focus on knowledge in a broader sense and beyond just encyclopedic knowledge, create KE, and become the infrastructure that offers KAS. In this position paper, I offer some of my early thoughts on where we should focus our efforts to move in this strategic direction. Given the limits of word-count, I will not go through the details of research methods and techniques that can be used to address each point. KnowledgeAs the central focus of the strategic direction is knowledge, we need to arrive at a unified working definition of knowledge. English Wikipedia defines knowledge as familiarity, awareness, or understanding of someone or something which is acquired through experience or education, by perceiving, discovering, or learning. This definition, however, is not a working definition that can help us decide what new content to include. Research on user behavior, needs, and learning patterns can help us define knowledge. Knowledge equityOur goal is to remove structural inequalities that limit our ability to represent knowledge from all people and by all people. To this end, we need to meet our users where they are. Today:
Knowledge as a serviceOur goal is to offer KAS: both in terms of the infrastructure that supports it as well as the content of it. To do this, we need to:
|
Secondary
Author | Tags | Primary Session | Secondary Sessions | Position Statement |
---|---|---|---|---|
Guillaume Paumier | Alternative Interfaces, Knowledge as a Service, Knowledge Equity, New Users | Advancing the Contributor Experience | Knowledge as a Service |
The strategic direction that has emerged has two components: "Knowledge as a service" and "Knowledge equity". "Knowledge as a service", which focuses on infrastructure, seems like the one most related to technology, This proposal is about exploring the less obvious intricacies between the two components, and in particular the technology implications of Knowledge Equity. As a complex socio-technical system, it's not really possible to separate people from technology when talking about Wikimedia. A direction of Knowledge Equity invites the contributors of the Wikimedia movement to take a critical look at themselves and assess their biases and privileges. This, in turn, can help identify structural biases that have been reproduced and ingrained in our technical platform. For example, MediaWiki is currently doing a great job at providing a localized interface in many languages. However, beyond language, interaction design and UX patterns seem very specific to Western culture. Similarly, when our strategic direction talks about building strong and diverse communities, this invites us to consider whether the current tools available to contributors enable them to provide an environment where newcomers can experiment, be mentored, and fail safely. Beyond software, little effort has been invested in exploring alternative interfaces beyond the connected browser. Our primary interface for contribution (the web site) may work well for middle-class contributors from Europe and North America, but isn't necessarily what enables people from other backgrounds or geographies from contributing. These are some of the topics I would like to bring up for discussion at the Developer Summit. |
Subramanya Sastry | API, Knowledge as a Service, Schema.org, Structural Semantics, Templates, Wikidata | Advancing the Contributor Experience | Knowledge as a Service |
PROBLEM To satisfy the 'Knowledge as a service' theme, in addition to providing access to full page content, Wikimedia APIs should provide access to semantic units at: - an abstract document level (sections, headings, tables, etc.) - a domain specific level (infoboxes, geolocation, taxoboxes, etc.) Wikitext, the core content creation technology on wikis, evolved as a string-processing language where one set of strings is replaced with another set of strings mostly via regular expression matches to yield the output HTML string. There is no notion of document structures here. This lack of structural semantics gets in the way of being able to robustly identify semantic units and developing tools and features that operate on a page structurally at sub-page granularities. SOLUTION: TRANSPARENT TYPING LAYER OVER WIKITEXT Types improve abstraction, reasoning, and tooling abilities in programming languages. A transparent typing layer on top of wikitext can provide similar benefits. A: ENFORCE STRUCTURAL TYPES ON OUTPUT OF WIKITEXT CONSTRUCTS INCLUDING TEMPLATES AND EXTENSIONS - Specify that all wikitext constructs have an output with type: String, DOM, CSS property, HTML-attribute, or a List of one of those - Extensions and templates can specify the expected output type. All other core wikitext constructs have the DOM output type. - Parser enforces the output type of all wikitext constructs. Examples: For DOM types, unclosed tags and misnested tags are fixed up. For String types, HTML tags are escaped, wikitext strings are nowikied. For CSS types, values are sanitized. Among other benefits, this basic typing mechanism enables MediaWiki to provide an API to extract and edit document fragments without introducing adverse side-effects on the rest of the page. B: UNIFIED TYPING MECHANISM TO EXPOSE DOMAIN-SPECIFIC SEMANTIC INFORMATION Editors impose structure in documents through a rich library of templates, policies, and maintainance processes they have developed over the years. If this semantic information (infoboxes, navboxes, sports rankings, railway timetables, etc) is mapped to a centralized ontology system (wikidata, schema.org, something else), the parser can expose this information in HTML and via MediaWiki APIs can expose this information in a wiki-neutral way. There are multiple disparate mechanisms today wherein template authors specify metadata about templates (templatadata, templatestyles, possibly others?) Instead of creating newer mechanisms for specifying structural output types and semantic information types for templates, it is better to provide a consolidated mechanism that unifies all this template metadata into a single user-defined type declaration. This lets newer applications and capabilities to be developed in the future without code changes to the core mechanism. FEASIBILITY This typing layer only affects template authors. Editors that use source editing won't see any impact (besides fewer markup errors). Editors that use visual editing might see improved tooling. Even for template authors, this is meant to be an opt-in mechanism with gradual migration over to the new model. The proposal here is a logical extension of what Parsoid does today. Parsoid provides an illusion of structured wikitext and demonstrates what is possible (VE, CX, Linter, Flow among others) by embracing structured semantics. |
Raimond Spekking | Lua, OpenStreetMap, Wikidata | Growing the MediaWiki Technical Community | Knowledge as a Service |
tl;dr: Empower more Wikipedians [1] in using Lua, incl. invoking data from Wikidata, and data from the Data namespace on Commons. In the past 15 years it was relatively easy for every Wikipedian to create templates, and with the ParserFunctions to write some kind of simple program code. In 2013 we got Lua as real and powerful programming language. Since a short time it is possible to invoke data from Wikidata and a very short time to store/read data in/from the Data namespace on Wikimedia Commons. These are all good improvements and increases the possibilities in adding valuable data/information to the (often) high quality content of the German Wikipedia. But these techniques increases the requirements in technical knowledge to the Wikipedians. In other words: Substantial less Wikipedians are able to write Lua module than creating templates. As an active community member of the German Wikipedia and Wikimedia Commons (and Wikidata) I see a real bottleneck in the situation, that we do not have enough Wikipedians with skills in Lua and the possibitiy how to invoke data from Wikidata & Co. Checking the Module namespace via RecentChanges for September 2017 on the German Wikipedia: Only 13 Wikipedians contributed to Lua code. Often I read on the Village Pump and related pages: "Sorry, but my to-do-list as volunteer is full until end of the year, I do not have the capacity to solve your problem." and so on. This is very frustrating for all sides. Other cool [3] projects, like integration of Maps via Kartographer, stucks because the German Wikipedia has not enough man power for these works As a result of this notice we have to create some help to empower more Wikipedians in using Lua, incl. invoking data from Wikidata, and data from the Data namespace on Commons. In Germany, Austria, and Switzerland it probably easier than in other countries because we can offer real-live workshops. In 2018 the team of the "Wikipedia:Lokal K" [2], a real-live supporting base for Wikipedia & Co, is planning, with the support of WMDE, some workshops for Wikidata. Same could be done for supporting Lua & Co. As originator of the "Technical Wishlist" [4] I am thinking about adding support for Lua modules & Co to the wishlist. On the summit I would like to discuss these and other solutions for this bottleneck.
[1] to be read as users of all sister projects [2] https://de.wikipedia.org/wiki/Wikipedia:Lokal_K [3] My POV as active OpenStreetMap community member [4] https://meta.wikimedia.org/wiki/Community_Tech/Community_Wishlist_Survey_description Originated by me in September 2013: https://de.wikipedia.org/wiki/Wikipedia:Technische_W%C3%BCnsche/%C3%9Cber_das_Projekt |
Madhumitha Viswanathan | API, Complexity, Documentation, Knowledge as a Service, Tools | Evolving the MediaWiki Architecture | Knowledge as a Service |
We are in the business of democratizing knowledge, and I believe that lowering and removing technical barriers to entry, and creating a culture of inclusion in our technical spaces is essential to our success.' The Knowledge as a Service aspect of our strategic directions focuses on building infrastructure and platforms that help create and share open knowledge. The key to successfully building and scaling such infrastructure, in the context of the Wikimedia Technical spaces, is enabling everyone, irrespective of their experience or backgrounds to be able to utilize and create research, data, and tools on top of our infrastructure. When designing infrastructure and other technical products, we often fail to take into account technical barriers, inessential complexities and social costs that can discourage or prevent people from being able to leverage them. For instance, is it enough to build a dataset and store it in a database, if we do not provide friendly ways for researchers to access and analyze this data? Is it enough to put out a call to contribute to a project, but not provide easy-to-setup development environments to be able to test changes? Is it sufficient to have a state of the art environment to host applications, but not design good, simple processes around gaining access and deploying to them? These conversations are crucial, because we are not building products for technology's sake, but are in fact trying to build a culture where it is easy to use and contribute to our technical projects, whether you are a volunteer who has a few hours to spare or a paid employee; a newcomer or a long time contributor. We also want our technical communities to be diverse, and these complex systems and processes, and unsaid social constructs around how to interact with our projects, often bias against traditionally underrepresented populations in technology. I have always worked on or pushed for creating and supporting simple graphical interfaces that provide unified access to data sources, building platforms and processes that lets people just create tools/APIs/dashboards and be able to painlessly host them, developing tutorials and good documentation for getting involved in our projects, and codifying friendly and inclusive social norms and promoting a culture of being excellent to each other in our technical spaces. When talking about the future directions of new and existing projects, we should take into account the costs and barriers to access, and who we may be failing to include as a result. I hope to be this voice in the Developer Summit. |