Knowledge as a Service

From devsummit

5 statements.

Author Tags Primary Session Secondary Sessions Position Statement
Katherine Maher Alternative Interfaces, Architecture, Knowledge as a Service, Strategy, User Experience Knowledge as a Service Advancing the Contributor Experience

This proposal focuses on the "Knowledge as a service" part of the strategic direction.

When I look at the core of what we do, to some extent I see a model that we've mastered, and that we're making incremental improvements to. My concern is that, while that model is incredible and powerful as a community, the model for the interface and the delivery mechanism for the product the community creates are changing, and for us to continue what we're doing today may or may not prepare us for what the future actually looks like. I think it also limits our ability to unlock all of the tremendous knowledge, unstructured and structured, that exists within our projects. And I also believe that it limits us to certain forms of knowledge and a certain hierarchy of creation in a way that is very inward-looking.

Right now much of our information is sitting, unstructured, in a SQL database, rendered through PHP, read through a rendering engine into a browser to read/write in one interface: the browser. While this is amazing for the world of the browser, we're not going to be a browser-based information world for that much longer, any more than anything else. It's not that the browser is going to go away, the browser will be like books: books haven't gone away, radio hasn't gone away, but there will be a transformation to a new interface, and we need to be ready for it. Perhaps we should actually backfill into those older interfaces that we're not currently part of, because people still use those interfaces, and those interfaces are valuable.

Essentially this is about taking the Model-View-Controller paradigm to the next level, and also about extending it to participation and to the "write" part of our read-write system. Even if Alexa is serving Wikimedia content outside the browser, there is no mechanism for contributing trough Alexa. We need to be planning for an architecture of information and architecture of experiences that is independent of the browser.

How do you get the most value out of the existing content? How do you serve a snippet to someone who just needs a quick answer? How do you serve different layers of sophistication to 8th-graders versus the college graduate, versus the PhD? Can we engage in the knowledge ecosystem and leverage what we have as a platform, and our traffic distribution and awareness, to actually open up greater resources of knowledge?

These are some of the topics I would like to see discussed at the Dev Summit.

Guillaume Paumier Alternative Interfaces, Knowledge as a Service, Knowledge Equity, New Users Advancing the Contributor Experience Knowledge as a Service

The strategic direction that has emerged has two components: "Knowledge as a service" and "Knowledge equity". "Knowledge as a service", which focuses on infrastructure, seems like the one most related to technology, This proposal is about exploring the less obvious intricacies between the two components, and in particular the technology implications of Knowledge Equity.

As a complex socio-technical system, it's not really possible to separate people from technology when talking about Wikimedia. A direction of Knowledge Equity invites the contributors of the Wikimedia movement to take a critical look at themselves and assess their biases and privileges. This, in turn, can help identify structural biases that have been reproduced and ingrained in our technical platform.

For example, MediaWiki is currently doing a great job at providing a localized interface in many languages. However, beyond language, interaction design and UX patterns seem very specific to Western culture. Similarly, when our strategic direction talks about building strong and diverse communities, this invites us to consider whether the current tools available to contributors enable them to provide an environment where newcomers can experiment, be mentored, and fail safely.

Beyond software, little effort has been invested in exploring alternative interfaces beyond the connected browser. Our primary interface for contribution (the web site) may work well for middle-class contributors from Europe and North America, but isn't necessarily what enables people from other backgrounds or geographies from contributing.

These are some of the topics I would like to bring up for discussion at the Developer Summit.

Subramanya Sastry API, Knowledge as a Service, Schema.org, Structural Semantics, Templates, Wikidata Advancing the Contributor Experience Knowledge as a Service

PROBLEM

To satisfy the 'Knowledge as a service' theme, in addition to providing access to full page content, Wikimedia APIs should provide access to semantic units at: - an abstract document level (sections, headings, tables, etc.) - a domain specific level (infoboxes, geolocation, taxoboxes, etc.)

Wikitext, the core content creation technology on wikis, evolved as a string-processing language where one set of strings is replaced with another set of strings mostly via regular expression matches to yield the output HTML string. There is no notion of document structures here.

This lack of structural semantics gets in the way of being able to robustly identify semantic units and developing tools and features that operate on a page structurally at sub-page granularities.

SOLUTION: TRANSPARENT TYPING LAYER OVER WIKITEXT

Types improve abstraction, reasoning, and tooling abilities in programming languages. A transparent typing layer on top of wikitext can provide similar benefits.

A: ENFORCE STRUCTURAL TYPES ON OUTPUT OF WIKITEXT CONSTRUCTS INCLUDING TEMPLATES AND EXTENSIONS

- Specify that all wikitext constructs have an output with type:

 String, DOM, CSS property, HTML-attribute, or a List of one of those

- Extensions and templates can specify the expected output type. All other

 core wikitext constructs have the DOM output type.

- Parser enforces the output type of all wikitext constructs. Examples:

 For DOM types, unclosed tags and misnested tags are fixed up.
 For String types, HTML tags are escaped, wikitext strings are nowikied.
 For CSS types, values are sanitized.

Among other benefits, this basic typing mechanism enables MediaWiki to provide an API to extract and edit document fragments without introducing adverse side-effects on the rest of the page.

B: UNIFIED TYPING MECHANISM TO EXPOSE DOMAIN-SPECIFIC SEMANTIC INFORMATION

Editors impose structure in documents through a rich library of templates, policies, and maintainance processes they have developed over the years. If this semantic information (infoboxes, navboxes, sports rankings, railway timetables, etc) is mapped to a centralized ontology system (wikidata, schema.org, something else), the parser can expose this information in HTML and via MediaWiki APIs can expose this information in a wiki-neutral way.

There are multiple disparate mechanisms today wherein template authors specify metadata about templates (templatadata, templatestyles, possibly others?)

Instead of creating newer mechanisms for specifying structural output types and semantic information types for templates, it is better to provide a consolidated mechanism that unifies all this template metadata into a single user-defined type declaration. This lets newer applications and capabilities to be developed in the future without code changes to the core mechanism.

FEASIBILITY

This typing layer only affects template authors. Editors that use source editing won't see any impact (besides fewer markup errors). Editors that use visual editing might see improved tooling. Even for template authors, this is meant to be an opt-in mechanism with gradual migration over to the new model.

The proposal here is a logical extension of what Parsoid does today. Parsoid provides an illusion of structured wikitext and demonstrates what is possible (VE, CX, Linter, Flow among others) by embracing structured semantics.

Madhumitha Viswanathan API, Complexity, Documentation, Knowledge as a Service, Tools Evolving the MediaWiki Architecture Knowledge as a Service

We are in the business of democratizing knowledge, and I believe that lowering and removing technical barriers to entry, and creating a culture of inclusion in our technical spaces is essential to our success.'

The Knowledge as a Service aspect of our strategic directions focuses on building infrastructure and platforms that help create and share open knowledge. The key to successfully building and scaling such infrastructure, in the context of the Wikimedia Technical spaces, is enabling everyone, irrespective of their experience or backgrounds to be able to utilize and create research, data, and tools on top of our infrastructure. When designing infrastructure and other technical products, we often fail to take into account technical barriers, inessential complexities and social costs that can discourage or prevent people from being able to leverage them. For instance, is it enough to build a dataset and store it in a database, if we do not provide friendly ways for researchers to access and analyze this data? Is it enough to put out a call to contribute to a project, but not provide easy-to-setup development environments to be able to test changes? Is it sufficient to have a state of the art environment to host applications, but not design good, simple processes around gaining access and deploying to them?

These conversations are crucial, because we are not building products for technology's sake, but are in fact trying to build a culture where it is easy to use and contribute to our technical projects, whether you are a volunteer who has a few hours to spare or a paid employee; a newcomer or a long time contributor. We also want our technical communities to be diverse, and these complex systems and processes, and unsaid social constructs around how to interact with our projects, often bias against traditionally underrepresented populations in technology.

I have always worked on or pushed for creating and supporting simple graphical interfaces that provide unified access to data sources, building platforms and processes that lets people just create tools/APIs/dashboards and be able to painlessly host them, developing tutorials and good documentation for getting involved in our projects, and codifying friendly and inclusive social norms and promoting a culture of being excellent to each other in our technical spaces.

When talking about the future directions of new and existing projects, we should take into account the costs and barriers to access, and who we may be failing to include as a result. I hope to be this voice in the Developer Summit.

Leila Zia Infrastructure, Knowledge as a Service, Knowledge Equity, Languages, Oral Knowledge, Research, Strategy, Trust Knowledge as a Service Research, Analytics, and Machine Learning

Title: Knowledge is our direction. What's next?

Combined knowledge as a service (KAS) and knowledge equity (KE) is identified as our strategic direction (draft). We have decided to focus on knowledge in a broader sense and beyond just encyclopedic knowledge, create KE, and become the infrastructure that offers KAS. In this position paper, I offer some of my early thoughts on where we should focus our efforts to move in this strategic direction. Given the limits of word-count, I will not go through the details of research methods and techniques that can be used to address each point.

Knowledge

As the central focus of the strategic direction is knowledge, we need to arrive at a unified working definition of knowledge. English Wikipedia defines knowledge as familiarity, awareness, or understanding of someone or something which is acquired through experience or education, by perceiving, discovering, or learning. This definition, however, is not a working definition that can help us decide what new content to include.

Research on user behavior, needs, and learning patterns can help us define knowledge.

Knowledge equity

Our goal is to remove structural inequalities that limit our ability to represent knowledge from all people and by all people. To this end, we need to meet our users where they are. Today:

  • language is a barrier to sharing in knowledge. Content should be available to our users in their languages.
  • text-only knowledge is a blocker for gathering knowledge, especially from parts of the world that are already left behind. Our systems should become technologically receptive to accepting and allowing editability of new forms of knowledge (e.g., voice for oral knowledge).
  • limits in proficiency and literacy is a blocker for our users. The content and its presentation will need to become a function of these parameters.

Knowledge as a service

Our goal is to offer KAS: both in terms of the infrastructure that supports it as well as the content of it. To do this, we need to:

  • empower our users to learn, create, and go beyond consuming content: Wikimedia projects' talk and discussion pages are an asset for building systems that can help our users think critically and learn how to deliberate. We need to do research to surface this critical thinking and step by step deliberation to gain insights from it, and share it with others as part of our KAS effort.
  • do research and development on building systems where deliberation and decision making can be possible at scale. Today, there is no such system available but one of the building blocks of KAS is infrastructure for discussion, deliberation, and decision making.
  • empower our users with ways to assess the trustworthiness of the content. Trust and reputation become especially important as we move to new forms of knowledge such as oral knowledge. We should do research to build trust and reputation models for Wikimedia and its users and understand how to surface such metrics as measures of reliability of the knowledge we serve.