Templates

From devsummit

5 statements.

Author Tags Primary Session Secondary Sessions Position Statement
Adam Baso Machine Learning, Mobile, Schema.org, Structured Data, Templates, Wikibase, Wikidata Knowledge as a Service Advancing the Contributor Experience, Research, Analytics, and Machine Learning

Structure Most Things with Schema.org

The future of digital information will likely be brokered by major platform providers such as Google, Apple, Amazon, Microsoft, and international equivalents and social networks. We're thankful they extend our reach, even as we seek to help consumers on the platforms join our movement.

We could help platform providers, their users, and our users solve problems better through adoption of the open standard Schema.org into Wikipedia pages mapped with templates and, ideally, federated and synchronized Wikidata properties.

Benefits:

  1. Wikipedia will have even better presentation and placement in search engines and other data rich experiences.
  2. We provide an opportunity for a more consistent data model for template authors and people/bots filling template values. And the richly defined Schema.org entities provide a good target to reach on all entities represented in the Wikipedia/Wikimedia corpora. Standardization can reduce duplication of effort and inconsistencies.
  3. We introduce an easier vector for mobile contribution, which could include simpler and different data entry, mapping, and modeling.
  4. We can elevate an open standard and push its adoption forward while increasing the movement's standing in the open standards community.
  5. Schema.org compliant data is more easily amenable to machine learning models that cover data structures, the relations between entities, and the dynamics of sociotechnical systems. This could bolster practical applications like vandalism detection, coverage analysis, and much more.
  6. This might provide a means for the education sector to educate students about knowledge creation, and data modeling, and more. It might also afford scientists and other practitioners a further standardized way to model the knowledge in their fields.

What would it take? And can this be done in harmony with the existing {{Template}} system?

This session will discuss the following:

  1. Are we aligned on the benefits, and which ones?
  2. Implementation options.
    1. Can we extend templates so they could be mapped to Schema.org?
      1. Would it be okay to derive the mapping by manual and automated analysis at WMF/WMDE and apply it behind the scenes? Would that be sustainable?
      2. Could we make it easy for template authors to mark up their templates for Schema.org compatibility and have some level of enforcement? Could Schema.org attributes and entity types be autosuggested for template creators?
    2. Is it easy to relate the most existing and proposed Wikidata entity types and properties to existing Schema.org entities and properties?
    3. What would it take to streamline MCR Schema.org data structures or MCR Wikibase property clusters mapped to Schema.org on defined entity types?
    4. Furthermore, if we can do #1 and #2, what's to prevent us from letting templates as is merely be the interface for Schema.org compliant Wikibase entities and properties (e.g., by duck typing / autosynthesis)?
    5. How could we bidirectionally synchronized between Wikipedia and Wikidata with confidence in a way compatible with patroller expectations? And what storage and event processing would be needed? Can the systems be scaled in a way to accommodate arrival of real-time and increasingly fine grained information?
Markus Glaser Code Review, Documentation, Gadgets, Software Development Practices, Templates, Third Parties Supporting Third-Party Use of MediaWiki Growing the MediaWiki Technical Community

Mediawiki needs a professional ecosystem

There is a huge potential for MediaWiki development outside of the Foundation's organized tech world. Thousands of organisations are running MediaWiki on the internet or intranet. They are investing time and money to make it their platform for information sharing, knowledge management or collaborative work. Yet, a lot of the development and design work stays contained on those installations instead of being published and provided to the greater MediaWiki community. I think this is not because of seclusiveness of the authors, but because we make it hard for externals to contribute.

So how can we tap into this potential? I think there are a number of measures we can take. Among others, these are:

  • Support standard ways for code contribution. For example, a lot of developers do have a github account, and know the github workflow of forking and requesting pulls. However, there is currently no way for them to contribute their code directly, instead they have to set up with our gerrit infrastructure. This is a hurdle many will not take.
  • Maintain extensions as a community. There are a lot of extensions which are not actively maintained by their authors. In order to get them working, you have to wait for the maintainer to +2 your code. Although I have +2 rights, it is not clear under which circumstances I should actually +2 code, nor is there a general review queue for extensions. We can establish a group of volunteers who review changes to extensions on a regular basis.
  • Create a template and gadget repository. A lot of work goes into site customisation using gadgets, templates or on-site-CSS. There are brilliant solutions out there, but we do not have a structured way to centrally collect this content or even curate it.
  • Make it attractive for professional developers and consultants build their projects on top of MediaWiki. For example by increasing the visibility of highly used extensions on MediaWiki.org, by providing good entry points for technical documentation or by adding automated quality checks to the extensions.

There are already some initiatives pursuing the general goal of fostering an ecosystem, e.g. MediaWiki Stakeholders or the recently announced Enterprise MediaWiki Consortium. Together with the Foundation, they can encourage MediaWiki maintainers to contribute their ideas and code and be part of the MediaWiki world.

Niharika Kohli Communities, Cross-wiki, Gadgets, Lua, New Users, Templates, Tools Evolving the MediaWiki Architecture Next Steps for Languages and Cross Project Collaboration

Investing in our communities

This position statement captures my thoughts about why and how we should be investing in our communities. There are a lot of ways we can encourage and support them, that we currently don't. Prioritizing to build tools for our communities is a crucial step for long term survival of our projects.

It's fairly common knowledge how a lot of our communities suffer from toxicity. It's incredibly hard for newcomers to edit, to stick around and stay engaged in the midst of the existing toxicity in the community. The problem frequently also exists in smaller communities. Just recently, the English wikipedia community has pushed WMF into implementing ACTRIAL and preventing brand new users from being able to create articles on the site. These are signs that all is not well with our communities. If we envision a future with an active, thriving editor community 15 years from now, we've to become more aware of how our communities function and do more to support them than what we do today. The problems also exists on the technical side. Communities without technical resources lose out on gadgets, templates, editing toolbar gadgets and so on. The editors on these wikis are still forced to do a lot of things the hard way. Non-wikipedia projects are probably the worst affected. Quite often our software projects also cater to the bigger projects. Often just wikipedias. I am sure we can't solve everything but I'm sure we can try to help solve at least some of the problems. We can invest in better tools for new users to create articles, to edit and experiment with wikitext markup. We can build a better "on boarding" experience for new users. For example, English wikipedia currently has "Article Creation Wizard"(https://en.wikipedia.org/wiki/Wikipedia:Article_wizard) which is outdated, poorly maintained and very confusing a lot of times. We can think about a more standardized solution which would be useful across wikis. We can also try to showcase user contributions in a better way, to build user engagement. Various wikis have been striving to create and sustain "wikiprojects" since a while with the result that several big wikipedias have come up with their own homegrown solutions for it. These are things the Foundation can help with building and standardize it for all wikis. For the technical problems, there is a big backlog of projects which are long overdue. Global cross-wiki watchlists, Global gadgets, templates, lua modules have been asked for by the community since many many years now. There are a lot more such projects to be found on Phabricator and the wishlist survey. These are projects which can be building blocks in making our communities more sustainable and thriving places. They are big and important enough projects that should make it into the product roadmap of teams outside of Community Tech. Another important thing we should think about is tools. Some tools such as pageviews analysis is one of the most important volunteer-maintained tools out there. What happens when it stops being maintained? When is a tool important enough for the Foundation to start thinking about incorporating that functionality in an extension/core? These are all important discussions to be had.

Subramanya Sastry API, Knowledge as a Service, Schema.org, Structural Semantics, Templates, Wikidata Advancing the Contributor Experience Knowledge as a Service

PROBLEM

To satisfy the 'Knowledge as a service' theme, in addition to providing access to full page content, Wikimedia APIs should provide access to semantic units at: - an abstract document level (sections, headings, tables, etc.) - a domain specific level (infoboxes, geolocation, taxoboxes, etc.)

Wikitext, the core content creation technology on wikis, evolved as a string-processing language where one set of strings is replaced with another set of strings mostly via regular expression matches to yield the output HTML string. There is no notion of document structures here.

This lack of structural semantics gets in the way of being able to robustly identify semantic units and developing tools and features that operate on a page structurally at sub-page granularities.

SOLUTION: TRANSPARENT TYPING LAYER OVER WIKITEXT

Types improve abstraction, reasoning, and tooling abilities in programming languages. A transparent typing layer on top of wikitext can provide similar benefits.

A: ENFORCE STRUCTURAL TYPES ON OUTPUT OF WIKITEXT CONSTRUCTS INCLUDING TEMPLATES AND EXTENSIONS

- Specify that all wikitext constructs have an output with type:

 String, DOM, CSS property, HTML-attribute, or a List of one of those

- Extensions and templates can specify the expected output type. All other

 core wikitext constructs have the DOM output type.

- Parser enforces the output type of all wikitext constructs. Examples:

 For DOM types, unclosed tags and misnested tags are fixed up.
 For String types, HTML tags are escaped, wikitext strings are nowikied.
 For CSS types, values are sanitized.

Among other benefits, this basic typing mechanism enables MediaWiki to provide an API to extract and edit document fragments without introducing adverse side-effects on the rest of the page.

B: UNIFIED TYPING MECHANISM TO EXPOSE DOMAIN-SPECIFIC SEMANTIC INFORMATION

Editors impose structure in documents through a rich library of templates, policies, and maintainance processes they have developed over the years. If this semantic information (infoboxes, navboxes, sports rankings, railway timetables, etc) is mapped to a centralized ontology system (wikidata, schema.org, something else), the parser can expose this information in HTML and via MediaWiki APIs can expose this information in a wiki-neutral way.

There are multiple disparate mechanisms today wherein template authors specify metadata about templates (templatadata, templatestyles, possibly others?)

Instead of creating newer mechanisms for specifying structural output types and semantic information types for templates, it is better to provide a consolidated mechanism that unifies all this template metadata into a single user-defined type declaration. This lets newer applications and capabilities to be developed in the future without code changes to the core mechanism.

FEASIBILITY

This typing layer only affects template authors. Editors that use source editing won't see any impact (besides fewer markup errors). Editors that use visual editing might see improved tooling. Even for template authors, this is meant to be an opt-in mechanism with gradual migration over to the new model.

The proposal here is a logical extension of what Parsoid does today. Parsoid provides an illusion of structured wikitext and demonstrates what is possible (VE, CX, Linter, Flow among others) by embracing structured semantics.

Brion Vibber API, Gadgets, JavaScript, Mobile, Security, Templates, Tools Evolving the MediaWiki Architecture

Infrastructure for Open: Safe code sharing in the Wikiverse

Wikipedia has always been a place where people build things, starting with MediaWiki itself... Talk pages were created out of formatting conventions manually followed. Templates and Lua modules grew out of users' need to automate common markup & text blocks. Gadgets came about to let users add new capabilities to their experience.

To scale our users' ability to work, we need to build modern infrastructure and APIs for on-wiki code: templates, gadgets, and custom workflows.

First, gadgets and templates need to be maintainable and sharable in a centralized place; copy-pasting doesn't scale. Integrate with "real developer" tools like git, so complex tools can be edited and archived off-wiki.

Second, they need to be safer and more future-proof. Template & module output is in wikitext, a fragile format; consider separating sanitized "true" templates from the data sources.

JS gadgets can access internal or deprecated APIs that may break ... or hijack a session as malware! We should create narrower APIs and run the gadgets in isolated JS contexts to provide fault isolation -- this would also enable using them in different contexts such as mobile apps, by implementing the same interfaces.

Third, we need to make content "smarter" by giving it the ability to run interactive scripts safely -- a mix of what templates/modules and what gadgets can do. This can be used to make animated widgets for article pages, but more importantly could be used to implement discussion & editing workflows to supplement what you can do with just a talk page and a set of conventions. At a minimum, think of what people do with Google Forms to guide input, and let folks do that on-wiki.

On-wiki tool-building is a "force multiplier" that lets people get more done by organizing themselves. Providing better tools for tool builders will lead to happier, more productive users working for our mission.