Collaboration

From devsummit

8 statements.

Author Tags Primary Session Secondary Sessions Position Statement
Dan Andreescu Analytics, Big Data, Collaboration, Documentation, Open Source, Research, Third Parties Research, Analytics, and Machine Learning Embracing Open Source Software

Our strategic goals include scaling our communities to a truly global level, and expanding our understanding of human knowledge. To do this, in my opinion, we need to have a much better understanding of our communities' actual work. We have tens of thousands of people doing millions of hours of work every month, and nobody knows exactly what is being done, what the definition of "done" is, and how fast or slow the progress is. We are the leaders of the free knowledge movement, and we are mostly blind except for some big picture notions like pageviews and edits. It is my opinion that we need to develop a good understanding of the work being done on the wikis. Very capable people have already spent lots of time trying to do this, but I believe we have largely failed because of technical limitations. This is a big data and big compute problem, and we have not yet approached it as such. A close collaboration between our communities, Analytics, Research, and Audiences teams is needed, as well as the power of the WMF Hadoop cluster. I have had sessions on this topic already, and am excited to finish planning and transition to actual work. There are some very valuable implications of taking on and finishing this work. Most importantly, we will all be able to more objectively talk about frustrations in the community over changes that cause "more work". For example, when we launched Visual Editor there was huge backlash about the amount of work this change implied for our community. But because this was largely based on subjective opinions, emotions got involved and it took years to calm the negative effect of those emotions. This effort would also give us, for the first time, a way to celebrate these millions of hours of work. People could see, share, and take pride in their part of building human knowledge (if they wanted to, privacy is of course one of our top priorities).

I am also interested in expanding our Open Source efforts, and examining changes that we can make to spur more collaboration. My reading of the strategic goals for 2030 is that the WMF will not have enough resources to execute by itself. That's where collaboration will be crucial, and where problems like in-house developed libraries without true Open Source presence will slow us down. We let documentation and third-party user support lag behind because we're busy with other stuff, and that's arguably fine for our scale so far. But this approach will not allow us to grow the way our Strategy is defined.

Erik Bernhardson Analytics, Collaboration, Machine Learning, Open Source, Privacy, Structured Data Research, Analytics, and Machine Learning

Title: Empowering Editors with Machine Learning

Background: Advances in machine learning, powered by open source libraries, is becoming the foundational backbone of technology organizations the world over. Many tedious, time consuming, tasks that previously required 100% human involvement can now be augmented with human in the loop machine learning to empower editors to get more done with the limited time they have available to contribute to the sum of all human knowledge.

Advice: 1) Invest directly in applying known quantity machine learning, such as pre-trained ImageNet classifiers, to add structured data to our multimedia repositories to increase their discoverability. Perhaps via tools that provide editors with lists of appropriate items that they can easily click to add if appropriate to the multimedia.

2) Engage academia to work with Wikimedia data sets and employ developers to move the most promising results from research into production. There is already a significant amount of work being done in academia to test and evaluate machine learning with our data sets, but little to none of that work ever makes it back into Wikimedia sites. With more focus on collaboration we can encourage research that is specifically applicable to deployment goals.

3)Wikimedia has the ability to collect significant amounts of implicit user data via browsing sessions, searches, watchlists, editing histories, etc. that can be used for machine learning purposes. We need to be continuously thoughtful of the privacy implications of how we use this data.

Daniel Kinzler Architecture, Collaboration, Hosting, JavaScript, Mobile, Strategy Evolving the MediaWiki Architecture Research, Analytics, and Machine Learning

I would like to discuss how assumptions drive our day to day work, and how to may sure we properly understand and regularly challenge these assumptions. I'm particularly interested in how technological assumptions shape product decisions, and how product assumptions shape technological decisions. Three major axioms come to mind:

MediaWiki needs to run in a shared hosting environment. This has been an explicit requirement for a long time now, but the baseline product that actually does run in such an environment (LAMP with no root access) is becoming more and more sub-par. We are already struggling to provide a decent mobile browsing experience there, not to mention search or WYSIWYG editing. So we should have a discussion about for how long we want to kep this requirement, what the consequences would be of dropping it, and what alternative platform we should target for the baseline installation of MediaWiki.

Editing has to work with old browsers and without JavaScript. It has long been an explicit requirement that no basic functionality, particularly editing, can require JavaScript to be enabled. However, this causes us to fall behind other sites further and further. With more and more sites requiring JS, it's becoming less and less clear to me that this requirement is still sensible. This is especially true in the light of many developing countries skipping straight from mostly-offline to mobile-only.

The primary medium for knowledge sharing is text. This assumption used to be hard-coded into MediaWiki until the introduction of ContentHandler, and it still seems to be hard coded in the minds of many long term contributors, to the software and to the wikis. I believe that it is high time to invest into exploring other media formats and alternative forms of collaboration. It seems to me like "Beyond Wikitext" is the major technological challenge that has come out of the movement strategy process, and that we should start thinking and talking about it - from the technological side as well as the product side.

Birgit Müller Code Review, Collaboration, Communities, Documentation, Gadgets, Open Source, Refactoring, Tools Growing the MediaWiki Technical Community Evolving the MediaWiki Architecture

Refactoring the Open: First steps to get ready for the next level

Wikimedia's technical environment has grown into a very complex system throughout the past 15 years. Measured in internet years, parts of the software are ancient. When implementing a new feature, refactoring of a piece of the (extended) MediaWiki software is often required first. Following this principle of a.) refactoring and b.) implementation of something new, I suggest to start the discussion of the future technology direction by reflecting (and possibly: refactoring) the current Open Source practices and processes within the Wikimedia context.

A mono perspective won't let us survive (and is less fun, too)

When we talk about "Open Source" within Wikimedia we're not only talking about free licenses and open code repositories. We're talking about global collaboration and the technical contributions of many: Through this, we ensure that the Wikimedia projects stay alive and evolve, that we constantly develop new ideas, that multiple and diverse perspectives shape the development of our infrastructure and tools.

We are great in having ideas, and we are good in trying things out. But we still partly fail at prioritising the problems we know we have and address them accordingly.

I believe that we should

better maintain the Technical Community and find ways to grow by

  • allocating stable code review resources from paid staff for volunteer and 3rd party developers
  • improving the documentation of the code base
  • providing a single entry point that is easy accessible for interested developers
  • building up partnerships with Open Source communities we might share interests in the future with (for example, communities around audio, video or translation technologies)

constantly take diverse perspectives into account by

  • finding better ways to gather and address feedback from smaller language communities and non-Wikipedia sites
  • being less Wikipedia-centric when it comes to research: Not yet existing or emerging communities might not be interested in creating articles, but in contributing data or multimedia content or in building tools to reuse data and multimedia content

build more bridges across local wikis and increase knowledge of local requirements by

  • fostering cross-wiki exchange (example activity: template Hackathon)
  • increasing the knowledge of the requirements that come along with different languages (example activity: multilingual support conference)

Open Source doesn't mean anything is possible - does it?

We have established processes and regulations for contributions to MediaWiki itself. But we lack processes and practices for local developments to ensure both, the freedom and space to experiment for the Technical Community and the stability and reliability of tools for users.

I believe that we should e.g.

  • raise priority for implementing a code review process for JS/CSS pages on Wikimedia sites
  • start thinking about a technical sysop user right
  • make it clear which user scripts/gadgets/tools are maintained, which are stable and which are proofs of concept or prototypes (for example: provide a (central) 'store' of maintained gadgets/tools with different levels: "stable version", "experimental version" ...)

Let's start refactoring.

Thomas Pellissier Tanon Analytics, API, Collaboration, JavaScript, Lua, Mobile, Multimedia, Structured Data, User Experience, Wikibase, Wikidata Knowledge as a Service

Title: Content structuration and metadata are keys to fulfil our strategy

Content: The Wikimedia mouvement strategy is making a focus on serving more different kinds of knowledge and sharing them with allies and partners [1]. I believe that the most important ground work for reaching these goals is to focus on the outgoing project of moving MediaWiki from a "wikitext plus media file" collaboration system to a platform allowing people to be able to collaborate on many kind of contents and to organise them in a cohesive way.

Two axes seem important to me to pursue this goal: 1. Support a broader set of different contents, not just wikitext, Wikibase items and Lua/JavaScript/CSS contents but also images, sounds, movies, books, etc., that should bee editable just like wiki text pages in order to allow people to improve them in a collaborative way. 2. Build platforms and tools allowing contributors to create and clean metadata about these contents in order to build together the broadest cohesive set of knowledge ever available and increase its reusability.

Going in these directions would allow us to:

  • Allow sister projects (and possible new ones) to use relevant content structure for their projects instead of a designed-for-everything wiki markup. It should lead to an increase of their reusability and their user-friendliness, just like what the Structured Metadata on Commons project is aiming for.
  • Build powerful APIs to retrieve and edit content just like Wikidata has and so, make working with partners easier.
  • Increase the connections between our contents and their discovery using their metadata
  • Build better tasked-focused mobile viewing and edit interfaces
  • Be more ready for the possible new environment changes like voice-powered interfaces


Some examples of projects we could work on in order to move in this direction:

  • Use the multiple content revision facility to migrate progressively the data that could be structured out of Wikitext on all our projects (like the structured metadata on Commons project is aiming for files)
  • Federate all our structured content into a "Wikimedia Query Service" that would allow to do unified and powerful analytics and to ease the discoverability of all our contents
  • The logical granularity of Wikisource, Wikibooks and Wikiversity contents (and maybe other projects?) is not the wiki page but the set of wiki pages storing a book or a course. MediaWiki should be able to support such use cases by providing a "collection" system allowing us to add metadata and to do operations (renaming, add to watchlist) on sets of wiki pages.
  • Switch projects that stores fairly structured data in wiki text templates (like Wiktionary or Wikispecies) from a Wikitext storage to a structured one. Build on top of them user interfaces to edit their local contents (and maybe also the relevant data from Wikidata) and provide nice displays and APIs to make humans and machine both able to retrieve these contents.
  • ...

[1] https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Direction#We_will_advance_our_world_through_knowledge.

Lydia Pintscher Collaboration, Wikidata Next Steps for Languages and Cross Project Collaboration

Breaking Down Barriers To Cross-Project Collaboration


"... a world in which every single human being can freely share in the sum of all knowledge."

Wikipedia has been our flagship for many years now and our main means of achieving our vision. As information consumption and expectations of our readers are changing, Wikipedia needs to adapt. One crucial building block for this adaption is re-using and integrating more content from the other Wikimedia projects and other language versions of Wikipedia. Connecting our projects more is vital for helping especially our smaller communities serve their readers. Surfacing more content from the other Wikimedia projects also gives them a chance to shine, find their audience and do their part in sharing in the sum of all knowledge.

This integration comes with a lot of challenges. Over many years the Wikipedias have lived largely independent from each other and the other Wikimedia projects. This is changing. Sharing and benefiting for example from data on Wikidata means collaborating with people from potentially very different projects, speaking different languages. It brings a perceived loss of local control. Editors see them-self first and foremost as editors of "their" Wikipedia at the moment and often don't perceive this integration as worth the effort - especially on the larger projects.

We need to address this on both the social and technical level if we want to bring our projects closer together and have them benefit from each other's strength and compensate their weaknesses.

We need to think about and find answers to these questions: What can we do in order to bring our projects together more closely? How can we help break down perceived and real barriers for cross-project work? What can we do to make cross-project collaboration easier?

Gergő Tisza Collaboration, Hosting, Incubator, Infrastructure, Innovation Supporting Third-Party Use of MediaWiki

We need a wikiproject incubator to enable innovation

We need infrastructure for cheap creation of new wikis and experimentation with new project types and their different technical and social needs.

Wikipedia was an unlikely bet that paid off extremely well. It does not cover the entirety of human knowledge however - it's limited to long-form, text-based representations of encyclopedic topics on subjects well covered by reliable sources. There have been recurring discussions in the past about covering new types of knowledge (genealogy, fact checking, 3D models, oral history, collaborative video etc. etc.) but they never went anywhere, because creating new projects is a lot of work and the outcome is unpredictable (some things only work in practice), so the WMF was unwilling to take the risk. If we want to make meaningful progress on our vision of curating the sum of *all* knowledge, and truly become the infrastructure of free knowledge, we need to overcome this barrier.

We need infrastructure for low-effort, low-risk experimentation with new projects - a "wiki nursery". A system where new wikis can be created at minimal cost, technical and social experiments can be performed on them without undue risk to other projects, and they can be discarded if they prove unsuccessful. Other organizations have used such mechanisms with great success (Wikia has tens of thousands of wikis; Stack Exchange has nearly two hundred sites).

Such a system has to be somewhat separate from the existing wiki cluster; a closely integrated approach like incubator.wikimedia.org is both too inflexible and too insecure. It needs to have some level of operational and legal maturity (more than what Cloud VPS offers). For risk segregation and eficiency, it would have to be different from our production cluster in a number of ways:

  • single sign-on which does not rely on the local wiki for authentication
  • no shared database and configuration acces
  • ensure even distribution of server resources without constant human supervision
  • flat domain structure (to avoid cookie leaks) with affordable certificate management
  • new management interfaces and corresponding core support to avoid spending developer time on common tasks (wiki creation and destruction, configuration and extension management)

While there is a significant one-time cost to setting up such a system, keeping it running is cheap, and the benefits are well worth it - information on what content types and what policies enable productive collaboration, a space free of the conservativism and change aversion of large established projects where new concepts can prove themselves, an entry point for projects with a less western-centric interpretation of knowledge, lots of small projects which are welcoming to new editors and provide lots of low-hanging fruit for contribution. (Also, the same feature set is needed for paid wiki hosting service, which is one way to create non-WMF income streams into the MediaWiki software ecosystem - a valid strategic goal on its own.)

Volker Accessibility, Collaboration, Documentation, Mobile, Style Guide Advancing the Contributor Experience

Developer resources & collaboration on Wikimedia User Interface Style Guide

Presenting the Wikimedia User Interface Style Guide targeted on developers needs. The style guide is including all its resources and is hosted on GitHub as a test case for design collaboration.

In the ideation of the new style guide, it was emphasized for it to be a successful resource, it has to address both, designers and developers needs.

So far a big part of the overarching and visual style principles alongside one development 'base' layer - WikimediaUI Base variables, already being used in OOUI and Marvin as building block - has been accomplished.

Within the next couple of weeks, there will also be a 'Components' and 'Resources' section where developers are presented design principles 'in action', combined with demos hot-linking. The presentation would be on principles behind and application of the components:

Ideation and design

This is for everyone, internationalization/language

Open to collaboration

Design consistency

Trustworthy yet joyful

Usability and UX best practices and underlying user research,

Responsive design and mobile first principles and

Accessibility measurements

There are clearly topics overlapping with the questions posed on the thematic overview like maintaining and growing technical community, role of open source, scaling, tools for embracing mobility.

At the end of the presentation there is an open questions/feedback slot planned getting ideas on how to extend/improve the style guide even more.