Embracing MediaWiki as Open Source Software

From devsummit
Revision as of 10:05, 27 November 2017 by Ccicalese (talk | contribs)
Related Phabricator Task
Topic Leaders none assigned

4 primary statements. 3 secondary statements.


Loading...
  • Accessibility
  • Analytics
  • Architecture
  • Big Data
  • Collaboration
  • Communities
  • Discussions
  • Documentation
  • Infrastructure
  • Languages
  • Open Source
  • Research
  • Strategy
  • Third Parties
  • Tools
  • User Experience
  • Wikibase

Primary

Author Tags Primary Session Secondary Sessions Position Statement
Michael Holloway Open Source Embracing Open Source Software

Free Software is Fundamental to Our Mission

Position

MediaWiki is a prominent free software project, and the Wikimedia projects have always run on free and open-source technology, but our relationship to free and open-source software needs clarification. We should formalize that we are committed to making, using, and leading in the development of free software, even when doing so is more difficult or less efficient in delivering user value than adopting closed solutions, as a central part of our educational mission.

Discussion

How does free software relate to the free knowledge movement?

In this movement we are building a body of open knowledge, curated collectively and accessible to all. We develop the software that powers these projects in the open, and we run our backing infrastructure on free and open-source technology.

We choose to do these things not because they are easy, but because they are hard. Existing free software is not always, or even most of the time, practically superior.[1] We work in the open so that others can contribute to and learn from our processes; our work product is educational content in its own right, and in that way directly contributes to our mission.[2] By ensuring that our tools and processes are open, and working through problems with free software projects rather than rejecting them in favor of closed solutions, we empower others everywhere to join us in doing this hard work, or to launch like-minded projects of their own.

It's often tempting to conclude that our users could be better served by adopting closed or proprietary software solutions to our engineering problems, rather than adapting free software to meet our needs or writing our own. This may be true in the short term, but over the long term this contributes to the cloistering of software engineering expertise in closed commercial enterprises.

Our goal is to expand and not restrict the knowledge of software engineering principles and practices, and we are playing a long game.

What could a formal commitment to free software mean in practical terms?

This is intended as an open-ended question for discussion, but here are a few ideas:

  • We should take a leadership role in the development of free software languages and technologies on which we depend (e.g., PHP).
  • Where we develop software for closed platforms (such as the mobile apps), we should promote free alternatives for their distribution channels (e.g., F-Droid [3]) and ensure they can be run without depending on proprietary software.
  • We should encourage and recognize contributions by our engineers in the broader free software community.

[1] https://www.gnu.org/philosophy/when-free-software-isnt-practically-superior.html [2] https://wikimediafoundation.org/wiki/Mission_statement [3] https://f-droid.org/

Matanya Moses Communities, Open Source Embracing Open Source Software

Standing on the shoulders of giants

Mediawiki is built on the basis of many other open source tools, libraries, packages and other software types. Our ability to write, run and use Mediawiki depends on their availability, support of the upstream and maintainability. As a few examples, debian, the OS WMF is running, PHP, or Elasticsearch, our search back end.

In the light of recent discussions of migration from HHVM, to zend php as our runtime, I would to raise the discussion point of what is our position in the open source world of the underlying parts of our stack. Wether we choose the be just a user of what upstream produces, or we want to actively influence the decisions made while writing the software.

In order to be able to influence the decisions made while writing the software as the known phrase says: "decisions are made by those who show up" we will need to show up in those communities, but an active part in them and contribute, in the exact same manner we hope third party mediawiki re-users will contribute, discuss, send patches and show up.

If we are to choose this path, it has resources implication, Time, money and dedication to involvement in other communities. For instance, having sponsoring a php developer working on our needs upstream for instance might be a good investment but might be a waste as whole.

I would like to have an open discussion about this approach, whether it is desired, feasible, and worth the effort. I think it might affect where our tech stack will be in the years to come and has a significant statement towards the outside open source ecosystem.

Thank you

Moriel Schottlender Accessibility, Languages, Open Source, Tools Embracing Open Source Software

The Wikimedia Foundation is a leader in many fields, but none as so obvious and otherwise so underserved anywhere else than that of language and accessibility. We are not just the fifth biggest site online, or one of the biggest open source endeavors available, we are the de facto leaders of technology that other commercial companies consider 'edge case' and 'less profitable'. This gives us an advantage of developing tools that don't just help our own audience, but could - and should - serve as a repository for allowing everyone online to reach, support, and embrace these audiences with minimal effort.

We have many of the tools available already, for our own users and products, but they are still limited when it comes to sharing and using them outside the movement. And why? Developing our tools to be accessible to outside projects - and to cloud tools, to bots and to other Open Source organizations - is a doable task that is not just worthy in general, it also follows our mission.

What better way to empower 'every single human being [to] freely share in the sum of all knowledge' than to share our own powerful tools with others to allow everyone to prioritize support for language, accessibility and right-to-left technologies and push these relevant technology forward?

I suggest we look across our technologies and libraries - from OOjs UI to CSSJanus, ResourceLoader to wfMessage(), and many others - and work to better generalize these to serve our own users better in their projects, bots, and cloud tools - and to place ourselves firmly and officially as the leaders of this technology that we already are.

 Now that my position is known, my direction is unknowable - Heisenberg Uncertainty Principle.

So let's break reality, and figure out both.

Timo Tijhof Infrastructure, Open Source, User Experience Embracing Open Source Software Evolving the MediaWiki Architecture
  • Embrace open-source and keep our software to the same standards we hold other open-source software. This would prevent our software from becoming isolated, hard to maintain, or hard to contribute to.
  • Scale the contributor experience. Ensure our content remains of high quality and value to readers; ultimately to avoid failing our mission. I envision this requires a radical shift in how our application is served, by involving a non-static service capable of scaling to the traffic of our CDN and yet vary responses by user.

Open-source:

We must understand the dangers of producing software that isn't reusable. Such software may be hard to maintain, hard to contribute, both for future contributions, and our future selves.

"Current needs" only exist to serve our long-term needs. Losing track of long-term needs can make software too specific to a current need, risking a trend of releasing software that is only open-source as a courtesy, for transparency, and without being re-usable.

Reusable software has a defined purpose and serves it well. It tends be easy to install, well-documented, and easy to contribute to. Re-use between different services, as well as externally. Such as for community tools, cloud services, or other third parties. Having a defined goal also encourages designing APIs in a way that we can agree not to break or change too often, because they are public.

Scaling:

Our current infrastructure is highly optimised for the passive reader that doesn't contribute. We serve a static CDN response to most users. For users having logged-in, or made contributions, we bypass these layers for all page loads. As a result, their document load time increases by 5x-10x (eg. NavTiming metric responseStart).

In 2015, Ori mentioned the danger of this in (<https://blog.wikimedia.org/2014/12/29/how-we-made-editing-wikipedia-twice-as-fast/>), saying optimising our backend will "allow us to dissolve the invisible distinction between passive and active users". And "enable microcontribution [features] that draw [in] passive readers".

Banners (CentralNotice) are a good example of our needs being at odds with our infrastructure. We want banners to show as part of the page, and for banners to vary by user, location, plus random variance. Our current infrastructure could only do so by bypassing the CDN on all requests. As such, the current way is entirely client-side and completes well after page load.

In few cases where we do ask readers for data, it is for statistical purposes or to improve the software. Direct (or indirect) contributions to our content remains limited to complex actions like "edit". Moving our contributors experience to match some of the capabilities and performance of the reader experience, would enable us to start accepting micro contributions that actually produce a change in content (either directly, or e.g. by consensus). It also opens the door to making our web platform work offline (e.g. ServiceWorkers) which further enables high-performant interactions that can be uploaded at a later time.

Secondary

Author Tags Primary Session Secondary Sessions Position Statement
Dan Andreescu Analytics, Big Data, Collaboration, Documentation, Open Source, Research, Third Parties Research, Analytics, and Machine Learning Embracing Open Source Software

Our strategic goals include scaling our communities to a truly global level, and expanding our understanding of human knowledge. To do this, in my opinion, we need to have a much better understanding of our communities' actual work. We have tens of thousands of people doing millions of hours of work every month, and nobody knows exactly what is being done, what the definition of "done" is, and how fast or slow the progress is. We are the leaders of the free knowledge movement, and we are mostly blind except for some big picture notions like pageviews and edits. It is my opinion that we need to develop a good understanding of the work being done on the wikis. Very capable people have already spent lots of time trying to do this, but I believe we have largely failed because of technical limitations. This is a big data and big compute problem, and we have not yet approached it as such. A close collaboration between our communities, Analytics, Research, and Audiences teams is needed, as well as the power of the WMF Hadoop cluster. I have had sessions on this topic already, and am excited to finish planning and transition to actual work. There are some very valuable implications of taking on and finishing this work. Most importantly, we will all be able to more objectively talk about frustrations in the community over changes that cause "more work". For example, when we launched Visual Editor there was huge backlash about the amount of work this change implied for our community. But because this was largely based on subjective opinions, emotions got involved and it took years to calm the negative effect of those emotions. This effort would also give us, for the first time, a way to celebrate these millions of hours of work. People could see, share, and take pride in their part of building human knowledge (if they wanted to, privacy is of course one of our top priorities).

I am also interested in expanding our Open Source efforts, and examining changes that we can make to spur more collaboration. My reading of the strategic goals for 2030 is that the WMF will not have enough resources to execute by itself. That's where collaboration will be crucial, and where problems like in-house developed libraries without true Open Source presence will slow us down. We let documentation and third-party user support lag behind because we're busy with other stuff, and that's arguably fine for our scale so far. But this approach will not allow us to grow the way our Strategy is defined.

Matthew Flaschen Discussions, Open Source, Strategy Evolving the MediaWiki Architecture Embracing Open Source Software

We need to re-evaluate scaling, on both the technical community side and the content side.

On the technical side, too often we think as if we were an isolated organization, rather than a respected leader that many wish to collaborate with. This causes us to ask ourselves the wrong questions and get the wrong answers.

For example, we asked ourselves whether we should limit ourselves to existing open source translation tools, or use proprietary translation services to fill in the gaps. Instead, we should have stayed committed to open source, and asked how we can use our engineering and financial resources to advance open source translation. This is a major problem that no organization can solve on its own. However, we have both the motivation and resources to be a major contributor to the solution.

We also asked whether we should support the proprietary MP4 format, or limit ourselves to weak device support for open formats. Instead, we should be staying committed to open standards, and working to support their uptake among software developers and device manufacturers. We already have significant relationships with wireless carriers that give us a foot in the door with such manufacturers.

By seeking important partnerships, where we are prepared to put in significant effort, we can greatly scale both our own efforts and those of the broader movement.

On the content side, to achieve sustained long-term growth, we need to grow every type of user activity, including writing, editing, discussion, organization, curation, maintenance, workflows, and moderation. We have historically provided good (and improving) support for writing, editing, discussion, and moderation.

However, we have neglected the related processes of organization (e.g. categorization, tagging), maintenance (e.g. tracking articles that need fixes, updating them as they become out of date), curation (e.g. quality images, featured articles), and workflows (used in multiple areas, but particularly supporting organization, maintenance, curation, and moderation).

It is vital that we improve discussion, curation, workflows, and moderation tools. Otherwise, we will be unable to keep up with increasing content and activity as our improvements to writing and editing succeed. We should look at past successes (e.g. the Teahouse) and failures (e.g. Article Feedback) and learn lessons. In both cases, we made a very specific product, which then succeeded or failed. This is not scalable to hundreds of wikis, and it is hard to iterate in response to lessons learned.

Instead, we should focus on platforms, such as workflow systems. In order to keep up with the community, we need to give them the flexibility to constantly use the software according to their needs.

Adam Shorland Architecture, Open Source, Wikibase Growing the MediaWiki Technical Community Embracing Open Source Software

I believe that the technicaly community should strive to collect and effectively disseminate technical knowledge as per the Wikimedia missions attement.

Ability to grow out technical community can be compared with ones own ability to gain knowledge in technical spaces within the Wikimedia movement. Currently there are many barriers to entry that have been surfaced year after year with some but little movement forward past them.

To scale and ready the community we should push forward and enable the use of emerging trends in technology, such as knowledge retaining Q&A platforms. There are many other organizations and softwares that do this much better than Wikimedia we should learn from them. Looking at Q&A platforms specifically, talk pages have never really been a good place to ask questions and retain knowledge in a searchable way for use in the future. Stackoverfolw, as an example, has proven to be an invaluable resource for people in technical spaces and we can learn from that. MediaWiki is an amazing piece of software, but we should not feel 'boxed in' by it. The Wikimedia foundation is not the MediaWiki foundation, MediaWiki does not have to always just be a wiki page.

Our commitment to Open Source is often something that slows down many actions within the movement, however this is not something that should change as it is integral to what Wikimedia stands for at the core. We should embrace our Open Source commitments and reach out to and engage with organizations using our software more. Wikimedia Germany does this outreach specifically with the Wikibase extension, looking for other users and engaging them to discover how they are using it, why, and how it can be better. The Wikibase extension also specifically the Wikibase Query Service shows us that not everything has to be a wiki page, as the query service disseminates knowledge under a free licence effectively.

I hope that the summit will agree that entry to our technical space, and increasing knowledge persistence within our technical space needs some thought and work, and that we should stay committed to Mediawiki as a software and platform, but that it can look, feel and act different while Wikimedia stays true to its mission.


Property "Expert" (as page type) with input value "{{{expert}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.