Technical Debt

From devsummit

3 statements.

Author Tags Primary Session Secondary Sessions Position Statement
Marko Obrovac Architecture, Microservices, Refactoring, Technical Debt Evolving the MediaWiki Architecture

All of the Wikimedia projects have, in technical terms, MediaWiki - the software - at its core. Thanks to this fact, MediaWiki has become a widely-deployed system, drawing many volunteer developers. Alas, there is a disparity in scale between the WMF-run install and other, external set-ups, which hinders the speed with which the platform supporting the Wikimedia projects. On the other hand, architecting microservices has proven as a good way of achieving scalability, increasing developer productivity, improving maintenance and reducing technical debt. Gradually moving towards 'de-monolithising' our core infrastructure will enable developers (both WMF staff as well as volunteers) to start working on all sorts of interesting features, ranging from simple add-ons to full-blown companion sub-systems. While this transition is (arguably) already happening, everything still gravitates around MediaWiki - the software. Instead of focusing our efforts on compatibility in scale (e.g. one JobQueue system for WMF, another for external installs), we should focus on the products and features that allow the projects to grow, both in terms of number of projects and features they offer, as well as in the number (and diversity) of their users. Microservices can greatly help in achieving this goal, since all installs can select the components they want to run based on the available resources at their disposal and their potential reach or scale. Much like the advent of extensions enabled various parties to complete their systems with sought functionality, microservices can refocus our technical community to think about features and components without worrying about scaling them (up and/or down). If we want our developers to assist the Wikimedia projects and their communities, we need to bring our core infrastructure to the 21st century. Let's not leave the technology behind - it is central to the success of the communities we are trying to enable.

Max Semenik Technical Debt Growing the MediaWiki Technical Community

I'm highly interested in having a deep discussion about our technical debt. I've been active in operating on this front myself and I really care how we fare in this aspect, however I see a lack of consensus on tech debt from the development community at large.

We deprecate things and then continue using them. People get irritated when their extensions break due to slightest core changes, even when the extensions themselves are misusing the core interfaces. We can't really run lots of types of static analysis against our code base because the sheer amount of problems detected would make the signal to noise ratio unacceptable. Developing skins for MediaWiki is incredibly painful. Our tests access database a lot, as a result they're slow and fragile. These are just a few examples of pain points haunting our code base and extracting their daily toll from everyone working on it.

I would like to have Tech Debt SIG work in person on addressing these issues. We should define code quality metrics, identify problem areas and create some actionables to address them. We should also discuss approaches to handling this without causing too much discontent from broader developer community.

I believe this would be an important step towards making MediaWiki a better ecosystem and improve our development pace.

Santhosh Thottingal Languages, Machine Translation, Open Source, Technical Debt, Volunteer Developers Next Steps for Languages and Cross Project Collaboration

Mediawiki is one of the rare software system where the i18n is done right. This infrastructure need timely improvement and maintenance. The technology and resources for supporting that technology is important as 2017 movement strategy states: 'We will build the technical infrastructures that enable us to collect free knowledge in all forms and languages.'. But most of these infrastructure is running under volunteer capacity and no official team responsible.

1. Opensource strategy - Mediawiki language technology is isolated and a less known in general opensource ecosystem. There is a need to have proper ownership, maintenance and feature enhancements as good open source project, so that our contributions comes from other multi lingual projects, while we help with our expertise.

   (a) Our localization file formats and the libraries on top of them are very advanced and supports languages more than any other system. But it was when the libraries made mediawiki independent, other projects started noticing it. We use that independent library(jquery.i18n) for VE, ULS, OOJS-UI and present in mediawiki core. But it is not actively maintained, issues and pull requests not addressed because there is nobody in foundation now in charge of it, except volunteer time. There is a lot of demand for its non jquery, general purpose js library. Code is aged, tech debt is increasing. 
    (b) We developed one of the largest repository of input methods(100+ languages) to support inputting in various languages and an input method library. This is a critical piece of software for many small wikis - jquery.ime. The code is aged, not actively maintained by anybody from foundation, except some in their volunteer time. Not updated to take advantage of browser technology updates about IMEs. This is a mediawiki independent open source library.
    (c) Universal Language selector - a language selection, switching mechanism for our large list of languages, also delivering input methods, fonts, need ownership and tech debt removal. Navigating between different wikis is done using this and now the team authored this system does not exist.This is a also mediawiki independent open source library. VE, Translate, ContentTranslation, Wikidata depends on this library.
      (d) Mediawiki core i18n features(php) are also started showing its age. There were plans to make some of them as standalone opensource libraries. Not happened. Nobody officially responsible for this infrastructure too.

2. The Translate extension - helping to have mediawiki interface available in 300 languages - something that we always proud of - is not officially maintained by foundation now. The localization happens because of volunteers and volunteer maintaining Translate extension code. Moreover, the translatewiki.net, which hosts the Translate where localization happens by volunteers also outside foundation infrastructure.

3. It is time to have machine translation infrastructure within wikimedia. Content translation used machine translation - but that is an isolated product. Translation of content can be used in various contexts for readers. CX tries to provide a service api for MT, There are lot of potential for that. Multiple MT services, even proprietary services might be needed to cover all the languages. At the same time, our content and translations are important for training new opensource MT engines.

4. Wikipedia follows very traditional approach for typography and layout. Language team had limited webfont delivery to aim missing font issue, but too old code not got any updates in last 3 years or so. Language team plans to abandon that feature due to maintenance burden, but not happened and no team now owns it. Other than this a few wikis does common.css hacks to have customization of default fonts. Typography refresh attempt from reading team was for Latin. Every script has its own characteristics about font size, preferred font family sequences, line heights etc. Presenting knowledge in all these language wikis, in 2017 or later need serious thoughts about readability, typography and general aesthetic of wikipedia in a language compared to other websites in that language.