Synchronization

From devsummit

3 statements.

Author Tags Primary Session Secondary Sessions Position Statement
Mark Hershberger Offline Editing, Synchronization, Third Parties Advancing the Contributor Experience Supporting Third-Party Use of MediaWiki

When Marshall McLuhan said "The medium is the message", he was saying that how the message is understood is affected by what is used to present that message.

MediaWiki is a fundamental part of the medium used to present Wikimedia's work (the "message"). Because the medium is an integral part of the message, it requires comparable attention to its availability and accessibility.

For example, effort is made to ensure that people in remote areas have access to selected content through Kiwix, but a very limited effort has been made to incorporate their knowledge into the "sum of all knowledge."

While there are efforts underway that include copying edits into Wikipedia by hand, it should be possible to provide people in remote areas with an editable copy of Wikipedia so that their edits could be incorporated with less intervention.

Improvements in the installation and resource consumption of a simple MediaWiki installation could be made without sacrificing the current PHP-based application such that someone could, for example, run a current MediaWiki installation an a un-rooted Android phone. Work could then be done to automate the synchronization of that MediaWiki with the current Wikipedia content.

This work on MediaWiki could, of course, be used by other people who use the tool besides the WMF which could create a virtuous cycle that would benefit the Foundation.

In fact, deeply incorporating McLuhan's thinking into WMF culture would mean that, while Wikipedia would remain the most visible product of the Foundation, there would be more room to focus on expanding MediaWiki's capabilities beyond what fits into the current focus on GLAM efforts, the website, etc.

Most of the world does not use Wikipedia every day, but many people use something they've learned as a result of reading from or contributing to Wikipedia every day. Making it easier for people to deploy MediaWiki where the potential users do not have the resources of the WMF (for example, in a place that doesn't have a stable Internet connection) could encourage more people to actively embrace of Wikimedia's vision of freely sharing knowledge.

Daren Welsh Contributors, Mobile, Offline Editing, Synchronization Advancing the Contributor Experience

What technologies are necessary for embracing mobility? How and with whom should we partner to create the technologies needed to support the mission?

Multilateral, asynchronous, bidirectional synchronization of wikis: How astronauts taught the world to wiki on the go

The world is certainly transitioning their internet usage from the desk to their mobile device. Let's not limit our focus on mobile devices that always have an internet connection. Let's talk about the millions of travelers stuck in a moving vehicle with nothing better to do than look out their window. I'm talking about passengers on planes, trains, and automobiles. The easiest target here are the millions of people who fly. As a passenger onboard a plane for hours, we're lucky if we have an in-flight movie system. But what if the plane offered a local intranet with a copy of Wikipedia? What if the airlines gave promotions to those who contributed? What if each flight competed with other flights for most contributions? The same approach could be applied for passenger trains, buses, subways, and ferries.

The main limitation here is a technical one. If you have thousands of Wikipedia clones buzzing around, each collecting contributions during their offline time, how do you reconcile the changes with the master database? While tools like Kiwix already offer an offline copy of Wikipedia, there is much work needed to support thousands of wiki clones reconciling changes every few hours. This will require revolutionary branch management and revision conflict handling.

But if you pull this off, it might kick off the biggest surge in user participation in years.

With whom should you partner to accomplish this? Why not start with NASA? They use MediaWiki to train astronauts and plan for spacewalks. Begin this development by running wiki servers onboard the International Space Station. Get astronauts to contribute to the same wiki used for their training while they are putting all that knowledge to use. Once the NASA wiki synchronization between the ground and the ISS is working, expand this model to Wikipedia. Yes, have a clone of Wikipedia onboard the ISS. Astronauts love to share their experience, their story, and their photos from their 6-month stays aboard the station. These lucky few represent countries from around the world and they have a huge influence on the rest of us on the ground. Once people see astronauts contributing to Wikipedia during their journey, they will want to join the movement on their travels (albeit aboard slightly less cool vehicles).

Brian Wolff Censorship, Offline Editing, Synchronization Advancing the Contributor Experience

Wikimedia should diversify its distribution methods.

Currently Wikimedia distributes its content almost exclusively using the Internet. However, the Internet is controlled by gate keepers in the form of governments and ISPs. While historically these entities rarely controlled the flow of information, more recently we have seen an increase in censorship, particularly by governments. Since Wikimedia is distributed almost entirely over the Internet, we are vulnerable to their whims.

The risk of having our distribution lines interfered with, is an existential threat to our mission. While at present time, only a few geographic locations practise such interference, the future is unknowable and does not appear to be heading in a comforting direction.

Furthermore, in the face of such interference, there is very little we can do. TOR is often spoken as a solution to censorship, but any such on-Internet system will either have to be obscure or rely on secret information (e.g. TOR bridges) to avoid blocking, and thus cannot be used by the public at large. The most effective solution to censorship so far seems to be political pressure, combined with bundling to make censorship decisions as broad as possible. When much content is bundled together, such as entire domains with TLS, or Github and New York Times[1], it can reduce censorship if there is political will to censor a specific part, but not the whole thing. However, political opinion is fickle, and cannot be relied upon.

Thus, we should reduce this risk by diversifying how we distribute our content. Multiple distribution routes means no single point of failure. I see two ways of doing this:

First, by expanding offline versions of Wikimedia. Kiwix already provides an offline version of Wikimedia sites. We need to expand this capability to allow for better updating. Offline apps should be able to efficiently update their contents in accordance to a scenario where users only have intermittent access to the open Internet. More importantly, offline apps should be able to update in a P2P fashion with other apps. In a community with limited access to open Internet, a single person with an up to date version of Wikipedia, should be able to easily synchronise his/her app with other people's apps to spread the knowledge. This could be especially helpful in a scenario where a small number of people have access via methods such as TOR, but such methods are too burdensome for most people.

Second, we could experiment with broadcasting recent edits widely. To broadcast html versions of all main namespace pages recently edited on English Wikipedia, would only require about 12 KBps [2]. This is not a huge amount of bandwidth. During the Cold War it was common to broadcast propaganda using short wave radio, which could be listened to across the world. Perhaps we could broadcast everything that is edited across the world in a similar fashion, allowing users to stay up to date regardless of their connectivity. This could be combined with the P2P app, so a few power users could listen in to the RC stream, and then spread the data among their communities.

[1] https://en.wikipedia.org/wiki/Censorship_of_GitHub#DDoS_attack [2] Based on very rough experiment, ?action=render of a wikipedia page roughly gzips to the size of the raw wikitext. From there the 12 KBps number is based on the enwiki result of: SELECT sum(l)/(1024*3600*24) FROM

(select max(rc_new_len) 'l' from recentchanges
 WHERE  rc_namespace = 0 and rc_timestamp
 BETWEEN '20170926000000' AND '20170927000000'
 AND rc_type <= 1 group by rc_cur_id
) t;