Roan Kattouw

From devsummit
Revision as of 16:23, 8 January 2018 by Rkattouw (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Tags Architecture, Contributors, Data Center
Primary Session Evolving the MediaWiki Architecture
Secondary Sessions Advancing the Contributor Experience

Users should not be punished for logging in

WMF wikis are slower for logged-in users than for anonymous users, which is unhelpful for trying to get users to contribute. This is a long standing problem that's hard to solve, but we should have a vision for how we're going to solve it.

WMF has caching data centers in strategic locations around the world (Amsterdam, San Francisco and soon Singapore), which make the wikis faster for users who are not near the primary data center (in Virginia) but are near a caching location. However, this only benefits anonymous users. For logged-in users, every page view contains their user name and other user-specific information in the personal tools area, so logged-in page views are considered uncacheable and are always routed to the primary data center.

This means that if a new user browses the site for a while, then creates an account because they want to contribute (or makes an anonymous edit), the wiki suddenly becomes slower for them. All users are affected, because uncached requests are slower to serve than cached ones, but users outside North/South America are affected the most, because their traffic now has to cross an ocean that it didn't have to cross before. It's not nice that a new user's 'reward' for creating an account is a slower experience, but it's especially not nice that users in emerging communities are affected the most. If we want to encourage readers to become contributors, slowing the site down as soon as someone contributes is not very helpful.

Some requests will always have to go to the primary data center, such as POST requests saving an edit, and those are always going to be slower for users outside North America. But for logged-in page views this isn't fundamentally necessary, and serving them from the edge caches would speed up the site for logged-in users and reduce the load on the app servers. There are different ways that this could be done, each with their own obstacles. For example, a single-page application for MediaWiki could use a content service to retrieve only the new page's contents when navigating, but this would require modifying or rewriting a lot of code in MediaWiki; ESI could be used to have Varnish inject cached page contents into a user-specific chrome, but that would require using advanced and partly unproven Varnish features. In both cases, we'd have to reimplement certain rendering preferences using CSS or a post-processing step. It's far from trivial, but let's start talking seriously about how we can address this problem.