Page MenuHomePhabricator

All Graphs broken on Wikimedia wikis (due to security issue T334895)
Open, Unbreak Now!PublicBUG REPORT

Description

On April 19, 2023 it was identified that the Graph extension, which uses the older Vega 1 & Vega 2 libraries, had a number of security vulnerabilities.

In the interest of the security of our users, the Graph extension was disabled on Wikimedia wiki's. WMF teams are working quickly on a plan to respond to these vulnerabilities.

We recommend that any other third party users of the Graph extension should disable the use of that extension on their wikis.

A configuration change will suppress the exposed raw tags and graph json definition to avoid excess disruption to the end user experience when the extension is disabled. [2] This also provides a tracking category "Category:Pages with disabled graphs" showing the pages that used to contain graphs. Local administrators can localise the name of the category and its description by editing [[MediaWiki:Graph-disabled-category]], [[MediaWiki:Graph-disabled-category-desc]] interface messages on your local wiki.

On Wikimedia projects, graphs created via the extension will remain unavailable. This means that pages that were formerly displaying graphs will now display a small blank area. To help readers understand this situation, communities can now define a brief message that can be displayed to readers in place of each graph until this is resolved. That message can be defined on each wiki at [[MediaWiki:Graph-disabled]] by local administrators.

An example from the English Wikipedia:

Screenshot 2023-04-19 at 00.58.31.png (610×636 px, 69 KB)

ORIGINAL:
Steps to replicate the issue (include links if applicable):

What happens?:

image.png (453×1 px, 59 KB)

Or blank space.

What should have happened instead?:
Graphs should shown

Other information (browser name/version, screenshots, etc.):
I know graphs was disable because of a security issue, but an open issue is also needed so that people understand what's going on.

April 21 update.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Actually yeah, I am wrong in the comment above, T334940#8791797, regarding why the new tracking category was created. As there is no sensitive information there I will just quote a relevant part of @Lucas_Werkmeister_WMDE's original message from the private task:

@Tgr also suggested that an additional tracking category could be added, whose description page (*-desc message in WikimediaMessages) could link to the public Phabricator task or otherwise explain the broken graph situation in some more detail. (This would also allow tracking how many pages weren’t re-parsed yet since the last time the extension was disabled or enabled.)

@stjn. ^

I would recommend to simply update graph ext to use the latest vega v5, and to make it use the slower interpreter (csp safe) instead of the eval based one.

The idea is good, but we will not be able to convert all graphs quickly.

sure, but between the option of "no graphs and you can't do anything about it" vs "broken graphs that anyone can fix themselves", the second option seems far better.

Note that since the graph ext now has no backend component, it should be trivial to update just the javascript portion to the latest Vega version.

The idea is good, but we will not be able to convert all graphs quickly.

Perhaps something people can get busy with in Athens 😶‍🌫️

@Tgr also suggested that an additional tracking category could be added, whose description page (*-desc message in WikimediaMessages) could link to the public Phabricator task or otherwise explain the broken graph situation in some more detail. (This would also allow tracking how many pages weren’t re-parsed yet since the last time the extension was disabled or enabled.)

OK. Well, while Lucas’s and Tgr’s intentions seem good, people do not look at *-desc messages for tracking categories (I doubt anyone but tech people can even find those easily), and using untranslated tracking category messages (= creating red-link category entries in incorrect language everywhere until the category is created and the category page is translated) for this are two reasons for this being the wrong approach to the situation. So if these were the only purposes, I think that my decision to translate this new category the same as the old one is still correct.

Dear Person Who Will Be Writing Postmortem for this ticket:

Please consider preparation such site as https://www.wikimediastatus.net/ but for such functional features (and not network traffic).

Currently usually its working like this:

  • some person spot issue
  • go to "Wikipedia:Village_pump_(technical)" or similar place on their wiki and report problem
  • some technical wikipedian start digging and spot already created phabricator
  • this technical wikipedian spread knowledge on their wiki

If something like wikimediastatus.net but only for important stuff* will be, then we can make life of technical users easier. (* - so nothing less than "This Is Very Important, all our alarms are blinking, phones of people in middle of night are calling, Team Leaders know that this is Real Shit" go there)

Dear Person Who Will Be Writing Postmortem for this ticket:

Please consider preparation such site as https://www.wikimediastatus.net/ but for such functional features (and not network traffic).

Currently usually its working like this:

  • some person spot issue
  • go to "Wikipedia:Village_pump_(technical)" or similar place on their wiki and report problem
  • some technical wikipedian start digging and spot already created phabricator
  • this technical wikipedian spread knowledge on their wiki

If something like wikimediastatus.net but only for important stuff* will be, then we can make life of technical users easier. (* - so nothing less than "This Is Very Important, all our alarms are blinking, phones of people in middle of night are calling, Team Leaders know that this is Real Shit" go there)

That's a good idea — for fear of creating "yet another place" [0] where technical announcements can be made, maybe we should consider a page like https://meta.wikimedia.org/wiki/Tech/News but for live/running announcements? Or perhaps we should be more liberal with considering the use of CentralNotices..

[0] - https://xkcd.com/927/

Or perhaps we should be more liberal with considering the use of CentralNotices..

I thought about setting one up yesterday too, but decided that while this is a very big disruption, it is still a smaller one than something every single Wikimedian should be notified about. I am sure there might be similar situations where it might be a good idea though.

Tech VPs have just been MassMessaged.

I would like to say out loud here something important in my opinion. Personally, I knew about the presence of a few critical vulnerabilities in Vega 2 a month ago during the creation of ticket T332096, because I needed to check information about versions for that. But since all my previous experience was that nobody cares (e.g. there is no interest in tickets created after T296855, although there was an actual attack case; T182536 has been around for more than five years), I did not even consider creating a ticket until I try to reproduce it myself when I have free time for this. I would like to understand the position and priorities of the Wikimedia Foundation about not even theoretically possible, but actually occurred attacks. While I am stunned and feel the inconsistency between the complete lack of reaction and such a sharp reaction.

T182536 has been around for more than five years

Not all vulnerabilities are created equal. T182536 is about some things that are not best practice (in more industry speak, these are like the INFORMATIONAL findings in a pentest). Ideally it would be changed, but they aren't "vulnerabilities" in the sense of things that could actually be used in an attack. Not fixing it, in 5 years, should not be taken to mean nobody cares about security. The things mentioned on that bug are more like the difference between being paranoid vs ultra-paranoid.

[I do not work for the security team anymore. I did when T182536 was written. This is of course just my opinion as I cannot speak for anyone else]

Disabling extension is an adequate decision. Long time ago I reported about EasyTimeline, nobody disabled this extension, it took few weeks to fix, but again, it was long time ago. Today Wikipedia has have state-level attacks (guess the country), just in February almost every Russian MediaWiki website was attacked (proclaimed as "euro-woodpecker", using wormable payloads with a mix of pre-1.37 stored XSS vulnerabilities and mcrundo to bypass captcha), so nobody wants to know how much damage malicious actors can do in few minutes.

Looking at Graph extension, tbh, it looks like a nightmare in terms of security. Regardless of the method of fixing, I suggest to wrap charts into sandboxed iframes, as after T178356 all officially supported browsers support them.

Quick update on progress.

Over the last few days engineers have been exploring an approach that will add Vega 5 support for the Graph Extension. The goal is restore as much graph rendering as possible in the shortest timeframe. This aim to address the vulnerabilities found but most importantly to restore as much of the extensions previous state. We will also be updating the D3.js library from version 3.5.17 -> 7.8.4. The Vega 1 and Vega 2 libraries will be removed from the graph extension.

In terms of expectations:

  • Initially the graph extension with Vega5 may only be supported on modern browsers (approximately 2017 or newer). This is due to some issues with ES5 builds and MediaWiki in the most recent versions of Vega. It’s hoped this can be resolved with a build step to restore functionality to MediaWiki's full supported browser stack.
  • A compatibility layer that maps Vega 2 community graphs to Vega aims to allow current Vega 2 syntax to work with Vega 5, but our expectation is that some <graph> syntax might need to be updated in some places.
  • We are building in some more sustainable error handling code. They will load and display an error thumbnail if the graph cannot be displayed. The purpose of this is to allow us to turn some graphs on and get a better sense of which graphs need to be prioritized. When graphs fail to render they will also send an error to our client logs so we can track them and later fix them.

Security will be reviewing the updated Vega 5 and D3.js libraries and the threat model associated with this approach, and release of the update is gated on a successful security review. We want to be as confident as possible that the approach is secure and correct.

We are assembling a small group of engineers from across a number of teams to see how much additional progress we can make on Graph between now and May 5, and will be working iteratively on our approach. We will continue to share updates along the way here on Phabricator. If we hit major blockers (e.g., security or library complexities we can't quickly resolve in the order of days or a couple weeks), we will be sure to share this information whilst we establish next steps. Our hope is that we can avoid difficult tradeoffs where we would need to keep Graph disabled for a long time, but we also need to acknowledge that this is a distinct possibility.

Thank you for the detailed update. A couple of questions, please.
After all this done, are you going to publish the full explanation about the security breach, or it still can be dangerous?
And also, the date you've mentioned, May 5, is it supposed to be the tentative date you want to allow usage in the Graph extension again, for now?

  1. Most WMF staff are off today and for the weekend so I can’t give an answer about what details we will release. It’s not just Wikimedia who use this, there are also 3rd party users to consider. Ultimately the release of details in regards to this event are a decision we will depend on guidance from the security team on. But that’s a discussion for afterwards.
  2. Ideally (pending things like security reviews) I’m hoping might have some at least some coverage for graphs back in place for next week but we also want to allow a little extra time to ensure we can get us back to as much of functionality coverage as we had previously. The engineers working on this have ultimately been pulled from other areas of work and so working on this can’t be indefinite. May 5 is to help timebox so that the engineers can return to their teams. If we get to May 5 and haven’t solved the issues, we will review and decide on next steps.

After all this done, are you going to publish the full explanation about the security breach

Note, the word "breach" has special meaning in security industry. While i am not privy to all the internal background chatter on this, i haven't seen any indication that that word applies to this situation. We should probably not describe this situation as such unless we get confirmation that is an appropriate descriptor. [Just personal opinion based on public info]

This comment was removed by EpicPupper.

Dear Person Who Will Be Writing Postmortem for this ticket:

Please consider preparation such site as https://www.wikimediastatus.net/ but for such functional features (and not network traffic).

Currently usually its working like this:

  • some person spot issue
  • go to "Wikipedia:Village_pump_(technical)" or similar place on their wiki and report problem
  • some technical wikipedian start digging and spot already created phabricator
  • this technical wikipedian spread knowledge on their wiki

If something like wikimediastatus.net but only for important stuff* will be, then we can make life of technical users easier. (* - so nothing less than "This Is Very Important, all our alarms are blinking, phones of people in middle of night are calling, Team Leaders know that this is Real Shit" go there)

That's a good idea — for fear of creating "yet another place" [0] where technical announcements can be made, maybe we should consider a page like https://meta.wikimedia.org/wiki/Tech/News but for live/running announcements? Or perhaps we should be more liberal with considering the use of CentralNotices..

[0] - https://xkcd.com/927/

Couldn't Graphs status be added to https://www.wikimediastatus.net?
Currently, it says "All Systems Operational".
It could say instead "Most Systems Operational", "Graphs extension disabled" or something.

A single MediaWiki extension is not a "system"... Could you please split this topic into a separate conversation? Thanks!

I see various reports on the Dutch Wikipedia about the pageview graphs missing on the action=info page. The error message doesn't show up there as the link is simply ommited, can something be done about that?

Is there an easy way to just generate the Graphs locally as picture with the data from Wikidata, upload them to Wikimedia Commons and embed them again as normal picture? Could this be automated? I am not that proficient in SPARQL.

Dans T334940#8802210, @C-Kobold a écrit :

Is there an easy way to just generate the Graphs locally as picture with the data from Wikidata, upload them to Wikimedia Commons and embed them again as normal picture? Could this be automated? I am not that proficient in SPARQL.

Hi, the trouble in that idea, is that the generated image might be outdated from the data input in Wikidata. Moreover, Vega is supposed to give some better end-user experience (tooltips, differents graphs, etc). Apparently, it is a matter of days to get graphs back.

I see various reports on the Dutch Wikipedia about the pageview graphs missing on the action=info page. The error message doesn't show up there as the link is simply ommited, can something be done about that?

It will be fixed when graphs get reenabled.

I think the question was whether something can be done about that until graphs get re-enabled.

Couldn't Graphs status be added to https://www.wikimediastatus.net?
Currently, it says "All Systems Operational".
It could say instead "Most Systems Operational", "Graphs extension disabled" or something.

I 've brought this comment to the attention of the cross SRE team group driving the status page forward. Thanks for the suggestion!

Dans T334940#8802210, @C-Kobold a écrit :

Is there an easy way to just generate the Graphs locally as picture with the data from Wikidata, upload them to Wikimedia Commons and embed them again as normal picture? Could this be automated? I am not that proficient in SPARQL.

Hi, the trouble in that idea, is that the generated image might be outdated from the data input in Wikidata. Moreover, Vega is supposed to give some better end-user experience (tooltips, differents graphs, etc). Apparently, it is a matter of days to get graphs back.

The graphs I generated with data from Wikidata don't change that often (only once per year: member count of German political parties); so we could also just make normal images until this issue is resolved. At the moment, in the German Wikipedia, ALL Wikipedia pages about the major German political parties (CDU, SPD, CSU, Bündnis 90/Die Grünen, FDP, Die LINKE) have broken diagrams that were supposed to show the number of members over the years.
This is quite disastrous!

There has been no mention above of the mapping features that Graph is used for. Will the fix restore the maps that are also not showing, which use 'Graph:maps with marks'? The 'en:OSM Location map' template has 5,500 maps, and the template is also translated to 44 other languages. Thanks for the work towards fixing things.

@RobinLeicester I don't think fixing those initially will be possible. The template/lua module will likely need migrating since it's atypical in its usage of vega but it might be feasible to patch something within the extension next week.

@RobinLeicester @C-Kobold please share a URL to the template (or provide the wikitext you are using and I'll make sure to look into these as part of T335325)

@RobinLeicester I can confirm that current status is that these don't work, but you can track progress with how they currently render on https://en.wikipedia.beta.wmflabs.org/wiki/Template:OSM_Location_map and https://en.wikipedia.beta.wmflabs.org/wiki/Template:Map_with_marks and T335325 would be the appropriate ticket to follow.

I added a subscriber: I.

I think this is also related to the Chinese strategy. Let's add him.

@RobinLeicester I can confirm that current status is that these don't work, but you can track progress with how they currently render on https://en.wikipedia.beta.wmflabs.org/wiki/Template:OSM_Location_map and https://en.wikipedia.beta.wmflabs.org/wiki/Template:Map_with_marks and T335325 would be the appropriate ticket to follow.

It also related to T335048

There has been no mention above of the mapping features that Graph is used for. Will the fix restore the maps that are also not showing, which use 'Graph:maps with marks'? The 'en:OSM Location map' template has 5,500 maps, and the template is also translated to 44 other languages. Thanks for the work towards fixing things.

In the meantime you can fix this template by moving it to https://www.mediawiki.org/wiki/Extension:Kartographer which does pretty much what your template did without using Graphs.

There is a good implemetation of Kartographer at en:template:Maplink, but it can't show any of the text labels or other items that allow a properly customised map rather than just a view into the base-map. I will have a think about whether a temporary base-map would seem better than an apology for the time being. It would be a disaster, long-term, for the maps with a lot of customisations.