-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
What is the issue with the HTML Standard?
XSLT v1.0, which all browsers adhere to, was standardized in 1999. In the meantime, XSLT has evolved to v2.0 and v3.0, adding features, and growing apart from the old version frozen into browsers. This lack of advancement, coupled with the rise of JavaScript libraries and frameworks that offer more flexible and powerful DOM manipulation, has led to a significant decline in the use of client-side XSLT. Its role within the web browser has been largely superseded by JavaScript-based technologies such as JSON+React. The underlying libraries that browsers use to process these transformations (e.g. libxslt in Chromium) are complex, aging C/C++ codebases. This type of code is notoriously susceptible to memory safety vulnerabilities like buffer overflows, which can lead to arbitrary code execution. Because client-side XSLT is now a niche, rarely-used feature, these libraries receive far less maintenance and security scrutiny than core JavaScript engines, yet they represent a direct, potent attack surface for processing untrusted web content. Indeed, XSLT is the source of several recent high-profile security exploits that continue to put browser users at risk.
For these reasons, I’d like to raise the question of whether we should deprecate and remove XSLT from the web standard. Doing so would directly reduce the browser's attack surface for all users, simplify the web platform, and allow engineering resources to be focused on securing the technologies that actually power the modern web, with no practical loss of capability for developers.
Just to be clear, the intention is not to deprecate the usage of XML (without XSLT) in other web platform APIs. Also, a side-note: much of XSLT isn't actually defined in the HTML/DOM standards. See whatwg/dom#181 for example. But we can potentially remove the few places that do mention it.
This question was raised recently in a WHATNOT meeting, but I’d like to have an issue where we can discuss and comment. Thoughts?
Activity
mfreed7 commentedon Aug 2, 2025
To see how difficult it would be, I wrote a WASM-based polyfill that attempts to allow existing code to continue functioning, while not using native XSLT features from the browser. The polyfill is located on GitHub or as an npm package:
This contains a functional polyfilled replacement for the XSLTProcessor class, plus a utility function for an easy way to replace XML documents that use XSLT processing instructions. Of course, XSLT would also continue to be executable on the server side if needed.
mfreed7 commentedon Aug 2, 2025
@smaug----, @annevk, @emilio, @rniwa for thoughts
emilio commentedon Aug 2, 2025
Wrong one right? But in general I think removing XSLT is worth trying, given the attack surface vs. usage trade-off.
bahrus commentedon Aug 2, 2025
This other metric shows less of a decline, and 20x more usage than the metric linked to above. Just saying.
annevk commentedon Aug 4, 2025
WebKit is cautiously supportive. We'd probably wait for one implementation to fully remove support, though if there's a known list of origins that participate in a reverse origin trial we could perhaps participate sooner.
It might also make sense to coordinate on a console warning?
keithamus commentedon Aug 4, 2025
smaug---- commentedon Aug 4, 2025
I think this is definitely worth trying. And yes, hopefully this time all the browsers could add a console warning sooner than later (that might reduce the risk to what happened with Mutation Events, where deprecation was discussed and kind of agreed on 2011-12, but removal happened 2025 ;) Only some browsers had the warning.)
fimion commentedon Aug 4, 2025
Out of curiosity, what is the game plan for displaying RSS feeds pleasantly when removing this?
Oblomov commentedon Aug 4, 2025
As a web developer and user, hard disagree on the proposal.
Just because something can be implemented in JS it doesn't mean that native solutions should be removed. One of the powers of XSLT support in browsers is that it can be used even with JS disabled, and in terms of security for the user, JS is a much bigger attack surface than XSL.
Both the antiquated XSLT version and the safety of the implementation can be addressed by adopting (and contributing to the development of) newer libraries written in safer languages, such as xrust.
quat1024 commentedon Aug 5, 2025
Could Chrome ship a package like this instead of using native XSLT code, to address some of the security concerns? (I'm thinking about how Firefox renders PDFs without native code using PDF.js.)
mfreed7 commentedon Aug 5, 2025
Gah - sorry. Too many polyfills. I've updated that comment with the correct link.
Awesome, thanks for the comment.
Great news, thanks. I understand the desire to wait for origin trials to expire or otherwise make sure sites aren't broken during the migration process. For Chrome at least, I doubt I can share the list, but I can definitely ask around to see if that's true.
I agree, great idea!
Great, thank you.
Makes sense to me!
This is a great point, and is likely the largest source of current usage. My thought had been that perhaps some version of the polyfill could be used to continue using XSLT to format a feed?
Understood, and thanks for the feedback!
I think the issue with XSLT isn't necessarily the size of the attack surface, it's the lack of attention and usage. I.e. nearly 100% of sites use JS, while 1/10000 of those use XSLT. So all of the engineering energy (rightfully) goes to JS, not XSLT.
Thanks for the pointer to xrust - I hadn't seen that one. Perhaps because it's quite recent. But we'll take a look.
This is definitely something we have been thinking about. However, our current feeling is that since the web has mostly moved on from XSLT, and there are external libraries that have kept current with XSLT 3.0, it would be better to remove 1.0 from browsers, rather than keep an old version around with even more wrappers around them.
daveajones commentedon Aug 5, 2025
XSLT is extensively used by podcast hosting companies to beautify their raw feeds. Example:
https://feeds.buzzsprout.com/231452.rss
The Podcast Standards Project (@tomrossi7 @albertobeta @mijustin) would be a good source of engagement on this as to how this would affect their products. I imagine they represent a large percent of the overall public XSLT usage in the wild.
marypcbuk commentedon Aug 5, 2025
does the BBC still use XSLT to style RSS feeds as human viewable? that was a major point in the extensive 2013 discussion on this exact same suggestion
https://feeds.bbci.co.uk/news/rss.xml
EDIT adding a link to the 2013 discussion that was still getting pro-XSLT comments after 5 years
https://groups.google.com/a/chromium.org/g/blink-dev/c/zIg2KC7PyH0/m/Ho1tm5mo7qAJ
60 remaining items
Add XSLT stylesheet
zcorpan commentedon Aug 19, 2025
I analyzed usage of XSLT in httparchive (desktop pages that trigger the XSLProcessingInstruction use counter). This set will only pick up RSS randomly (because the non-root page is one of the linked pages from the root page, and only one non-root page per site is collected).
The number of matches is 357, out of ~23,000,000 pages (~0.001%).
There are a few pages that would be broken by removing XSLT (either some content in an iframe or the entire page or site). Some are education portals or application portals. Then there are a bunch of sitemaps, feeds, or API endpoints where the transform doesn't seem essential. From the pages I manually looked at and assessed impact, about 29% are primarily for humans and would have essential or main functionality broken if XSLT was removed (~104 of the 357 when extrapolating).
https://docs.google.com/spreadsheets/d/1ihvzeRpIcojGxXZ8rj4UQR0M3KY_c-mkkfn3C6Gkptg/edit?usp=sharing
zcorpan commentedon Aug 19, 2025
@mfreed7
While doing my analysis (see above), I noticed that several of the feeds with XSLT have a helpful message at the top that explains what a feed is and what the user is expected to do with it (copy the URL into their feed reader or podcast player). This seems like something browsers could do (in place of the "this XML has no stylesheet" message for RSS and Atom).
mfreed7 commentedon Aug 20, 2025
Wow awesome analysis, thank you. I'm taking some time to look through that list. I've already found a few cases where my browser extension fails, so I'll get those fixed.
+1 to this idea. The only question is what the helpful message would be. I was thinking that the browser could detect an XML page that has an XSLT PI, and direct them (from the XML source viewer) to a browser extension. Better ideas appreciated.
P.S. For the other folks on this thread: while I am not the one who hid comments and closed this issue to non-collaborators, I completely agree with those actions. The ad hominem attacks here and elsewhere (e.g. this or this) violate the code of conduct and obscure the technical discussion. I'm just one engineer and I don't have unlimited budget; I'm just trying to do my job. I care a lot about the health of the overall web. I do want to find solutions to real problems, and I want to minimize the pain folks are feeling about this discussion of XSLT removal. But it's important to remember that ordinary users that fall victim to security vulnerabilities also feel pain, and I'm trying to minimize that too. I proposed some solutions to the concrete use cases I heard in this issue. If there are still gaps, I'd like to work on closing them. It's too bad we can't have that discussion here - I'm guessing we can't re-open this issue for outside comments, due to the overall tone of past comments. Either way, from now on, I'll only be responding to technical conversations, and ignoring the rest, for my own sanity.
jakearchibald commentedon Aug 30, 2025
@mfreed7 your polyfill requires putting a script in the XML markup. Is this likely to break feed readers that are not expecting that element?
I wonder if we could add a processing instruction to execute script, so it doesn't otherwise appear in the document.
mfreed7 commentedon Aug 30, 2025
Interesting question. I tried it with a handful of RSS readers and none of them reported any issues with a modified feed containing the polyfill
<script>element. It seems like a<script>hanging right off the root is the least likely to cause problems, since it won't interfere with most XPath queries. But if there's a better place to put the script so that it's less likely to cause reader issues, that'd be good to know.This is an option, of course. But if just tossing in the
<script>doesn't seem to cause issues, then perhaps not necessary?jakearchibald commentedon Aug 30, 2025
Fair. I just wonder if it'd close the feature gap. But if it's not a problem, it doesn't need a solution.
I guess we wouldn't want a new way to execute script, but the xslt processing instruction already can, so that might be a "way in"
zcorpan commentedon Sep 1, 2025
I think we shouldn't add a new way to execute scripts. The main benefit would be theoretical purity of avoiding an HTML element in an XML document using a vocabulary that doesn't say you can use HTML elements.
jakearchibald commentedon Sep 2, 2025
Yeah, if anything, I was thinking
<?xml-stylesheet type="text/javascript" href="transform.js"?>, sincexml-stylesheetalready results in script execution. But I agree, public feed readers tend to be non-validating, so it's ok to just add a<script>.