Have you seen Tony Chen's (development lead for Xbox One security) description of the Xbox 360 reset glitch hack at [1] and the effect this (and other console exploits of that time) had on Xbox One development?
This is one of my go-to case study videos for the development effort required to architect a computer to resist attackers who have physical access.
Congratulations, haven't had a reason to mess with it myself, but I've heard it described online as the most secure piece of consumer hardware before or since
I think you might be mixing up the Xbox 360 with the Xbox One, the former was ultimately compromised in several ways, but the latter's security has held up extremely well for 11 years and counting. The Xbox One and its successor are easily the most secure consoles ever made.
The Xbox 360 was overall a very, very secure device. While we don't know exactly how the folks who discovered the hypervisor syscall handler bug were able to get plaintext, it's theorized that it came from development kit and SDK leaks. With an SDK and dev kit someone could dump boot loaders and the HV.
Otherwise on a retail console you can't do much. The hard drives are not encrypted but all content that can possibly contain code / save data is signed. Save data cannot contain code but introduces scripting engine / save parsing attack surface, but you can't modify it without first dumping keys from a retail console.
To dump keys from a retail console you have to get code exec in the hypervisor. To attack the hypervisor you have be able to dump the hypervisor to audit it.
To dump the hypervisor you have to be able to read its contents or dump it from flash. The flash is encrypted with a per-console key (and I don't think you can sniff the bus?) and RAM is encrypted.
Realistically if it weren't for the original syscall handler bug and dev kits getting into researcher's hands, the Xbox 360 may have never been hacked.
Stupid question, is the reason that people cannot simply dump the ROM as they do with say routers is that the rom is encrypted? But if they have the SDK they can decrypt it?
What are the reasons for why Microsoft wanted to lock down consoles to only run signed code? As a games console manufacturer, what are the business reasons for doing so? Thanks
Limiting piracy is the ongoing reason, but there is also the historical reason of the Video game crash of 1983 which led to Nintendo's Seal of Quality.
Essentially as the platform owner, you want to ensure games sold for the platform "just work", and if you have a bunch of third parties running bad software, consumers would lose faith in the platform altogether.
I think it's also worth pointing out that the console makers (and developers) pour a lot more resources into ensuring that the products released for their platform are of a suitable quality than, say, phone app store gatekeepers.
A big draw as well is that people can't (within the economic viability timeframe of the games/console) hack the games on a console, meaning you get a much more predictable online experience than you might on PC.
To add a little more color to this, it wasn't solely to ensure games worked. The lesson of the video game crash was that third party publishers would make knock-off games similar to popular titles and flood the market with them at much lower cost - sometimes as low as $5 vs for a $40 for a top title. These games were generally low budget and rushed to market to capitalize on looking like a top-selling title - while being just different enough to (hopefully) avoid trademark infringement.
These games usually "worked" (as in booting up and playing), the issue was more that that they were just bad versions of the title they were ripping off due to having little development time and minimal play testing along with poorer artwork and fewer levels (thus saving ROM memory). The flood of cheap, bad versions of more popular games is credited as the main factor that killed the Atari VCS.
Another big factor was that later console manufacturers charged game publishers a license fee for the proprietary library code required for a console to run a game. This fee could allow manufacturers to sell game consoles at cost or even below cost and recoup the lost profit over time in the per game license fee.
This wasn't always the case in the early days of hardware cartridge systems. Initially, some early console manufacturers didn't charge much more than a game publisher could buy blank cartridges for from a third party. Some other manufacturers chose to generate revenue simply by building more margin into the wholesale price they charged game publishers for blank cartridges. Of course, when console manufacturers started increasing their cartridge profit margin, game publishers were motivated to use third party cartridges - which led to console makers deploying "genuine hardware" checks or, later, disc checks and encryption. Nintendo popularized enforcing their business model both technically and legally (by requiring an IP license). Today, console manufacturer business models rely on 1) Collecting per game license fees, 2) Blocking piracy, 3) Limiting game supply.
Microsoft actually reversed course on this. You can make a one-time purchase to access "developer mode" and then run whatever you want. It's been suggested that this is the reason there's been less interest in hacking the Xbox. Ironically it also means you have more computational freedom on the Xbox than on the iPhone/iPad.
They sell the consoles at a loss, so if you could port your own games to the consoles instead of buying the games that they could take a royalty from then they lose money. It doesn't have to be an effective circumvention to trigger the DMCA making it illegal.
A games console provided a platform where they could more effectively argue that “their” works “““needed””” to be protected so they could farm us (people who want to run their own code on hardware they purchased) for digital-jail technologies which would never otherwise have reason to exist. Then those technologies can metastasize fully-formed over to general-purpose computing in a way that's harder to argue against. They learned with Clipper and Palladium that trying to develop jail tech on PC would be vehemently opposed.
The opposition was pointless though, like everything has TPM/IME/etc. nowadays so we lost that war awhile ago. I don't see how consoles helped them win that war though.
One challenge was that while I started working on the Xbox 360 about three years before it would ship, we knew that the custom CPU would not be available until early 2005 (first chips arrived in early February). And there was only supposed to be one hardware spin before final release.
So I had no real hardware to test any of the software I was writing, and no other chips (like the Apple G5 we used as alpha kits) had the custom security hardware or boot sequence like the custom chip would have. But I still needed to provide the first stage boot loader which is stored in ROM inside the CPU weeks before first manufacture.
I ended up writing a simulator of the CPU (instruction level), to make progress on writing the boot code. Obviously my boot code and hypervisor would run perfectly on my simulator since I wrote both!
But IBM had also had a hardware accelerated cycle-accurate simulator that I got to use. I was required to boot the entire Xbox 360 kernel in their simulator before I could release the boot ROM. What takes a few seconds on hardware to boot took over 3 hours in simulation. The POST codes would be displayed every so often to let me know that progress was still being made.
The first CPU arrived on a Friday, by Saturday the electrical engineers flew to Austin to help get the chip on the motherboard and make sure FSB and other busses were all working. I arrived on Monday evening with a laptop containing the source code to the kernel, on Tuesday I compiled and flashed various versions, working through the typical bring-up issues. By Wednesday afternoon the kernel was running Quake, including sound output and controller input.
Three years of preparation to make my contribution to hardware bring-up as short as possible, since I would bottleneck everyone else in the development team until the CPU booted the kernel.
Eric Mejdric from IBM called on Friday and said we have the chips, when are you guys getting here?
I took a red eye that night and got to Austin on Saturday morning.
We brought up the board, the IBM debugger, and then got stuck.
I remember calling you on Sunday morning. You had just got a big screen TV for the Super bowl and had people over and in-between hosting them you dropped us new bits to make progress.
I think Tracy came on Sunday or Monday and with you got the Kernel booted.
OMG Harjit! I saw you in the documentary! You and the entire team are total rockstars! I just cannot fathom ever being in a position to design something that provided so much joy and happy core memories to countless people around the world...you guys did it!
Just the thought of how many people you touched with your work....just amazing! :)
Had a question if you don't mind: Can you talk about the thought process behind the power supply design? Its very large even in the super slim models. Were you following a specific design driven by the hardware architecture or were there other reasons? I always wondered about that.
Actually Tracy never made it to Austin. He was going to fly in later in the week to continue bring-up, but since we were done by Wednesday, we just sent the chips to Redmond and he continued there. He was of course always available on the phone to answer my kernel questions I had.
This is really some blast from the past. Can you please shed more light on the simulator? Is it interpretation or JIT? But then I realize XBOX uses Pentium III, so maybe virtualization? Edit Sorry it was XBOX 360 so it's not Pentium.
As someone who recently got interested in emulation and wrote two lc-3 emulators, would really love to learn from the masters.
You sir are an inspiration! I am but a mediocre Angular developer and reading this has me in complete awe! The kind of drive you must have had to get this done well I dont know how people manage to do it but it is so cool to see! :)
I called the simulator Sbox and it was just a simple console app. I didn't implement the GPU, so no graphics just the hypervisor and kernel and some simple non-graphics apps. I made it so that you could build the Xbox 360 kernel on your windows machine, then just run sbox.exe and it would automatically find the just built kernel image targeting the PPC64 and boot it. Then if you typed control-C it would drop into the kernel debugger as a sub process, and you could poke around at the machine state as if it were the real Xbox hardware, showing all the PPC instructions and registers. It was a lot of fun writing it, and quite useful.
You should also talk about the lwarx/stecx bug. IIRC - in the first version of the chip there was a bug in one or both of these instructions. Your code booted on SBox but didn't on the hardware. You compared the two and then figured out it was these instructions.
You filed a bug report and then dug into them and used SBox to figure out what must have been going wrong.
The chip supplier came back with a workaround and within five minutes you simulated it on SBox and said it wouldn't work, why, and then said how it should be fixed.
The supplier didn't believe you as yet. And you worked out a workaround so we could be unblocked. Two weeks later they agreed with your fix...
I recall an issue when trying to use lwarx/stwcx on Xbox 360 directly that the compiler (or maybe even the kernel, on program load? it's been a while) raised an error and said to use the Interlocked intrinsics instead -- is that related?
So the PPC instruction set uses lwarx (load word and reserve indexed), and stwcx (store word conditional indexed), along with variations for word size, to implement atomic operations such as interlocked-increment and test-and-set.
So on PPC interlocked-increment is implemented as:
loop: lwarx r4,0,r3 # Load and reserve r4 <- (r3)
addi r4,r4,1 # Increment the value
stwcx. r4,0,r3 # Store the incremented value if still reserved
bne- loop # Loop and try again if lost reservation
The idea is that the lwarx places a reservation on an address that it wants to update at some later time. It doesn't prevent any other thread or processor from reading or writing to that address, or cause any sort of stall, but if an address being reserved is written to, conditional or otherwise, then the reservation is lost. The stwcx instruction will perform the store to memory if the reservation still exists clears the NE flag, otherwise it doesn't do the write and sets the NE flag and software should just try again until it succeeds.
On the Xbox 360 we provided the compiler which would emit sequences like these for all atomic intrinsics, but developers could also write assembler code directly if they wanted to. We'll get back to this point in a moment.
As the V1 version of the Xbox 360 CPU was being tested by IBM, they discovered that an error with the hardware implementation of these two instructions and issued an errata for software to work around it, which we implemented. Unfortunately, after further testing IBM discovered that the errata was insufficient, so issued a second errata, which we also implemented and assumed all was well.
Then the V2 version of the CPU comes out and months go by. But early one morning I get a phone call from IBM letting me know that the latest errata was still insufficient and that the bug is in the final hardware. Further, Microsoft has already started final production of CPU parts, even before full testing was fully complete (risk buy), so that they could have sufficient supply for the upcoming November release. I was told that they are stopping manufacturing of additional CPUs, and that I had 48 hours to figure out if there is anything software can do that could work around the hardware issue. They also casually mentioned that millions of dollars of parts would need to be discarded, a hardware fixed implemented which would take weeks, then the production could resume from scratch.
Bottom line is that, yes, there was a set of software changes that would work around the bug, but it required very specific sequences of instructions, the disabling of interrupts around these sequences, a change to the hypervisor, and updating the compiler to emit the new sequences. To make sure that developers didn't introduce code sequences that uses lwarx/stwcx in a way that would expose the bug (via inline assembly, for example), the loader would scan the code and refuse to load code that didn't obey the new rules.
Interesting fact: the hardware bug existed in every version of the Xbox 360 ever shipped, because software needed to run on any console ever shipped, there was no advantage to ever fixing the bug since software always needed to work around it anyway.
Thank you so much. This is so awesome to know and learn.
I'm just curious, what are the instructions that replace the lwarx/stwcx "atomic" pair? From my understanding, basic you need to generate a pair of load reserved/save instructions, and you have to replace the pair with a series of instructions. But I don't understand why do you have to disable interrupts -- is it because actually multiple instructions were used to facilitate the load, and an interrupt may disturb a value stored in a register?
What was the culture like working on this project and back in those days? I’ve always been fascinated by the development of consoles, especially the story of the 360. Any sources you recommend to learn more? I thought the Microsoft documentary on Xbox was the best I’ve found so far.
Was it ever explained why? This was an unanswered question I always wondered about from time to time. They must have done something to remove RGH capability?
Yes that revision is patched to specifically counter RGH. Microsoft disabled the ability to get the precise timing needed from the CPU and also added more filtering/robustness so the system will reset properly instead of getting into the inconsistent state of the old revisions.
I feel like random delays would make the glitch attack harder but it would still be possible given enough attempts. Seems like the bigger issue is that you can glitch the CPU reset line which corrupts the processing rather than having no effect or resetting the CPU.
I assume those are probably very hard to fix since (again, I assume, I'm just a hobbyist in the hardware space) that sort of glitch relies on propagation delays (e.g. a short burst triggering some latches but not others, or triggering the latches in some specific synchrony).
Can anyone confirm if I'm on the right track with my guess?
Having a single developer allows fewer offices with their windows completely covered with newspaper. Plus, there's one person doing everything, which can be a lot better than two with people who have different ideas of how to make the system work together.
Nice work! Always fun to see something I wrote long ago reverse engineered. The packet format was indeed inspired by ESP over UDP, and I named it XSP. After system link shipped with the original launch of the console, I also worked on Xbox Live networking, including the client/server interactions and the design and implementation of the front-end Security Gateways that all Xboxes would talk to, first to authenticate themselves to the service, and then to maintain a heartbeat connection to the service (to keep NAT ports open during idle time), and to facilitate NAT traversal.
Nice! You did a great job on the protocol. Probably my only complaint on the XSP side of things is the fact that you have to do relatively complex parsing of the XSP packets before you can get to the point of verifying the signature of the packet. Seems like all of the corner cases were handled well in the implementation on the boxes, but as someone who does auth/cryptography in my day job, it kind of gives me the heebie-jeebies.
Do you know if the auth side was carried into deeper parts of the backend? So like, did the SG decorate incoming connections with the auth information as they made their way to the different services? There seemed to be more auth information than I expected in headers on some of those HTTP calls into services like matchmaking.
That's a valid point about complex parsing. I remember being very concerned about adding unnecessary overhead to each packet during encapsulation.
As for the SG, it primarily authenticated the Xbox machine account using Kerberos and then maintained a security association, accepted heartbeats, authenticated and decrypted incoming ESP-UDP packets into IP packets that it forwarded to the backend servers. Responses from the backend would be encrypted, authenticated, and encapsulated before sending back to the Xbox. I don't think the SG had any knowledge of higher level connections running through it, such as TCP or HTTP, so it would not have manipulated HTTP headers as they passed through.
Ok, cool. That's about what I figured at this point. Originally while REing the protocol I thought that it was holistically handling auth at that XSP layer, but then was surprised when a box would the identify it's XID to matchmaking as well, which should have been stored in the krb ticket to bootstrap that connection.
Thanks so much, I really appreciate your candor here!
The SG had to do a few TCP-level things for NAT purposes like rewriting checksums, and it would sometimes synthesize a RST. No layer 7 processing at all
There was a low level protocol allowing backends to get some extra metadata about a connection
Note to self: you should have added random delays before and after making the POST code visible on the external pins.
reply