_____                   _                  _____            _____       _ 
  |     |___ _____ ___ _ _| |_ ___ ___ ___   |  _  |___ ___   | __  |___ _| |
  |   --| . |     | . | | |  _| -_|  _|_ -|  |     |  _| -_|  | __ -| .'| . |
  |_____|___|_|_|_|  _|___|_| |___|_| |___|  |__|__|_| |___|  |_____|__,|___|
  a newsletter by |_| j. b. crawford               home archive subscribe rss
COMPUTERS ARE BAD is a newsletter semi-regularly issued directly to your doorstep to enlighten you as to the ways that computers are bad and the many reasons why. While I am not one to stay on topic, the gist of the newsletter is computer history, computer security, and "constructive" technology criticism.

I have an M. S. in information security, more certifications than any human should, and ready access to a keyboard. This are all properties which make me ostensibly qualified to comment on issues of computer technology. When I am not complaining on the internet, I work in engineering for a small company in the healthcare sector. I have a background in security operations and DevOps, but also in things that are actually useful like photocopier repair.

You can see read this here, on the information superhighway, but to keep your neighborhood paperboy careening down that superhighway on a bicycle please subscribe. This also contributes enormously to my personal self esteem. There is, however, also an RSS feed for those who really want it. Fax delivery available by request.

--------------------------------------------------------------------------------

>>> 2021-09-08 W X Y Z

Let's return, for a while, to the green-ish-sometimes pastures of GUI systems. To get to one of my favorite parts of the story, delivery of GUIs to terminals over the network, a natural starting point is to discuss an arcane, ancient GUI system that came out of academia and became rather influential.

I am referring of course to the successor of W on V: X.

Volumes could be written about X, and indeed they have. So I'm not intending to present anything like a thorough history of X, but I do want to address some interesting aspects of X's design and some interesting applications. Before we talk about X history, though, it might be useful to understand the broader landscape of GUI systems on UNIX-like operating systems, though, because so far we've talked about the DOS/VMS sort of family instead and there are some significant differences.

Operating systems can be broadly categorized as single-user and multi-user. Today, single-user operating systems are mostly associated only with embedded and other very lightweight devices [1]. Back in the 1980s, though, this divide was much more important. Multi-user operating systems were associated with "big iron," machines that were so expensive that you would need to share them for budget reasons. Most personal computers were not expected to handle multiple users, over the network or otherwise, and so the operating systems had no features to support this (no concept of user process contexts, permissions, etc).

Of course, you can likely imagine that the latter situation, single-user operating systems, made the development of GUI software appreciably easier. It wasn't even so much about the number of users, but rather about where they were. Single-user operating systems usually only supported someone working right at the console, and so applications could write directly to the graphics hardware. A windowing system, at the most basic, only really needed to concern itself with getting applications to write to the correct section of the frame buffer.

On a multi-user system, on the other hand, there were multiple terminals connected by some method or other that almost certainly did not allow for direct memory access to the graphics hardware. Further, the system needed to manage what applications belonged on which graphics devices, as well as the basic issue of windowing. This required a more complicated design. In particular, server-client systems were extremely in at the time because they had the same general shape as the computer-terminal architecture of the system. This made them easier to reason about and implement.

So, graphics systems written for multi-user systems were often, but not always, server-client. X is no different: the basic architecture of X is that of a server (running within the user context generally) that has access to a graphics device and input devices (that it knows how to use), and one or more clients that want to display graphics. The clients, which are the applications the user is using, tell the server what they want to display. In turn, the server tells the clients what input they have received. The clients never interact directly with the display or input hardware, which allows X to manage multiple access and to provide abstraction.

While X was neither the first graphics system for multi-user operating systems, nor the first server-client graphics system, it rapidly spread between academic institutions and so became a de facto standard fairly quickly [2]. X's dominance lasts nearly to this day, Wayland has only recently begun to exceed it in popularity. Wayland is based on essentially the same architecture.

X has a number of interesting properties and eccentricities. One of the first interesting things many people discover about X is that its client-server nature actually means something in practice: it is possible for clients to connect to an X server running on a different machine via network sockets. Combined with SSH's ability to tunnel arbitrary network traffic, this means that nearly all Linux systems have a basic "remote application" (and even full remote desktop) capability built in. Everyone is very excited when they first learn this fact, until they give it a try and discover that the X protocol is so hopelessly inefficient and modern applications so complex that X is utterly unusable for most applications over most internet connections.

This gets at the first major criticism of X: the protocol that clients use to describe their output to X is very low level. Besides making the X protocol fairly inefficient for conveying typical "buttons and forms" graphics, X's lack of higher-level constructs is a major contributor to the inconsistent look-and-feel and interface of typical Linux systems. A lot of basic functionality that feels cross-cutting, like copy-and-paste, is basically considered a client-side problem (and you can probably see how this leads to the situation where Linux systems commonly have two or three distinct copy and paste buffers).

But, to be fair, X never aimed to be a desktop environment, and that type of functionality was always assumed to occur at a higher level.

One of the earliest prominent pseudo-standards built on top of X was Motif, which was used as a basis for the pseudo-standard Common Desktop Environment presented by many popular UNIX machines. Motif was designed in the late '80s and shows it, but it was popular and both laid groundwork and matched the existing designs (Apple Lisa etc) to an extent that a modern user wouldn't have much trouble using a Motif system.

Motif could have remained a common standard into the modern era, and we can imagine a scenario where Linux had a more consistent look-and-feel because Motif rose to the same level of universality as X. But it didn't. There are a few reasons, but probably the biggest is that Motif was proprietary and not released as open source until well after it had fallen out of popularity. No one outside of the big UNIX vendors wanted to pay the license fees.

There were several other popular GUI toolkits built on top of X in the early days, but I won't spend time discussing them for the same reason I don't care to talk about Gnome and KDE in this post. But rest assured that there is a complex and often frustrating history of different projects coming and going, with few efforts standing the test of time.

Instead of going on about that, I want to dig a bit more into some of the less discussed implications of X's client-server nature. The client and server could exist on different machines, and because of the hardware-independence of the X protocol could also be running different operating systems, architectures, etc. This gave X a rather interesting property: you could use one computer to "look at" X software running on another computer almost regardless of what the two computers were running.

In effect, this provided one of the first forms of remote application delivery. Much like textual terminals had allowed the user to be physically removed from the computer, X allowed the machine that rendered the application and collected input to be physically removed from the actual computational resources running the software. In effect, it created a new kind of terminal: a "dumb" machine that did nothing but run an X server, with all applications running on another machine.

The terminology around this kind of thing can be confusing and is not well agreed upon, but most of the time the term "thin terminal" refers to exactly this: a machine that handles the basic mechanics of graphical output and user input but does not run any application software.

Because of the relatively high complexity of handling graphics outputs, thin terminals tend to be substantially similar to proper "computers," but have very limited local storage (usually only enough for configuration) and local processing and memory capacity that are just enough to handle the display task. They're like really low-end computers, basically, that just run the display server part of the system.

In a way that's not really very interesting, as it's conceptually very similar to the block terminals used by IBM and other mainframes. In practice, though, this GUI foray into terminals took on some very odd forms over the early history of personal computers. The terminal was decidedly obsolete, but was also the hot new thing.

Take, for example, DESQview. DESQview was a text-mode GUI for DOS that I believe I have mentioned before. After DESQview, the same developer, Quarterdeck, released DESQview/X. DESQview/X was just one of several X servers that ran on DOS. X was in many ways a poor fit for DOS, given DOS's single task nature and X's close association with larger machines running more capable operating systems, but it was really motivated by cost-savings. DOS was cheap, and running an X server on DOS allowed you to both more easily port applications written for big expensive computers, and to use a DOS machine as an X server for an application running on another machine. The cheap DOS PC became effectively a hybrid thin terminal that could both run DOS software and "connect to" software running on a more expensive system.

One way to take advantage of this functionality was reduced-cost workstations. For example, at one time years ago the middle school I attended briefly had a computer lab which consisted of workstations (passed down from another better funded middle school) with their disks removed. The machines booted via PXE into a minimal Linux environment. The user was presented with a specialized display manager that connected to a central server over the network to start a desktop environment.

The goal of this scheme was reduced cost. In practice, the system was dog slow [3] and the unfamiliarity of KDE, StarOffice, and the other common applications on the SuSE server was a major barrier to adoption. Like most K-12 schools at the time the middle school was already firmly in the grasp of Apple anyway.

Another interesting aspect of X is the way it relates to the user model. I will constrain myself here to modern Linux systems, because this situation has varied over time. What user does X run as?

On a typical Linux distribution (that still uses X), the init system starts a desktop manager as root and then launches an X server to use, still with root privileges. The terminology for desktop things can get confusing, but a desktop manager is responsible for handling logins and setting up graphical user sessions. It's now common for the display manager to actually run as its own user to contain its capabilities (by switching users after start), so it may use setuid in order to start the X server with root capabilities.

Once a user authenticates to the display manager, a common display manager behavior is to launch the user's desktop environment as the user and then hand it the information necessary to connect to the existing X instance. X runs as root the whole time.

It is completely possible to run X as a non-privileged user. The problem is that X handles all of the hardware abstraction, so it needs direct write access to hardware. For numerous reasons this is typically constrained to root.

You can imagine that the security concerns related to running X as root are significant. There is work to change this situation, such as the kernel mode setting feature, but of course there is a substantial problem of inertia: since X has always run with root privileges, a great many things assume that X has them, so there are a lot of little problems that need solving and this usage is not yet well supported.

This puts X in a pretty weird situation. It is a system service because it needs to interact directly with scarce hardware, but it's also a user application because it is tied to one user session (ultimately by means of password authentication, the so-called X cookie). This is an inherent tension that arises from the nature of graphics cards as devices that expect very low-level interaction. Unfortunately, continuously increasing performance requirements for graphical software make it very difficult to change this situation... as is, many applications use something like mesa to actually bypass the X server and talk directly to the graphics hardware instead.

I am avoiding talking about the landscape of remote applications on Windows, because that's a topic that deserves its own post. And of course X is a fertile field for technology stores, and I haven't even gotten into the odd politics of Linux's multiple historic X implementations.

[1] Windows looks and feels like a single-user operating system to the extent that I sometimes have to point out to people that NT windows releases are fully multi-user. In fact, in some ways Windows NT is "more multi-user" than Linux, since it was developed later on and the multi-user concept is more thoroughly integrated into the product. Eventually I will probably write about some impacts of these differences, but the most obvious is screen locking: on Linux, the screen is "locked" by covering it with a window that won't go away. On Windows, the screen is "locked" by detaching the user session from the local console. This is less prone to bypasses, which perennially appear in Linux screensaver implementations.

[2] The history of X, multi-user operating systems from which UNIX inherited, and network computing systems in general is closely tied to major projects at a small number of prominent American universities. These universities tended to be very ambitious in their efforts to provide a "unified environment" which lead to the development of what we might now call a "network environment," in the sense of shared resources across an institution. The fact that this whole concept came out of prominent university CS departments helps to explain why most of the major components are open source but hilariously complex and notoriously hard to support, which is why everyone today just pays for Microsoft Active Directory, including those universities.

[3] Given the project's small budget, I think the server was just under-spec'd to handle 30 sessions of Firefox at once. The irony is, of course, that as computers have sped up web browsers have as well, to the extent that running tens of user sessions of Firefox remains a formidable task today.

Note: several corrections made, mostly minor, thanks to HN users smcameron, yrro, segfaultbuser. One was a larger one: I find the startup process around X to be confusing (you hopefully see why), and indeed I described it wrong. The display manager starts first and is responsible for starting an X server to use, not the other way around (of course, when you use xinit to start X after login, it goes the way I originally said... I didn't even get into that).

--------------------------------------------------------------------------------

>>> 2021-08-26 a permanent solution

The strategic and tactical considerations surrounding nuclear weapons went through several major eras in a matter of a few decades. Today we view the threat of nuclear wear primarily through the "triad": the capability to deliver a nuclear attack from land, sea, and air. This would happen primarily through intercontinental ballistic missiles (ICBMs), so-called because they basically launch themselves to the lower end of space before strategically falling towards their targets (ballistic reentry). ICBMs are fast, taking about 30 minutes to arrive across the globe. The result is that we generally expect to have very little warning of a nuclear first trike.

The situation in the early Cold War was quite different. ICBMs, and long-range missiles in general, are complex and took some time to develop. From the end of World War II to roughly the late '60s, the primary method of delivery for nuclear weapons was expected to be by air: bombs, delivered by long-range bombers. The travel time from the Soviet Union would be hours, allowing significant warning and a real opportunity for air defense intervention.

The problem was this: we would have to know the bombers were coming.

Many people seem to assume that the United States has the capability to detect and track all aircraft flying in US airspace. The reality is quite a bit different. The problem of detecting and tracking aircraft is a surprisingly difficult one, and even today our capabilities are limited. Nonetheless, surveillance of airspace is considered a key element of "air sovereignty," or our ability to maintain military and civil control of our airspace.

Let's take a look at the history of the United States ability to monitor our airspace.

During the 1940s, it was becoming clear that airspace surveillance was an important problem. Although the United States did not then face attacks on the contiguous US (and never would), were the Axis forces to advance to the point of bombing missions on CONUS it would be critical to be able to detect the incoming aircraft. The military invested in a system for Aircraft Control and Warning, or AC&W. Progress was slow: long-range radar was primitive and expensive, and the construction of AC&W stations was not a high-level priority. By the end of the 1948, there were only a small number of AC&W stations, they were considered basically experimental, and the ability to integrate data coming from the several stations was very limited. Efforts to expand the air surveillance system routinely failed due to lack of funding.

The Lashup project, launched in '48, made up the first major effort to build an air surveillance system. As the name suggests, Lashup was only intended to be temporary, funded by the congress as a stopgap measure until a more complete system could be designed. Over the next two years, 44 radar stations were built focused around certain strategically important areas. Lashup provided nothing near nationwide coverage, but was expected to detect bombing runs directed towards the most important military targets. Lashup included three stations surrounding my own Albuquerque, due to the importance of the Sandia and Manzano Amy Bases and the Z Division of Los Alamos.

Lashup sites used sophisticated radar sets for the time, but perhaps the most important innovation of Lashup was the command and control infrastructure built around it. Lashup stations were connected to the air defense command by dedicated telephone lines, the air defense command was connected to the continental air command by another dedicated telephone line, and ultimately dedicated lines were connected all the way to the White House. This was the first system built to allow a prompt nuclear response by informing the commander in chief of an impending nuclear attack as quickly as possible.

If you've read any of my other material on the cold war, you might understand that this is the core of my fascination with cold war defense history: the threat of nuclear attack was, for the large part, the first thing to motivate the development of a nationwide rapid communications system. For the first half of the 20th century there simply wasn't a need to deliver a message from the west coast to the President in minutes, but in the second half of the 20th century there most definitely was.

The fear of a Soviet nuclear strike, and the resulting government funding, was perhaps the largest single motivator of progress in communications and computing technology from the 1940s to the 1980s. Most of the communications technology we now rely on was originally built to meet the threat of a first strike.

We see this clearly in the case of air defense radar. While Lashup nominally had the capability to deliver a prompt warning of nuclear attack, the entire process was rather manual and thus not very reliable. Fortunately, lashup was temporary, and just as construction of the Lashup sites was complete work started on its replacement: the Permanent System.

The Permanent System consisted of a large number of radar stations, ultimately over 100. More importantly, though, it consisted of a system of communications and coordination centers intended to quickly confirm and communicate a nuclear threat.

It will help in understanding this system to understand the strategic principal involved. The primary defense against a nuclear attack by bombers was a process called ground-controlled intercept, or GCI. The basic concept of GCI was that radar stations would provide up-to-date position and track information on inbound enemy aircraft, which would be used to vector interceptor aircraft directly towards the threat. The aid of ground equipment was critical to an effective response, as fighter aircraft of the time lacked sophisticated targeting radar and had no good way to search for bombers.

To this end, the Permanent System included Manual Air Defense Control Centers (ADCC) (the "manual" was used to differentiate from automatic centers in the later SAGE system). The ADCCs received information on radar targets from the individual radar sites via telephone, and plotted them with wet erase marker on clear plexiglass maps (perhaps the source of the clear whiteboard trope now ubiquitous in films) in order to correlate multiple tracks. They then reported these summarized formations and tracks to the Air Defense Command, at Ent AFB in Colorado, for use in directing interceptors.

The Permanent System was extended beyond CONUS, although Alaska continued to have a distinct air defense program. The biggest OCONUS extension of the Permanent System was into Canada, with the Pinetree Line (the first of the cross-Canadian early warning radar networks) roughly integrated into the Permanent System. Perhaps most interestingly, the Permanent System also saw an early effort at extension of early warning radar into the ocean. This took the form of the Texas Towers, a set of three awkward offshore radar stations that were later abandoned due to their poor durability against rough seas [1].

Technology was advancing extremely rapidly in the mid-20th century, and by the time the Permanent System reached nearly 200 radar stations it had also become nearly obsolete. For its vast scale, the capabilities of the Permanent System were decidedly limited: it could only detect large aircraft, it performed poorly at low altitudes (often requiring mitigation through "gap filler" stations), and interpretation and correlation of radar data was a manual process, costing precious minutes in the timeline of a nuclear reprisal.

Here in Albuquerque, Kirtland Air Force Base was host to the Kirtland Manual ADCC, activated in 1951. 13 radar stations around New Mexico, eastern Arizona, and western Texas reported to Kirtland AFB. Each of these 13 radar stations was itself a manned Air Force Station including housing and cantonment. The Continental Divide Air Force Station, for example, consisted of some fifty people in remote McKinley County. The station included amenities like a library and gym, housing and a trailer park, and two radars: an early warning radar and a height-finding radar. Finally, a ground-air transmit-receive (GATR) radio site provided a route for communications with interceptors.

Continental Divide AFS was deactivated in 1960. You can still see the remains today, although there is little left other than roads and some foundations.

Like Continental Divide AFS, the Permanent System as a whole failed to make it even a decade. In 1960, it was as obsolete as Lashup, having been replaced not only by improved radar equipment but, more importantly, by a vastly improved communications and correlation system: the Semi-Automatic Ground Environment, or SAGE---by most measures, the first practical networked computer system.

We'll talk about SAGE later, but for now, check out a list of Permanent System sites. There might be one in your area. Pay it a visit some time; in many ways it's the beginning of the computer revolution: a manual data collection network obsoleted in just a few years by the development of the first nationwide computer network.

[1] The Texas Towers were connected to shore via troposcatter radio links, one of my favorite communications technologies and something that will surely get a full post in the future.

--------------------------------------------------------------------------------

>>> 2021-08-16 on voting

The use of electronics to administer elections has been controversial for some time. Since the "hanging chads" of the 2000 election, there's been some degree of public awareness of the use of technology for voting and its possible impacts on the accuracy and integrity of the election. The exact nature of the controversy has been through several generations, though, reflecting both changes in election technology and changes in the political climate.

Voting is a topic of great interest to me. The administration of elections is critical to a functioning democracy, and raises a variety of interesting security and practical challenges. In particular, the introduction of automation into elections presents great opportunity for cost savings and faster reporting, but also a greater risk of intentional and accidental interference in the voting process. Back when I was in school, I focused some of my research on election administration. Today, I continue to research the topic, and have added the practical experience of being a poll worker in two states and for many elections [1].

Given my general propensity to have opinions, it will come as no surprise that this has all left me with strong opinions on the role of computer technology in election administration. But before we get to any of that, I want to talk a bit about the facts of the matter.

The thing that most frustrates me about controversies surrounding electronic voting is the generally very poor public understanding of what electronic voting is. If you follow me on Twitter, you may have seen a thread about this recently, and it's a ramble I go on often. There is a great deal of public misconception about the past, present, and future role of electronics in elections. These misunderstandings constantly taint debate about electronic voting.

In an on-and-off series of posts, I plan to provide an objective technical discussion of election technology, "electronic voting," and security concerns surrounding both. I will largely not be addressing recent "stolen election" conspiracy theories for a variety of reasons, but will undoubtedly touch on them occasionally. At the very least, because I can never turn down an opportunity to talk about J. Hutton Pulitzer, an amazing wacko who has a delightful way of appearing with a huge splash, making a fool of himself, and then disappearing... to pop up again a couple years later in a completely different context.

I will restate that my goal here is to remain largely apolitical (mocking J. Hutton Pulitzer aside), and as a result I will not necessarily respond to any given election fraud or interference claim directly. But I do think anyone interested in or concerned by these theories will find the technical context that I can provide very useful.

Who runs elections?

One of the odd things about the US, compared to other countries, is the general architecture of election administration. In the US, elections are mostly administered by the county clerk, and the election process is defined by state law. Federal law imposes only minimal requirements on election administration, leaving plenty of room for variation between states.

Although election administration is directly performed by the county clerk, for state-level elections (which is basically all the big ones) the secretary of state performs many functions. It's also typical for the secretary of state to provide a great deal of support and policy for the county clerks. So, while county clerks run elections, it's common for them to do so using equipment, software, and methods provided by the state. It's ultimately the responsibility of states to pay for elections, which is probably the greatest single problem with US election integrity, because states are poor.

While it seems a little odd that, say, a presidential election is run by the county clerks, it can also be odd the other way. Entities like municipalities, school districts, higher education districts, flood control districts, all kinds of sub-county entities may also have elected offices and the authority to issue bond and tax measures. These are typically (but not always) administered by the county or counties as well, usually on a contract basis.

What is electronic voting?

Debate around electronic voting tends to focus purely around "voting machines," a broad category that I will define more later. The reality is that voting machines are only a small portion of the overall election apparatus, and are not always the most important part. So before I get into the world of election security theory, I want to talk a bit about the moving parts of an election, and where technology is used.

The general timeline of an election looks like this:

To meet these ends, election administrators use various different systems. There's a great deal of mix-and-match between these systems, many vendors offer a "complete solution" but it's still common for election administrators to use products from multiple vendors.

Each of these systems poses various integrity and security concerns. However, election systems can be roughly divided into two categories: tabulating systems and non-tabulating systems.

Tabulating systems, such as tabulators and direct recording electronic (DRE) machines, directly count votes which they record in various formats for later totalization. Tabulating systems tend to be the highest-risk element of an election because they are the key point at which the outcome of an election could be altered by, for example, changing votes.

Non-tabulating systems perform support functions such as design of ballots, registration of voters, and totalizing of tabulated votes. These systems tend to be less security critical because they produce artifacts which are relatively easy to audit after the fact. For example, a fault in ballot design will be fairly obvious and easy to check for. Similarly, totalizing of tabulated votes can fairly easily be repeated using the original output of the tabulators (and tabulators typically output their results in multiple independent formats to facilitate this verification).

This is not to say that tabulating systems are not subject to audit. When a paper form of the voter's selections exists (a ballot or paper audit trail), it's possible to manually recount the paper form in order to verify the correctness of the tabulation. However, this is a much more labor intensive and costly operation than auditing the results of other systems. In the case of DRE systems with no paper audit trail, an audit may not be possible.

We will be discussing all of these systems in more detail in the future.

Why electronic voting

There is one fundamental question about electronic voting that I want to address up front, in this overview. That is: why electronic voting at all?

Most of the fervor around electronic voting has centered around direct recording electronic (DRE) machines that lack a voter verifiable paper audit trail (VVPAT) [2]. These machines, typically touchscreens, record the voter's choices directly to digital media without producing any paper form. As a result, there is typically no acceptable way to audit the tabulation performed by these machines. Software bugs or malicious tampering could result in an incorrect tabulation that could not be readily detected or corrected after the fact.

It's fairly universally accepted that these machines are a bad idea. Basically no one approves of them at this point. So why are they so common?

Well, this is the first major misconception about the nature of electronic voting: DRE machines with no VVPAT are rare. Only ten states still use them, and most of those states only use them in some polling places. Year by year, the number of DRE w/o VVPAT machines in use decreases as they are generally being replaced with other solutions.

The reason is simple: they are extremely unpopular.

So why did anyone ever have DRE machines? And why do we use machines at all instead of paper ballots placed in a simple box?

The answer is the Help America Vote Act of 2002 (HAVA). The HAVA was written with a primary goal of addressing the significant problems that occurred with older mechanical voting systems in the 2000 election, including accessibility problems. Accessibility is its biggest enduring impact: the HAVA requires that all elections offer a voting mechanism which is accessible to individuals with various disabilities including impaired or no vision.

In 2002, there were few options that met this requirement.

The other key ingredient is, as we discussed earlier, the nature of election administration in the US. Elections are not just administered but funded at the state and county level. State budgets for elections have typically been very slim, and suddenly, in 2002, most states suddenly faced a requirement that they replace their voting systems.

The result was that, in the years shortly after 2002, basically the entire United States replaced its voting systems on a shoestring budget. Many states were forced to go for the cheapest possible option. Because paper handling adds an appreciable amount of complexity, the cheapest option was to do it in software: "paperless," or non-auditable, DRE machines.

To the extent that DRE w/o VVPAT machines are still in use in 2021, we are still struggling with the legacy of the HAVA's good intentions combined with the US's decentralized and tiny budget for the fundamental administration of democracy.

We don't have non-auditable voting systems because someone likes them. We have them because they were all we could afford in 2003, and because we haven't since been able to afford to replace them.

Basically the entire electronic voting landscape revolves around this single issue: there is enormous pressure in the US to perform elections as cheaply as possible, while still meeting sometimes stringent but often lax standards. The driver on selection of election technology is almost never integrity, and seldom speed or efficiency. It is nearly always price.

In upcoming posts, I will be expanding on this with (at least!) the following topics:

[1] I highly recommend that anyone with an interest in election administration step up as a poll worker. You will learn more than you could imagine about the practical considerations around elections.

[2] We will talk more about VVPAT and how it compares to a paper ballot in the future.

--------------------------------------------------------------------------------

>>> 2021-08-03 key systems

programming note update: the ongoing reliability problems with computer.rip have been tracked down to a piece of hardware which is Not My Problem, and so I anxiously await the DC installing a replacement. Hopefully the problem will be resolved shortly.

And now for more about telephones, because I am on vacation in Guadalajara and telephones are decidedly a recreational topic. If you follow me on twitter I am probably about to provide an over-length thread on some Mexican telephone trivia.

Back when I was talking about turrets, I mentioned their relationship to key systems. While largely forgotten today, key systems were an important step in the evolution of business telephone systems and remain influential on business telephony today. Let's talk a bit about key systems, including some particularly notable ones.

But first, it would be helpful to understand the landscape of business telephony systems. I'm writing this from the perspective of today, but I think this overview will be helpful in understanding the context in which the key system was invented and became popular.

Most businesses have a simple problem: they have, say, ten employees, each with a phone, but they do not want to pay the considerable expense of having ten telephone lines in service. It would be much better to have, say, two telephone lines, which were shared among the employees. The first and most obvious solution was the private branch exchange, often abbreviated PBX. In a classic PBX arrangement, one or more outside lines terminate at a small manual exchange (the type with operators that insert plugs to connect lines). The PBX can provide the same services as a telco exchange, including answering incoming calls and directing them to inside lines, but comes at the significant disadvantage of requiring an operator.

Today, it's not unusual for a front-desk receptionist or other similar employee to serve as the de facto telephone operator (usually today called an "attendant" to differentiate from the older position of a dedicated operator), answering incoming calls and directing them appropriately. The design of a manual telephone exchange made this impractical, though, as even small manual exchanges were pretty large and nearly required wearing a headset... wearing a headset and sitting behind a plugboard was not amenable to greeting guests or other typical receptionist tasks, so a dedicated, full-time telephone operator was basically required. This made PBXs very expensive to operate, in addition to the considerable expense of purchasing one.

The solution here seems obvious: the Private Automated Branch Exchange, or PABX. A PABX uses automatic switching rather than manual. Outbound calls can be made by dialing, while inbound calls can be managed by various techniques like DID or an automated attendant. In the case of DID, Direct Inward Dialing, the telephone company assigns a unique telephone number to each employee of a company even though the company does not have that many lines (for practical reasons related to how mechanical switches hunted for available lines, in early cases these numbers usually had to be sequential). When the telco connected a call to the PABX, it used some technique to indicate the number the call had been dialed to originally---early on this was often the delightfully named Revertive Pulsing, where once the PABX "answered" the line the exchange pulse-dialed back to the PABX, often with the last n digits of the called number.

In the case of an automated attendant (AA), the PABX answers and plays an audio recording prompting the caller to enter an extension. It then connects the call appropriately. The AA may optionally provide a menu of usually single-digit options, although this is a bit more complicated to implement and was not as common on early PABXs.

DID and AA are both ubiquitous today. The use of telephone extensions inside of businesses has generally decreased over the years as DID has become easier and cheaper to implement, but AAs remain common for telephone menus, which may straddle the line between a "mere" AA and the more complicated interactive voice response (IVR) system.

Here's the problem, though: in the early days of business telephony, DID and AAs were both very complex to implement. Early PABXs were mechanical, even Strowger (also called step-by-step or SXS), and the introduction of DID significantly complicated the switching matrix. The lack of good, reliable audio playback devices and the lack of universal DTMF signaling made AAs impractical for quite some time.

So, here is the problem: for smaller organizations, which could not justify the expense of employing a telephone operator during business hours, there were few practical options. PABXs were too expensive and too limited, often still requiring a full-time operator to handle incoming calls [1].

The key system was introduced as a compromise. Like a PABX, it does not require an operator. But, a key system is substantially less complex and expensive than a PABX. What's the trick? A key system makes everyone act as the operator.

When I previously mentioned key systems I put it like this: a PABX connects many users to each line. A key system connects many lines to each user.

Lets say again you are a small organization with about ten employees and you want to pay for two lines. When you install a key system, you connect the two outside lines to a Key Service Unit (KSU). The KSU is then connected to each of the ten telephones by a large, multi-pair cable, often a 25-pair Amphenol type connector. Superficially, it may look like a PABX, but the use of the multi-pair cable is a big hint to what's going on: the KSU only provides very minimal electrical conversions and mostly just acts as a jumper matrix. All of the actual logic is in the telephones, each of which have all of the outside lines connected directly to them.

The "key" in "key system" refers to the "line keys" on each phone. In our notional two-line system, each phone has two buttons labeled "line 1" and "line 2." Whenever a line is in use, the button lights. When a line is ringing, the light flashes and the phone may ring depending on configuration (ringing can usually be enabled/disabled per line to provide a simple concept of "call groups" if the outside lines have different numbers).

To place a call, a user presses a line key that is not lit, which connects their phone directly to that outside line. They then dial normally. To answer a call, the user presses the flashing line key and then picks up the phone. All they really have is a phone that is connected to all of the outside lines, the key system just makes it possible to have many phones connected this way at once.

Of course, early on key systems sprouted additional features. Even the earliest key systems started to offer an "intercom" feature, in which one or more pairs on each phone were connected to an "intercom bridge" in the KSU. This provided a feature that is superficially like a PABX's inside calling: a user can press an intercom key and then dial a number, which causes another phone on the system to ring. When that person answers, they can have a conversation. Of course the simple design of the feature imposes a lot of limitations, and generally only one intercom call can be made for each assigned intercom bridge on the system. This was often only one or two.

You can also see that key systems pose a significant risk of "collisions." Later key systems often included a "privacy" feature that locked out phones from connecting to a line when it was currently in use, so that other users could not eavesdrop on your calls. The feature could similarly prevent someone trying to make an intercom call suddenly being placed in an existing call. Of course these features meant that if all outside lines or all intercom bridges were in use, it was simply not possible to make a call. The line key lights served an important purpose in showing users when a line was available for their use.

Perhaps the quintessential key system is the Western Electric 1A and descendants, which were in widespread use for decades around the mid century. Later revisions of the 1A such as the 1A2 supported as many as 29 lines to each phone (this required multiple 25-pair cables per phone!) and advanced (for the time) features such as attended transfer and music on hold.

Key systems were often designed flexibly to reduce cost of installation. For example, outside lines might be allocated to different departments. Most phones would only need to be connected to the lines for their department, but a receptionist might have a "call director" phone that presented all lines so that they could answer calls for multiple departments [2].

My favorite key system, though, is the AT&T Merlin. The Merlin was a late digital key system, introduced in 1983, and so began to blur the line between key system and PABX. Most importantly, though, the Merlin telephone instruments were beautiful. Seriously, look at them. An advertising campaign including product placement in films and television reinforced the aesthetic cache of the Merlin. The campaign is said to have been so successful that the Merlin instruments became something of a status symbol, and client-contact organizations like law firms would upgrade from 1A2 to Merlin just for the desk decorations. I recall having read once that the Merlin was a key inspiration for the design of the NeXT Cube under Steve Jobs, but I cannot find a source on this now so perhaps I just made it up. I certainly hope it's true!

It might seem that key systems would be an artifact of history today, entirely outmoded by the availability of inexpensive PABX systems. There were a lot of disadvantages to key systems. Besides the issue of users having to manually select lines, and limited logic on ring groups, the large multi-pair cables required to telephone instruments made key systems expensive to install and not amenable to reuse of existing phone cabling in a building.

The funny thing is that sort of the opposite happened. The low-cost PABXs that became readily available in the 1990s were actually more descended from key systems than the earlier electromechanical PABXs. The small business PABX I have in my house, for example, the Comdial DX80, is basically an overgrown key system. Yet it has many of the advantages of an earlier PABX!

Here's the trick: the availability of computer-controlled digital switching and communications allowed for implementing a "key system" using a standard two-pair line to each telephone. Small businesses were usually upgrading from key systems and expected similar behavior. So it just made sense to take a suite of PABX features and shove them into a key system, using digital signaling to simplify the installation of the system.

So the DX80 for example works like this: the KSU communicates with the phones using a digital protocol over a single-pair telephone line. Each telephone instrument can be equipped with a full set of line keys for the KSU's up-to-16 outside lines, but the KSU is also capable of automatically selecting outside lines and automated incoming call routing based on DID or an auto-attendant. Internal calling between phones is managed digitally and is not limited to one or two intercom lines. All this adds up to flexibility: you can use the DX80 as either a key system or a PABX, depending on how you configure it. You can leave automated line selection un-configured and present line keys on the phones, or you can remove the line keys from phones (reallocating them to other uses) and set up fully automatic call handling.

Many organizations ended up doing both!

A lot of '90s to '00s PABXs were like this. They had sort of an identity crisis between key system and PABX where they wanted to present the convenience of a PABX without removing the familiar line keys for direct access to outside lines. Those line keys could be important, after all, as not all businesses had a DID arrangement (or even disconnect supervision) from their telco, so the use of the line keys allowed for connecting the PABX directly to a "normal" telephone line without needing to get the telco to enable additional features.

Today, most business telephone systems are being converted to VoIP which can provide additionally flexibility and features, and basically obsoletes the concept of a key system since the "number of lines" on a VoIP trunk is a largely synthetic concept. Nonetheless, most VoIP systems can be configured for key-system-like behavior if you really want it.

[1] I have omitted from this discussion the Centrex and other forms of telco- operated PABXs. I will probably do a full post on these in the future. For a short time I worked for a large organization which owned a formerly AT&T-operated 5ESS as their PABX and had the pleasure of getting an extensive tour of the system from one of its few remaining on-site technicians. It has since been decommissioned. As a basic hint, when an organization is large enough to have one or more exchange codes to itself (often seen with universities and older large corporations), it's likely that they had an on-site PABX provided by the telco. If an organization had a set of sequential numbers but no on-site switch, they probably used Centrex, which was basically the same arragement except for the switch was located in a telco office (and often "virtualized" on an existing ESS). Centrex was also popular with organizations that were very large but had multiple facilities, like school districts, since the existing telco exchange office was as convenient of a central location as anywhere else. That said, the nature of their close relationship to government meant that school districts often found it convenient to run their own private trunk lines between buildings, and so they may have still used an on-site switch.

[2] The term "call director" is still sometimes used today to refer to phones with an unusually large number of line buttons, often on a device like a "receptionist sidecar". The terminology is confused by "Call Director" also being the name of various PABX products and features.

--------------------------------------------------------------------------------

>>> 2021-07-26 rip those bits to shreds

Programming note: you may have noticed that computer.rip has been up and down lately. My sincere apologies, one of the downsides of having a neo-luddite aversion to the same cloud services you work with professionally all day is that sometimes your "platform as a physical object" (PaaPO) starts exhibiting hardware problems that are tricky to diagnose, and you are not paid to do this so you are averse to spending a lot of your weekend on it. Some messing around and remote hands tickets later the situation seems to have stabilized, and this irritation has given me the impetus to get started on my plans to move this infrastructure back to Albuquerque.

Let's talk a bit about something practical. Since my academic background is in computer security, it's ego-inflating to act like some kind of expert from time to time. Although I have always focused primarily on networking, I also have a strong interest in the security and forensic concerns surrounding file systems and storage devices. Today, we're going to look at storage devices.

It's well known among computing professionals that hard disk drives pose a substantial risk of accidental data exposure. A common scenario is that a workstation or laptop is used by a person to process sensitive information and then discarded as surplus. Later, someone buys it at auction, intercepts it at a recycler, or similar and searches the drive for social security numbers. This kind of thing happens surprisingly frequently, perhaps mostly because the risk is not actually as common knowledge as you would think. I have a side hustle, hobby, and/or addiction of purchasing and refurbishing IT equipment at auction. I routinely purchase items that turn out to have intact storage, including from government agencies.

So, to give some obvious advice: pay attention to old devices. If your organization does not have a policy around device sanitization, it should. Unfortunately the issue is not always simple, and even organizations which require sanitization of all storage devices routinely screw it up. A prominent example is photocopiers, for years organizations with otherwise good practices were sending photocopiers back to leasing companies or to auction without realizing that most photocopiers these days have nonvolatile storage to which they cache documents. So having a policy isn't really good enough on its own: you need to back it up with someone doing actual research on the devices in question. I have heard of a situation in which a server was "sanitized" and then surplussed with multiple disk drives intact because the person sanitizing it didn't realize that the manufacturer had made the eccentric decision to put additional drive bays on the rear of the chassis!

But that's all sort of besides the point. We all agree that storage devices need to be sanitized before they leave your control... but how?

Opinions on data sanitization tend to fall into two camps. Roughly, those are "an overwrite is good enough" and "the only way to be sure is to nuke it from orbit." Neither of these positions are quite correct, and I will present an unusually academic review here of the current state of storage sanitization, along with my opinionated advice.

The black-marker overwrite

The most obvious way to sanitize a storage device, perhaps after burying it in a hole, is to overwrite the data with something else. It could be ones, it could be zeroes, it could be random data or some kind of systematic pattern. The general concept of overwriting data to destroy it presumably dates back to the genesis of magnetic storage, but for a long time it's been common knowledge that merely overwriting data is not sufficient to prevent recovery.

A useful early illustration of the topic is Venugopal V. Veeravali's 1987 master's thesis, "Detection of Digital Information from Erased Magnetic Disks." It's exactly what it says on the tin. The paper is mostly formulae by mass, but the key takeaway is that Veeravali connected a spectrum analyzer to a magnetic read head. They showed that the data from the spectrum analyzer, once subjected to a great deal of math, could be used to reconstruct the original contents of an erased disk to a certain degree of confidence.

This is pretty much exactly the thing everyone was worried about, and various demonstrations of this potential lead to Peter Gutmann's influential 1996 paper "Secure Deletion of Data from Magnetic and Solid-State Memory." Gutmann looks at a lot of practical issues in the way storage devices work and, based on consideration of specific patterns that could remain considering different physical arrangements for data storage, proposes the perfect method of data erasure. The Gutmann Method, as it's sometimes called, is a 35-pass scheme of overwriting with both random data and fixed patterns.

The reason for the large number of passes is partially Just To Be Sure, but the fixed pattern overwrites are targeted at specific forms of encoding. The process is longer than strictly needed just because Gutmann believes that a general approach to the problem requires use of multiple erasure methods, one of which ought to be appropriate for the specific device in question. This is to say that Gutmann never really thought 35 passes were necessary. Rather, to put it pithily, he figured eight random passes would do and then multiplied all the encoding schemes together to get 27 passes that ought to even out the encoding-related patterns on the drives of the time.

Another way to make my point is this: Gutmann's paper is actually rather specific to the storage technology of the time, and the time was 1996. So there's no reason to work off of his conclusions today. Fortunately few people do, because a Gutmann wipe takes truly forever.

Another influential "standard" for overwriting for erasure is the "DoD wipe," which refers to 5220.22-M, also known as the National Industrial Security Program Operating Manual, also known as the NISPOM. I can say with a good degree of confidence that every single person who has ever invoked this standard has misunderstood it. It is not a standard, it is not applicable to you, and since 2006 it no longer makes any mention of a 3-pass wipe.

Practical data remanance

The concept of multi-pass overwrites for data sanitization is largely an obsolete one. This is true for several different reasons. Most prominently, the nature of storage devices has changed appreciably. The physical density of data recording has increased significantly. Drive heads now operate on magnetic coils and track dynamically rather than under absolute positioning (reducing error in tracking). And there are of course today many solid-state drives, which repeatedly overwrite data as a matter of normal operating procedure (but at the same time may leave a great deal of data available).

You don't need to take my word on this! Also in 2006, for example, the NIST issued new recommendations on sanitization stating that a single overwrite was sufficient. This may have been closely related to the 2006 NISPOM change. Gutmann himself published a note in 2011 that he no longer believes his famous method to be relevant and assumes a single overwrite to be sufficient.

Much of the discussion of recovery of overwritten data from magnetic media has long concentrated around various types of magnetic microscopes. Much like your elementary school friend who's uncle works for Nintendo, the matter is frequently discussed but seldom demonstrated. Without wanting to go too deep into review of literature and argumentative blog posts, I think it is a fairly safe assertion that recovery of data by means of electron microscopy, force microscopy, magnetic probe microscopy, etc is infeasible for any meaningful quantity of data without enormous resources.

The academic work that has demonstrated recovery of once-overwritten data by these techniques has generally consisted of extensive effort to recover a single bit at a low level of confidence. The error rate makes recovery of even a byte impractical. A useful discussion of this is in the ICISS 2008 conference paper "Overwriting Hard Drive Data: The Great Wiping Controversy," amusingly written in part by a man who would go on to claim (almost certainly falsely) to have invented Bitcoin. It's a strange world out there.

As far as summing up the issue, I enjoy the conclusion of a document written by litigation consultant Fred Cohen:

To date I have found no example of any instance in which digital data recorded on a hard disk drive and subsequently overwritten was recovered from such a drive since 1985... Indeed, there appears to be nobody in the [forensics and security litigation] community that disputes this result with any actual basis and no example of recovery of data from overwritten areas of modern disk drives. The only claims that there might be such a capability are based on notions surrounding possible capabilities in classified environments to which the individuals asserting such claims do not assert they have actual access and about which they claim no actual knowledge.

Recovery of overwritten data by microscopy is, in practice, a scary story to tell in the dark.

The takeaway here is that, for practical purposes, a single overwrite of data on a magnetic platter seems to be quite sufficient to prevent recovery.

It's not all platters

Here's the problem: in practice, remanance on magnetic media is no longer the thing to worry about.

The obvious reason is the extensive use of SSDs and other forms of flash memory in modern workstations and portable devices. The forensic qualities of SSDs are, to put it briefly, tremendously more complicated and more poorly understood than those of HDDs. To even skim the surface of this topic would require its own post (perhaps it will get it one day), but the important thing to know is that SSDs throw out all of the concerns around HDDs and introduce a whole set of new concerns.

The second reason, though, and perhaps a more pervasive one, is that the forensic properties of the magnetic platters themselves are well understood, but those of the rest of the HDD are not.

The fundamental problem in the case of both HDDs and SSDs is that modern storage devices are increasingly complex and rely on significant onboard software in order to manage the physical storage of data. The behavior of that onboard software is not disclosed by the manufacturer and is not well understood by the forensics community. In short, when you send data to an HDD or SSD, we know that it puts that data somewhere but in most cases we really don't know where it puts it. Even in HDDs there can be significant flash caching involved (especially on "fancier" drives). Extensive internal remapping in both HDDs and SSDs means that not all portions of the drive surface (or flash matrix, etc) are even exposed to the host system. In the case of SSDs, especially, large portions of the storage are not.

So that's where we end up in the modern world: storage devices have become so complex that the recovery methods of the 1980s no longer apply. By the same token, storage devices have become so complex that we can no longer confidently make any assertions about their actual behavior with regards to erasure or overwriting. A one-pass overwrite is both good enough at the platter level and clearly not good enough at the device level, because caches, remapping, wear leveling, etc all mean that there is no guarantee that a full overwrite actually overwrites anything important.

Recommended sanitization methods

Various authorities for technical security recommendations exist in the US, but the major two are the NIST and the NSA.

NIST 800-88, summarized briefly, recommends that sanitization be performed by degaussing, overwriting, physical destruction of the device, or encryption (we will return to this point later). The NIST organizes these methods into three levels, which are to be selected based on risk analysis, and physical destruction is the recommended method for high risk material or material where no method of reliable overwriting or degaussing is known.

NSA PM 9-12 requires sanitization by degaussing, disintegration, or incineration for "hard drives." Hard drives, in this context, are limited to devices with no non-volatile solid state memory. For any device with non-volatile solid state memory, disintegration or incineration is required. Disintegration is performed to a 2mm particle size, and incineration at 670 Celsius or better.

Degaussing, in practice, is surprisingly difficult. Effective degaussing of hard drives tends to require disassembly in order to individually degauss the platters, and so is difficult to perform at scale. Further, degaussing methods tend to be pretty sensitive to the exact way the degaussing is performed, making them hard to verify. The issue is big enough that the NSA requires that degaussing be followed by physical destruction of the drive, but to a lower standard than for disintegration (simple crushing is acceptable). For that reason, disintegration and incineration tend to be more common in government contexts.

It's sort of funny that I tell you all about how multiple overwrite passes are unnecessary but then tell you that accepted standards require that you blend the drive until it resembles a coarse glitter. "Data sanitization is easy," I say, chucking drives into a specialized machine with a 5-figure price tag.

The core of the issue is that the focus on magnetic remanance is missing the point. While research indicates that magnetic remanance is nowhere near the problem it is widely thought to be, in practice remanance is not the way that data is sneaking out. The problem is not the physics of the platters, it's the complexity of the devices and the lack of reliable host access to the entire storage capacity.

ATA secure erase and self-encryption and who knows what else

The ATA command set, or rather some version of it, provides a low-level secure erase command that, in theory, causes the drive's own firmware to initiate an overwrite of the entire storage surface. This is far preferable to overwriting from the host system, because the drive firmware is aware of the actual physical storage topology and can overwrite parts of the storage that are not normally accessible to the host.

The problem is that drive manufacturers have been found to implement ATA secure erase sloppily, or not at all. There is basically no external means of auditing that a secure erase was performed effectively. For that simple reason, ATA secure erase should not be relied upon.

Another approach is the self-encrypting drive or SED, which transparently encrypts data as it is written. These devices are convenient since simply commanding the drive to throw away the key is sufficient. SED features tend to be better implemented than ATA secure erase because of the fact that they are only implemented at all on high-end drives that are priced for the extra feature. That said, the external auditing problem still very much exists.

Another option is to encrypt at the host level, and then throw away the key at the host level. This is basically the same as the SED method but since the encryption is performed externally to the drive, the whole thing can be audited externally for assurance. In all reality this is a fine approach to data sanitization and should be implemented whenever possible. If you have ever been on the fence about whether or not to encrypt storage, consider this: if you are effective about encrypting your storage, you won't need to sanitize it later! The mere absence of the key is effective sanitization, as recognized by the NIST.

The problem is that disk encryption features in real devices are inconsistent. Drive encryption may not be available at all, or it may only be partial. This makes encryption difficult to rely on in most practical scenarios.

The bottom line

When you dispose of old electronics, you should perform due diligence to identify all non-volatile storage devices. These storage devices should be physically destroyed prior to disposal.

DIY methods like drilling through platters and hitting things with hammers are not ideal, but should be perfectly sufficient for real scenarios. Recovering data from partially damaged hard drives and SSDs is possible but not easy, and the number of facilities that perform that type of recovery is small. There are lots of ways to achieve this type of significant damage, from low-cost hand-cranked crushing devices to the New Mexican tradition of taking things out to the desert and shooting at them. Await my academic work on the merits of FMJ vs hollow-point for data sanitization. My assumption is that FMJ will be more effective due to penetration in multi-platter drives, but I might be overestimating the hardness of the media, or underestimating the number of rounds I will feel like putting into it.

Ideally, storage devices should be disintegrated, shredded, or incinerated. Unless you are looking forward to making a large quantity of thermite, these methods are difficult without expensive specialized equipment. However, there are plenty of vendors that offer certified storage destruction as a service. Ask your local shredding truck company about their rates for storage devices.

Most conveniently, do what I do: chuck all your old storage devices in a drawer, tell yourself you'll use them for something later, and forget about them. We'll call it long-term retention, or the geologic repository junk drawer.

--------------------------------------------------------------------------------
                                                                        older ->