Windows and MS-DOS use for new lines the control characters CR-LF (carriage return, line feed), while unix uses just LF.

As far as I know CR-LF made sense for systems controlling an real teletyper, which has an actual carriage.
LF only may make sense for teletypers with automatic carriage return or just as simplification on systems which do not need the physical interpretation of these characters anymore.

Now I wonder why MS-DOS being a rather recent OS is using CRLF while Unix which were one of the OS which controlled teletypers only uses LF. It seems like it should be the other way round.

up vote 68 down vote accepted

This is covered largely in the history section of Wikipedia’s entry on newlines. Basically there are two lineages of operating systems leading to modern-day Windows on the one hand, and Unix-like systems on the other.

Windows descends from MS-DOS (because initially it was implemented on top of DOS), which itself inherits much of its behaviour from CP/M. CP/M inherited its line-endings from DEC systems, which used CR+LF because that’s the character sequence required to move the cursor to the start of the next line on ASR-33 teletypes (among others), which were common teletypes used with DEC systems. On most teletypes, CR and LF do just what their names imply: carriage return returns the carriage (carrying the paper) to the right, so the hammers or type head are above the left of the page (or equivalently, it returns the type head to the left of the page, depending on which part of the assembly is mobile), and line feed feeds the page one line up. The order was important: carriage return takes some time to execute, so starting it first meant that the line feed would happen in parallel to the carriage return, and by the time the line feed was processed, the carriage had a decent chance of having finished, so the next character could be processed safely (otherwise it ended up smeared across part of the page as the carriage finished flying back).

Unix was inspired by Multics, whose developers chose LF as the line-ending character, relying on device drivers to translate that to whatever character sequence was required on actual devices. (I recall discussions on this topic where the idea was floated that the Multics developers did this in order to save disk space, but I’m not sure that’s accurate. Another possible reason is that relying on the device driver to handle this meant that each driver could adjust the timing as necessary, without the system having to care about it — CR+LF in particular was chosen partly for timing reasons.)

Some systems used other conventions (see this table); many systems, including all of Apple’s computers before OS X, used a single CR, and obviously non-ASCII systems had their own line-ending characters (this includes IBM mainframes and 8-bit Atari home computers).

  • 2
    Maybe I'm just a punk kid, but the only printing terminals I ever saw attached to DEC systems were LA36 DecWriters. – james large May 3 at 20:25
  • 8
    Re, "...for timing reasons." Anybody remember seeing the smear on the paper where type head on a Teletype machine struck the first character of the next line while the head still was in motion, "returning" from the previous line? – james large May 3 at 20:30
  • 6
    "Unix-style systems descend from Multics" - this is not true. Unix was inspired by Multics. Unix was actually shipped before Multics and was developed because Multics took too long to develop – slebetman May 4 at 0:33
  • 1
    @jameslarge the LA-36 was introduced too late to influence the CR+LF/LF design decisions (1975). DEC routinely sold Teletype hardware, including the ASR-33, and featured them prominently in their brochures (at least until they started building their own teletypes). – Stephen Kitt May 4 at 9:40
  • 3
    @jameslarge I remember when we replaced out ASR 33's with Decwriter LA-36's. I was really impressed at the speed of the LA-36. I remember telling a friend, "This thing is so fast, it can print FASTER THAN YOU CAN READ IT!" – Jay May 4 at 21:35

At the time the PC came out, there were at least five common approaches used by ASCII-based devices and systems:

  1. Devices receiving a CR would advance to the start of the next line, and lines were delineated with just a CR. An LF might behave identically, or might advance to the same spot on the next line, but it wouldn't usually matter because LF codes weren't used much. This approach allowed arbitrary binary graphic data to be included within files to printed.

  2. Devices receiving an LF would advance to the start of the next line, while receipt of a CR would reset them to the start of the current line. Lines were delineated with LF; CR would generally only be used if necessary to overprint the current line. This approach allowed arbitrary binary graphic data to be included within files to printed.

  3. Devices receiving an CR would reset to the start of the current line, and devices receiving an LF would either advance to the start of the next line or the current position on the next line. Lines were delineated with CR+LF--a mode of behavior which was inherently compatible with equipment of types #2 or #3. This approach allowed arbitrary binary graphic data to be included within files to printed.

  4. Lines were delineated with just CR, but devices of types #2 and #3 would be accommodated by replacing any instances of CR with CR+LF. This approach would be prone to malfunction when printing files containing binary graphics data.

  5. Lines were delineated with just LF, but devices of types #2 and #3 would be accommodated by replacing any instances of LF with CR+LF. This approach would be prone to malfunction when printing files containing binary graphics data.

Approach #4 was used by the Apple II among others; approach #5 was used by Unix. When the PC came out, however, many popular printers including the Epson MX-80, were configurable to process CR and LF using approach #1 or #3, but not #2, and they also handled bitmap graphics with a command that would take a specified number of bytes as binary pixel data that needed to sent verbatim even if it contained the bit patterns 00001101 or 00001010. The fact that printers would have problems with #2, #4, or #5 meant that the if MS-DOS wanted to be suitable for use with such printers it would need to adopt approach #1 or #3. Of those choices, #1 is slightly more efficient but #3 offers more efficient overprinting.

  • This impliess MSDOS was designed on certain printers - and reference for that - surely MSDOS design is following CP/M and so earlier than those printers – Mark May 4 at 22:12
  • @Mark: Almost any common printers which could handle ASCII would work usably when files delimited with CR+LF were sent to them verbatim. At absolute worst, they'd produce double-spaced output. Sending both CR+LF wasn't necessarily the most useful format, but it was the closest thing to a universal one. BTW, Adobe PostScript uses the closest thing to a universal receiver: treat either CR or LF as a newline, except that an LF will be ignored if preceded by a non-ignored CR, and a CR will be ignored if preceded by a non-ignored LF. – supercat May 4 at 22:52
  • 1
    ... and Unix ended up using pretty much only PS to talk to printers (as the application-level printing language). That’s not a factor here of course since PS came later. The interesting angle in the DOS/Unix comparison wrt printers is that Unix was specifically built to support a typesetting system... – Stephen Kitt May 5 at 13:16
  • @StephenKitt: In Unix, routing all printer output through a printing utility program made sense. The printing program could run concurrently with other programs, and could ensure that different user's print jobs didn't collide with each other. It does, however, limit the range of usable printer features to those the program knows about. When MS-DOS came out, the personal printer market was very much in a state of flux, and it the only cheap and practical way for MS-DOS to treat the printer was as a pipe to which bytes are sent. – supercat May 5 at 16:59
  • @supercat I agree, the small printer market was too much in a state of flux; I just thought the comparison was amusing. It’s also interesting that one of the first TSRs for DOS was a print spooler, PRINT. – Stephen Kitt May 5 at 17:19

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.