Why indeed? Wouldn't something like &br;
be more appropriate?
An HTML entity reference is, depending on HTML version either an SGML entity or an XML entity (HTML inherits entities from the underlying technology). Entities are a way of inserting chunks of content defined elsewhere into the document. All HTML entities are single-character entities, and are hence basically the same as character references (technically they are different to character references, but as there are no multi-character entities defined, the distinction has no impact on HTML). When an HTML processor sees, for example
So it replaces the entity reference with the entity Now, what would we replace your hypothetical &br; with to cause a line-break to happen? We can't do so with a newline character, or even the lesser known U+2028 LINE SEPARATOR (which semantically in plain text has the same meaning as What we need is not an entity, but a way to indicate semantically that the rendered content contains a line-break at this point. We also need to not indicate anything else (we can already indicate a line-break by beginning or ending a block element, but that's not what we want). The only reasonable way to do so is to have an element that means exactly that, and so we have the |
|||||||||||||
|
A tag and a character entity reference exist for different reasons - character entities are stand-ins for certain characters (sometimes required as escape sequences - for example The reason the There is no single character that has this meaning, though See the answers from @John Kugelman and @John Hanna for more detail on this aspect. Not entirely related, there is another reason why a
Character entities are single character escapes, so cannot represent this, again in the HTML 4 spec:
You will see that all the defined character entities map to a single character. A line break/new line cannot be cleanly mapped this way, thus an entity is required instead of a character entity reference. This is why a line break cannot be represented by a character entity reference. Regardless, it not not needed as simply using the Enter key inserts a line break. |
|||||||||||||||||||||
|
Entities are stand-ins for other characters or bits of text. In HTML they are used to represent characters that are hard to type (e.g. It couldn't be An entity A |
|||||||||||||||||
|
In HTML all line breaks are treated as white space:
And white space does only separate words and sequences of white space is collapsed:
This means that line breaks cannot be expressed by plain characters. And although there are certain special characters in Unicode to unambiguously separate lines and paragraphs, they are not specified to do this in HTML too:
That means there is no plain character or sequence of plain characters that is to mark a line break in HTML. And that’s why there is the Now if you want to use
Having this additional entity named br declared, a general-purpose XML or SGML processor will replace every occurrence of the entity reference
|
|||||
|
Entities are content, tags are structure or layout (very roughly speaking). It seems whoever made the |
|||
|
HTML is a mark-up language - it represents the structure of a document, not how that document should appear visually. Take the The same goes for the |
|||||
|
|
|||||||||||||||||||||
|
Yes. An HTML entity would be more appropriate, as a break tag cannot contain text and behaves much like a newline. That's just not the way things are, though. Too late. I can't tell you the number of non-XML-compatible HTML documents I've had to deal with because of unclosed break tags... |
|||||||||||||
|
br
tag should never be used. – Yi Jiang Aug 15 '10 at 16:27pre
) – tcooc Aug 15 '10 at 16:31