Newsgroups: comp.std.internat
From: ojarnef@admin.kth.se (Olle Jarnefors)
Subject: The different space characters of ISO 10646/Unicode
Organization: Royal Institute of Technology, Stockholm (Kungl Tekniska Hogskolan)
Date: 08 Mar 1995 19:31:23 MET
Clive D.W. Feather <clive@sco.com> wrote (Fri, 3 Mar 1995 03:52:33 GMT) in article <D4uIrn.M6q@scone.london.sco.com>:

> Well, many years ago I used to work with real lead type (letterpress
> type). With this, an em space was a square bit of type. An en space is
> half a square - it's the normal interword spacing.

> Em quad and en quad should be a filled-in square and half-square, not a
> space, if I recall correctly.

That's interesting. ISO 10646 in no way indicates that 2000 EN QUAD and 2001 EM QUAD should be filled characters. They are displayed in exactly the same way in code table 35 as all the spacing characters.

The Unicode book explicitly says that these characters are spaces:

"_Typographical Space Characters_. Spaces all have the semantics of being word-break characters. Other than that, the main difference is in the width of the characters. U+2000 -> U+2006 are standard quad widths used in typography. The figure space is provided for use in some languages as a thousands separator. [This implies that it should be narrower than a digit. A quarter of an em space is preferred in Swedish typography, I've been told. /OJ] The punctuation space is a space defined to be the same width as the period. The thin space and the hair space are successively smaller-width spaces used for narrow word gaps and for justification of type. All the fixed-width space characters are derived from conventional (hot lead) typography. Their functions are mostly replaced by algorithmic kerning and justification in computerized typography."

The TeX system for computerized typography seems to have these kinds of spaces (from my cursory reading of the TeX Book):

\qquad         double quad
\quad          quad
\enskip        half a quad
\enspace       half a quad, non-breaking
\;             thick space, 5/18 of a quad, in mathematical formulas
\>             medium space, 2/9 of a quad, in mathematical formulas
\thinspace     1/6 of a quad, non-breaking
\,             thin space, 1/6 of a quad, in mathematical formulas
This corresponds pretty well to the 10646/Unicode spaces:
2000 EN QUAD
2001 EM QUAD
2002 EN SPACE              \enskip  \enspace
2003 EM SPACE              \quad
2004 THREE-PER-EM SPACE    approximately \;
2005 FOUR-PER-EM SPACE     approximately \>
2006 SIX-PER-EM SPACE      \thinspace  \,
2007 FIGURE SPACE
2008 PUNCTUATION SPACE
2009 THIN SPACE
2010 HAIR SPACE
It suggests, however, another possible distinction between EM QUAD and EN QUAD on the one hand, and EM SPACE and EN SPACE on the other hand. Maybe the former are "breaking" spaces, and the latter non-breaking spaces? (All narrower spaces would also be non-breaking.)

--
Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@admin.kth.se>