Your American History Reference Guide!
- Space (punctuation)

HistoryMania Information Site on Space (punctuation) American History American History Search        American History Browse welcome to our free resource site for all enthusiasts!

Space (punctuation)

A space is a punctuation convention for providing interword separation in some scripts, including the Latin, Greek, Cyrillic, and Arabic.

Not all languages use spaces between words; the ancient Latin and Greek did not. Spaces were not used to separate words until roughly 600800 AD. (See interword separation for more on the history.) Traditionally, all CJK languages have no space: modern Chinese and Japanese (except when written with little or no kanji) still do not, but modern Korean uses spaces.

For use of spaces after full stops, exclamation marks, and question marks, see discussion in the article Full stop.

Spaces and computers

In programming language syntax, spaces are frequently used to explicitly separate tokens. Aside from this use, spaces and other whitespace characters are usually ignored by most modern programming languages; with the exception of Haskell, ABC, and Python, which use the amount of whitespace in indentation to indicate the scope of a block (unlike Algol-derived languages, like Pascal, C and Perl, which use braces for that purpose).

In word processors and text editors, if a line on a screen is shorter than the width of the screen or window, then the empty space to the right usually does not correspond with space characters in the file: there is simply a code indicating that the next text should be put on a new line. Thus, the size of the file is not made unnecessarily larger. If there are space characters, one usually does not see the difference; text editors and word processors often have an option to make them visible. Also, if there is a space character, the cursor can move there, otherwise usually not.

Spaces and digital typography

In computer programming, the normal space corresponds to Unicode and ASCII character 32, or U+0020. In HTML and XML multiple spaces or new line characters collapse into a single space, unless they are contained in an HTML element such as pre, the xml:space="preserve" XML attribute is used, or CSS defines whitespace="pre" (or pre-line or pre-wrap). The special non-breaking-space   always gives a non-collapsable space character, often used to indent text, though some web authorities discourage using it for that purpose.

Other kinds of spaces exist for special uses: for example an em dash can optionally be surrounded with a so-called hair space, Unicode character 8202, or U+200A. This space should be much thinner than a normal space, and is seldom used on its own. It can be written in HTML by using the numeric character entity   or  . Unfortunately, very few user agents are able to render a hair space correctly: in most cases the result is an unwanted symbol or a question mark on the screen (depending on the font).

Normal space versus hair space
Normal space left right
Normal space with em dash left — right
Hair space with em dash left — right
No space with em dash left—right

Unicode defines several space characters for fine typography. Depending on the browser and fonts used to view this table, not all spaces may display properly:

Space characters defined in Unicode
Code HTML entity Name In Block Display Description
U+0020 not necessary Space Basic Latin ] [ Normal space, same as ASCII character 0x20
U+00A0   No-Break Space Latin-1 Supplement ] [ Identical to U+0020, but not a point at which a line may be broken
U+1680   Ogham Space Mark Ogham ] [ Used for interword separation in Ogham text. Normally a vertical line in vertical text or a horizontal line in horizontal text, but may also be a blank space in "stemless" fonts. Requires an Ogham font.
U+2002   En Space, or Nut General Punctuation ] [ Width of one en (half of one em)
U+2003   Em Space, or Mutton General Punctuation ] [ Width of one em
U+2004   Three-Per-Em Space, or Thick Space General Punctuation ] [ One third of an em wide
U+2005   Four-Per-Em Space, or Mid Space General Punctuation ] [ One fourth of an em wide
U+2006   Six-Per-Em Space General Punctuation ] [ One sixth of an em wide
U+2007   Figure Space General Punctuation ] [ In fonts with monospaced digits, equal to the width of one digit
U+2008   Punctuation Space General Punctuation ] [ As wide as the narrow punctuation in a font
U+2009   Thin Space General Punctuation ] [ One eighth of an em wide
U+200A   Hair Space General Punctuation ] [ Thinner than a thin space
U+200B ​ Zero-Width Space General Punctuation ]​[ Used to indicate word boundaries to text processing systems when using scripts that do not use explicit spacing; normally not a visible separation, but it may expand in passages that are fully justified
U+202F   Narrow No-Break Space General Punctuation ] [ Similar to U+00A0 No-Break Space
U+205F   Medium Mathematical Space General Punctuation ] [ Used in mathematical formulae
U+3000   Ideographic Space CJK Symbols and Punctuation ] [ As wide as a CJK character cell

Unicode also provides some visible characters to stand in for space when necessary in the "Control Pictures" block: the Symbol For Space ␠ (U+2420), the Blank Symbol ␢ (U+2422), and the Open Box ␣ (U+2423).

See also

Last updated: 06-02-2005 13:33:42
The contents of this article are licensed from Wikipedia.org under the
GNU Free Documentation License. How to see transparent copy
Search | Browse | Contact | Legal info