9.1 Text terminology
- Character
- A character, as far as an XML document is concerned, is a byte or bytes with a
numeric value according to the Unicode standard. For example, what we call the
letter "g" is the character with Unicode value 103.
- Glyph
- A glyph is the visible representation of a character or characters.
A single character can have many different glyphs to represent it, for example g and g.
Multiple characters can reduce to a single glyph; some fonts have separate glyphs for the letter combinations "fl" and "ff" to make their spacing look better (these are called ligatures).
Other times, a single character can be composed of multiple glyphs; a print program might create the character é (which has Unicode value 233) by combining the "e" glyph with a non-spacing accent mark "´".
- Font
- A collection of glyphs representing a certain set of characters.
- Glyph measurements
-
All the glyphs in a font will normally have the following characteristics in common:
- baseline - all the glyphs in a font line up on the baseline
- ascent - the distance from the baseline to the top of the character
- descent - the distance from the baseline to the bottom of the character
The total height of the character is also called the em-height.
The em-box is a square that has a width as large as an em-height.