| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| <html> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <title>Why do I need a shaping engine?: HarfBuzz Manual</title> |
| <meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> |
| <link rel="home" href="index.html" title="HarfBuzz Manual"> |
| <link rel="up" href="what-is-harfbuzz.html" title="What is HarfBuzz?"> |
| <link rel="prev" href="what-is-harfbuzz.html" title="What is HarfBuzz?"> |
| <link rel="next" href="ch01s03.html" title="What does HarfBuzz do?"> |
| <meta name="generator" content="GTK-Doc V1.25 (XML mode)"> |
| <link rel="stylesheet" href="style.css" type="text/css"> |
| </head> |
| <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> |
| <table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="5"><tr valign="middle"> |
| <td width="100%" align="left" class="shortcuts"></td> |
| <td><a accesskey="h" href="index.html"><img src="home.png" width="16" height="16" border="0" alt="Home"></a></td> |
| <td><a accesskey="u" href="what-is-harfbuzz.html"><img src="up.png" width="16" height="16" border="0" alt="Up"></a></td> |
| <td><a accesskey="p" href="what-is-harfbuzz.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td> |
| <td><a accesskey="n" href="ch01s03.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td> |
| </tr></table> |
| <div class="section"> |
| <div class="titlepage"><div><div><h2 class="title" style="clear: both"> |
| <a name="why-do-i-need-a-shaping-engine"></a>Why do I need a shaping engine?</h2></div></div></div> |
| <p> |
| Text shaping is an integral part of preparing text for |
| display. Before a Unicode sequence can be rendered, the |
| codepoints in the sequence must be mapped to the corresponding |
| glyphs provided in the font, and those glyphs must be positioned |
| correctly relative to each other. For many of the scripts |
| supported in Unicode, these steps involve script-specific layout |
| rules, including complex joining, reordering, and positioning |
| behavior. Implementing these rules is the job of the shaping engine. |
| </p> |
| <p> |
| Text shaping is a fairly low-level operation. HarfBuzz is |
| used directly by text-handling libraries like <a class="ulink" href="https://www.pango.org/" target="_top">Pango</a>, as well as by the layout |
| engines in Firefox, LibreOffice, and Chromium. Unless you are |
| <span class="emphasis"><em>writing</em></span> one of these layout engines |
| yourself, you will probably not need to use HarfBuzz: normally, |
| a layout engine, toolkit, or other library will turn text into |
| glyphs for you. |
| </p> |
| <p> |
| However, if you <span class="emphasis"><em>are</em></span> writing a layout engine |
| or graphics library yourself, then you will need to perform text |
| shaping, and this is where HarfBuzz can help you. |
| </p> |
| <p> |
| Here are some specific scenarios where a text-shaping engine |
| like HarfBuzz helps you: |
| </p> |
| <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| <li class="listitem"> |
| <p> |
| OpenType fonts contain a set of glyphs (that is, shapes |
| to represent the letters, numbers, punctuation marks, and |
| all other symbols), which are indexed by a <code class="literal">glyph ID</code>. |
| </p> |
| <p> |
| A particular glyph ID within the font does not necessarily |
| correlate to a predictable Unicode codepoint. For instance, |
| some fonts have the letter "a" as glyph ID 1, but |
| many others do not. In order to retrieve the right glyph |
| from the font to display "a", you need to consult |
| the table inside the font (the <code class="literal">cmap</code> |
| table) that maps Unicode codepoints to glyph IDs. In other |
| words, <span class="emphasis"><em>text shaping turns codepoints into glyph |
| IDs</em></span>. |
| </p> |
| </li> |
| <li class="listitem"> |
| <p> |
| Many OpenType fonts contain ligatures: combinations of |
| characters that are rendered as a single unit. For instance, |
| it is common for the "f, i" letter |
| sequence to appear in print as the single ligature glyph |
| "fi". |
| </p> |
| <p> |
| Whether you should render an "f, i" sequence |
| as <code class="literal">fi</code> or as "fi" does not |
| depend on the input text. Instead, it depends on the whether |
| or not the font includes an "fi" glyph and on the |
| level of ligature application you wish to perform. The font |
| and the amount of ligature application used are under your |
| control. In other words, <span class="emphasis"><em>text shaping involves |
| querying the font's ligature tables and determining what |
| substitutions should be made</em></span>. |
| </p> |
| </li> |
| <li class="listitem"> |
| <p> |
| While ligatures like "fi" are optional typographic |
| refinements, some languages <span class="emphasis"><em>require</em></span> certain |
| substitutions to be made in order to display text correctly. |
| </p> |
| <p> |
| For example, in Tamil, when the letter "TTA" (ட) |
| letter is followed by "U" (உ), the pair |
| must be replaced by the single glyph "டு". The |
| sequence of Unicode characters "டஉ" needs to be |
| substituted with a single "டு" glyph from the |
| font. |
| </p> |
| <p> |
| But "டு" does not have a Unicode codepoint. To |
| find this glyph, you need to consult the table inside |
| the font (the <code class="literal">GSUB</code> table) that contains |
| substitution information. In other words, <span class="emphasis"><em>text shaping |
| chooses the correct glyph for a sequence of characters |
| provided</em></span>. |
| </p> |
| </li> |
| <li class="listitem"> |
| <p> |
| Similarly, each Arabic character has four different variants |
| corresponding to the different positions it might appear in |
| within a sequence. Inside a font, there will be separate |
| glyphs for the initial, medial, final, and isolated forms of |
| each letter, each at a different glyph ID. |
| </p> |
| <p> |
| Unicode only assigns one codepoint per character, so a |
| Unicode string will not tell you which glyph variant to use |
| for each character. To decide, you need to analyze the whole |
| string and determine the appropriate glyph for each character |
| based on its position. In other words, <span class="emphasis"><em>text |
| shaping chooses the correct form of the letter by its |
| position and returns the correct glyph from the font</em></span>. |
| </p> |
| </li> |
| <li class="listitem"> |
| <p> |
| Other languages involve marks and accents that need to be |
| rendered in specific positions relative a base character. For |
| instance, the Moldovan language includes the Cyrillic letter |
| "zhe" (ж) with a breve accent, like so: "ӂ". |
| </p> |
| <p> |
| Some fonts will provide this character as a single |
| zhe-with-breve glyph, but other fonts will not and, instead, |
| will expect the rendering engine to form the character by |
| superimposing the separate "ж" and "˘" |
| glyphs. |
| </p> |
| <p> |
| But exactly where you should draw the breve depends on the |
| height and width of the preceding zhe glyph. To find the |
| right position, you need to consult the table inside |
| the font (the <code class="literal">GPOS</code> table) that contains |
| positioning information. |
| In other words, <span class="emphasis"><em>text shaping tells you whether you |
| have a precomposed glyph within your font or if you need to |
| compose a glyph yourself out of combining marks—and, |
| if so, where to position those marks.</em></span> |
| </p> |
| </li> |
| </ul></div> |
| <p> |
| If tasks like these are something that you need to do, then you |
| need a text shaping engine. You could use Uniscribe if you are |
| writing Windows software; you could use CoreText on macOS; or |
| you could use HarfBuzz. |
| </p> |
| <div class="note"><p> |
| In the rest of this manual, the text will assume that the reader |
| is that implementor of a text-layout engine. |
| </p></div> |
| </div> |
| <div class="footer"> |
| <hr>Generated by GTK-Doc V1.25</div> |
| </body> |
| </html> |