Coding Adventure: Rendering Text
Summary
TLDRIn this Coding Adventures episode, the focus is on the intricacies of rendering text using TrueType font files. The host delves into the complexities of font file formats, exploring the challenges of decoding and rendering glyphs. Through trial and error, various methods are tested, including parsing font tables, handling bezier curves, and implementing anti-aliasing for smoother edges. The video also touches on the patent issues surrounding certain rendering techniques and the host's journey in overcoming various rendering artifacts and issues with overlapping shapes and complex fonts.
Takeaways
- ð The video is about the process of rendering text using font files and the complexities involved in ensuring clear and legible display.
- ð The journey begins with understanding the TrueType font format and the importance of tables like 'glyf' for storing glyph shapes.
- ð Parsing font files involves dealing with endianness issues, byte offsets, and various data structures that define the properties and contours of glyphs.
- ðš Rendering glyphs involves interpreting the font's bytecode instructions and handling the intricacies of bezier curves and line segments.
- ð The importance of handling compound glyphs and understanding the mapping between glyph indices and unicode values is highlighted.
- ð The video emphasizes the challenge of rendering text at low resolutions and the need for techniques like anti-aliasing to improve legibility.
- ð¡ The exploration of different rendering techniques, such as using textures, signed distance fields, and direct bezier rendering, is discussed.
- ð Debugging font rendering issues involves creating 'evil artifact detectors' and refining algorithms to handle edge cases and font peculiarities.
- ð The video showcases the improvement in performance and visual quality through the use of shaders and optimized algorithms.
- ð The script concludes with the acknowledgment of remaining challenges in rendering complex fonts and the need for further exploration and refinement.
- ð The content is a rich resource for understanding the technical aspects of font rendering and the trade-offs involved in different approaches.
Q & A
What is the primary focus of the video?
-The primary focus of the video is to explore the process of rendering text using font files, specifically focusing on the technical aspects of decoding and rendering fonts, and handling various challenges such as anti-aliasing and overlapping shapes.
Which font file format does the video begin with?
-The video begins with the TrueType font file format, which was developed by Apple in the late 1980s.
What is the significance of the 'glyf' table in a font file?
-The 'glyf' table is significant because it stores the shapes of the glyphs, or characters, in the font.
What is the main problem that the video tries to address with low-resolution text rendering?
-The main problem addressed is ensuring that text is displayed or printed clearly at low resolutions, where unlucky scaling could cause parts of a letter to appear twice as thick as others, making the letters difficult to read.
How does the video tackle the endianness issue when reading font tables?
-The video creates a FontReader class in C# that handles the conversion from big endian to little endian, allowing the program to correctly read the font table data.
What is the role of the 'head' and 'maxp' tables in a font file?
-The 'head' table contains metadata about the font, while the 'maxp' table provides information about the maximum number of glyphs and other data useful for font processing.
What is a compound glyph, and how is it handled in the video?
-A compound glyph is a character made up of other glyphs. In the video, compound glyphs are handled by reading in each component glyph and combining their points and contour end indices into a single glyph.
How does the video address the issue of text legibility at small sizes?
-The video acknowledges that legibility at small sizes is a challenge and suggests that delving into the bytecode interpreter aspects of font rendering might be necessary for improvement, though it does not provide a concrete solution within the scope of the video.
What is the basic idea behind the anti-aliasing technique implemented in the video?
-The basic idea behind the anti-aliasing technique is to use a grid of points within each pixel and calculate a coverage value based on how many of those points are inside the glyph, resulting in smoother edges and improved appearance at larger magnifications.
How does the video propose to optimize the rendering performance for complex fonts?
-The video suggests optimizing the rendering performance by splitting each glyph into multiple bands so that each pixel only has to test the bezier segments inside its band, and by constructing tighter fitting meshes tailored for each individual glyph through pre-processing.
Outlines
ð Introduction to Font Rendering
The video begins with the host expressing excitement about exploring the process of rendering text, starting with the need to select a font. The chosen font, JetBrainsMono-Bold, is opened in a text editor to inspect its contents, which are found to be unintelligible. The host decides to refer to a reference manual to understand the TrueType font format, developed by Apple in the 1980s. The complexity of the font file is discussed, including its tables and bytecode instructions. The focus is on the 'glyf' table, which is believed to contain the shapes of the characters. The host encounters an issue with the number of tables read from the font file but resolves it by understanding the endianness of the file format. The video then delves into the structure of the tables and how to interpret them.
ð Understanding the Font Table Structure
The host continues to explore the font tables, focusing on the table directory and the metadata it provides. The discussion includes the identification tags and byte offsets of the tables. The host experiences initial difficulties with reading the tags due to skipping over unnecessary data. After adjusting the code, the host successfully reads the tags and identifies the glyph table entry. The structure of the glyph data is explained, including contours, bounding boxes, and instructions. The host also discusses the concept of 'FWords' and how they relate to the glyph data. The video then demonstrates how to read the glyph data and plot it in the Unity engine, revealing a distorted character glyph. The host refines the code to correctly display the glyphs and expresses eagerness to render all characters in the font.
ðš Bezier Curves and Glyph Transformation
The host introduces Bezier curves as a method to smooth out the blocky appearance of the glyphs. A brief explanation of Bezier curves is provided, including the concept of control points and how they affect the curvature of a line. The host visualizes the curves with a creative lazer-beam analogy. The mathematical aspect of Bezier curves is simplified with a linear interpolation function. The video then shows how to implement Bezier curves in the code to replace the blocky glyphs with smooth curves. The host tests the code with a 'Hello World!' message, but encounters issues with character mapping and spacing. The video ends with the host planning to address these issues in future videos.
ð Debugging and Refining Glyph Rendering
The host addresses the issues from the previous attempt by implementing a character map to correctly display the 'Hello World!' message. The video discusses the various formats of character mapping and focuses on the most common ones. The host writes code to handle these mappings and successfully displays the message. However, the host notices that certain letters are missing and deduces that they are compound glyphs. The video explains the concept of compound glyphs and how they are stored in the font file. The host implements a solution to render compound glyphs and tests it with lower-case letters and diacritics. The video concludes with the host planning to test different fonts and explore more rendering techniques.
ðïž Advanced Rendering Techniques
The host experiments with different rendering techniques for text, starting with a discussion on increasing the line thickness to create solid shapes. The video explores polygon triangulation and the use of a shader to fill in the curves. The host also considers the idea of pre-rendering glyph meshes to a texture file for efficient rendering. The video then discusses the concept of signed distance fields for text rendering and their benefits in scaling. The host concludes that direct rendering from Bezier data is preferred for quality and begins to discuss the challenges of text rendering, including the need for anti-aliasing and handling overlapping shapes.
ð Artifact Detection and Algorithm Refinement
The host creates an 'artifact detector' to identify and quantify rendering issues. The video explains the process of testing each pixel within a glyph and how the results are used to identify artifacts. The host identifies a flickering issue when zooming and panning the text and attempts to understand its cause. The video delves into the problem of double-counting intersections and presents a solution to record and ignore close intersection points. The host also discusses the limitations of the 'closest curve' approach and considers returning to the intersection counting method. The video concludes with the host expressing a desire to improve the rendering algorithm further.
ð Revising the Rendering Approach
The host revisits the intersection counting method, using insights gained from debugging to refine the approach. The video explains the new method for skipping curves and handling curves going in opposite directions. The host addresses issues with overlapping shapes and introduces a method to track the closest distance to the exit of shapes. The video demonstrates the improved rendering with different fonts and discusses the challenges of legibility at small sizes. The host acknowledges the limitations of the current approach and suggests potential optimizations. The video concludes with a test of the latest rendering method on a large text wall and a reflection on the journey of text rendering exploration.
Mindmap
Keywords
ð¡Font Rendering
ð¡TrueType Font (TTF)
ð¡Glyph
ð¡Endianness
ð¡Bytecode Instructions
ð¡Font Directory
ð¡Contours
ð¡Bytecode
ð¡Font Reader Class
ð¡Glyph Table
ð¡Bezier Curves
Highlights
Exploring the complexities of rendering text from font files, specifically focusing on the challenges of interpreting TrueType Font (TTF) files.
Discussing the historical development of TTF files, originally created by Apple in the late 1980s.
The importance of understanding the structure of font files, including mandatory tables like 'glyf' that store glyph shapes.
Addressing endianness issues when reading font files, with a focus on big endian and little endian systems.
Decoding the font directory and table entries, including handling 4-letter tags and byte offsets.
Understanding the concept of 'contours' that make up glyphs and the potential complexity of compound glyphs.
The role of 'FWords' in defining the bounding box of a glyph and the intricacies of handling x and y coordinates.
The challenges of interpreting bytecode instructions and flags within font files for rendering purposes.
The significance of the 'maxp' and 'head' tables in determining the total number of glyphs and glyph locations.
The process of extracting glyph locations from the 'loca' table and the importance of mapping glyphs to Unicode values.
The introduction of Bezier curves for smoothing out the edges of rendered glyphs, enhancing visual appeal.
The exploration of different rendering techniques, such as increasing line thickness versus using polygon triangulation.
The challenges of rendering text at different sizes and the need for anti-aliasing to improve legibility.
The development of a method for detecting and correcting faulty contours in fonts to improve rendering accuracy.
The creation of an 'artifact detector' to systematically identify and address rendering errors.
The implementation of a shader-based approach for efficient, parallel computation of pixel coverage.
The exploration of sub-pixel anti-aliasing as a technique to enhance the smoothness of rendered text.
The final demonstration of rendering a variety of fonts with the developed techniques, showcasing the potential and limitations of the approach.
Transcripts
Hello everyone, and welcome to another episode of Coding Adventures. Today, Iâd like to
try and render some text, since itâs one of those things that I always just take for
granted. To begin with, weâre going to need a font.
And Iâve just grabbed the first one I found on my hard-drive, which happens to be JetBrainsMono-Bold.
So let me open that up in a text editor just to see whatâs inside, and not unexpectedly,
the raw contents is highly un-illuminating to say the least, so weâre going to need
to find some kind of guide for decoding this. This is a ttf, or TrueType font file, which
is a format, Iâm told, developed by Apple in the late 1980s. And so, I had a look on
their developer website, and located this handy reference manual. Hopefully itâs nice
and simple to understand. Okay, Iâve just been skimming through it
a little, and Iâm becoming increasingly baffled and bewildered by all the mysterious
diagrams, the long lists of arcane bytecode instructions, not to mention the strange terminology
such as freedom vectors⊠âploopvaluesâ, and twilight zones⊠What is the twilight
zone? *âIt is the middle ground between light and shadowâ* â Youâre not helping!
Anyway, after slowing down a bit and reading more carefully, I realized that most of this
complexity is focused on a single problem, which is making sure that text is displayed
or printed clearly at low resolutions, where unlucky scaling could cause one part of a
letter to appear twice as thick as another, for example, making the letters difficult
to read. So I donât think itâs actually something
we need to worry about for our little experiment today at least, which is definitely a relief!
Letâs move on then to the next section of the manual, which gives us this list of mandatory
tables that all fonts need to include, and Iâm most excited about this âglyfâ table,
since that sounds like where the shapes are stored. So our first goal is to locate that
table. Aand â this looks promising! âThe Font
Directory is a guide to the contents of the font fileâ. Music to my ears.
So this is the very first block of data weâll encounter in the file, and I donât think
we really care about any of it , except for the number of tables. So weâre going to
want to skip over one 32bit integer, and then read this 16bit integer.
And hereâs some quick C# code for doing exactly that. Iâm printing out the number
of tables here by the way, just to make sure itâs a plausible value, which from the tables
listed in the manual, Iâm thinking would be somewhere between the mandatory 9 and around
50 at a maximum maybe. So letâs run this quickly â and we can
see the number of tables isâŠ. 4352. How am I doing something wrong already?
Ookay, I finally found the answer on this osdev wiki entry â which is that the file
format is big endian â which just means that the individual bytes of each value are
stored back-to-front from what my little-endian computer is expecting.
So Iâve quickly made this simple FontReader class, which letâs us read in a 16bit value,
and have this conversion to little-endian happen behind the scenes, so we donât need
to worry about it every time. Alright, so using that, letâs try running the program
again â and now the number of tables comes out as⊠17. Which is much more believable.
That means we can proceed to the table directory. And what this gives us is a 4-letter identifying
tag for each table â and weâve briefly seen what those look like already â and
then also the tableâs location in the file, in the form of a byte offset.
So in the code Iâve just added this little loop over the total number of tables, and
in there itâs simply reading in this metadata for each of them. Of course Iâve had to
expand our little reader class in the process, with these two new functions for actually
reading a tag, and a 32bit integer. And you can see here my, possibly somewhat clunky
approach to reversing the order of the bytes in there.
So, hopefully this is going to work⊠Letâs try running it, aand these tags look a little
strange. For example we seem to have a tiny head over here with a big question mark mark
next to it. Which does incidentally sum up how Iâm feeling right now. Oh wait! I just
realized that I forgot to skip over all that other stuff in that first table that we didnât
really care about. Let me just do that quickly â I think it was 3 16bit values, so thatâs
6 bytes we want to hop over here. Alright⊠And now⊠Weâve got the tables!
Iâve spotted the glyph table entry over here, which is pointing us to byte location
35436, so thatâs where weâre headed next! Letâs take a quick look at the manual first
though â and it looks like the first bit of information weâll be given is the number
of contours that make up the first glyph in the table. Interestingly, that value can be
negative, which would indicate that this is a âcompoundâ glyph made up of other glyphs,
but letâs worry about that later! We then have a bunch of 16bit âFWordsâ,
which is an interesting name choice, and these tell us the bounding box of the glyph.
Following that comes the actual data such as these x and y coordinates in here, which
Iâm eager to get get my hands on. Although we should take note that these coordinates
are relative to the previous coordinate, so theyâre more like offsets I guess we could
say. They can also be in either an 8bit or 16bit format, which I assume these flags here
are hopefully going to tell us. Then thereâs also an array of instructions,
which I believe is the scary bytecode stuff weâre skipping over, and finally, or I guess
firstly â Iâm not sure why I read this table backwards â we have indices for the
end points of each contour. So if Iâm understanding this correctly,
weâre going to read in a bunch offsets, from which we can construct the actual points,
and then say we get two contour end indices â like 3, and 6 for example âthat just
means that the first contour connects points 0, 1, 2, 3 and back to 0. While the second
contour connects points 4, 5, 6, and back to 4.
That seems easy enough, so the other thing to wrap our heads around is that 8bit flag
value we saw. Although only 6 of the bits are actually used. So weâre going to have
one of these for each point, and according to the guide, bits 1 and 2 tell us whether
the offsets for that point are stored in an unsigned 1 byte format, or a signed 2 byte
format. Then, getting just slightly convoluted here,
bits 4 and 5 have two different meanings, depending on whether the 1 byte, or 2 byte
representation is being used. In the first case, it tells us whether that byte should
be considered positive or negative. And in the second case, thatâs not necessary since
the sign is included in the offset itself, and so it instead tells us whether or not
we should skip the offset. That way, if the offset is zero, it doesnât have to actually
be stored in the file. So we can see thereâs some basic compression happening here.
On that note, letâs take a look at bit 3. If this is on, it tells us to read the next
byte in the file, to find out how many times this flag should be repeated, so that it doesnât
have to waste space actually storing repeated instances of the same flag.
And finally, bit 0 tells us whether a point is on the curve or off the curve â and Iâm
not entirely certain what that means right now, but we can worry about it later â letâs
first just get these points loaded in. By the way, to actually test if a particular
bit in the flag is on or off â Iâm using this tiny function here which simply shifts
all the bits over so that the bit weâre interested in is in the first spot, then masks
it out, meaning all the other bits get turned off, and finally checks if the resulting value
is equal to 1. Alright, and then hereâs a function Iâve
been working on for actually loading one of these simple glyphs. And this does exactly
what we just talked about: it reads in the contourEndIndices, then reads in all of the
flag values â of course checking each of them to see if they should be repeated some
number of times â and finally reading in the x and y coordinates, before simply returning
all of that data. And for reading in the coordinates, I have
other little function here â where each coordinate starts out with the value of the
previous coordinate, and then we grab the flag for this point, and depending on that,
we either read in a single byte for the offset, and add or subtract it based again on what
the flag tells us to do; or we read in a 16bit offset, but only if the flag isnât telling
us to skip this one. Alright, Iâm excited to see what comes out
of this â so back over here Iâve just made a dictionary for mapping table tags to
locations, which we can use to set the readerâs position to the start of the glyph table,
and then simply read in the first glyph, and print out its data.
So letâs what that looks like quickly. Okay, that seems promising⊠but we wonât really
know if itâs nonsense or not until we draw it. So in the Unity engine Iâve just quickly
plotted these points â and I donât know what this is supposed to be, but Iâm sure
it will become clear once we draw in the contours. Okay, itâs looking a little dishevelled,
but I do believe itâs the missing character glyph â and I actually remember now reading
that itâs required to be at the very start of the glyph table, so that makes sense. I
just need to iron out some bugs clearly, so Iâve been fiddling with the code a bit,
and hereâs how itâs working now. We simply loop over all the end indices, and
then create this little window into the array, which allows us to access only the points
in the current contour â since evidently I canât be trusted â and then we simply
draw lines between those, looping back to the first point in the contour to close the
contour at the end. Alright, letâs try it out â and thatâs
looking much better! One glyph is not enough though, I want all
the glyphs! And to get them, we just hop over to the âmaxpâ table, where we can look
up the total number of glyphs in the font. Then we head over to the âheadâ table
and leap over several entries we donât care about right now, to find out whether the locations
of the glyphs are stored in a 2 byte or 4 byte format â and finally we can dive into
the âlocaâ table, from which we can extract the locations of all of the glyphs in the
glyf table. And then we can just read them in.
I must admit, even though parsing a font file must be pretty low on the list of thrilling
things one could do with oneâs day, Iâm quite excited to see all these letters appearing
here. Anyway, letâs zoom in on one of these glyphs,
such as the big B for instance, and we can see that while the glyphs are obviously very
beautiful, they are perhaps, a bit blocky. So I believe our next step should be to bezier
these bad boys. Iâve babbled about beziers a bunch before
on this channel, but briefly â if we have two points, and want to draw a curve between
them â we need at least one extra control point to control how the curve should curve.
Now to visualize this curve, letâs imagine a point starting out at the start point, and
moving at a constant speed towards this intermediate control point. From which, a second point
moves simultaneously towards the end point, in the same amount of time that it takes the
first point to complete itâs journey. Like this.
Thatâs not terribly interesting, but like many things in life, it can be made more exciting
with the addition of lazers. So letâs draw a lazer-beam between our two moving points.
And thatâs a bit better already, but for some added flair, letâs then also leave
a ghost trail of the old beams behind⊠and now weâre really getting somewhere.
So the curve that these beams trace out is our bezier curve. And all we need now is a
way to describe this curve mathematically, which is surprisingly easy to do. We just
need to imagine one last travelling point â this one moving between the first two
travellers, and also completing its journey in the same amount of time. Observing the
path of this third point, we can see how it travels perfectly out our curve.
So hereâs a little function to help calculate the positions of these points. It takes in
both the start position, and the destination, as well as a sort of âtimeâ value, which
can be anywhere between 0 and 1, as measure of the progress of the journey.
The pointâs position at the given time is then calculated as the starting point, plus
the offset that takes us from the start to the ending point, multiplied by the time value.
From this linear interpolation, we can build our Bezier interpolation, which just calculates
these intermediate A and B points like we saw, and then interpolates between those to
get the actual point along the curve at the current time.
So, if we want to draw one of these curves, one way to do it would be like this â simply
chopping the curve up into a bunch of discrete points based on this resolution value, and
then drawing straight lines between them. So just testing this out quickly â we can
see that with a resolution of 1 we of course just have a straight line; 2 gives us just
the faintest idea of a curve; and from there it quickly starts to look pretty smooth.
Anyway, letâs get back to our blocky B â now that we have beziers working â and Iâll
just quickly jump into the code here, and modify it so that every set of 3 points in
the contour now gets drawn as a curve. And letâs take a moment to admire how that
looks. Wait what⊠This is not quite what I had in mind, but
it looks kind of interesting, so letâs try writing some text with it, and we can come
back and fix it in a minute. So Iâve created a string here that says
âHello World!â, and we simply go through each character in that string, and of course
to the computer each of these characters is just a number defined by the unicode standard
â and Iâm using that number as an index into the array of glyphs that weâve loaded
in. We then draw each of these, and add some hardcoded
spacing between them, because I still need to figure out where theyâve hidden the actual
spacing information. Alright, letâs take a look at the result.
Okay the letter spacing is too small, so Iâll just increase that quickly, but unfortunately
the text is still not particularly comprehensible. Clearly the glyphs in the font are not in
the same order as unicode â which makes total sense actually, since different fonts
support different subsets of characters, so thereâs never going to be a 1 to 1 mapping.
Instead, the font file needs to include a character map, to tell us which glyph indices
map to which unicode values. This part seems like a bit of a pain because
there are 9 different formats in which this mapping information can be stored. Although,
I then read to my great relief that many of these are either obsolete, or were designed
to meet anticipated needs which never materialized. And reading a bit further, it seems that 4
and 12 are the most important ones to cover, so Iâm going to get cracking on those.
Okay â hereâs the code Iâve written to handle them. And Iâm not going to bore
you with the details, all itâs doing is looking up, in a slightly convoluted way,
the unicode value that corresponds to each glyph index.
With that mapping information, we can now retrieve the correct glyphs to display our
little Hello World message. I really like how stylish some of these buggy
beziers look. Perhaps this is the font of the future⊠Oh and by the way, if youâd
like to learn more about the beauty of bezier curves, I highly recommend a video called
the âBeauty of Bezier Curvesâ, and itâs sequel too â the continuity of splines.
Right now though, letâs see if we can fix this mistake that Iâve made.
So first of all, we can that here where thereâs supposed to be a straight line, this point
is actually just controlling how the curve bends as it moves towards this next point
here. So it looks like the glyphs arenât made entirely out of beziers as I was kind
of assuming, but rather have some ordinary line segments mixed in.
Also, taking another look at the manual, it shows this example where we have a spline
made up of two consecutive bezier curves. And what itâs pointing out, is that this
shared point between the two curves is exactly at the midpoint between these two control
points. This is usually where youâd want it to be,
simply because if itâs not at the midpoint, we can see that we no longer get a nice smooth
join between the two curves. They become discontinuous. And so what the manual recommends doing for
these points that lie right in the middle, is to save space by simply deleting them.
As it puts it, the existence of that middle point is implied.
So, weâre going to need to infer these implied points when we load the font. And I believe
that this is where that âon curveâ flag that we saw earlier is going to come in handy.
So in this example, points 1 and two would be flagged as âoff the curveâ, whereas
points 0 and 3 would be on the curve, since the curve actually touches them.
What we can do then is say that if we find two consecutive off-curve points, weâll
insert a new on-curve point between them. Also, just to turn everything into beziers
for consistencyâs sake, letâs also say that if we find a straight line segment â which
would be two consecutive on-curve points, then weâll insert an extra control point
in between them. Alright, so Iâve been writing some code
to handle that quickly, which as we just talked about simply checks if we have two consecutive
on or off-curve points, and if so, it inserts a new point in between them. And hereâs
a quick test to see if thatâs working as expected. So these are the on and off curve
points that are actually stored within the font file for our letter B. And then here
are the implied on curve points that weâve now inserted into the contours, as well as
these extra points between the straight line segments, to turn them into valid bezier curves.
With that, our code for drawing these curves finally works as expected. And just for fun,
I made it so that we can grab these points and move them around to edit the glyphs if
we want. Anyway, letâs put this to the test quickly
by drawing all the letters of the English alphabet here, and these all seem to have
come out correct. So letâs then also try the lowercase letters,
and these are looking good too⊠except I notice that we are missing the i and the j.
I guess they must be compound glyphs since we havenât handled those yet, and I think
the idea is just that the âdotâ would be stored as a separate glyph, and then the
i and the j would both reference that, instead of each having to store their own copy of
it. So Iâve just spent some time implementing
compound glyphs over here, and since these are made up of multiple component glyphs,
essentially all we need to do is keep reading in each component glyph, until we receive
word that weâve reached the last one. And all the points and contour end indices from
those, are simply squished together into a single glyph, which is returned at the end
here. Then here is the function for actually reading
in a component glyphs, and this essentially just gives us the index to go and look up
the data of a simple glyph. Actually, Iâm wondering now if a compound glyph can contain
another compound glyph. Iâll need to tweak this later if thatâs the caseâŠ
But anyway the simple glyph weâve just read in can then be moved around, scaled, or even
rotated â if I had implemented that yet that is â to orient it correctly relative
to the other components. And we can see that transformation being applied to the points
over here. So with that bit of code, we can at last read
in our missing i and j. And this has also unlocked a whole world of
diacritics, since of course these all make use of compound glyphs as well, to save having
loads of duplicate data Okay, weâre making good progress I think,
so Iâd like to start trying out some different fonts, just to see if theyâre even working
properly. Letâs write a little test sentence here,
such as the classic one concerning the fox and the lazy dog, or this one I came across
concerning discotheques and jukeboxes. These are good test sentences because of course
they contain all the letters of the alphabet. Which I learned today is called a pangram.
Anyway, this is JetBrainsMono that weâve been using, and Iâm going to try switching
now to maybe⊠Open Sans. Okay, thereâs an issue with the size of
the glyphs clearly â but I found this helpful scaling factor in the head table, which should
take care of that⊠And thatâs looking much better â but obviously the spacing
is a little off. The problem is that this isnât a monospaced
font like we had before, so our hard-coded spacing value is no longer going to cut it.
After some rooting around though, I unearthed this horizontal metrics table, which tells
us how much space we need to leave after each glyph, which is known as the âadvance widthâ.
So hereâs some code for just reading in that information. And if we use those values,
our text now obviously looks a lot more sensible. Okay Iâve just spent a little while refactoring
and cleaning up the project, because things were getting a little out of hand. Letâs
quickly make sure I havenât messed anything up in the process⊠and of course I have.
Alright â I tracked down the issue, so now weâre back on track.
And now that weâre able to write some words and display their outlines, letâs finally
figure out how to properly render the text. The solution is actually surprisingly simple
â we just need to increase the thickness of the lines until they turn into a solid
shape. Alright, thanks for watching! Well, wait a
minute, this is perhaps not the most legible approach, so we should maybe take some time
to explore our options. For example, some years ago I wrote this little
polygon triangulation script, which uses an algorithm called ear-clipping. And one idea
might be to simply feed the points that weâre currently using to draw the outlines into
that, and get out a mesh for the computer to draw. This doesnât seem ideal though
because if we want the letters to be nice and smooth, weâll have to have lots and
lots of little triangles on the screen which the computer might not particularly appreciate.
So a much smarter mesh-based approach is presented in this paper on âResolution Independent
Curve Renderingâ. Basically, we triangulate a mesh of just the sort of straight-line inner
part of the glyph, and then render a single extra triangle for each of the curvy bits.
So to be clear, for each bezier curve on the contour, a triangle is simply being drawn
like this. The trick now is to find some way to display only the section of the triangle
that is inside of the curve, so that we can get a smooth result without having to add
even more triangles. For this, weâre going to need to think about
what the shape of our bezier curve actually is âand just eye-balling it, it looks kind
of parabolic, but letâs have a look at our equation to be sureâŠ
Currently we have this defined as 3 separate linear interpolations, which paints a nice
geometric picture, but itâs a little clunky mathematically. So Iâve quickly just written
out those interpolations as one giant equation, and then simplified it a bit to arrive at
this, which we can of course recognize as a quadratic equation. Itâs just this coefficient
âaâ times t^2, plus the coefficient âbâ times t, plus a constant âcâ. For that
reason, this particular type of Bezier curve weâre working with is called a quadratic
bezier curve, which confirms our parabolic suspicions.
We can appreciate that fact even better if allow the lines to extend out past this 0
to 1 time interval of our bezier segment. Now, it is a little bit fancier than your
standard parabola though in that we can actually rotate it around. And thatâs because our
equation of course is outputting 2 dimensional points, rather than a single value. We could
break it down if we wanted into separate equations for the x and y values of the curve respectively.
And if we were to graph those, they would just be regular old, non-rotated parabolas
like this. So nothing mysterious here â if we move
a point on the x axis, we can see how thatâs reflected in the x graph, and if we move a
point on the y axis, we can see how thatâs reflected in the y graph.
Alright, so I reckon weâre ready now to think about this problem of how to fill in
the region inside of the curve. Thatâs kind of confusing when the curve is all rotated
like this though, so letâs just position the points so that it looks like a perfectly
normal y = x^2 parabola. And one way to do this is to put p1 at coordinates (1/2, 0),
and p0 at coordinates (0, 0), and p2 at coordinates (1,1).
And that indeed has given us this nice simplified case where the y values of the curve are equal
to the square of the x values. So if we want to fill in this region inside
the curve â thatâs very easy in this case because if the points on the curve are where
y is equal to x^2, then obviously inside is just where y is greater than x^2.
To actually draw this region though, weâll need a mesh and a shader.
For the mesh, we can simply construct a triangle like we saw earlier out of the 3 bezier points,
and Iâll give those points texture coordinates corresponding to the positions we put the
points in to get that simple y=x^2 parabola. Then in a shader, we can get the interpolated
x and y texture coordinates for the current pixel, and letâs visualize the x value for
now as red, which looks like this. And hereâs the y value as green. And hereâs them both
together. I actually find it a bit easier to see whatâs going on if we chop the values
up into discrete bands, by multiplying by 10 for example, then taking the integer part
of that, and then dividing by 10. And now we can see our x and y values as a
little grid here, instead of a continuous gradient.
Of course we can move our points around, and we can see how this grid gets transformed
along with the , since again the shader automatically interpolates the texture coordinates to give
us the value for the current pixel. In texture space though, things always just look like
this, since these are the texture coordinates we defined.
So back in the shader, letâs test for y being greater than x^2, and draw the result
of that into the blue channel here. As hoped, that now fills in the curve â and
whatâs more of course, it will also be transformed along as we move our points around. So we
only really had to think about the simplest possible case, and by taking advantage of
the way that shaders work, we get the rest for free.
I thought that was a cool idea, so with that, we can now now go back to our mesh of just
the straight lines of the glyph, then add in those triangles for all the curves, and
then use this clever little idea to shade in just the parts inside the curve.
Hang on, thatâs not quite right. It looks like along the outside edge of the glyph,
where the points go in a clockwise order, everything is correct â but along the inside
here where the points are now going counterclockwise we actually want to fill in the opposite region
of the curve. So in the shader Iâve just quickly added
this little parameter whichâll tell us the direction of the triangle, and we can flip
the fill flag based on that. Then just testing that here â we can see that when the points
are clockwise, we have the inside of the curve being filled, and when counterclockwise, the
outside is now filled. Aaand that seems to have fixed our problem
here. So this is a pretty interesting way to render text I think, but one potential
problem with the approach is that Microsoft has a patent on this research. Skimming through
this, it does seem though that it only covers the more complicated case of cubic bezier
curves â which is when the curves are defined by four points instead of the three that weâre
working with. So quite possibly it is actually okay to use.
But, Iâm anyway curious to experiment with different ideas today, so letâs put this
one to the side for now. What if instead we took our old glyph meshes
made out of ridiculous numbers of triangles, but claim that we donât actually care about
the triangle count because instead of rendering them to the screen, we pre-render all of them
to a texture file. That way we ultimately only need to render a single quad for each
letter, and use a shader that automatically crops out the appropriate part of that texture
atlas to display the letter we want. Thisâd be nice and fast, but it does of course have
some issues with scaling. Weâd need to make our atlas truly gigantic if we want to be
able to display large text for example. Or alternatively, we could try using signed
distance fields. In one of my old videos we took this map of the earth, and wrote a little
function to calculate the distance of every pixel to the shoreline. Which could then be
used for making a simple wave effect. So we could use that same code here to turn
our image of glyphs, into an image of glyph distances.
A simple shader could then sample the distance, and output a colour if that distance is within
some threshold. Which looks like this. This scales better than before, because now
the shader is blending between distances rather than hard on-off values. The outlines are
a little lumpy, but Iâm sure we could improve that with some refinements to the technique.
These texture-based approaches are slightly annoying though, in the sense that itâs
not really practical to support all the characters in a font because then the texture would have
to be too large. So we always have to compromise on some subset, in order to get good quality.
And even still, the edges tend to have some obvious imperfections, at least if youâre
zooming unreasonably close to the text â and we tend to lose our nice crisp corners.
With that said, I have seen some interesting-looking research about using multiple colour channels
to store additional information that can help to increase the quality.
So I would like to experiment with this at some point because itâs definitely an interesting
technique, but for today Iâd prefer to render the text directly from the bezier data, without
any textures comprising the quality in between. So focusing on a single letter for a moment,
letâs imagine a grid of pixels behind it, and our goal is obviously to light up the
pixels that are inside of the letter. Therefore, we need a way to detect whether a point is
inside or not, and happily thereâs a very intuitive way to do this.
Say weâre interested in this point over here. All we need to do is do is pick any
direction, and imagine a ray going out until it hits one of the contours of the shape.
When it hits that contour, it must either be entering or exiting the shape. And if itâs
entering it, then of course itâs bound to exit it at some point. But in this case the
ray doesnât hit any other contours along its journey, which means it must have been
exiting the shape, and so obviously it must have started out inside of it.
Now moving our point of interest back a bit into the âBâ hole here, for want of a
better term, the ray now intersects with two contours along its path. From that we can
deduce it must have first entered the shape, and then exited it, and so clearly the starting
point this time was on the outside The simple rule we can gather from this is
that if the number of intersections is even, weâre outside the glyph, while if its odd,
weâre inside of it. So all we really need to do is figure out
the maths for calculating the intersection of a horizontal ray with a bezier curve.
For example, say we have this curve over here, and we want to figure out where this ray,
at a height of 0.5, intersects with it. Since the ray is horizontal, we only need
to care about the y values of the curve, so weâre basically just asking where this function
is equal to 0.5. Equivalently, we could shift the whole curve down by 0.5, and then ask
where the function is equal to zero. Conveniently, thatâs exactly the question
that that the quadratic formula is there to answer.
So, if we just take our equation for the y component of the bezier curve, and plug the
values in for a, b, and c, weâre going to get back the values of t where the corresponding
y value is 0. In this case, the t values come out as 0.3,
and 1.2. And since values outside of the range 0 to 1 are not on our curve segment, that
means that weâd count 1 intersection in this case.
Alright, so hereâs some code implementing the quadratic formula. And one small detail
here is that if a is zero, we get a division error â this is the case where the curve
is actually a straight line â and so then we can solve it like this instead.
Okay, now Iâve also written this little test just to see that this is working properly
â and all it does is calculate the a b and c values from the positions of the bezier
points, and then it asks what the roots of that equation are â so where itâs equal
to zero, but first the height of our horizontal ray is of course subtracted over here like
we saw earlier, to take its position into account.
Then, assuming a solution exists, we just draw a point at that time value along the
curve to visualize it â using this little function over here. So, letâs see how it
goes! Iâm just going to grab some points and move
them around a bit to see whether the intersection points look correct or not. Itâs maybe not
the most rigorous testing methodology in the world, but we should catch anything obvious
at least. So far so good though. And let me also try changing the height of the ray, and
that seems to be working as well. So using this horizontal intersection test,
we can now implement that simple algorithm we were talking about earlier, where an even
number of intersections means that the point is on the outside of the glyph, while an odd
number means that itâs inside. So Iâve basically just copy pasted our intersection
test here â only after calculating the roots, instead of visualizing them, we now increment
a counter if the root is valid. And by valid I mean it needs to lie between 0 and 1 to
be on the bezier curve, and weâre also only interested in intersections to the right of
the rayâs starting point; since thatâs the arbitrary direction I picked for the ray
to be traveling. Now we just need this tiny function here to
loop over all the curves in the glyph, count the total number of intersections, and return
whether that value is even or odd. Alright â letâs try it out. And, isnât
that beautiful? From here, I guess we basically just need to increase the number of dots!
But hang on a second⊠what are these weird artefacts that keep appearing, and also from
the noises Iâm hearing, I fear my computerâs about to burst into flames.
Okay, no need to panic, letâs just start with the most serious issue first â getting
rid of this ugly line. So letâs take another look at
our visualization. Alright⊠it just needed a good old-fashioned
restart it appears, and weâre back in business. So let me set the grid size to one hundred
here, which is where that issue was occurring â and I believe whatâs happening is just
that thereâs obviously a meeting point of two curves over here, and this row of pixels
just happens to line up precisely for both of them to be counted in this case. So in
the code, Iâll just change the condition for being a valid intersection to discount
the very end of the curve, meaning that where one curve ends and another begins, weâll
only count one of them. Aaand running this again â we can see: problem solved.
With that fixed, letâs now tackle the performance problems, and I want to try simply moving
all of our calculations into a shader so that we can compute loads of pixels in parallel.
Firstly Iâm just trying to get the bounding boxes of each glyph showing up â so basically
thereâs this buffer of instance data, which just contains each glyphâs position and
size at the moment, and that gets sent to the shader. Then weâre requesting that a
quad mesh be drawn for however many glyphs there are in the input string.
So when the shader is processing the vertices of the mesh, itâll be told which copy or
instance of the mesh itâs currently working with, meaning that we can get the data for
that instance, and use it to position each vertex.
Hereâs how itâs looking at the moment. This says âCoding Adventuresâ, by the
way, in case you canât tell. To improve the readability, weâll need to
get the rest of the data over to the shader. So Iâve been working on this little function
here which creates a big list of all the bezier points that we need, along with a list of
integers for storing some metadata about each unique glyph in the text. In particular, at
what index it can find its bezier data, the number of contours it has, and also the number
of points in each of those contours. Obviously we also need to record the index at which
each glyphâs metadata begins, so that the glyphs can be told where to find it.
Not very exciting stuff, but it needs to be done. And all of this ultimately gets packed
up and sent to the shader, to live in these buffers over here.
Another slightly tedious thing Iâve been doing is translating all our C# code into
shader language for things like the quadratic formula, intersection counting, and the point
inside glyph test. Okay with that done, letâs try it outâŠ
And itâs working! Well mostly anyway⊠I donât know why the bitcoin logo has showed
up â thatâs just supposed to be a space; and also the i has gone missing, but Iâm
sure theseâll be easy enough to fix. Iâm more concerned about whateverâs happening
over here! So I just want to step though the different
segments of the glyph to see what values weâre getting for our quadratic coefficients â aaand
I should have known. If it isnât my old nemesis floating point imprecisionâŠ.
ââŠas we can see, everything is jiggling like thereâs no tomorrowâ
So computers, tragically, represent numbers with a finite number of bits, meaning that
precision is limited. On this segment for instance, we calculate the coefficient âaâ
by taking p0, adding p2, and subtracting twice p1. Which is precisely zero for the y values.
Except⊠the computer begs to differ. It says itâs extremely close, but not quite,
equal to zero. As a result, when we apply the quadratic formula, weâre dividing by
this tiny, incorrect value, and getting incorrect results. So, we need to get need a little
dirty I guess, and just say that weâll consider our curve to be a straight line if as âaâ
is very close to zero. And now those horrible little lines have disappeared.
By the way, we should check if all of this has actually improved the performance, and
indeed our efforts have taken us from barely 12 frames a second or whatever it was, to
being comfortably in the range of 800 / 900 frames per second range.
Okay, after a few misadventures⊠Iâve managed to fix that missing I and unvisited
bitcoin, and so at last, weâre able to properly render some text to the screen with our shader.
Since weâre doing this directly from the curve data, we can also zoom in pretty much
to our heartâs content, and it remains perfectly smooth. Well I say perfectly smooth, but of
course thereâre still not enough pixels in modern displays to prevent edges from looking
unpleasantly jagged â so weâll definitely need to implement some kind of anti-aliasing
to improve that. I also spotted a weird flicker when I was
zooming in. Iâm struggling to see it again now, but hereâs a freeze frame from the
recording. So Iâve been zooming in and out and panning around trying to see how common
these flickers are, and where theyâre occurring. Thereâs another one! By the V⊠Did you
see that? This is a bit frustrating because it only
happens when the pixels perfectly line up in some way, so itâs hard to capture the
exact moment. Itâs right in the sweet-spot Iâd say of
being rare enough to be a nightmare to debug, but not quite rare enough to just pretend
it doesnât exist. So I really want to have some way of quantifying how many of these
artefacts are occurring, so that I donât have to spent multiple minutes after each
tweak to the code trying desperately to divine whether Iâve made things better or worse.
So, Iâve decided to build an evil artifact detector.
What Iâve done is simply taken a glyph, and written some code to randomly generate
these little test windows. The windows in black are entirely outside of the glyph, while
the windows in white are entirely inside of it.
And Iâve run this across a bunch of different glyphs from several different fonts. The idea
then is that each of these windows gets fed one by one to a compute shader, which looks
at all the texels â which is to say, all the pixels inside the window texture â and
runs our algorithm for each of them to figure out whether theyâre inside the glyph or
not. It then draws the result to the window texture, and also flags the test as a failure
if it doesnât match the expected value. By the way, a small detail here is that Iâm
offsetting the y position by an increasing fraction of a texel as we go from the left
to right edge of the window â because that way weâre not just running the same test
over and over, but rather covering a whole extra range of values, whichâll hopefully
help us catch more artifacts! In total itâs running just over 74000 tests;
which takes about 10 seconds to complete, and our current algorithm is failing 7166
of them, so roughly 10 percent. To help hunt down the issues, Iâve also
set up this little debug view, where we can see a list of the failed cases down the side
here. And each case is defined by 3 numbers. Thereâs the glyph index, which we can set
over here. Then the resolution index, which determines how many pixels are tested inside
of the window. And finally the window index of course tells us which window weâre looking
at. So letâs go to one of the failed cases,
such as 0, 2, 19. And zooming in on the window, we can see this little pixel over here where
it mistakenly thought that it was inside of the glyph. Iâm going to draw in a quick
debug line, so that we can then zoom out and see where exactly the issue is arising. And
it appears to be this troublesome meeting point of two curves again, where the intersection
is being double-counted. I thought I had fixed that earlier when we
told it to ignore time values exactly equal to one; but clearly that was overly optimistic.
It makes sense in theory, but as we saw recently â with every bit of maths we do, numerical
imprecision has an opportunity to creep in â and so we canât exactly trust the values
that weâre getting. So Iâve been working on a dirty little fix
for the double counting, where we simply record the positions of the intersections that occur
on each segment, and then when weâre examining the next segment, we ignore any intersection
points that are extremely close to those previous ones. Alright, letâs try this out⊠And
we get a failure count of over 30000. Well thatâs not great⊠Okay, I see what
went wrong though. Weâre recording the positions over here regardless of whether they were
actually on the curve segment or not. So let me fix that quickly, and then redo the test
â and this timeâŠ. 4943. And with some trial and error just tweaking the value of
the distance threshold, I was able to get the failure rate down to 3647.
Alright, now Iâve been taking a look at some of these remaining failures â such
as this one over here â and in this case, we have a few spots that falsely believe theyâre
outside of the glyph â meaning the intersection count is even when it should be odd. And looking
at this, we can see we have 1, 2, 3 simple intersections, and then we avoid the double
counting over here, giving us a total of 4. The count is supposed to be odd though, so
I guess in this particular case we actually do need to count both of them. I think because
theyâre forming this kind of vertical wedge, so in a sense the ray is both exiting and
entering the glyph at that point, meaning they cancel each other out.
Iâm not sure if Iâm making any sense, but regardless, this whole double-counting
business is becoming annoying, so Iâve been brainstorming a bit how we could approach
the problem slightly differently to get around it.
The new plan is to forget about counting the number of intersections, but rather look for
the closest intersection. The idea is that regular contours are defined in a clockwise
direction, while the contours of holes go anti-clockwise. And this means that if the
closest curve crosses our ray in an upwards direction, we must be outside the glyph, or
inside a hole â but thatâs the same thing really. Whereas if the closest curve crosses
our ray in a downwards direction, then we know weâre inside the glyph.
To determine which direction the curve crosses the ray, we can take our equation and just
apply the power rule from calculus to figure out the gradient, which comes out as 2at plus
b. Calculating that for any value t along our
curve gives us this vector here, which tells us how the x and y positions along the curve
change with respect to t. So over here, the y position of the curve is increasing as t
increases, so the gradient has a positive y value, whereas over here y is decreasing
as t increases, so the gradient has a negative y value. That means we can simply check the
sign of that value to find out whether the curve crosses our ray in an upwards or downwards
direction. So hereâs the new implementation â as
you can see weâre simply searching for the closest point that our ray intersects, then
calculating the gradient of the curve at that point, and saying weâre inside the glyph
if that closest curve crosses the ray in a downwards direction.
Letâs run the test to see how this compares⊠And weâre getting a failure count now of
1126, so not a bad improvement! There are some downsides to this approach
though. For instance â hereâs a font I was experimenting with, and we can see that
the âjâ is looking a little strange! Thatâs due to an error in the font where the contour
has been wound anti-clockwise like a hole. If we were just counting the intersections
it would give the correct result regardless, but this new approach fails catastrophically.
Also, over here the contour contains a slight self-intersection, which again causes an issue.
So thatâs definitely a big weakness to keep in mind, but Iâm going to run with it anyway
since it seems promising in terms of eliminating those pesky artifacts.
With that said, this same case we had before is actually still failing, but now its because
right at the end point these two curves have the same position, so it doesnât know which
one is closer. I think it makes sense then to favour the curve thatâs most on the left,
since our rightwards ray should hit that first. So in the code Iâve just added a quick test
to see if two points are the same, or at least very near to being the same â and then if
thatâs the case, we skip the curve that has itâs âp1â point further to the right.
And running this now, weâve eliminated a few more problems â down to 1024.
Alright here is our next problemâŠ. and Iâm just trying to spot the artifact here. Itâs
quite tiny, but there it is. Okay Iâm not sure whatâs going on in this case â I
donât even know which of these curves the ray is intersecting. So Iâve quickly added
a way to highlight the curves of each contour by their index, and we can see here that the
possibilities are curves 2, 3, or 4. Now shaders can be a little tricky to debug
â but we can get creative and do something like output a colour based on the index of
the closest curve we foundâ so red if index 2, green if index 3, and blue if index 4.
And now we can zoom in on our debug texture again to see which it is. And it must be our
lucky day â two different problems for the price of one! As we can see, from one spot
its hitting curve 2, and from the other its hitting curve 4.
Letâs think first about the ray thatâs hitting curve number 2. Itâs a little odd
that our ray is able to hit that curve, but evidently miss the closer curve over here
â given the fact that these endpoints are at exactly the same height.
I think what might help in at least some situations like this is to test whether all the points
of a curve are below our ray, and simply skip the curve in that case. And same story if
all the points are above the ray. These raw position values are really the only reliable
thing we have â as weâve seen, once we start doing maths with them, all bets are
off â and so by skipping curves early on based on the positions, we can avoid âaccidentallyâ
hitting them due to numerical issues. It would really be wise to use these positions
directly as much as possible â for example at one point I read a bit about the Slug algorithm,
which has this way of classifying different curves based solely on the positions of their
control points. This technique is very much patented though, so Iâm going to just forge
ahead with my own hacky approach instead. Alright, so running the test again with our
curve skipping enabled now, we get a failure count of 882.
And looking at our case from before, we can see that only the intersection with curve
4 remains. This is actually the curve that we want it to intersect with, but since it
completely flattens out over here, we must be getting a gradient of zero, which is ambiguous
as to whether the curve is going up or down. But from the fact that p0 and p1 are in a
straight line here, and p2 is *below* them, I think we could reasonably consider this
curve as going purely downwards. So in general, letâs say the a curve is
decreasing for itâs entire duration if p2 is below p0, and increasing if itâs the
other way around. Although this only holds of course if p1 is somewhere in between them
on the y axis. As soon as p1 becomes the lowest, or highest
point, the curve starts curving in both directions. And this is kind of troublesome actually,
because letâs say we have a ray which should theoretically intersect precisely at this
turning point over here where the y gradient is 0⊠Any tiny imprecision might cause the
program to actually think it intersected a tiny bit behind or ahead of that point, so
we could very easily end up with artifacts. Most fonts Iâve come across so far, seem
to actually avoid using this kind of curve, but if we encounter one, maybe we could simply
split it into two segments, one where itâs going purely upwards with respect to t. And
one where itâs going purely downwards. Doing that should be reasonably straight-forward.
First of all, we can figure out the turning point simply by setting the gradient to zero,
and then solving for t â from which we can of course compute the actual point.
Then say we want to construct this segment here that goes from p0 to the turning point.
Obviously weâre going to place the two endpoints of the new curve at p0 and the turning point,
but the big question is, where does the new p1 point go?
Well, we need the gradient at p0 to still point towards the p1 point of the original
curve, in order to maintain the same shape, which means that our new p1 point is constrained
to be somewhere along this line here. And where exactly, is where it forms a horizontal
line with the turning point, since of course the gradient there, by definition, is 0 on
the y axis. So Iâve been writing some code to do these
calculations for us automatically, and itâs such a great to feeling to figure out the
maths for something, and then just have it work perfectly the first time. Which is why
I was a bit disappointed when this happened. Anyway, it turned out I just forgot some important
parentheses⊠and while weâre here letâs take a quick look at the actual maths, which
is super simple. We just say that if we start at p0 and move along the gradient vector of
the original curve at p0 for some unknown distance, weâre eventually going to hit
a point with a y value equal to the turning point. We can then rearrange that to solve
for the unknown distance, which then letâs us calculate our new p1 point. And the exact
same goes for the other segment of course Aaaand now we can see that this is working.
So this can be done as a pre-processing step on the glyph contours â though like I said,
most the fonts Iâve tried actually keep the p1 points strictly between p0 and p2,
so itâs usually not necessary. Alright, then in the shader, we can now determine
if the curve is going up or down simply based on the relative y positions of p0 and p2.
Hopefully this will improve the accuracy somewhat⊠and running the test on this, we now we have
a failure count of 755. Iâm slightly underwhelmed, but itâs progress nonethless.
Okay, letâs keep going! The next error on our list is this one here, where this ray
is managing somehow to dodge both of the segments at their meeting point. I believe whatâs
happening is that the roots weâre calculating there are just ever so slightly outside of
the valid bounds â due to precision errors of course.
So Iâve simply added in a little fudge factor to catch those cases. And running the test
again, weâre now down to just 404 remaining failures.
Okay, hereâs the next case on the list, and in this one the ray should be hitting
segment six, but after some debugging I discovered its actually just missing it due to precision
issues wouldnât you know â and instead hitting only segment seven. So in our quadratic
formula, missing the curve means that the disciminant is negative â so letâs try
actually allowing it to be a tiny bit negative to catch these cases â and just clamp it
to zero in our actual calculation. Then letâs run the tests once again, and
thatâs actually brought us all the way down to just 10 failed cases. So letâs have a
look at what remains. Alright, our ray is hitting the tip of these
two segments here â and this looks very similar to a problem we dealt with earlier
actually. Remember this one â where the ray was also hitting the tip of two segments,
and we told it that it should prefer the one that had its p1 point further on the left;
the idea being that a ray traveling rightwards should hit that segment first.
But now we have a bit of an edge case here where the p1 point of the segment we want
the ray to hit, is actually not further to the left than the other segment.
So Iâve set up this little test area with different configurations of curves to try
and come up with a better approach, and I think I finally have something thatâll do
the trick. To be clear about what weâre trying to achieve
in case I wasnât making sense earlier â the problem is that if our ray is coming in and
hitting the tip of these two curve segments, weâre not going to get a clear answer as
to which curve it hit first. But intuitively, it should be the blue curve, since if the
ray was just a tiny bit lower down, then it would very clearly be hitting the blur curve
first. So my thinking now is that weâll consider
the gradients of the two curves at our point of intersection, and then imagine kind of
winding in a clockwise direction from the ray, and seeing which gradient vector we run
into first. Whichever curve that gradient belongs to, is the curve we want.
By the way, if the curves originate from above the ray instead of below it like this case
here, the idea still works the same, only we imagine winding in a counterclockwise direction
instead. And hereâs that idea expressed in code.
So⊠running the test again, weâre now at long last down to zero errors, which Iâm
very happy to see. Obviously, scoring perfectly on the test does
not mean itâs perfect â in fact Iâd be shocked if thereâre not still plenty
of edge cases lurking in the shadows, but weâve definitely come a long way in destroying
those artifacts that were quite prevalent before. At least Iâve been zooming and panning
around this text for quite a while, and not yet noticed a single one.
I did come across a more fundamental issue while testing some new fonts though. And this
arises when we have overlapping shapes, like with this particular letter âKâ over here.
Weâre using the closest segment to determine whether weâre inside or outside of the glyph,
but from over here for example, the ray would hit into this as the closest segment, which
incorrectly tells us that weâre outside the glyph. And if we try rendering it, it
looks like this. After some contemplation though, Iâve tweaked
the algorithm slightly to now keep track of the closest distance to the exit of any shape
that weâre inside of, as well to the exit of any hole that we might be inside of. And
then by comparing those distances we can determine whether our point is inside the glyph or not.
So it seems like our rendering is now working okay.
But it definitely is in need of some anti-aliasing as I mentioned before, so that it doesnât
look so unpleasantly jaggedy. Currently, each pixel decides what colour
to display based on whether a single point at its centre is inside of the glyph or not.
So the easiest improvement I can think of would be to go from a single point deciding
the fate of the pixel, to perhaps a grid of 9 points, each contributing a little to the
final colour of the pixel. We could even get fancy and consider the fact
that each pixel is made up of a red, green, and blue component. So over here for example,
instead of just dimming all 3 colours uniformly because only 6 of the 9 points are inside
the glyph â we could say that since red and green are within the glyph, weâll just
light those two up fully, by outputting yellow, and leave blue turned off.
This is a technique called sub-pixel anti-aliasing, and we can see it in action if we take a very
close look at the text in my code editor for example, where along left edge of the h for
instance we can notice how mainly just the green and blue parts of the pixel are lit
up; while along the right edge here it looks like mainly just the red part of the pixel
is lit up. So this is a clever way of essentially tripling the horizontal resolution of the
display. There are some subtleties to this thoughâ
in terms of filtering the result so that the colour fringes arenât overly harsh, and
I guess detecting and handling the possibility of different pixel layouts on different displays.
So Iâm going to actually leave this subpixel idea for future experiments.
But hereâs our super simple version for today, we just loop over a 3x3 grid, and calculate
the different sample points within the pixel. This can be done by the way thanks to this
derivative function, which just tell us how much the position value changes between the
current pixel weâre working on, and its next-door neighbour. In other words, itâs
the size of the pixel in our glyphâs coordinate space.
So weâre just adding one for each of these points that is inside of the glyph, and then
dividing by 9 to get the average of those samples. And now, instead of being all jaggedy,
our edges become nicely smoothed out. This is at a very high magnification of course
just to show the effect better â itâs obviously a lot more subtle in reality, and
while Iâm not sure how well itâll show up in the video, it definitely does make a
big difference to how smooth the text feels. Running our algorithm 9 times for each pixel
feels a bit exorbitant though, so Iâve been experimenting with a little improvement where
we actually only run it three times â along the vertical axis â and sum up now this
horizontal coverage value. Basically, instead of simply returning a bool
for whether the point is inside the glyph or not, Iâve slightly modified our function
to return a coverage value between 0 and 1. This is calculated by adding together the
distance to the nearest curve segment on either side of the input point, both of which have
been clamped to half of the width of the pixel, and then dividing by the width of the pixel.
This means that if the pixel is far away from any edge, then it will get a value of 1, meaning
itâs completely inside the glyph â or inverted to zero if weâre actually outside
of the glyph. And for pixels closer and closer to an edge, that value will get closer and
closer to 0.5, so that we get a smooth transition between 0 and 1 across the edge.
Here that is in action with our test text from earlier. And if we magnify this a bit,
we can see the anti-aliasing is looking pretty decent I think.
Itâs maybe a slightly strange implementation though â I later came across this writeup
which talks about sending out the rays in different directions to estimate what fraction
of this little circle is covered by the glyph, which sounds a bit more flexible and accurate,
so I might use that instead in the future. In any case letâs do a super quick performance
test now. So Iâll paste in the entire 12 thousand word script for this video as the
input, and try running it. Alright, taking a look at the statistics,
it seems to be running happily enough at around 800 frames per second â and I tried comparing
the average frame duration here to that of a blank project, and it seems like the text
is taking roughly 0.2 milliseconds to render, although Iâm honestly not sure how to properly
profile these sorts of things. The performance does fall off a cliff unsurprisingly
though for very complex fonts. Hereâs a random font I found with lots of little wiggly
details in the glyphs â this g for instance contains 440 bezier curves, which has taken
us from roughly 800 frames per second down to about 150.
There are ways we could optimize this though â the obvious idea that comes to mind being
splitting each glyph up into multiple bands, so that each pixel only has to test the bezier
segments inside of the band its in. Additionally, each glyph is currently rendered as a quad
â which is nice and simple, but in a lot of cases can leave quite a bit of empty space
that the shader still needs to crunch the numbers for. With a bit of pre-processing,
we could construct much tighter fitting meshes, tailored specifically for each individual
glyph. Far more challenging than improving the performance
I suspect is improving the legibility of the text at small sizes.
Right now itâs very easy to read â on my screen at least â but if I just zoom
out a little more, it really starts to go downhill quite fast. Let me magnify this view
so we can see the problem better, and take this word here for example. We can see how
some parts of it show up nice and bright, while other parts are much dimmer â because
maybe they fall awkwardly between two pixels for example, so both pixels end up just being
kind of greyish. If I pan the camera slightly, we can see how the legibility of each letter
actually changes as its alignment with the pixels changes.
And thatâs kind of a tough thing to fix! I believe weâd need to delve into that scary
bytecode interpreter stuff that I was so happy to skip in the beginning. So definitely a
problem for some other day. Whatâs still bugging me a bit though is
those downsides I showed earlier to our âclosest curveâ rendering approach, namely failing
if the contour has the wrong winding direction, and being very sensitive to self-intersections.
So I decided to take everything I learned from that artifact hunting process, and basically
start over from scratch with the counting method again. And I wonât bore you with
another blow-by blow account, but armed with a much better understanding of the problem
now, I was able to quickly get it working with zero issues on the test set.
A bunch of the same ideas are still in there, and the main new addition is really just a
small detail in the way that curves are skipped, which actually helped immensely with the double-counting
issues that were plaguing us previously. Basically, when weâre testing if all the
points of a curve are above or below the ray, we need to decide whether we want to include
zero in the test; which means skipping the curve even if one of the points is precisely
at the same height as the ray. If we donât skip those cases we get issues
with double counting at the meeting point of two curves, while if we do skip them, then
we essentially create tiny holes that the ray can sneak through at those meeting points.
Of course we could try to solve that issue by only including zero for the p2 points perhaps,
since then if we imagine a contour like this, a ray that perfectly lines up at the meeting
point will only intersect one of the curves. The problem though is that if the contour
looks like this instead, then suddenly the ray isnât really crossing the contour anymore,
itâs just kind of scraping past it, so we want it to hit either neither of the curves,
or both of them, so that it doesnât affect the result.
I believe it was this kind of issue that drove me to stop using this counting method originally,
but now that weâve processed the curves to ensure they can only go strictly upwards
or strictly downwards â it suddenly doesnât seem like such a big deal anymore. The problem
simply occurs when the curves are going in opposite directions: one up, and one down.
And so the way we can handle this in the code is to just check the direction of the current
curve, and for one direction we include zero only for the p0 points, and for the other
direction we include zero only for the p2 points. And that solves our issue.
So now that weâre using the counting method again, we can see that the âjâ renders
correctly, and the tiny self intersection is no longer causing any visible issues.
I was quite happy about this, but victory was short-lived because I then realized that
this approach actually fails in the case of overlapping shapes again. We can fix this
with a slight tweak â where we say that whenever the ray crosses a downwards curve,
which means that itâs exiting the current shape, we add one to this counter. And otherwise
subtract one if itâs entering the shape. Then, if the ray exited more shapes than it
entered, it must have started out inside of a shape. This is essentially the same idea
as the counting method, but should be able to handle overlaps.
By the way, this can also be extended very nicely to work with our current anti-aliasing
approach, like this. Itâs still just adding or subtracting one like before, unless the
distance to the intersection becomes extremely small â like within the current pixel. And
in that case weâre now adding or subtracting some fractional amount â meaning that in
the end, we again get our nice smooth blend from 0 to 1 across the edge of the glyph again.
This idea comes from that writeup I mentioned earlier when we were first implementing the
anti-aliasing. Anyway, as we can see this little tweak has
indeed resolved the overlapping shape issue. ⊠While unfortunately breaking the problematic
âjâ again, which is quite annoying. On the other hand, the self-intersection is looking
better than ever, so Iâd say we can count this as a half victory at least.
I did try a couple different things to get the wonky j to render correctly, since it
is a recurring issue that Iâve seen in several fonts at this point, but none of my attempts
came particularly close to working. This one perhaps least close of all. Maybe we should
try to automatically detect faulty contours and correct them when importing the font instead,
but I need to stop worrying about these little
details for now, this was just supposed to be a quick experiment after all, and I maybe
got a little lost in the weeds along the way â sorry about that.
Anyhow, after all this, I definitely feel like Iâve gained a much greater appreciation
for some of the intricacies of rendering text. Here is one last test with our latest approach
on a big wall of text with several different fonts, and I think Iâm happy enough with
how itâs looking to call it a day for now at least.
So, to end things off, Iâve just been having some fun in the vertex shader here, making
the letters wave and spin around to say farewell, and thanks for watching
I hope youâve enjoyed following this little journey into the surprisingly deep rabbit
hole of rendering text, and if you have any thoughts as to how the current implementation
could be improved, Iâd love to hear them. Alright, until next time â cheers!
5.0 / 5 (0 votes)