blog:2023:0722_nvl_devlog23_text_rendering

NervLand DevLog #23: Adding support for text rendering

Trying yet another approach now to keep track of my devlogs, this time trying to focus on specific points in dedicated blog articles to try to cut things down into more manageable pieces. Let's see if this works better for me ;-).

Youtube video for this article available at:

Implement the WGPU native text overlay example:

  • Build the freetype library on windows clang/msvc/emcc and linux: OK
    • With support of zlib/png: OK
    • With support of bzip2: A pain to build on windows, discarding for now.
    • With support of harfbuzz: OK
    • With support for brotli: OK
  • Implementing initial support classes `Font`,`FontManager`, `WGPUFont`
  • Reading infos on freetyppe function `FT_Request_Size`
  • Simple font texture rendering result:

Reached a working state with all characters aligned correctly on baseline 👍!

And added horizontal line to show the baseline:

Added support to draw boxes around each character for debugging:

Now that we have the text rendering starting to work, I realize what I need to do to support dynamic test is really to manipulate the WGPUTextRenderer class directly: the “texts” should just be simple structs inside this class.

Moreother the WGPUTextRenderer should be a buffer provider (eg. what we have called UBOProvider so far, but this class should be renamed to BufferProvider as it should support Uniform and Storage buffers)

Refactoring BufferProvider::update() method to BufferProvider::update_buffers() (for better clarity)

Now considering this piece of code:

    // Setup the camera:
    auto camera = WGPUCamera::create({window.get(), 8.0});
    eng->add_ubo_provider(SID("camera"), camera);
/

  • ⇒ I think we could just as well add an engine function WGPUEngine::create_camera() that will create and register the camera with the provided SID. And similarly, we could provide the WGPUEngine::create_text_renderer() method.
  • Okay, so we now have this entry to create the camera:
        // Setup the camera:
        auto camera = eng->create_camera(SID("camera"), {window.get(), 8.0});
    
    /
  • Now removed the need to create WGPUText objects, instead we add a TextDesc structure directly in the WGPUTextRenderer as follow:
        // Add this text to a textRenderer:
        auto trdr = eng->create_text_renderer("texts"_sid, {});
    
        trdr->add_text({"Hello manu and patglqiy:-)!"});
    
    /

Next critical step I think I need to take now is to add the possibility to change text values dynamically. Once we have added a “text slot” in a TextRenderer, we should be able to update the text value with a simple function call such as:

rdr->set_text(0, "This is the new text");

To help with this, I think the add_text method should return the index of the newly created text slot, if this is not the case already. Oh, actually it's already the case, good 😅.

I first wrote a minimal version of the function:

void WGPUTextRenderer::set_text(U32 slotId, const String& text) {
    // get the corresponding text slot:
    NVCHK(slotId < _texts.size(), "Out of range text slot index {}", slotId);

    _texts[slotId].text = text;

    NO_IMPL("TODO: need to trigger update here.");
}

But of course for this to work I had to fix the NO_IMPL macro:

#define NO_IMPL(msg)                                                           \
    {                                                                          \
        auto msg_str__ = format_msg(msg);                                      \
        THROW_MSG("[NO_IMPL] ({}:{}) {}", __FILE__, __LINE__,                  \
                  msg_str__.c_str());                                          \
    }

Which also meant introducing that format_msg() function above:

template <typename... Args>
inline auto format_msg(fmt::format_string<Args...> msg_format, Args&&... args)
    -> String {
    auto out = fmt::memory_buffer();
    fmt::format_to(std::back_inserter(out), msg_format,
                   std::forward<Args>(args)...);
    auto* data = out.data(); // pointer to the formatted data
    return {out.data(), out.size()};
}

But now I realize that before I can implement the set_text() function I first need to provide the indirect draw call implementation: to ensure I can change the number of character to render without rebuilding a pipeline each time. Let's see, how do I do that 🤔?

Checking documentation from this page: https://www.w3.org/TR/webgpu/#dom-gpurendercommandsmixin-drawindirect

⇒ So I need a buffer, and it must be created with “INDIRECT” usage support. I create that buffer in the WGPUTextRenderer constructor for now:

    // Add at least one Draw args entry, corresponding to one font texture:
    _drawArgs.push_back({});

    // Create the indirect buffer:
    _drawArgBuffer = eng->create_buffer(_drawArgs, BufferUsage::Indirect);

I then add a new helper struct for indirect draw call definitions:

struct IndirectDrawCall {
    wgpu::Buffer buffer;
    U64 indirect_offset{0};
};

And we deal with this new struct in the RenderNode class:

    auto add(const IndirectDrawCall& call) -> WGPURenderNode& {
        // Add a new drawable with this draw call:
        auto obj = create_ref_object<Drawable>();
        obj->set_indirect_buffer(call.buffer, call.indirect_offset);

        // Assign the current groups to this drawable:
        obj->set_bind_refs(_currentBindGroups);

        _drawables.push_back(obj);
        return *this;
    }

Note: Eventually I extended the IndirectDrawCall struct a bit to also include an “indexed” boolean flag, as we need something like that to select between a DrawIndirect or a DrawIndexedIndirect call:

static void record_draw_ops(WGPURenderBundleBuilder& bld,
                            RefPtr<Drawable>& drawable) {
    if (drawable->indirect_buffer != nullptr) {
        if (drawable->is_indirect_indexed) {
            bld.DrawIndexedIndirect(drawable->indirect_buffer,
                                    drawable->indirect_offset);
        } else {
            bld.DrawIndirect(drawable->indirect_buffer,
                             drawable->indirect_offset);
        }
        return;
    }

    if (drawable->ibo.buffer != nullptr) {
        // Draw indexed:
        bld.DrawIndexed(drawable->ibo.indexCount, drawable->instancesCount);
    } else {
        // No indices so we just draw the vertex count:
        bld.Draw(drawable->ibo.indexCount, drawable->instancesCount);
    }
}

Then just updating “locally” how we add the drawable to the TextRenderer RenderNode to use this indirect buffer:

    // Update the indirect draw call data:
    _drawArgs[0].vertexCount = 6;
    _drawArgs[0].instanceCount = nletters;

    // Write the data to the buffer:
    eng->write_buffer(_drawArgBuffer, 0, _drawArgs.data(),
                      sizeof(DrawArgs) * _drawArgs.size());

    // Next we add the indirectdrawcall to the render node here:
    _renderNode->add(IndirectDrawCall{_drawArgBuffer, 0, false}, "camera",
                     BindLinearSampler, tex2d(ftex.texture), makeROSto(gdescs));

And this still works! Great 🥳! (eg. I get the same “Hello manu” message as shown above)

Now that we have indirect draw calls, it's time to clean a bit the rendernode setup in the TextRenderer:

  • Each time a new text slot is added, we should check what is the font associated for this text slot
  • For each different font we should add an indirect draw call, and bindings pointing to that font texture
  • We could then assume that all calls to “add_text” should be done *before* we write the rendernode to bundle
  • And any call to set_text should just update the storage buffer containing each character data, and the indirect buffer containing the count of characters.

⇒ Let's call this a plan and try to implement it this way. […crunching some code now…] Allright, this actually worked without too much trouble 🙂!

I have re-designed a good share of the TextRenderer implementation:

  • The constructor will only create the the indirect draw argument buffer,
  • Then when adding new text slots, we will initialize FontContext structs as needed (one context per font):
    auto WGPUTextRenderer::get_or_create_font_context(const RefPtr<WGPUFont>& font)
        -> FontContext& {
    
        // Check if we already have the context for this font:
        for (auto& fctx : _fontContexts) {
            if (fctx.font == font) {
                return fctx;
            }
        }
    
        // We should create a new context:
        auto* eng = WGPUEngine::instance();
        _fontContexts.push_back({.font = font});
        auto& fctx = _fontContexts.back();
        fctx.gdescs.resize(_maxNumGlyphs);
        fctx.glyphBuffer = eng->create_buffer(
            nullptr, _maxNumGlyphs * sizeof(GlyphDesc), BufferUsage::Storage);
    
        NVCHK(_fontContexts.size() <= _maxNumFonts, "Too many font contexts: {}",
              _fontContexts.size());
        return fctx;
    }
  • Next, whenever I modify a text value I simply mark the corresponding font context as dirty,
  • And the actual update of the buffers is done for each dirty font context in the update_buffers() overrided method (since the TextRenderer is a BufferProvider) with this method:
    auto WGPUTextRenderer::update_font_context(FontContext& fctx) const -> U32 {
    
        // Get a pointer to the start of the glyph data:
        GlyphDesc* gptr = fctx.gdescs.data();
    
        U32 nchars = 0;
    
        // Get a font texture:
        auto* font = fctx.font.get();
        NVCHK(font != nullptr, "Invalid font");
    
        auto& ftex = font->get_texture(_fontResolution);
    
        // Get the texture width and height to be able to convert the glyph
        // location info to normalized uv coords:
        auto tex_width = (F32)ftex.texture.GetWidth();
        auto tex_height = (F32)ftex.texture.GetHeight();
    
        // Update each font text:
        for (auto& desc : fctx.texts) {
            // The number of instances should be the number of letters we have:
            auto cpoints = string_to_codepoints(desc.text);
            U32 nletters = cpoints.size();
            NVCHK(nletters > 0, "Cannot draw text with 0 characters for now.");
    
            nchars += nletters;
    
            // logDEBUG("Rendering text with {} characters.", nletters);
    
            // Also keep track of the current x offset for each glyph:
            F32 xoffset = 0.0;
            F32 spacing = 4.0F;
    
            for (auto& cp : cpoints) {
                // find the glyph info for this codepoint:
                const auto& ginfos = ftex.get_glyph_infos(cp);
                // logDEBUG("Glyph top: {}", ginfos.top);
    
                // create a Vec4 of normalized UV coords: (top_left_u, top_left_v,
                // width, height)
                Vec4f rect(
                    (F32)ginfos.xpos / tex_width, (F32)ginfos.ypos / tex_height,
                    (F32)ginfos.width / tex_width, (F32)ginfos.height / tex_height);
                // logDEBUG("Computed rect: {}", rect);
    
                // Also the glyph top value her measures the distance between the
                // top of the glyph bitmap and the baseline of the text. So this can
                // be used to compute the relative placement of each character in a
                // string.
                F32 baseline_top_ratio = (F32)ginfos.top / (F32)ginfos.height;
    
                *gptr++ = {rect, {xoffset, baseline_top_ratio, 0.0F, 0.0F}};
    
                // Move the xoffset to the next character position (in normalized
                // coords)
                xoffset += ((F32)ginfos.width + spacing) / tex_width;
                if (ginfos.width == 0) {
                    // Add some additional space in case we wrote a space character:
                    // TODO: compute the mean character width to use it here for the
                    // space character ? Or retrieve this info somehow from the
                    // font.
                    xoffset += 3 * spacing / tex_width;
                }
            }
        }
    
        // Write the glyph data:
        WGPUEngine::instance()->write_buffer(
            fctx.glyphBuffer, 0, fctx.gdescs.data(), nchars * sizeof(GlyphDesc));
    
        // Should mark this font context as not dirty anymore:
        fctx.dirty = false;
    
        // Return the number of chars rendered with this font:
        return nchars;
    }

With this in place I can now start updating text slot content dynamically, and we get the updated display:

void ExampleApp::render_frame() {

    // Update the text display:
    static U32 count = 0;
    count++;

    trdr->set_text(0, format_string("Fpack: %d", count / 10), 0);

    eng->update_ubo_providers();

    eng->build_commands().execute_render_pass(0).submit();
    eng->present();
}

⇒ Pretty cool 👍😆!

And now, it's finally time to start thinking about a conceptually more interesting point (from my perspective) which is: “how to position the text slots” ?

This is interesting for me because this questions feels like a completely new field of grass where I've never been before, and at that particular moment, I'm free 😆… I'm free to “think as I want” and just take any path I want because there is no constraint (yet lol).

So, let's think about this then.

First, there is the question of “where” we are going to render the text: technically each glyph is a simple quad. So generally speaking we could consider rendering them just as 3D objects, anywhere in the scene, but could this make sense 🤔?

I think yes, it could make sense to render text in 3D in some cases: for instance when you want some kind of billboard text to show on top of the head of a character, of to display numbers like hit points, object cost, etc in the world itself. So we have to take this posibility into account.

But there is also the alternate case where we want to just draw the text “on screen”, I mean, projected on screen, and to draw that we need an ortho projection matrix of course.

Yet in both cases, we will give our text object a position value as a Vec3f. And in addition to that we will also specify the text anchor point.

For the anchor point we need to support all combinations between [left, center, right] and [top, center, baseline, bottom].

⇒ Let's try a first draw implementation with that!

I started with adding support to compute some infos on a given font texture which I think will be necessary down the line:

static void compute_glyph_details(WGPUFont::FontTexture& ftex) {
    auto& infos = ftex.infos;

    NVCHK(!infos.empty(), "compute_mean_glyph_width: No glyphs provided.");
    F32 mean_width = 0.0F;

    // To compute the "baseline ratio", we need to first check each glyph and
    // figure out what is the largest "top" value, and also compute the largest
    // "bottom" value. Knowing that top is the distance from baseline to the top
    // of the bitmap.
    I32 max_top = 0;
    I32 max_bottom = 0;
    for (const auto& it : infos) {
        mean_width += (F32)it.width;
        NVCHK(it.top >= 0, "Invalid top value.");
        max_top = maximum(max_top, it.top);
        // Compute the bottom value:
        I32 bottom = (I32)it.height - it.top;

        // Note that the "bottom" value can actually be negative, in the case of
        // an "apostrophe" glyph for instance: the bitmap height will be less
        // that the top value.
        max_bottom = maximum(max_bottom, bottom);
    }

    // Now, on any given line we need to be able to place max_top pixels above
    // the baseline and max_bottom pixels below the baseline, which give us the
    // effective max height of the line (in glyph texture pixels here)
    ftex.line_height = max_top + max_bottom;

    // And what we consider as the "line_base" in the Font texture is the
    // measure of the distance from the baseline position to the top of the
    // line, which is exactly this "max_top" value:
    ftex.line_base = max_top;

    ftex.mean_width = mean_width / (F32)infos.size();
    logDEBUG("Computed font details:\n - line_height={}\n - line_base={}\n - "
             "mean_width={}",
             ftex.line_height, ftex.line_base, ftex.mean_width);
}

So now we have those line_height, line_base and mean_width values available to us (and by the way, I think I can now use the “mean_width” as an appropriate replacement for the width of the space glyph, so fixing that…)

When placing a text on screen, we want to control its size by specifying the text height (in projection pixels) or width, or even both. For now, let's only consider the case where we control the text height. We could request for instance to draw a text of height 20 pixels on a projection with size 1920×1080.

This size information should be provided as part of the construction infos for the text slot.

[crunching… crunching… crunching…]

And here we go! My first correct text display using the projection coordinate system:

This is all nice and all, but in the log I then noticed I could sometime get “NaN” values for the destination rect of some characters 😅:

2023-07-21 11:27:40.403315 [DEBUG] Using text size: Vec2f(1276.55,     50)
2023-07-21 11:27:40.403333 [DEBUG] Baseline pos: 10.784313
2023-07-21 11:27:40.403336 [DEBUG] Dest glyph 0 rect: Vec4f(     0, 10.7843, 63.4874, 33.3333)
2023-07-21 11:27:40.403339 [DEBUG] Dest glyph 1 rect: Vec4f(72.5571, 10.7843, 49.883, 25.4902)
2023-07-21 11:27:40.403341 [DEBUG] Dest glyph 2 rect: Vec4f(131.51, 10.7843, 11.337, 36.2745)
2023-07-21 11:27:40.403343 [DEBUG] Dest glyph 3 rect: Vec4f(151.916, 10.7843, 11.337, 36.2745)
2023-07-21 11:27:40.403346 [DEBUG] Dest glyph 4 rect: Vec4f(172.323, 10.7843, 54.4178, 25.4902)
2023-07-21 11:27:40.403350 [DEBUG] Dest glyph 5 rect: Vec4f(235.811, -nan  ,      0,      0)
2023-07-21 11:27:40.403352 [DEBUG] Dest glyph 6 rect: Vec4f(267.554, 10.7843, 81.6267, 25.4902)
2023-07-21 11:27:40.403355 [DEBUG] Dest glyph 7 rect: Vec4f(358.251, 10.7843, 49.883, 25.4902)
2023-07-21 11:27:40.403357 [DEBUG] Dest glyph 8 rect: Vec4f(417.203, 10.7843, 47.6156, 25.4902)
2023-07-21 11:27:40.403359 [DEBUG] Dest glyph 9 rect: Vec4f(473.888, 10.7843, 47.6156, 25.4902)
2023-07-21 11:27:40.403362 [DEBUG] Dest glyph 10 rect: Vec4f(530.574, -nan  ,      0,      0)
2023-07-21 11:27:40.403364 [DEBUG] Dest glyph 11 rect: Vec4f(562.317, 10.7843, 49.883, 25.4902)
2023-07-21 11:27:40.403366 [DEBUG] Dest glyph 12 rect: Vec4f(621.27, 10.7843, 47.6156, 25.4902)
2023-07-21 11:27:40.403369 [DEBUG] Dest glyph 13 rect: Vec4f(677.955, 10.7843, 49.883, 35.2941)
2023-07-21 11:27:40.403371 [DEBUG] Dest glyph 14 rect: Vec4f(736.908, -nan  ,      0,      0)
2023-07-21 11:27:40.403373 [DEBUG] Dest glyph 15 rect: Vec4f(768.652, 0.980391, 49.883, 35.2941)
2023-07-21 11:27:40.403376 [DEBUG] Dest glyph 16 rect: Vec4f(827.604, 10.7843, 49.883, 25.4902)
2023-07-21 11:27:40.403378 [DEBUG] Dest glyph 17 rect: Vec4f(886.557, 10.7843, 34.0111, 31.3725)
2023-07-21 11:27:40.403381 [DEBUG] Dest glyph 18 rect: Vec4f(929.638, 0.980391, 49.883, 35.2941)
2023-07-21 11:27:40.403383 [DEBUG] Dest glyph 19 rect: Vec4f(988.59, 10.7843, 11.337, 36.2745)
2023-07-21 11:27:40.403385 [DEBUG] Dest glyph 20 rect: Vec4f(  1009, 0.980391, 49.883, 35.2941)
2023-07-21 11:27:40.403388 [DEBUG] Dest glyph 21 rect: Vec4f(1067.95, 10.7843, 13.6045, 34.3137)
2023-07-21 11:27:40.403390 [DEBUG] Dest glyph 22 rect: Vec4f(1090.62, 0.980391, 52.1504, 35.2941)
2023-07-21 11:27:40.403392 [DEBUG] Dest glyph 23 rect: Vec4f(1151.84, 10.7843, 13.6045, 25.4902)
2023-07-21 11:27:40.403395 [DEBUG] Dest glyph 24 rect: Vec4f(1174.52, 22.549, 29.4763, 3.92157)
2023-07-21 11:27:40.403397 [DEBUG] Dest glyph 25 rect: Vec4f(1213.06,      0, 31.7437, 49.0196)
2023-07-21 11:27:40.403399 [DEBUG] Dest glyph 26 rect: Vec4f(1253.88, 10.7843, 13.6045, 33.3333)

⇒ Well, looking carefully into this, it seems that the “nan” values are produced for the space characters! so that's probably not a big deal, let's quickly fix that. OK done.

Now that we have some initial proper control on the text destination area position and size, let's update the system further to actually use the position provided for each text slot.

But first, to be able to test properly, let's also add a simple point drawing call to show a point at the center of the screen:

Arrggh… just got to learn that it's not supported to control the point size with WGSL (see this page for details), so for our usage here, let's rather display an horizontal and a vertical line crossing at the center of the clip space:

    {
        // Draw a simple line at the baseline of the text:
        WGPURenderNode rnode(0, "tests/hline");
        rnode.set_primitive_state({.topology = PrimitiveTopology::LineList})
            .add(DrawCall{2}, "camera")
            .add_bundle_to(rpass1);
    }

    {
        // Draw a simple line at the baseline of the text:
        WGPURenderNode rnode(0, "tests/vline");
        rnode.set_primitive_state({.topology = PrimitiveTopology::LineList})
            .add(DrawCall{2}, "camera")
            .add_bundle_to(rpass1);
    }

And this will produce this display:

And now, let's try to set the position of the text to get it drawn at the center of the screen:

    // Add this text to a textRenderer:
    trdr = eng->create_text_renderer("texts"_sid, {});
    trdr->add_text({.text = "Hello manu and patglqiy:-)!",
                    .position = {1920.0 * 0.5, 1080 * 0.5},
                    .size = {-1.0F, 50.0F}});

And of course, we also need to use this text position as an offset when preparing the glyph destination rects:

            // Finally we can update the destination rect (all in projection
            // coordinates):
            gptr2->dest.set(xpos + tpos.x(), ypos + tpos.y(), width, height);
            logDEBUG("Dest glyph {} rect: {}", i, gptr2->dest);

Bingo! The result is exactly as I was expecting:

So far we have only assumed an anchor point at the bottom-left corner of the text area, we'll work on that in the next section actually

In the example above we have only been using a bottom-left anchor point for our text, now let's add support for all of them.

Introducing the TextAnchor enumeration:

enum TextAnchor {
    ANCHOR_BOTTOM_LEFT,
    ANCHOR_BOTTOM_CENTER,
    ANCHOR_BOTTOM_RIGHT,
    ANCHOR_BASE_LEFT,
    ANCHOR_BASE_CENTER,
    ANCHOR_BASE_RIGHT,
    ANCHOR_CENTER_LEFT,
    ANCHOR_CENTER_CENTER,
    ANCHOR_CENTER_RIGHT,
    ANCHOR_TOP_LEFT,
    ANCHOR_TOP_CENTER,
    ANCHOR_TOP_RIGHT,
};

Next I added a dedicated function to compute the offset to apply to the text position for a given anchor point:

static auto compute_text_anchor_offset(Vec2f tsize, TextAnchor anchor,
                                       F32 baseline_pos) -> Vec2f {
    // Compute the anchor offset to apply from the bottom left corner of a given
    // text area:
    // Note that the baseline pos is given from the bottom of the destination
    // area already.
    switch (anchor) {
    case ANCHOR_BOTTOM_LEFT:
        return {0.0F, 0.0F};
    case ANCHOR_BOTTOM_CENTER:
        return {-tsize.x() * 0.5F, 0.0F};
    case ANCHOR_BOTTOM_RIGHT:
        return {-tsize.x(), 0.0F};
    case ANCHOR_BASE_LEFT:
        return {0.0F, -baseline_pos};
    case ANCHOR_BASE_CENTER:
        return {-tsize.x() * 0.5F, -baseline_pos};
    case ANCHOR_BASE_RIGHT:
        return {-tsize.x(), -baseline_pos};
    case ANCHOR_CENTER_LEFT:
        return {0.0F, -tsize.y() * 0.5F};
    case ANCHOR_CENTER_CENTER:
        return {-tsize.x() * 0.5F, -tsize.y() * 0.5F};
    case ANCHOR_CENTER_RIGHT:
        return {-tsize.x(), -tsize.y() * 0.5F};
    case ANCHOR_TOP_LEFT:
        return {0.0F, -tsize.y()};
    case ANCHOR_TOP_CENTER:
        return {-tsize.x() * 0.5F, -tsize.y()};
    case ANCHOR_TOP_RIGHT:
        return {-tsize.x(), -tsize.y()};
    }
}

And now we ca start testing all the anchor points, for instance, bottom-center:

    trdr->add_text({.text = "Hello manu and patglqiy:-)!",
                    .position = {1920.0 * 0.5, 1080 * 0.5},
                    .size = {-1.0F, 50.0F},
                    .anchor = ANCHOR_BOTTOM_CENTER});

Or center-right:

Or top-center:

⇒ Allright! It seems that everything is working just as expected here all good 👍😁!

  • blog/2023/0722_nvl_devlog23_text_rendering.txt
  • Last modified: 2023/08/03 11:48
  • by 127.0.0.1