CEF Direct offscreen rendering to Direct3D Surfaces

As I'm currently working on a project that make intensive use of CEF to render imagery overlays I finally reached the point where it could become very interesting to be able to render offscreen content with CEF, but still avoid copying the generated texture from GPU to CPU and then back to GPU to inject it in another 3D application.

Until very recently, I was just waiting for the official bug report to be handled by someone from the official CEF team or at least soemone more familiar with this giant project, but this wasn't moving fast enough for me, so I finally decided I should get my hands dirty and try and fix this by myself.

The whole idea is based on the fact that CEF used the ANGLE library on Windows to convert the GLES commands into DirectX commands, and the ANGLE library supports interactions with other DirectX devices/resources out of the box.

From there, I had to spend a long time reading the sources and documentation on CEF to try to understand how it actually works and where exactly I could inject the changes I needed. The most complex aspect in this project is due to the fact that you basically have 2 separated “parts”:

  • On one side you have “your client” which is the part of CEF that will run in your own software process space, where you build your “Browser”, and perform different requests, and get feedback.
  • On the other side, you have the CEF “GPU service” which runs in a separated process. And which is responsible for doing all the rendering requested by the “client” part.

Those two parts can communicate with commands sent from the client to the service via shared memory (with optional results received back), and some complex command buffer system.

So, the start point was in my software process, where I can create DirectX surfaces with shared handle, then with ANGLE, I can normally render on this shared handle from a GL ES context, but to do so, I needed to find a way to pass the shared handle value to the GPU service side, and thus, needed to create a new command in the CEF internals.

  • I started from the current CefRenderHandler class which is used in the current offscreen rendering pipeline to retrieve the generated image from some CPU memory location (which its OnPaint(…) method).
  • From there I figured out an instance of this class was used in the CefRenderWidgetHostViewOSR class, and more precisely in the helper class CefCopyFrameGenerator
  • So I extended the CefRenderHandler with additional methods that can be overriden to specify that we want to use a shared handle:
      // Check if shared handle should be used with this render handler.
      virtual bool UseSharedHandle() { return false; }
      // Return the shared handle for this renderhandler. If no shared handle
      // is available then null is returned.
      virtual void* GetSharedHandle() { return nullptr; }
Note that one has to be very carefull when changing the CEF public API like that: those functions prototype are then used to generate a bunch of auto-generated bindings and all, so you have to respect a specify syntax (note the comments format for instance which should be exactly like that :-) )
  • Then I ended up updating the function CefCopyFrameGenerator::PrepareTextureCopyOutputResult to support 2 different processing pipelines: the regular one using a copy in CPU memory with a Skia Bitmap, or the new direct one trying to render on a provided Direct3D surface is a valid shared handle is provided in the Renderhandler:
      void PrepareTextureCopyOutputResult(
          const gfx::Rect& damage_rect,
          std::unique_ptr<cc::CopyOutputResult> result) {
        base::ScopedClosureRunner scoped_callback_runner(
                        weak_ptr_factory_.GetWeakPtr(), damage_rect));
        const gfx::Size& result_size = result->size();
        DEBUG_MSG2("Texture size is: "<<result_size.width()<<"x"<<result_size.height());
        // Shared code: we need the gl_helper in both rendering paths:
        content::ImageTransportFactory* factory =
        viz::GLHelper* gl_helper = factory->GetGLHelper();
        if (!gl_helper) {
          DEBUG_MSG("NervCEF: ERROR: Invalid gl_helper");
        // Select the appropriate rendering path depending on the shared handle being provided or not:
        CefRefPtr<CefRenderHandler> handler = view_->browser_impl()->GetClient()->GetRenderHandler();
        if(handler.get() && handler->UseSharedHandle()) {
          // Direct rendering on shared handle path:
          // Prepare the texture mail box:
          viz::TextureMailbox texture_mailbox;
          std::unique_ptr<cc::SingleReleaseCallback> release_callback;
          result->TakeTexture(&texture_mailbox, &release_callback);
          if (!texture_mailbox.IsTexture())
          void* handle = handler->GetSharedHandle();
          if(handle == nullptr) {
            DEBUG_MSG("NervCEF: ignoring null shared handle.");
          // Here we call another function to copy the texture to the shared handle:
          DEBUG_MSG2("NervCEF: using shared handle: "<<(const void*)handle);
          gl_helper->NervCopyMailboxToSharedHandle(texture_mailbox.mailbox(), texture_mailbox.sync_token(), handle);
          // Regular on CPU rendering path:
          // Allocate the Skia bitmap:
          SkIRect bitmap_size;
          if (bitmap_)
          if (!bitmap_ || bitmap_size.width() != result_size.width() ||
              bitmap_size.height() != result_size.height()) {
            // Create a new bitmap if the size has changed.
            bitmap_.reset(new SkBitmap);
            bitmap_->allocN32Pixels(result_size.width(), result_size.height(), true);
            if (bitmap_->drawsNothing())
          // Retrieve the pixel buffer:
          uint8_t* pixels = static_cast<uint8_t*>(bitmap_->getPixels());
          // Prepare the texture mail box:
          viz::TextureMailbox texture_mailbox;
          std::unique_ptr<cc::SingleReleaseCallback> release_callback;
          result->TakeTexture(&texture_mailbox, &release_callback);
          if (!texture_mailbox.IsTexture())
            texture_mailbox.mailbox(), texture_mailbox.sync_token(), result_size,
            gfx::Rect(result_size), result_size, pixels, kN32_SkColorType,
                weak_ptr_factory_.GetWeakPtr(), base::Passed(&release_callback),
                damage_rect, base::Passed(&bitmap_)),
  • As shown on the code snippet above, when using the shared handle processing pipeline, we then rely on a new command created specifically to send the shared handle to the GPU service and to request a copy of a given GL texture_id onto that shared handle surface: NervCopyMailboxToSharedHandle. Again, building a new command in CEF/chromium is not very clear the first time you do it: you have to provide some kind of prototype (in “gpu/command_buffer/cmd_buffer_functions.txt”), then also update some metadata on your new function in “gpu/command_buffer/”, then actually call this script to create all the auto-generated content required to actually make this new command available for usage, and finally, you have to provide some custom implementation in different locations (depending on how your new command behave I believe).
  • For this new function, the main implementation is done in GLES2DecoderImpl::HandleNervCopyTextureToSharedHandle(…) which is now a pretty big function so I will not copy it here. But this is where all the real magic happens, because: you are now in the GPU service process, you have the value of the shared handle, you have the texture_id you want to use as source texture, and you have the GLES context available sharing this texture_id content with you :-)
  • So, from there, the idea is to perform the following:
    1. Create a EGL context sharing content with the GLES2Decoder default context: so that we still have access to the texture data in our new context, but still, we avoid messing too much with the current state of rendering in the default context. (Actually, i'm not absolutely sure this is really needed: maybe we could use the default context directly to do the rendering ?)
  1. Setup a pbuffer with the shared handle (as described on the ANGLE interop page)
  1. Setup the resources required to render a screen aligned quad: we need a program that will simply copy the input texture on the render surface. A vertex buffer to draw 2 triangles, and an index buffer.
  1. Then each time our new command is executed, we carefully replace the current context with our own context, then we copy the provided texture onto our pbuffer, and then restore the default current context, just how it was:
      // Get the current read and draw surfaces:
      EGLSurface drawSurface = eglGetCurrentSurface(EGL_DRAW);
      EGLSurface readSurface = eglGetCurrentSurface(EGL_READ);
      // Assign our surface as current draw/read target:
      EGLSurface ourSurface = gSurfaces[handle];
      if(eglMakeCurrent(egl_display, ourSurface, ourSurface, nvContext) != EGL_TRUE) {
        NV_LOG("GLES2DecoderImpl: ERROR: eglMakeCurrent failed with: "<< ui::GetLastEGLErrorString());
        // Try to restore the previous surfaces:
        eglMakeCurrent(display, drawSurface, readSurface, curContext);
        return error::kNoError;
      // init the context if necessary:
      if(program == 0) {
        NV_LOG2("GLES2DecoderImpl: Initializing context resources.")
      // Convert the client texture_id to our service texture_id:
      uint32_t service_tex_id=0;
      if(GetServiceTextureId(texture_id, &service_tex_id)) {
        NV_LOG2("GLES2DecoderImpl: Drawing from service texture_id: "<<service_tex_id);
      else {
        NV_LOG("GLES2DecoderImpl: ERROR: Cannot retrieve service texture for client id: "<<texture_id);
      // float v1 = 1.0f * (rand()/(float)RAND_MAX);
      // float v2 = 1.0f * (rand()/(float)RAND_MAX);
      // NV_LOG2("GLES2DecoderImpl: Clearing shared handle surface with color: ("<<v1<<","<<v2<<",0.0)");
      // First we try to just display some fixed color (red):
      // glViewport(0, 0, 1920, 1080);
      // glClearColor(v1, v2, 0.0f, 1.0f);
      // glClear(GL_COLOR_BUFFER_BIT);
      // Now that we are done, we restore the previous current context/surfaces:
      eglMakeCurrent(display, drawSurface, readSurface, curContext);
  • Initialization is done in the nvInitContext() function (only once). Then the custom nvDraw() function will use all the resources we allocated, to copy the input texture on the current render surface (ie. our pbuffer / ie. shared DirectX surface) as shown below:
    void nvDraw(uint32_t texture)
      // NV_CHECK(eglReleaseTexImage(winDisplay, pbuffer, EGL_BACK_BUFFER), "Cannot releast pbuffer tex image.");
      // NV_CHECK(eglMakeCurrent(winDisplay, pbuffer, pbuffer, winContext), "Cannot make EGL surface current.");
      // float v1 = 1.0f * (rand()/(float)RAND_MAX);
      // float v2 = 1.0f * (rand()/(float)RAND_MAX);
      // NV_LOG2("GLES2DecoderImpl: Clearing shared handle surface with color: ("<<v1<<","<<v2<<",0.0)");
      // // elapsed += 0.01;
      // // glClearColor((sin(elapsed) + 1.0f) * 0.5f, 0.0f, 0.0f, 1.0f);
      // glClearColor(v1, v2, 0.0f, 1.0f);
      // glClear(GL_COLOR_BUFFER_BIT);
      // NV_CHECK(eglBindTexImage(winDisplay, pbuffer, EGL_BACK_BUFFER), "Cannot releast pbuffer tex image.");
      // NV_CHECK(eglMakeCurrent(winDisplay, winSurface, winSurface, winContext), "Cannot make EGL surface current.");
      // Set the viewport
      glViewport(0, 0, pWidth, pHeight);
      float v1 = 1.0f * (rand()/(float)RAND_MAX);
      float v2 = 1.0f * (rand()/(float)RAND_MAX);
      // Ensure we have a proper color mask set:
      glColorMask (GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
      glClearColor (v1, v2, 0.0f, 1.0f);
      // glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
      // Clear the color buffer
      // glClearColor(1.0f, 0.5f, 1.0f, 0.0f);
      GLint curBuffer = 0;
      glGetIntegerv(GL_ARRAY_BUFFER_BINDING, &curBuffer);
      // if(curBuffer == 0) {
      //   NV_LOG2("Current buffer is 0")
      // }
      GLint curIdxBuffer = 0;
      glGetIntegerv(GL_ELEMENT_ARRAY_BUFFER_BINDING, &curIdxBuffer);
      // if(curIdxBuffer == 0) {
      //   NV_LOG2("Current index buffer is 0")
      // }
      // We bind the vertex buffer:
      glBindBuffer(GL_ARRAY_BUFFER, nvVertBuffer);
      // We bind the index buffer:
      glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, nvIdxBuffer);
      // Use the program object
      // Load the vertex position
      // glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 5 * sizeof(GLfloat), vVertices);
      glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 5 * sizeof(GLfloat), NULL);
      // Load the texture coordinate
      // glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 5 * sizeof(GLfloat), &vVertices[3]);
      glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 5 * sizeof(GLfloat), reinterpret_cast<const void*>(3*sizeof(GLfloat)));
      // Bind the texture
      // glBindTexture(GL_TEXTURE_2D, nvTexture);
      glBindTexture(GL_TEXTURE_2D, texture);
      // Set the sampler texture unit to 0
      // logDEBUG("Using sample uniform location: " << sloc);
      glUniform1i(texLoc, 0);
      glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);
      // Then we restore the bound buffer:
      glBindBuffer(GL_ARRAY_BUFFER, curBuffer);
      glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, curIdxBuffer);
      // Also unbind our texture here:
      glBindTexture(GL_TEXTURE_2D, 0);
      // eglSwapBuffers(winDisplay, winSurface);
      // ValidateRect(hwnd, NULL);
  • As a side note, alos note that when done with this copy operation, we delete the texture_id (or at least the client reference on the real texture) since this seems to be what was done anyway with the regular CPU rendering pipeline:
      // When done we should delete our source texture directly:
      NV_LOG2("GLES2DecoderImpl: Deleting texture with client id="<<texture_id<<" (service_id="<<service_tex_id<<")");
      DeleteTexturesHelper(1, &texture_id);
      NV_LOG2("GLES2DecoderImpl: Done copying to shared handle surface.");
In the regular rendering pipeline, we rather call gl_→DeleteTextures(1, &mailbox_texture); on the client size, But I believe this will simply eventually execute the equivalent of DeleteTexturesHelper(1, &texture_id); on the service side, so I thought I could save a command here.
As you can see from the code snippets added above, the code is currently still full of commented tests, aggressive checks, debug outputs, not optimized sections, etc: So this is still a very early implementation and will probably requires some refactoring before it can be used in a general way by the CEF/chromium community :-)

During my initial tests, I was modifying the CEF/chromium sources directly, and then building CEF following the official instructions. But I quickly realized this would become a nightmare to maintain except if I were to keep it all under git control which seems even more frightening when you consider the size of this project!

So I took a different path: instead I built a set of script functions and a separated collection of CEF files that I needed to modify: then I have script functions used to:

  1. checkout a given branch of CEF (branch 3163 by default):
    # Update of the CEF sources to a given branch
    nv_cef_update() {
      local branch="3163"
      local bdir=`cygpath -w "$_cef_build_dir"`
      # echo "using build dir: $bdir"
      cd "$_cef_build_dir/chromium_git"
      nv_cef_call_python ../automate/ --download-dir=$bdir/chromium_git --depot-tools-dir=$bdir/depot_tools --no-distrib --no-build --branch=$branch
      cd - > /dev/null
  1. Override all the base files from CEF with the updated version I stored separetely and then generate the project files, this script will also take care of re-generating all the auto-generated files based on the modifications we just injected in the code:
    # create the cef projects:
    nv_cef_create_project() {
      # Copy all the updated files:
      cp -Rf "$_cef_patch_dir/src" "$_cef_build_dir/chromium_git/chromium/"
      # Once we are done copying the files ensure we call the translator tool:
      cd "$_cef_build_dir/chromium_git/chromium/src/cef/tools"
      nv_cef_call_python --root-dir ..
      cd - > /dev/null
      # Update the command buffer functions:
      cd "$_cef_build_dir/chromium_git/chromium/src"
      nv_cef_call_python "gpu\\command_buffer\\"
      cd - > /dev/null
      # Create the project files:
      cd "$_cef_build_dir/chromium_git/chromium/src/cef"
      nv_cef_call_python tools/
      cd - > /dev/null
  1. Then another script to perform the actual build:
    # build the cef library:
    nv_cef_build() {
      local btype=${1:-Release}
      cd "$_cef_build_dir/chromium_git/chromium/src"
      nv_cef_call_ninja -C "out\\${btype}_GN_x64" cef
      echo "Done building CEF"
      cd - > /dev/null
  1. And finally another script to generate the distrib folder:
    # Make CEF distrib:
    nv_cef_make_distrib() {
      #  cf.
      local vsdir=`nv_get_visualstudio_dir`
      vsdir=`nv_to_win_path $vsdir`
      export CEF_VCVARS="$vsdir\\VC\\bin\\amd64\\vcvars64.bat"
      cd "$_cef_build_dir/chromium_git/chromium/src/cef/tools"
      nv_cef_call_python --output-dir ../binary_distrib/ --ninja-build --x64-build --allow-partial
      echo "Done packaging CEF"
      cd - > /dev/null
Note that I'm executing all the scripts above from a cygwin environment

Do be able to use those scripts, one should only need to provide 2 folder locations:

  1. The location where CEF should be built (ie. containing the official CEF sources)
  2. The location where the patched files are (which will be the folder containing the script file itself if you use my package below)

Those location can be configured at the beginning of the script file I'm providing below:

# CEF Build dir: this is the root folder where CEF is built:

# CEF patch dir: this is the folder containing all the updated files required to build
# our patched version of CEF (with support for the Direct rendering to Direct3D)

Currently in use in this CEF patch, you will find 2 debug output logging mechanism I created specifically for my investigations (I don't know how to use the CEF logging system properly, and in fact I didn't even want to learn that part ;-) ).

  • On one side, in the “client” process, we can make use of a “Log handler”:
    namespace cef {
    struct CefLogHandler {
      CefLogHandler() {};
      virtual ~CefLogHandler() {};
      // Log a message:
      virtual void log(const std::string& msg) = 0;
    // Function used to set our log handler:
    void setLogHandler(CefLogHandler* handler);
    // Function used to actually log a message:
    void handleLogMessage(unsigned int level, const std::string& msg);
    // Define a deBUG MESSAGE macro:
    #define DEBUG_MSG(msg) { \
        std::ostringstream os; \
        os.precision(9); \
        os << std::fixed << msg; \
        cef::handleLogMessage(1, os.str()); \
    #define DEBUG_MSG2(msg) { \
        std::ostringstream os; \
        os.precision(9); \
        os << std::fixed << msg; \
        cef::handleLogMessage(2, os.str()); \

This means that in my software I can then assign a CefLogHandler instance with an overload log() method to retrieve log messages originating from CEF directly into my software logging system, which make it all more consistent from my point of view.

  • On the other side, the approach was not possible in the “GPU service process” (or this would require building again other complex commands to send log data to the client process and I certainly didn't want to go that way). So instead, there is another simple logging system available on that side, that will log everything into a file called “cef_nv.log” currently (see the nv_logger.{cc, h} files):
    namespace nv {
    Helper logging function used to log debug outputs to a file.
    void nvLOG(unsigned int level, const std::string& msg);
    // Define a DEBUG MESSAGE macro:
    #define NV_LOG(msg) { \
        std::ostringstream os; \
        os.precision(9); \
        os << std::fixed << msg; \
        nv::nvLOG(1, os.str()); \
    #define NV_LOG2(msg) { \
        std::ostringstream os; \
        os.precision(9); \
        os << std::fixed << msg; \
        nv::nvLOG(2, os.str()); \
    #define NV_LOG3(msg) { \
        std::ostringstream os; \
        os.precision(9); \
        os << std::fixed << msg; \
        nv::nvLOG(3, os.str()); \

Both of those logging system have a verbosity controlled by the environement variable “NV_CEF_LOG_LEVEL” which is expected to take 3 different values (default value if not defined is 0):

  1. NV_CEF_LOG_LEVEL=0 ⇒ No log output at all
  2. NV_CEF_LOG_LEVEL=1 ⇒ Minimal log outputs (mainly errors if any)
  3. NV_CEF_LOG_LEVEL=2 ⇒ Maximum output level (errors and infos)

So finally, here is a link to a github repo I just created to hold those patch files: this repo contains the script with the functions described above, and all the modified CEF files in the src/ folder but no real file history (as all my source files are stored on a different [private] repo I have)

Github repo:

⇒ If you have any question or problem you can still post a comment here or at an issue on the github repo, or contact me on linkedin/facebook/google+ or by email at roche.emmanuel (gmail account)

During the work on this project I also took a lot of notes on what I was doing, what was working, what was not, etc: it's a bit messy, but it still contains valuable info if you need to know more. So you can access those notes on this page: CEF direct offscreen rendering to Direct3D notes

I noticed during the initial usage tests for this patch that there seem to be a serious issue on the GL texture release process (textures are simply not released anymore if the requests for new frames are coming too quickly). I'm currently investigating this issue.

  • blog/2017/1130_cef_direct_copy_to_d3d.txt
  • Last modified: 2021/09/02 13:38
  • by manu