Table of Contents

Direct GPU buffer copy in CEF

Building CEF from sources

Build CEF directly from sources was less complicated than I anticipated: I meanly followed the instructions from the Master Build Quick Start guide. Also taking some notes from the official Branches and Building page.

Initial steps:

In the end, I simply created a few helper batch files, placed in the chromium_git folder (note that my build root for CEF is: D:\Projects\CEFBuild):

  1. First the update.bat file, used to retrieve the code updates:
    set CEF_USE_GN=1
    REM set GN_DEFINES=is_win_fastlink=true
    set GN_ARGUMENTS=--ide=vs2015 --sln=cef --filters=//cef/*
    python ..\automate\automate-git.py --download-dir=D:\Projects\CEFBuild\chromium_git --depot-tools-dir=D:\Projects\CEFBuild\depot_tools --no-distrib --no-build --branch=3163
  2. Then the create_cef_project.bat file, used to create the CEF Visual studio/Ninja project files:
    cd D:\Projects\CEFBuild\chromium_git\chromium\src\cef
    
    set CEF_USE_GN=1
    REM set GN_DEFINES=is_win_fastlink=true
    set GN_ARGUMENTS=--ide=vs2015 --sln=cef --filters=//cef/*
    call cef_create_projects.bat
  3. Then the build_cef_release.bat to build the release version of CEF:
    cd D:\Projects\CEFBuild\chromium_git\chromium\src
    ninja -C out\Release_GN_x64 cef
  4. And finally the make_distrib.bat file which will generate the binary_distrib folder inside D:\Projects\CEFBuild\chromium_git\chromium\src\cef:
    REM cf. https://bitbucket.org/chromiumembedded/cef/wiki/BranchesAndBuilding.md
    
    set CEF_VCVARS="D:\Apps\VisualStudio2015_CE\VC\bin\amd64\vcvars64.bat"
    
    cd D:\Projects\CEFBuild\chromium_git\chromium\src\cef\tools
    ./make_distrib.bat --ninja-build --x64-build --allow-partial

One thing to note though is: by default on Windows the build system will attempt to build both x86 and x64 binaries. Yet, some dependencies where not available on my computer to build the x86 version of the library, so I had to manually disable that (before running create_cef_project.bat AFAIR) by tweaking the line of D:\Projects\CEFBuild\chromium_git\chromium\src\cef\tools\gn_args.py to only request the build of the x64 version:

  if platform == 'linux':
    use_sysroot = GetArgValue(args, 'use_sysroot')
    if use_sysroot:
      # Only generate configurations for sysroots that have been installed.
      for cpu in ('x86', 'x64', 'arm'):
        if LinuxSysrootExists(cpu):
          supported_cpus.append(cpu)
        else:
          msg('Not generating %s configuration due to missing sysroot directory'
              % cpu)
    else:
      supported_cpus = ['x64']
  elif platform == 'windows':
    # supported_cpus = ['x86', 'x64']
    supported_cpus = ['x64']
  elif platform == 'macosx':
    supported_cpus = ['x64']
  else:
    raise Exception('Unsupported platform')

But now I realize there might be a more appropriate way to handle this by specifying the target_cpu command line argument somewhere maybe ? (to be investigated).

Note: Currently experimenting with CEF branch 3163

Trying to automate header copy in make_distrib

The logic used the copy the headers during the make_distrib step is found in make_distrib.py around line 525:

  # transfer common include files
  transfer_gypi_files(cef_dir, cef_paths2['includes_common'], \
                      'include/', include_dir, options.quiet)

Investigating current Off Screen Rendering (OSR) pipeline

Implement CEF logging capability

We need to be able to retrieve some debug infos from the cef library in an integrated way: so we need to be able to assign a log handler inside the library, and then use this handle to output log messages if valid.

Can we simply add a new class/method inside the source base ? Let's try that: building the LogHandler class and trying to use it in CefRenderWidgetHostViewOSR:

Setup access to LogHandler

Now that we have a LogHandler support we should assign it on start of our software. How would we do that ?

So we need to figure out what we are missing here and how to make our function available from the libcef library to the outside world.

⇒ We expect to get one or the other of those 2 debug outputs (or both ?)… well, instead we got a crash :-( but it doesn't seem to be related to what we are changing here:

2017-09-15T10:19:33.735898 [Debug]            Calling CefInitialize...
[0915/111933.735:FATAL:libcef_dll_wrapper.cc(209)] Check failed: false.

Setup access to LogHandler - Second try

Okay, so we are probably not too far from success, BUT we should avoid making changes to the include/capi folder, thus, not include our new exported function into cef_app_capi.h ⇒ This doesn't matter much: let's just defined it in a separated file!

Great! :-) We now have the expected outputs:

2017-09-17T13:37:32.881686 [Debug]            NervCEF: CompositingSurface has texture!
2017-09-17T13:37:33.056642 [Debug]            NervCEF: CompositingSurface has texture!
2017-09-17T13:37:33.377407 [Debug]            NervCEF: CompositingSurface has texture!

Investigating image copy pipeline

Now that we can output some debug outputs if need, let's try to clarify further the offscreen image copy pipeline.

First, let's check what kind of messages we got in the previous run test:

⇒ It all comes down to the method: viz::GLHelper::CropScaleReadbackAndCleanMailbox(…)

Found the following description for this method in components/viz/common/gl_helper.h:

  // Copies the block of pixels specified with |src_subrect| from |src_mailbox|,
  // scales it to |dst_size|, and writes it into |out|.
  // |src_size| is the size of |src_mailbox|. The result is in |out_color_type|
  // format and is potentially flipped vertically to make it a correct image
  // representation.  |callback| is invoked with the copy result when the copy
  // operation has completed.
  // Note that the texture bound to src_mailbox will have the min/mag filter set
  // to GL_LINEAR and wrap_s/t set to CLAMP_TO_EDGE in this call. src_mailbox is
  // assumed to be GL_TEXTURE_2D.

So our parameter “src_mailbox” is assumed to be a GL_TEXTURE_2D ⇒ sounds good for us! Now checking the method implementation in components/viz/common/gl_helper.cc:

void GLHelper::CropScaleReadbackAndCleanMailbox(
    const gpu::Mailbox& src_mailbox,
    const gpu::SyncToken& sync_token,
    const gfx::Size& src_size,
    const gfx::Rect& src_subrect,
    const gfx::Size& dst_size,
    unsigned char* out,
    const SkColorType out_color_type,
    const base::Callback<void(bool)>& callback,
    GLHelper::ScalerQuality quality) {
  GLuint mailbox_texture = ConsumeMailboxToTexture(src_mailbox, sync_token);
  CropScaleReadbackAndCleanTexture(mailbox_texture, src_size, src_subrect,
                                   dst_size, out, out_color_type, callback,
                                   quality);
  gl_->DeleteTextures(1, &mailbox_texture);

Checking the implementation oc ComsumeMailboxToTexture:

GLuint GLHelper::ConsumeMailboxToTexture(const gpu::Mailbox& mailbox,
                                         const gpu::SyncToken& sync_token) {
  if (mailbox.IsZero())
    return 0;
  if (sync_token.HasData())
    WaitSyncToken(sync_token);
  GLuint texture =
      gl_->CreateAndConsumeTextureCHROMIUM(GL_TEXTURE_2D, mailbox.name);
  return texture;
}

And finally we find for CreateAndConsumeTextureCHROMIUM (inside gpu/command_buffer/client/gles2_implementation.cc):

GLuint GLES2Implementation::CreateAndConsumeTextureCHROMIUM(
    GLenum target, const GLbyte* data) {
  GPU_CLIENT_SINGLE_THREAD_CHECK();
  GPU_CLIENT_LOG("[" << GetLogPrefix() << "] glCreateAndConsumeTextureCHROMIUM("
                     << static_cast<const void*>(data) << ")");
  const Mailbox& mailbox = *reinterpret_cast<const Mailbox*>(data);
  DCHECK(mailbox.Verify()) << "CreateAndConsumeTextureCHROMIUM was passed a "
                              "mailbox that was not generated by "
                              "GenMailboxCHROMIUM.";
  GLuint client_id;
  GetIdHandler(SharedIdNamespaces::kTextures)->MakeIds(this, 0, 1, &client_id);
  helper_->CreateAndConsumeTextureINTERNALImmediate(target,
      client_id, data);
  if (share_group_->bind_generates_resource())
    helper_->CommandBufferHelper::Flush();
  CheckGLError();
  return client_id;
}

So it sounds like everytime we request a copy we start with creating a GL texture, then we crop/scale/readback on CPU memory, and finally we destroy the texture.

We continue following the CreateAndConsumeTextureINTERNALImmediate method (found in gles2_cmd_helper_autogen.h):

void CreateAndConsumeTextureINTERNALImmediate(GLenum target,
                                              GLuint texture,
                                              const GLbyte* mailbox) {
  const uint32_t size =
      gles2::cmds::CreateAndConsumeTextureINTERNALImmediate::ComputeSize();
  gles2::cmds::CreateAndConsumeTextureINTERNALImmediate* c =
      GetImmediateCmdSpaceTotalSize<
          gles2::cmds::CreateAndConsumeTextureINTERNALImmediate>(size);
  if (c) {
    c->Init(target, texture, mailbox);
  }
}

And from there, we call this init method (found in gles2_cmd_format_autogen.h):

  void Init(GLenum _target, GLuint _texture, const GLbyte* _mailbox) {
    SetHeader();
    target = _target;
    texture = _texture;
    memcpy(ImmediateDataAddress(this), _mailbox, ComputeDataSize());
  }

So: we create a GL texture Id (with the call to MakeIds(…)), then we send a command to copy the mailbox data into our texture. So, I think when CreateAndConsumeTextureCHROMIUM() returns we should have a valid GL texture ID, which contains the imagery data we want to retrieve ?

Then the next question is: How can we retrieve this data into our own texture object on the GPU from a different GL or DirectX context ?

Inter-context texture copy

From GL to GL

Reading a few related pages:

⇒ Those links do not seem very optimistic, but then I found the GL_NV_copy_image extension page, which states that:

This extension enables efficient image data transfer between image
objects (i.e. textures and renderbuffers) without the need to bind
the objects or otherwise configure the rendering pipeline.  The
WGL and GLX versions allow copying between images in different
contexts, even if those contexts are in different sharelists or
even on different physical devices.

This sounds more interesting already. But then we have the context specification issue in wglCopyImageSubDataNV and glXCopyImageSubDataNV: we need to be able to provide the source/dest contexts as HGLRC or GLXContext objects.

So let's assume for a moment that this function is available (ie. we are using a recent nvidia GPU), and that we are on windows (ie. we are considering the HGLRC contexts): how could we retrieve the source context ?

From GL to DirectX

Found this article which sounds promizing: https://github.com/Microsoft/angle/wiki/Interop-with-other-DirectX-code

⇒ Are we using “ANGLE” in CEF ? According to the previous forum post YES. So the rational here should be applicable: we should be able to:

  1. Create a shared DirectX texture (DX11 is mentioned, but maybe it could also workd with DirectX9Ex ?)
  2. Then we pass the shared handle to the CEF engine and use it to create a new EGL surface.
  3. We bind the surface to a GL texture,
  4. And we copy from our source mailbox onto that texture ?

This might all just sound like a plan ;-)

Investigating old Hardware offscreen render support implementation

So what about the first accelerated paint implementation mentioned in this forum post ?

⇒ Not understanding much from this patch. But it seems to rely on the OpenGL API. I don't think we can really apply the logic available here in the new Chromium source code base. So let's instead focus on the DirectX/Angle interactions mentioned above.

Figuring out how the CEF commands work

Now trying to figure out where we create a GLSurfaceEGL class instance… Searching in all .cc files with:

$ grep -rnw --include \*.cc . -e 'GLSurfaceEGL'
./components/exo/wayland/clients/client_base.cc:279:    if (gl::GLSurfaceEGL::HasEGLExtension("EGL_EXT_image_flush_external") ||
./components/exo/wayland/clients/client_base.cc:280:        gl::GLSurfaceEGL::HasEGLExtension("EGL_ARM_implicit_external_sync")) {
./components/exo/wayland/clients/client_base.cc:283:    if (gl::GLSurfaceEGL::HasEGLExtension("EGL_ANDROID_native_fence_sync")) {
./gpu/command_buffer/service/texture_definition.cc:150:  EGLDisplay egl_display = gl::GLSurfaceEGL::GetHardwareDisplay();
./gpu/ipc/service/direct_composition_child_surface_win.cc:45:    : gl::GLSurfaceEGL(),
./gpu/ipc/service/direct_composition_surface_win.cc:97:  if (!gl::GLSurfaceEGL::IsDirectCompositionSupported())
./gpu/ipc/service/direct_composition_surface_win.cc:1005:    : gl::GLSurfaceEGL(),
./gpu/ipc/service/gpu_init.cc:110:      gl::GLSurfaceEGL::IsDirectCompositionSupported() &&
./gpu/ipc/service/image_transport_surface_win.cc:54:    if (gl::GLSurfaceEGL::IsDirectCompositionSupported()) {
# (...) => more lines here.

Server side command handling

On the client side, it now seems relatively clear how the various GL commands are “posted”, but now we need to figure out how those commands are handled on the server side.

⇒ Focusing on the initial question: trying to find where we use the command ID kGenTexturesImmediate ?

So it seems the Command ID is not how we retrieve the GenTexturesImmediate of command on the server/host side ?

⇒ From there, the GenTexturesHelper(…) method will call glGenTextures():

bool GLES2DecoderImpl::GenTexturesHelper(GLsizei n, const GLuint* client_ids) {
  for (GLsizei ii = 0; ii < n; ++ii) {
    if (GetTexture(client_ids[ii])) {
      return false;
    }
  }
  std::unique_ptr<GLuint[]> service_ids(new GLuint[n]);
  glGenTextures(n, service_ids.get());
  for (GLsizei ii = 0; ii < n; ++ii) {
    CreateTexture(client_ids[ii], service_ids[ii]);
  }
  return true;
}

So this is where we call the gl functions: in the GLES2Decoder. Yet, how do we reach the HandleGenTexturesImmediate method ?

GLES2Decoder class cannot be instantiated because is is abstract. Yet we need to figure out what are all the possible derived classes and where they are instantiated now:

This all still doesn't tell us where HandleGenTexturesImmediate is effectively called… The method is not available to any public interface so it should be used only “internally” ?

The GLES2Decoder instance is create with a CommandBufferService object.

⇒ Found information on how to add a command from this page

Found the CommandInfo struct in gles2_cmd_decoder.cc:

  // A struct to hold info about each command.
  struct CommandInfo {
    CmdHandler cmd_handler;
    uint8_t arg_flags;   // How to handle the arguments for this command
    uint8_t cmd_flags;   // How to handle this command
    uint16_t arg_count;  // How many arguments are expected for this command.
  };

  // A table of CommandInfo for all the commands.
  static const CommandInfo command_info[kNumCommands - kFirstGLES2Command];

OK, so now we need to figure out where this array is filled, and where we call the cmd_handler function!

Sharing DirectX texture with ANGLE

As mentioned above, it seems to be possible to have ANGLE write to ashared DirectX texture. So in our case we would need to follow those steps:

1. Create a shared DirectX texture

We should already have the required support functions to do this ?

Then we can provide our own implementation of the SharedRenderSurfaceManager class on CEF start, just like with the LogHandler class above.

Where do we use the getShareHandle() then ?

Then, we need to create an EGL surface based on this handle:

EGLSurface surface = EGL_NO_SURFACE;

EGLint pBufferAttributes[] =
{
    EGL_WIDTH, width,
    EGL_HEIGHT, height,
    EGL_TEXTURE_TARGET, EGL_TEXTURE_2D,
    EGL_TEXTURE_FORMAT, EGL_TEXTURE_RGBA,
    EGL_NONE
};

surface = eglCreatePbufferFromClientBuffer(mEglDisplay, EGL_D3D_TEXTURE_2D_SHARE_HANDLE_ANGLE, sharedHandle, mEglConfig, pBufferAttributes);
if (surface == EGL_NO_SURFACE)
{
    // error handling code
}

⇒ Any chance we could execute the previous code in the CEF context ?

2. Request rendering on shared DX texture

Once we have a valid share handle for our DirectX texture, then each time we have to refresh the offscreen rendered image, we could send a command to the GPUProcess, to request rendering from a given GL texture on our shared DX texture. Something like: copyOnDXSharedSurface(uint glTextureID, void* shareHandle).

3. Resources allocation/deallocation

Before we can copy to our shared DirectX surface, we need to allocate the required resources for it: this can be done on the GPU process itself directly: if when we handle the copy, if the required resources for a given share handle are not found, then we create them.

But then we have the question of the resources deallocation: when we are done rendering with a given browser, we should release those resources allocated on the GPU Process. For instance with a command such as: releaseDXSharedSurfaceResources(void* shareHandle).

4. Logging in GPU Process

During development, we will have to get some outputs from the GPU Process itself to ensure everything is happening as expected. How can we get those outputs ?

First implementation stage

Creating the nvLogger helper class, and trying to use it to log messages to file when creating the GLES command decoder:

Running the target project with this new version of CEF we have a cef_nv.log file generated! But the content is not what we expected:

nvLogger initialized.Destroying nvLogger.

What would be interesting to add from that point now si the process ID / thread id / and log time for each message in that file.

⇒ OK, updated the code, and now we get those results:

2017-09-26 13:53:13.473 UTC [20408]{16984}: nvLogger initialized.
2017-09-26 13:53:13.473 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:13.479 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:13.482 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:13.484 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:32.438 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:34.164 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:34.170 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:36.405 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:36.443 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:36.448 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.
2017-09-26 13:53:37.010 UTC [20408]{16984}: Creating GLES2DecoderImpl instance.

So it seems all GLES2Decoder instances are created from the same process id (expected) and from the same thread: so this all good for us.

Where to retrieve the rendered frame

error::Error GLES2DecoderPassthroughImpl::HandleNervCopyTextureToSharedHandle(
  uint32_t immediate_data_size,
  const volatile void* cmd_data) {
  const volatile gles2::cmds::NervCopyTextureToSharedHandle& c =
      *static_cast<const volatile gles2::cmds::NervCopyTextureToSharedHandle*>(
          cmd_data);
  GLuint texture_id = c.texture_id;
  void* handle = (void*)(c.shared_handle());
  NV_LOG("GLES2DecoderPassthroughImpl: received shared handle "<<(const void*)handle<<" for source texture_id: "<<texture_id);

  return error::kNoError;
}

<sxh cpp>void GLES2Implementation::NervCopyTextureToSharedHandle(
GLuint texture_id, GLuint64 shared_handle) {
helper_->NervCopyTextureToSharedHandle(texture_id, shared_handle);

}</sxh>

Getting the copy operation to work

Origin of the Compositor texture

Currently it seems we are generating a new texture each time the compositor has a new output to provide: we need to investigate what is actually the source of this generation and thus see if we can use the root texture directly (if any ?)

Texture memory issues

TODO

References