Some aspects of WebGL optimisation


The browser today does all the things that made us install a bunch of software 10 years ago. It’s really cool but don’t forget that possibilities and performance are limited.
Even on your i7 or on your cool QuadCore tablet.
Understanding of this fact makes software engineer responsible for writing the code that could be run on machines cheaper that 1.5kEUR.
One of very cool contemporary web-things is the WebGL. Let’s talk about the optimization in this area.


I’ll describe a little the case we faced and will provide some conclusions we came to. Actual constraints we met, brought us some experience in the WebGL optimization.
Imagine that there is a web application

  • running on Raspberry-Pi-like machine with the Linux on board.
  • The application shows the grid (table with ~50 cells of variable sizes) and the details panel.
  • Each cell contains the text chunk and some status icons.
  • The details panel contains several paragraphs of text and an image (header cells also contain images, like vendor logos).

Q: Does it look that complicate?
A: Not for your i7 or for your cool QuadCore tablet. But it looks really heavy for that device (taking into accounting all the background services running).
Especially with the animation (a must-have option).
Especially with asynchronous data loading (the grid can be paginated in all the four directions far away from the starting point).


We tried PIXI.js and it was really nice. With less data (20-30 rectangles, images and color) PIXI shows amazing animation, effects and all the advantages of the GL.
However it shows significant delay and low FPS with our amount of data (mostly thanks to the text) on our device (I’ll describe technical constraints later).

Hardware WebGL constraints that we met

  • Although the WebGL specification supports up to 32 texture units being used simultaneously (via gl.activeTexture(gl.TEXTURE<0..31>);), our device throws exceptions accessing textures #16+;
  • maximal texture dimension is 2048 px. Is this enough? Emmm… We would like to use the framebuffer for for horizontal scrolling 1050 px wide area (that requires 2 x 1050 px) but framebuffer’s constraints depend on the texture constraints so we missed that damn 52 px 😦
  • gl.drawElements take 10 times more time as gl.drawArrays with comparable parameters. We don’t know why, it looks like driver-specific issue.
  • Creating the texture from the regular image takes 50-100 ms; creating the texture of the canvas2D of similar resolution takes up to 2,000 ms! Power-of-two / NPOT do not matter for our case.


Before using comparative adjectives, let me describe metrics we used.
America needs you
The main KPI is you. It’s about are you satisfied with the animation smoothness, aren’t you. It’s about the delay between you press the arrow key and the page starts the animation. It’s about the frame-smoothness of the animation: does it feel like Skrillex or like Beethoven.
Of course we used less perfect (joking 🙂 ) metrics as well like FPS, CPU usage and memory consumption.

WebGL optimisation

Now let’s talk about how we solved the performance issues.

Merge ’em all!

Drawing (I mean “passing to the GPU through the GL calls” here and in the future) of one big vertex buffer is more efficient than drawing of several chunks:

  • considering we have 50 rectangles filled with one color/texture (50 x 6 x 2 pairs of vertex coords);
  • gl.drawArrays(, 0, 50 * 6); takes 1.2-1.5ms/frame,
    in comparison to
  • for (var i = 0; i < 50; i++) { gl.drawArrays(gl.TRIANGLES, 0, 6); }
    takes up to 3ms/frame (with appropriate visual FPS decrease and animation lags).

Use scissor

Like this:
gl.enable(gl.SCISSOR_TEST); gl.scissor(x, y, width, height);
It is more efficient than drawing the out-of-box content and hiding it with the overlays. K. O. approves: the less pixels you draw, the faster it goes 🙂
Stencil should also improve performance but we haven’t used it due to restrictions of our hardware/drivers.
Using scissors increased the FPS in our case from 26-28 to 32-33 (10-15% of pixels were cut off).
Potential pitfalls here:

  • you have to flip the Y-coordinate for the default render target but you haven’t to do this using scissor for framebuffers;
  • make sure you disable scissor (by gl.disable(gl.SCISSOR_TEST); or by setting maximal possible coordinates) after using it. If you don’t — be ready to spend your weekend debugging the animation.
  • One of good practices is
figures.forEach(function (fig) {
  gl.bindBuffer(...); // alongside with other preparations
  gl.scissor(fig.clip.x, fig.clip.y, fig.clip.width, fig.clip.height);
  gl.scissor(0, 0, canvas.width, canvas.height);


Text is the real pain in the back. For the HTML approach you wrap the text in the tag or do div.innerHTML = "Problems, GL-officer?"; and it appears! Brutal lumberja GL guys create a pair or triangles for each letter and place the char texture over ’em respecting the character coordinates (char atlas). For each letter, Karl!

  • Bare in mind that drawing even this small paragraph of text it really expensive for the WebGL.
  • Scaling the text is never an option.
    Font is vector object so rasterizing it to necessary size gives better quality than scaling formerly rasterized image. Rasterization algorithms for text are not that primitive so scaling the picture (even decreasing each dimension twice) does not give the perfect font appearance.
  • Same principle is for font variants (oblique, bold, etc): many fonts use completely different visuals for font variants and it’s impossible to transform, for instance, normal to oblique with simple tilt operation.
  • If you have few strings of static text, it might be better to create textures with pre-defined text chunks (words, phrases, etc).
  • If you have lot of dynamic text, char atlas comes to rescue. Create textures for each font-family/size/variant combination. It’s better to create one big texture containing as many chars and covering as many font-family/size/variant combinations as possible.
    That allows using one texture for many text objects (and invoke gl.drawArrays less times).
  • Play with different blending options to make char edges smooth. Why “play? — it depends on the destination media a lot (for instance, the default brightness settings for TV sets, projectors and mobile devices vary a lot and that impacts the look of the char edge). In our case, text looks best using
    gl.blendFunc(gl.ONE, gl.ONE_MINUS_SRC_ALPHA);
    however the semi-transparent PNGs require
    gl.blendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA);
    Your settings may vary.

Pass the transition logic to the vertex shader

If you need simple animations, this really helps. Considering you have the array of 12k vertex coords. Think that’s a lot? — that’s only 1000 chars — two regular paragraphs on the Wikipedia!
Now you need to move it: you have to update all the 12k values.
For every frame of your animation.
With the JS.
In the browser.

Bet the GPU does it faster.

Keep your shaders simple

The void main () {} of the vertex shader runs for each vertex; same code of the fragment shader runs for each pixel! Complex/nested conditions are evil here. Some simple conditions might be emulated with arithmetical operations:

uniform int u_blendAdd; // zero or one
if (u_blendAdd == 1) {
  gl_FragColor = v_color_0 + v_color_1;
} else {
  gl_FragColor = v_color_1;

— vs —

uniform int u_blendAdd; // zero or one
gl_FragColor = v_color_0 * u_blendAdd + v_color_1;

Bare in mind, not all conditions might be emulated via simple operations 🙂

Framebuffers come to rescue

Considering you have to scroll the scene containing 1000 rectangles — it means you have to redraw all of them each frame (if you want 60 FPS, drawing of one frame should take less than 17ms).
The trick is:

  • draw the scene to the framebuffer before the animation starts;
  • draw the area using that framebuffer as the texture for each frame.
    That makes you passing 6 vertices (for the rectangular area) each frame instead of 6000 vertices.

These points helped us with making our WebGL code more performant. Hope they help you too.

Fly in the ointment

The life teaches us that there is no silver bullet. So the WebGL is not always an option.

  cool: "super",
  singlePage: true,
  autoCreate: "all-the-pages-I-ever-need"

— vs —

  1. WebGL is very low-level especially for guys habitual to ng- and $-programming [:trollface:]. You need at least two languages in three files to do something visible with raw WebGL. Frameworks usually come to rescue.
  2. WebGL realization takes more memory in comparison to HTML or canvas2D realization of similar prototype. Do you remember the 12k coords case — that’s it!
    Actually, HTML realization might also take a lot of memory — DOM is expensive enough.

    • An non-obvious outcome here, you must remember any big object you create to clean it up later. So don’t forget any Image or ArrayBuffer and nullify ’em later.
  3. One of WebGL peculiarities, it’s not so easy to release the GL context after you finished the work with it. Memory leaks are possible here so debug “enter-exit” moments carefully.
  4. It’s not always possible to apply kind of optimization I described above (for instance, merging similar figures together and reduce gl.drawArrays calls).

Anyway, WebGL is awesome so have fun exploring this area!


One thought on “Some aspects of WebGL optimisation

  1. As per my colleague Dave:

    I’m not sure about your conclusions at the end though. I doubt Webgl uses more memory than a browser, browser has a lot more overhead it needs to do.
    Also, don’t just clean up big objects, clean up all your objects, because sometimes death by a thousand cuts memory leaks are harder to track down than whopping big ones! 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s