Posts Tagged WebGL

WebGL Basics 5 – Full transformation matrix

In this post we look at the transformation matrix with all steps.


The vertices of the 3D scene are stored in static arrays, and then in "buffers", by the Javascript code. In order to render the scene as viewed by an observer located at an arbitrary position in the scene, the vertex coordinates must be transformed such that the visible part of the scene is in the viewable cube (or view "frustum"). The necessary transformations are summarized hereafter:

  • The 3 translations correspond to the 3 degrees of freedom of the observer in pure translation
  • The scaling represents a zoom capability of the observer
  • The 3 rotations correspond to the 3 degrees of freedom in rotation
  • The projection is necessary because the visualization device we use is a 2D monitor
  • The aspect ratio must be corrected due to the default scaling used to fill the WebGL context

The next sections describe these transformations step-by-step starting with a reminder about homogeneous coordinates.

Homogeneous coordinates

A coordinate transformation in the 3D space is a function that takes a 3-components vector as input and returns a transformed 3-components vector. In general, such a function could be anything (square, exponentiation, etc) and would transform the space in any arbitrary way. If the transforming function is chosen to be a single matrix multiplication, several common transformation can be realized: rotation, scaling and symmetry in the 3D space can all be implemented as 3×3 matrix multiplications. Furthermore, because it is a linear operation, several successive transformations can be combined into one matrix multiplication, the necessary matrix being the product of the matrices of the successive operations. Consequently, because only one matrix operation is necessary, the method is well adapted to massive computations in the GPU.

The only problem with 3×3 matrix multiplications is that some operations that are necessary to render a 3D scene, namely translations and projections, cannot be realized.

Homogeneous coordinates allow to concentrate all transformations into one matrix multiplication by adding an extra dimension (not the time!), such that:

  • All operations are done on 4-components vectors, using 4×4 matrix multiplications
  • The "homogeneous coordinates" of a point are hence the 4 components of a vector (x,y,z,w), where x, y, z are the coordinates of the point in the space, and w (from "weight") an appended scale factor set to 1 at the beginning, which is used only for internal computation
  • 3D space coordinates are obtained back by dividing the 3 first components by the 4th one: X=x/w , Y=y/w, Z=z/w

Several areas can be identified in the 4×4 matrix (in wxMaxima: mat:matrix( [Mxx,Mxy,Mxz,Cx], [Myx,Myy,Myz,Cy], [Mzx,Mzy,Mzz,Cz], [Lx,Ly,Lz,W]);):

The coefficients in the matrix can be interpreted in independent groups (provided the other coefficients are set to 0, except the diagonal set to 1):

  • Mij: the 3×3 matrix used for the common linear transformations
  • Ci: after multiplication by a point vector (x,y,z,w), the result is (x + Cx w, y + Cy w, z + Cz w, w), hence these coefficients are used to add constant values i.e. for translations
  • Lj: after multiplication by the point vector, the scale factor becomes a linear combination of the point coordinates (Lx x + Ly y + Lz z + W w), i.e. it allows to divide the coordinates among them (when 3D coordinates are transformed back), what is useful for projection.

The next sections illustrate how the space transformations are implemented in 4×4 matrices.

In addition, homogeneous coordinates support operations on points that are infinitely far. Indeed if the scale factor is near 0, the last 4D to 3D transformation (division by the scale factor) will result in values nearing infinity. But before this last transformation, all operations are done on non infinite values without risk of errors.

Observer position and orientation

We define the observer with the following parameters:

  • (ox,oy,oz): the point looked at
  • (rx,ry,rz): orientation of the observer around the axes
  • d: the distance between the observer and the point looked at


A translation is necessary to follow the movements of the observer. The general form a of a translation matrix with homogeneous coordinates is as following:

Multiplied by the point coordinates given by the column-vector (x,y,z,w), we obtain:

As usual with homogeneous coordinates, the transformation is completed only after the last operation of dividing the three first components by w (that should be equal to 1 at this stage):

Here we define the translation as a change of the origin (0,0,0) of the model. The new origin is noted (ox,oy,oz) and corresponds to the point the observer is located (or better said is looking at). The translation matrix is as following (in wxMaxima: translation:matrix( [1,0,0,-ox], [0,1,0,-oy], [0,0,1,-oz], [0,0,0,1]);):

Negative values are due to the definition we use: the model is translated from the (ox,oy,oz) point to the (0,0,0) point, i.e. in the opposite direction as the vector (ox,oy,oz).

Rotation Basics

Basically, a rotation is a transformation of the plane. In the plane formed by the X and Y axes, the rotation around the Z axis makes a point turn around the origin according to a given angle:

Let’s take the point with polar coordinates (r,α). The Cartesian coordinates are given by:

x = r * cos(α)
y = r * sin(α)

In polar coordinates, a rotation around the origin is simply done by changing the angle component. Hence in our example the new point (x’,y’) after rotation by the angle theta is given by:

x´ = r * cos( α + θ )
y´ = r * sin( α + θ )

By applying the angle addition formulas:

x´ = r * ( cos(α) * cos(θ) - sin(α) * sin(θ) )
y´ = r * ( sin(α) * cos(θ) + cos(α) * sin(θ) )

Now we can re-order the terms to let appear the Cartesian coordinates of the original point:

x´ = x * cos(θ) - y * sin(θ)
y´ = x * sin(θ) + y * cos(θ)

This is the standard form of a 2D rotation, given hereafter as a multiplication matrix:

There is another way to obtain this formula by using complex numbers. Indeed, a multiplication by a complex number of the form r*e results in a rotation of angle θ and a scaling of factor r. So we rewrite the original point as x+iy, and multiply it with the complex number of angle θ and modulus 1 (defined using trigonometric functions) to get the transformed point:

x' + iy' = ( x + iy ) * ( cos(θ) + i sin(θ) )

After development (remember i2=-1) and re-ordering:

x' + iy' = x cos(θ) - y sin(θ) + i ( x sin(θ) + y cos(θ) )

The real and imaginary values must be identified with x’ and y’, and then we get the same formula as with the first method.

Rotations of the observer

Rotations are needed due to the orientation of the observer. In our definition, the observer rotates around the point looked at. Taken separately, each single rotation can be understood as a rotation around one axis, defined by angles such that positive values represent:

  • rx: goes down, look towards top
  • ry: goes right, look towards left
  • rz: rolls to the right staying at the same place

In order to simplify the matrices, we set:

  • sx = sin(rx)
  • cx = cos(rx)
  • sy = sin(ry)
  • cy = cos(ry)
  • sz = sin(rz)
  • cz = cos(rz)

Using the results of the previous paragraphs, the rotations around the axes can be represented in homogeneous coordinates as following, starting with the rotation around Z (in wxMaxima: rotz:matrix( [cz,-sz,0,0], [sz,cz,0,0], [0,0,1,0], [0,0,0,1]);):

The rotation around X and Y are obtained by permutation, i.e. by moving the 2×2 rotation matrix within the 4×4 identity matrix. The rotation matrix around X (in wxMaxima: rotx:matrix( [1,0,0,0], [0,cx,-sx,0], [0,sx,cx,0], [0,0,0,1]);):

Around Y (in wxMaxima: roty:matrix( [cy,0,sy,0], [0,1,0,0], [-sy,0,cy,0], [0,0,0,1]);):

The different rotations must be combined, what is descried in a section below.

Scaling and aspect ratio

Two scaling operations are needed: one for the zooming factor of the observer, the other to correct the X/Y distortion due to the standard rendering mechanism.

The zoom is an isotropic scaling, i.e. the 3 dimensions are scaled equally. Such a transformation is represented by the following matrix (in wxMaxima: scale:matrix( [s,0,0,0], [0,s,0,0], [0,0,s,0], [0,0,0,1]);):

The correction of the aspect ratio is anisotropic: the Z coordinate remains unchanged and either the X or Y coordinate must be scaled such that a square is rendered as a square and not as a rectangle. Here we choose to leave the Y coordinate unchanged and to adapt the X coordinates (horizontal compression). Without correction, the point at (1,1) would be rendered at the top-right corner of the view port (i.e. with x equal to half of the the width of the context) . Once corrected, the same point must rendered at a width x’ such that it equals half of the height of the context (hence no distortion). Formally:

x' = context_height/2
x = context_width/2

By dividing the first formula by the second and re-ordering:

x' = x * context_height / context_width

We define the aspect ratio as the quotient of the width by the height (remember the common "16/9" screen dimension), i.e. it will be greater than 1 for a standard screen and equal to 1 for a square view port:

ar = context_width / context_height

With this definition, the matrix used for the correction aspect ratio is (with wxMaxima: aspect_ratio:matrix([1,0,0,0],[0,1/ar,0,0],[0,0,1,0],[0,0,0,1]);):

Combining rotations, scaling and translations

The transformations seen so far can be combined in a basic transformation matrix, i.e. without the perspective projection part. Because the matrix product is not a commutative operation, the order in which matrices are multiplied is important. Some rules can be deduced from the above sections to determine possible combinations:

  • The rotation around the Z axis is better after the rotations around X and Y, such that it is a final rotation of the rendered plane
  • Scaling and rotations can be combined in any order
  • The translation must be the first operation for it moves the center of all subsequent operations

We choose the following order: first translations, then rotations, then scaling and aspect ratio.

A very important point to note about the way WebGL (as well as OpenGL) uses arrays: matrices are flattened in linear arrays according to the "column-major order". This means that data is stored one column after the other, and not row by row. Because wxMaxima returns matrices in row-major order, the matrices we use must be first transposed before being passed to any WebGL function. We obtain the wished matrix with wxMaxima using the previously entered matrices and the final command transpose(aspect_ratio. scale. rotz. roty. rotx. translation);:

This basic transformation matrix is implemented in the Javascript code in addition to the full one (see section below).

Perspective projection

The projection is needed to render a 3D scene onto a 2D screen, it is hence a transformation of the space to a plane. The projection plane is typically chosen perpendicular to the view direction. In our basic transformation matrix, the depth coordinate is just ignored to render the model (we keep (x,y) from (x,y,z)). This kind of projection, called "orthographic" (it can be shown that it conserves parallelism and relative angles), cannot render realistically a scene with perspective because the depth information is simply lost.

To produce a perspective effect, i.e. such that far objects are rendered smaller than near ones, the depth information must be combined with the other coordinates in the projection computation. The following picture illustrates what happens on the Y coordinates during the projection of a point to the screen in the general case:

Here we look at the positions of the objects in the plane ZY, i.e. we see the observer and what it observes from the side. The observer is located at (0,0) and the point to be rendered (real position) is at (Z,Y). We set the "projection screen" at a distance d of the observer. Very naturally, we find that the observer "sees" the point to be rendered through the projection screen at a position Y’ such that (using the Thales theorem a.k.a. similar triangles):

Y' / d = Y / Z

Or, after re-ordering:

Y' = d * Y / Z

Several things can be seen in this formula:

  • The main operation is a division by Z, what matches the expected behavior (the bigger Z, the smaller are the rendered objects)
  • The parameter d has the effect of a zoom factor, and due to this distance d there is always a translation to take into account for a correct positioning of the camera. These unwanted scaling and translation make it difficult to predict the final position of rendered points.
  • If the observer is far away from the projection screen, i.e. d and Z are high hence the ratio d/Z≈1, the result of the projection Y’≈Y is almost the same as without projection. Consequently, if d is high, the perspective effect is minimized (like for an orthographic projection).

Besides, in order to fit in the homogeneous coordinates concept, the projection must be represented as a matrix multiplication. As an illustration, in the simplest case (no translation, no scaling), a projection is a product by a matrix of the following form (in wxMaxima: pure_projection:matrix( [1,0,0,0], [0,1,0,0], [0,0,1,0], [0,0,1,0]);):

Multiplied by the column-vector coordinates of a point (x,y,z,1) (or in wxMaxima point:matrix([x],[y],[z],[1]);), the result is (pure_projection.point;):

The division by Z appears only after the homogeneous coordinates are transformed back into 3D space coordinates (division by the 4th coordinate) :

The main issue with this simple projection matrix is that the Z coordinate equals 1 at the end, and is thus lost. The Z value is necessary to correctly display overlapping faces according to their depths (the nearest faces must be drawn last).

We will hence use a matrix that allows keeping the Z coordinate, and at the same time make the result more predictable. The goal here is to find a projection matrix such that a given set of points remain invariant during the projection, that the parameter d (distance between observer and projection screen) controls only the perspective effect without zoom (for which there is an explicit scaling operation as seen in a section above), and that the Z coordinates of all points in the visible interval are scaled to the standard [-1,1] interval.

The perspective projection is formally defined with the following requirements:

  1. The projection screen shall be at a distance d of the observer.
  2. Points in the projection screen at Z=0 shall be invariant in the operation (no scaling, etc)
  3. Points at Z=n (near plane, i.e. points nearer than this plane are not displayed) shall have a coordinate Z’=-1 after projection
  4. Points at Z=f (far plane) shall have a coordinate Z’=+1 after projection

Requirement 1 implies a translation on the Z axis. Because the translated value of the z coordinate must be used by later operations on all other coordinates, the translation is isolated in a dedicated pre-projection matrix (pre_projection:matrix( [1,0,0,0], [0,1,0,0], [0,0,1,d], [0,0,0,1]);):

The multiplication by a point (x,y,z,1) gives (pre_projected_point:pre_projection.point;):

We now look for a matrix containing unknowns that can answer our needs after the first translation on Z, i.e. a matrix including a scaling on X and Y, an additional scaling factor and a projection factor. We assume that the following matrix is usable for this task (in wxMaxima conjectured_projection:matrix( [A,0,0,0], [0,A,0,0], [0,0,B,C], [0,0,1,0]);):

The product of the coordinates of a point (x,y,z,1) by the pre-projection matrix and then by our conjectured matrix gives (conjectured_projected_point: conjectured_projection. pre_projected_point;):

After the last transformation from homogeneous coordinates to space coordinates (division by the 4th component), it comes (conjectured_projected_point_3D: matrix( [conjectured_projected_point[1][1]], [conjectured_projected_point[2][1]], [conjectured_projected_point[3][1]])/ conjectured_projected_point[4][1];):

We extract the first and third components (xcoord:conjectured_projected_point_3D[1][1]; and zcoord:conjectured_projected_point_3D[3][1];) and use them to create a system of equations by substituting values given by Requirements 2, 3 and 4 (in wxMaxima equations: [subst(0,z,xcoord)=x, subst(f,z,zcoord)=1, subst(n,z,zcoord)=-1];):

This is a common system of linear equations showing corner conditions if we want to keep the sign of all factors:

  • d>0: If the distance between eye and projection screen equals 0, all points are at the infinite. If lower than 0, everything is inverted.
  • n+d>0: If the distance d is smaller than the near plane, the coordinates of the points between the observer and the near plane will be inverted.
  • f+d>0: Same as the condition on the near plane.

It can be solved by wxMaxima (solutions: solve(equations,[A,B,C]);):

The results are quite complex and show the obvious corner condition f-n>0, i.e. near and far planes are placed in increasing Z coordinates (the Z axis is pointed towards the back of the rendering cube).

Full transformation matrix

Now that the last operation is defined, it can be combined as the last step. The product and the final transposition are done in wxMaxima with the code full_transformation:transpose(conjectured_projection. pre_projection. aspect_ratio. scale. rotz. roty. rotx. translation);:

Exchange of X and Z

For a better understanding of the projection operation, we add a final matrix that only exchanges the X and Z coordinates (exchange_xz:matrix( [0,0,1,0], [0,1,0,0], [1,0,0,0], [0,0,0,1]);):

This operation must be activated only for didactic purposes. It shows the projection operation from the side. The resulting matrix is given hereafter (full_transformation_exz: transpose(exchange_xz. conjectured_projection. pre_projection. aspect_ratio. scale. rotz. roty. rotx. translation);):

Depending on the activation of the last XZ exchange, the wished matrix is selected.

Javascript code

We use wxMaxima to get the results in a C-like syntax and insert the formulas in getTransformationMatrix. As the distance to camera must be greater than the near plane, we use the transformation matrix without projection for smaller values of d, and a third matrix for a projection including the final XZ exchange:

// Returns a transformation matrix as a flat array with 16 components, given:
// ox, oy, oz: new origin (translation)
// rx, ry, rz: rotation angles (radians)
// s: scaling factor
// d: distance between camera and origin after translation,
//     if d <= -n skips projection completely
// f: z coordinate of far plane (normally positive)
// n: z coordinate of near plane (normally negative)
// ar: aspect ratio of the viewport (e.g. 16/9)
// exz: if true exchanges X and Z coords after projection
function getTransformationMatrix(ox, oy, oz, rx, ry, rz, s, d, f, n, ar, exz)
  // Pre-computes trigonometric values
  var cx = Math.cos(rx), sx = Math.sin(rx);
  var cy = Math.cos(ry), sy = Math.sin(ry);
  var cz = Math.cos(rz), sz = Math.sin(rz);

  // Tests if d is too small, hence making perspective projection not possible
  if (d <= -n)
    // Transformation matrix without projection
    return new Float32Array([
    // Pre-computes values determined with wxMaxima
    var A=d;
    var B=(n+f+2*d)/(f-n);
    var C=-(d*(2*n+2*f)+2*f*n+2*d*d)/(f-n);
    // Tests if X and Z must be exchanged
      // Full transformation matrix
      return new Float32Array([
      // Full transformation matrix with XZ exchange
      return new Float32Array([


As usual, the result can be seen on-line. Additional HTML/Javascript controls were added to let the user play with the parameters of the transformation. It should look like this:

The controls give the ability to decompose the full transformation step-by-step, i.e. we start with the following view:

After the rotation around the X axis:

After the rotation around the Y axis:

After the rotation around the Z axis:

After scaling:

After perspective projection:

We use the XZ exchange to display what happens on the Z coordinates during the perspective projection, i.e. both images show the same scene after projection (with no other transformation), the first from the front, the second from the side:


Nothing really new about WebGL in this post, mainly mathematics:

  • Operations with "homogeneous coordinates" consist in successive 4×4 matrix products, where space coordinates are represented by 4-components column-vectors, the fourth component being initialized to 1
  • Operations such as translation, rotation, scaling and projection on a plane are combined in one matrix product
  • After this product, the 3D coordinates are obtained back by dividing the first 3 components by the fourth one
  • Rotation and scaling operations are common three dimensional transformations adapted to homogeneous coordinates by putting the 3×3 transformation matrix into a 4×4 identity matrix
  • The translations cannot be represented by a 3×3 matrix (non-linear operations) and need a specific 4×4 matrix with translation vector components on the 4th column
  • The perspective projection is as well represented as a 4×4 matrix through which the 4th component of the homogeneous coordinates is made proportional to the Z component, resulting in a division by Z during the conversion from homogeneous to space coordinates (division by the 4th coordinate).
  • A quite complex calibration of the factors used in the perspective projection is necessary to keep the Z coordinate and scale it to the interval [-1,+1]
  • The parameters chosen for the transformation are: (ox,oy,oz) point looked at, (ry,ry,rz) rotation angles, ar aspect ratio, s scaling, d distance between observer and projection screen, f far plane Z coordinate, n near plane Z coordinate

, ,


WebGL Basics 4 – Wireframe 3D object

We continue with a real 3D object.


In the previous post, we rendered a single triangle. In this one, we add the third dimension and create a complex object. But because our fragment shader allows only to draw with a constant color, we won’t render faces but only lines. This "wireframe" object is a preparation for the future post about the transformation matrix. What we need to do in brief:

  • Create a new object composed of a single long line
  • Change the shader code (the current one processes only 2D coordinates)
  • Change the rendering Javascript code

Object definition

The object is given as a sequence of 3D vertices (flat array) in the global variable vertices:

// Vertices of the object
var vertices = new Float32Array([
    /* X Axis */
   -1.0,  0.0,  0.0,
    0.9,  0.0,  0.0,


    0.0,  0.0,  0.8,
    0.6,  0.0,  0.8

This kind of definition is extremely constraining as it is composed of a single long broken line. If you look at the vertices in the code, you will see that the definition is not optimal, i.e. that some line segments are defined several times. It is acceptable as an example but would not be adapted for a real application. We will see alternatives below.

3D processing in shader

The change is straightforward. The ppos attribute variable is extended from vec2 to vec3, and ppos.z is given to create the final homogeneous coordinate:

// GLSL ES code to be compiled as vertex shader
'attribute vec3 ppos;'+
'uniform mat4 mvp;'+
'void main(void) {'+
'  gl_Position = mvp * vec4(ppos.x, ppos.y, ppos.z, 1.0);'+

3D wireframe rendering

Once the shader is modified, the corresponding Javascript code must be adapted, first in the initialization function start(). The vertices are given as parameter, and the number 3 in vertexAttribPointer (instead of 2) indicates that we process 3 values per point now.

  // Puts vertices to buffer and links it to attribute variable 'ppos'
  gl.bufferData(gl.ARRAY_BUFFER, vertices, gl.STATIC_DRAW);
  gl.vertexAttribPointer(vattrib, 3, gl.FLOAT, false, 0, 0);

Only the call to the function drawArrays must be changed in draw() (the function called periodically):

  // Draws the object
  gl.drawArrays(gl.LINE_STRIP, 0, vertices.length/3);

Here we use the gl.LINE_STRIP rendering mode. Several modes are available:

  • LINE_STRIP: the two first vertices delimit the ends of the first segment, and each new vertex defines the end of a new segment starting at the end of the previous one
  • LINE_LOOP: same as LINE_STRIP with an additional segment between the first and last vertices (closing the contiguous segments set)
  • LINE: pairs of vertices delimit the ends of each individual segment (giving a non-contiguous set of segments)


As usual the result is available on-line. A house with the axes now rotates around the origin:

Please note that:

  • The rotation is only relative to the origin
  • There is an X/Y distortion (aspect ratio is not corrected)
  • There is no perspective (orthogonal projection), so it is impossible to determine which vertex is far or near

An interesting effect is also visible if the 3 rotations are activated. One corner of the house is cropped when the corresponding vertex has a Z coordinate outside the interval [-1,+1], limits of the visible cube:


Not many points this time:

  • LINE_STRIP is used in drawArrays to render a wireframe object
  • The object is defined as a sequence of 3D vertices in a flat array
  • The shader processes 3D vertices directly (using the ppos.z component)

The next post will deal with the complete transformation matrix.



WebGL Basics 3 – Rotating triangle

We continue this pure WebGL tutorial with animation.


The previous post was dedicated to all the function calls necessary to render a simple triangle. In this post, we will animate this triangle and make it rotate around the three axes. The web page used in the previous post must be changed as following:

  • Addition of a mechanism to periodically refresh the canvas and manually control the rotation
  • Modification of the shader code to include a matrix-based transformation of vertices
  • Creation of a function used to pre-compute the transformation matrix

The part about the transformation matrix necessitates some basic linear algebra but should be understandable with high-school level mathematics knowledge.

Refactoring the code

First of all, in order to make the program more readable, we move some code around. The code of the shaders is placed at the beginning of the Javascript part:

// GLSL ES code to be compiled as fragment shader
'void main(void) {'+
'  gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);'+

// GLSL ES code to be compiled as vertex shader
'attribute vec2 ppos;'+
'void main(void) {'+
'  gl_Position = vec4(ppos.x, ppos.y, 0.0, 1.0);'+

Note that the vertex shader code above is not modified compared to the previous post, just re-arranged. It will be modified in the next section.

The second step is to separate the initialization code from the code called at each refresh. Hence the new method draw() is created and filled with the end of the start() function. The new global flag running activates or deactivates the refresh commands (the control of this flag is described in a section below) and we define the variable program as a global one:

var running = true; // True when the canvas is periodically refreshed
var program; // The program object used in the GL context

// Function called periodically to draw the scene
function draw()
  // Tests if canvas should be refreshed
  if(!running || !gl)

  // Sets clear color to non-transparent dark blue and clears context
  gl.clearColor(0.0, 0.0, 0.5, 1.0);

  // Draws the object
  gl.drawArrays(gl.TRIANGLES, 0, 3);

Periodic Refresh

The draw() function is now prepared. At the end of start(), the standard Javascript function setInterval is used to indicate to the browser that the function draw() must be called periodically:

  // The function draw() will be called every 40 ms
  setInterval('draw();', 40);  

Using setInterval is not a very good solution for animation because this function is completely independent of the refresh cycles of the screen (no smooth animation). Furthermore, it is called even if the page or tab is not currently visible, and a "best-guess" value for the delay in milliseconds must be set. There is an alternative to setInterval called requestAnimationFrame, but it is still in a draft stage and we won’t use it for now (it maybe addressed in a future post).

Matrix multiplication in shader

The next change is in the vertex shader. We add a transformation implemented as a multiplication of a 4×4 floats matrix (type "mat4" in GLSL) by the 4D homogeneous coordinates (x, y, z, 1) of each vertex. The matrix is given in the variable named mvp:

// GLSL ES code to be compiled as vertex shader
'attribute vec2 ppos;'+
'uniform mat4 mvp;'+
'void main(void) {'+
'  gl_Position = mvp * vec4(ppos.x, ppos.y, 0.0, 1.0);'+

The modifier "uniform" means that the value of this variable won’t change during the rendering operations. The product of a matrix by a vector is simply stated using the operator "*".

As a side note, standard operators are used in GLSL for common matrix or vector operations ("+", "-", "*" of course the vector or matrix sizes must match) and additional functions implement other operations: dot(v1,v2) is the dot product of two vectors (i.e. x1*y1+x2*y2+x3*y3 equal to 0 for perpendicular vectors), cross(v1,v2) the cross product (i.e. a vector perpendicular to v1 and v2 with a length equal to 0 for parallel vectors) and matrixCompMult(m1,m2) the component-wise product of two matrices.

Animation control

Note: this section deals more with HTML and DOM than WebGL.

We want to control the animation with the mouse, more precisely we want to toggle the refresh flag "running" and change the increments of the rotation angles. The flag "running" is already defined (see section "Refactoring the code"), and it is controlled directly as a click into the <canvas>:

<canvas id='glcanvas' width=320 height=240 onclick='running = !running;'>
    Your browser may not support HTML5

The rotation angle increments are set by the user with drop down lists (<select> elements with identifiers dx, dy, dz) and the current angles are stored as <div> elements (ax, ay, az). The HTML code for the control of the rotation around the X axis is given hereafter:

<div style='display:inline-block;'>RX:</div>
<div style='display:inline-block; width:1.5em;'id='ax'>0</div>
<div style='display:inline-block;'>
  <select id='dx'>
    <option>-2<optio>-1<option>0<option selected="selected">+1<option>+2

In the draw() function, the rotation angles and increments are read, and the current angles get updated (EDIT: thanks canta I changed "innerText" by "innerHTML" for better compatibility e.g. with Firefox):

  // Gets control value angles from HTML page via DOM
  var ax = parseInt(document.getElementById('ax').innerHTML, 10);
  var ay = parseInt(document.getElementById('ay').innerHTML, 10);
  var az = parseInt(document.getElementById('az').innerHTML, 10);
  // Use increments via DOM to update angles (still in degrees)
  ax = (ax + parseInt(document.getElementById('dx').value, 10) + 360) % 360;
  ay = (ay + parseInt(document.getElementById('dy').value, 10) + 360) % 360;
  az = (az + parseInt(document.getElementById('dz').value, 10) + 360) % 360;
  // Update HTML page with new values
  document.getElementById('ax').innerHTML = ax.toString();
  document.getElementById('ay').innerHTML = ay.toString();
  document.getElementById('az').innerHTML = az.toString();

After that, ax, ay and az contain the current angles in degrees, and must be converted:

  // Convert values to radians
  ax *= 2*Math.PI/360; ay *= 2*Math.PI/360; az *= 2*Math.PI/360; 

Combining the rotation matrices

At this point, almost everything is set up:

  • The current rotation angles, in radian (i.e. 0 to 2*PI), are defined (in ax, ay, az)
  • They are incremented for each refresh
  • The refresh is done in draw() using the code of the previous post
  • The shader uses a 4×4 "uniform" matrix that is multiplied with all input vertices
  • This matrix must be defined in the Javascript code

In this section we look at the matrix itself, without a full explanation of the mathematics behind it (a future post will be dedicated to the complete transformation matrix). Hence we admit for now that the rotation around an axis is achieved by the product of a "rotation" matrix with the input coordinates. We will use the Computer Algebra System wxMaxima to pre-compute these matrix operations (the matrix formula won’t change during rendering). We use the variables rx, ry, rz for the angles. A rotation matrix around the Z axis has the following form (in wxMaxima: rotz:matrix([cos(rz),-sin(rz),0,0],[sin(rz),cos(rz),0,0],[0,0,1,0],[0,0,0,1])):

Note that after multiplication of the matrix above by a point with coordinates (x,y,z,1), the Z coordinate remains unchanged (as well as the 4th one), and the X and Y coordinates are rotated in the plane (0,X,Y) (hence it is a rotation around the Z axis):

The rotation matrices around the X and Y axes are obtained by replacing az by ax and ay, and by permuting the different coordinates. The rotation matrix around the X axis is given hereafter (rotx:matrix([1,0,0,0],[0,cos(rx),-sin(rx),0],[0,sin(rx),cos(rx),0],[0,0,0,1])):

The rotation around the Y axis (roty:matrix([cos(ry),0,sin(ry),0],[0,1,0,0],[-sin(ry),0,cos(ry),0],[0,0,0,1])):

The 3 rotations are combined by multiplying the 3 matrices (here again wxMaxima is of great help!). The order of the matrix multiplications is important, i.e. multiplying rotX.rotY.rotZ is not the same as rotZ.rotY.rotX (matrix product is not "commutative"). In our case, we choose to have the rotation around the Z axis as last operation, hence in wxMaxima, we get the result of the formula rotz.roty.rotx:

The matrix is quite complex now, and it only implements 3 rotations. The good thing is that this computation is necessary only once per refresh. It is implemented in the getTransformationMatrix(rx,ry,rz) function:

// Gets a transformation matrix given the rotation angles
function getTransformationMatrix(rx, ry, rz)
  // Pre-computes trigonometric values (mainly for better readability)
  var cx = Math.cos(rx), sx = Math.sin(rx);
  var cy = Math.cos(ry), sy = Math.sin(ry);
  var cz = Math.cos(rz), sz = Math.sin(rz);

  // Returns matrix
  return new Float32Array([cy*cz, (sx*sy*cz-cx*sz), (sx*sz+cx*sy*cz), 0,
                           cy*sz, (sx*sy*sz+cx*cz), (cx*sy*sz-sx*cz), 0,
                           -sy,   sx*cy,            cx*cy,            0,
                           0,     0,                0,                1]);

The object returned by this function is a one-dimensional array with 16 components, not a real 4×4 matrix. In the WebGL conventions, all matrices are represented as flat arrays when used from Javascript.

Transformation matrix set for rendering

The transformation is now defined and must be communicated to the shader. First the address of the mvp matrix in the shader must be determined. The function gl.getUniformLocation(program, variable-name) returns the desired reference. After that the function gl.uniformMatrix4fv(uniform-ref, false, matrix-as-array) sets the value of the matrix:

  // Gets reference on the &amp;quot;uniform&amp;quot; 4x4 matrix transforming coordinates
  var amvp = gl.getUniformLocation(program, 'mvp');
  if(amvp == -1)
  {alert('Error during uniform address retrieval');running=false;return;}  

  // Creates matrix using rotation angles
  var mat = getTransformationMatrix(ax, ay, az);
  // Sets the model-view-projections matrix in the shader
  gl.uniformMatrix4fv(amvp, false, mat);

As usual, you can have a look at the result online. It should look like this:


The main points regarding animation in WebGL:

  • Functions used for the refresh are put in a new function, which is called periodically through setInterval("code", refresh-period)
  • A "transformation matrix" is set as a "uniform" variable in the shader
  • A reference to such a variable is retrieved by the function gl.getUniformLocation(program, variable-name)
  • The vertex transformation operation in the shader is the product matrix * coordinates
  • Our transformation matrix is built as a product of three 2D rotations using three variable angles (formula found with wxMaxima)
  • Once computed, the matrix is set in the shader through gl.uniformMatrix4fv(uniform-ref, false, matrix-as-array)



WebGL Basics 2 – White 2D triangle

This part is longer than the first one and I hope that it will be the longest in this mini tutorial. WebGL is such that for the simplest rendering, there is a great amount of things to setup.


The basic mechanism of hardware accelerated 3D rendering engines is that the graphics card achieves the computing intensive operations using specialized hardware and the main processor focuses on the remaining tasks such as managing user interactions. It is the same with WebGL, but with the particularity that the main processor controls everything through Javascript, which is not as quick as a compiled program would be.

The rendering of a static model in 3D must roughly achieve the following steps:

  1. Vertices are taken from a buffer and processed to move the model in the camera referential (rotation, translation, scaling) and to transform the 3D coordinates in 2D points on the viewable area (perspective projection).
  2. Faces (delimited by the vertices) are computed, i.e. non-visible areas are eliminated (back faces culling and Z buffering), and extremities of pixel lines belonging to faces are determined (rasterization).
  3. For each pixel line (called a fragment) the color of each pixel is computed.

In the OpenGL ES 2 scheme, on which WebGL is based, the first and third steps are fully programmable! It has the obvious advantages of flexibility and optimization (only the necessary operations are done) but it makes everything complicated for the newcomer. According to the OpenGL terminology, the "vertex shader" is the programmable part responsible for vertex transformations and the "fragment shader" the one that computes pixel colors. Both are compiled and linked to a "program", which is in turn attached to the WebGL context. The vertices are stored by the Javascript code in a "buffer", from where they are piped into the vertex buffer via an "attribute" variable (i.e. input variable of the vertex shader).

Fragment shader compilation

We begin with the HTML page started in the first part of this tutorial and continue in the start() function. The following lines of code show the creation and compilation of the fragment shader:

// Creates fragment shader (returns white color for any position)
var fshader = gl.createShader(gl.FRAGMENT_SHADER);
gl.shaderSource(fshader, 'void main(void) {gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);}');
if (!gl.getShaderParameter(fshader, gl.COMPILE_STATUS))
{alert("Error during fragment shader compilation:\n" + gl.getShaderInfoLog(fshader)); return;}

The compilation is triggered by the 3 WebGL functions createShader, shaderSource and compileShader. The source code of the shader is passed to shaderSource as a simple character string. The language used for shader programming is called GLSL ES (GL Shader Language for Embedded Systems). It features on a C-like syntax with specific data types and predefined variables. In our example, the shader is programmed with the following source code:

void main(void)
  gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);

The function main() is called as often as necessary (maybe not for each pixel but for each vertex) such that the color of each pixel can be interpolated. Each time main() is called, the color at this position is returned in the predefined variable gl_FragColor. It is given as a vector with 4 components between 0.0 and 1.0: red, green, blue and alpha (transparency). In our example, the shader is simplistic as it produces only a constant color independent of the position. For more elaborated fragment shaders, some geometric data related to the model being rendered is necessary (more about this in a future post).

Vertex shader compilation

The vertex shader is created and compiled with the same functions as for the fragment shader:

// Creates vertex shader (converts 2D point position to coordinates)
var vshader = gl.createShader(gl.VERTEX_SHADER);
gl.shaderSource(vshader, 'attribute vec2 ppos; void main(void)
  { gl_Position = vec4(ppos.x, ppos.y, 0.0, 1.0);}');
if (!gl.getShaderParameter(vshader, gl.COMPILE_STATUS))
{alert('Error during vertex shader compilation:\n' + gl.getShaderInfoLog(vshader)); return;}

The code of the shader is given hereafter:

attribute vec2 ppos;
void main(void)
  gl_Position = vec4(ppos.x, ppos.y, 0.0, 1.0);

The shader transforms vertices in a very simple way: for each call of main(), the 2D position given as input in ppos is extended with a Z coordinate equal to 0.0 plus an additional component equal to 1.0, and is returned as a 4-components vector in the predefined variable gl_Position. This last component is due to the usage of "homogeneous coordinates" (more about this in a future post). The most important thing to note is that the input variable ppos is marked with the prefix "attribute". Any attribute variable is a link to the Javascript code. We will see below how to pipe vertices into the vertex shader, but before that additional data structures must be set up.

Linking to program and validation

Once the shaders are correctly compiled, they must be linked in a so-called program using the functions createProgram, attachShader and linkProgram. The code is straightforward:

// Creates program and links shaders to it
var program = gl.createProgram();
gl.attachShader(program, fshader);
gl.attachShader(program, vshader);
if (!gl.getProgramParameter(program, gl.LINK_STATUS))
 {alert("Error during program linking:" + gl.getProgramInfoLog(program));return;}

After compilation, the code is "validated" and finally "used":

// Validates and uses program in the GL context
if (!gl.getProgramParameter(program, gl.VALIDATE_STATUS))
{alert("Error during program validation:\n" + gl.getProgramInfoLog(program));return;}

I assume that there is a good reason to separate the different program/shader operations (attach, link, validate, use), maybe for performance reasons. As soon as several programs and contexts are used in the same page, the need for such steps will certainly appear clearer.

Attribute variable reference

Once the program is set up, it is possible to find the reference to the input "attribute" variable ppos, which is the interface between the Javascript code and the vertex shader code:

// Gets address of the input 'attribute' of the vertex shader
var vattrib = gl.getAttribLocation(program, 'ppos');
if(vattrib == -1)
{alert('Error during attribute address retrieval');return;}

Note that the returned reference is a simple integer value (a pointer to some memory area accessible from the vertex shader). This kind of low-level access mechanism, unusual and potentially unsecure in a high-level interpreted language like Javascript, is directly inherited from the (compiled) OpenGL API. According to the WebGL specification, the access to non-allowed areas must fail with the error "invalid operation", but in my opinion we can expect some new exploits based on that. In any case, this raw mechanism does not facilitate debugging.

Buffer filled with vertices

The next step is to define the object to be renderd. First, a buffer object is created with createBuffer and "bound" with bindBuffer(gl.ARRAY_BUFFER, buffer). The bind operation sets the buffer as being the current one (the next commands will use it without specifying it) and states that its type is array (and not indices).

// Initializes the vertex buffer and sets it as current one
var vbuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, vbuffer);

Once the buffer is there, the object is created. In our example, it is a simple 2D triangle. The xy coordinates are put sequentially in one array that is converted into a Float32Array object (with bufferData), and then the buffer is linked together with the vertex shader via the attribute variable (with vertexAttribPointer):

// Puts vertices to buffer and links it to attribute variable 'ppos'
var vertices = new Float32Array([0.0,0.5,-0.5,-0.5,0.5,-0.5]);
gl.bufferData(gl.ARRAY_BUFFER, vertices, gl.STATIC_DRAW);
gl.vertexAttribPointer(vattrib, 2, gl.FLOAT, false, 0, 0);

The "target" gl.ARRAY_BUFFER is the same as the one uses in bindBuffer. In the same command, the "usage" set to gl.STATIC_DRAW indicates that the data in the buffer won’t be changed during the subsequent operations.

The next command vertexAttribPointer(vattrib, 2, gl.FLOAT, false, 0, 0) takes more parameters than one would expect intuitively. These parameters provide some flexibility for the vertex shader to read the values from the buffer:

  • vattrib is the attribute variable (determined using getAttribLocation)
  • 2 is the size, i.e. the number of values to be read from the buffer before the main() function of the shader is called (here we put a 2 as our vertices define a 2D object)
  • gl.FLOAT is the type of the data in the buffer
  • false is the "normalized" flag equal to false for floats (used only for integers to indicate that integer values represent values between -1 and 1)
  • 0 is the "stride" that represents the increment in bytes between two consecutive elements (i.e. tuples of values), if greater than 0
  • 0 (the last parameter) is the "offset" that defines the number of bytes skipped at the beginning of the buffer

In our examples, the values of stride and offset will be set to 0. As a side note, when stride and offset are used, their values must be multiples of the size of the data type considered (here floats).

Face rendering

Now that everything is set up, the object must be rendered with drawArrays:

// Draws the objext
gl.drawArrays(gl.TRIANGLES, 0, 3);

The function drawArrays takes the current buffers (the "bound" ones) and the corresponding "enabled" attribute variables as input for the vertex shader. The first parameter of the function gives the "mode", i.e. the way the vertices returned by the vertex shader will be used during the draw operation. The value gl.TRIANGLES means vertices are used 3-by-3 to draw triangles. OpenGL ES and WebGL support other draw modes such as "triangle strips", "triangle fans", points and lines (more about that in a future post).

Finally, the second parameter of drawArrays gives the first element of the array to be drawn, and the third one the number of elements to be used (elements are here pairs of vertices, due to the number 2 given in vertexAttribPointer). The flush function is called to be sure that the operation are started (not mandatory).

Please have a look at the result online. The canvas part should look like this:

Viewable area between -1 and +1

It is now time to look precisely at the result. The coordinates of the triangle are defined between -1 and +1. But even if there is no more transformation than adding a Z coordinate equal to 0 to each 2D point (in the vertex shader), the triangle is not rendered on two pixels but stretched to the size of the canvas. Namely, the top spike of the triangle with a Y coordinate set to +0.5 is rendered at a distance of the top border equal to 1/4 of the complete area. As one can deduce, the default viewable area in WebGL is the cube centered on the origin (0, 0, 0) and with size 2, i.e. with faces parallel to axis pairs and at coordinates +1 or -1. With such a definition, if the canvas is not a square, the image is distorted (i.e. like with an incorrect aspect ratio).


The main points to remember (gl is the GL context object created from the canvas), first about the shader functions:

  • gl.createShader(gl.FRAGMENT_SHADER) returns a fragment shader object
  • gl.createShader(gl.VERTEX_SHADER) returns a vertex shader object
  • gl.shaderSource(shader, source-code-string) associates the source code to a shader object
  • gl.compileShader(shader) compiles the shader
  • In the vertex shader, the input data is piped through a global variable tagged as attribute, and the variable gl_Position receives the output vertex as a 4 components vector (x, y, z, 1.0)
  • In the fragment shader, the variable gl_FragColor receives the output color as a 4 components vector (r, g, b, alpha)

The program functions:

  • gl.createProgram() returns the program object
  • gl.attachShader(program, shader) attaches a shader object
  • gl.linkProgram(program) links everything together
  • gl.validateProgram(program) validates
  • gl.useProgram(program) uses finally the program

The buffer and attribute functions:

  • gl.getAttribLocation(program, attribute-name) returns the reference on the attribute variable
  • gl.enableAttribute(attrib) enables it
  • gl.createBuffer() returns a new buffer object
  • gl.bindBuffer(gl.ARRAY_BUFFER, buffer) sets it as current buffer
  • gl.bufferData(gl.ARRAY_BUFFER, array, gl.STATIC_DRAW) puts the values contained in the given Float32Array (usually vertex coordinates) in the current buffer
  • gl.vertexAttribPointer(attrib, size, gl.FLOAT, false, 0, 0) links the current buffer (containing floats) and the attribute reference (size is the number of values per element, e.g. 3 for 3D points)

Finally the draw function:

  • gl.drawArrays(gl.TRIANGLES, 0, count); draws the values contained in the current buffer (linked to a valid attribute) as triangles (count is the number of elements to read from the buffer)
  • The default viewable area is defined as a cube centered on (0.0.0) with faces at coordinates -1/+1



WebGL Basics 1 – Context in canvas

I’m currently learning WebGL, without prior knowledge of OpenGL or DirectX, and in this series of posts I will track what I discovered so far. This is not the first WebGL tutorial, and obviously not the last one! In my pages, I want to focus on minimizing the information to learn and pointing out possible alternatives. The pre-requisites for the reader are basic knowledge of HTML and linear algebra.


We will start with the simplest WebGL page, i.e. with no animations, not even a rendered surface, just a dark blue WebGL context created in an HTML5 “canvas” element. This part addresses the creation of the canvas, the creation of the “WebGL context” and the commands to clear the background of this context.

HTML5 canvas

The body of the web page is kept minimal. It contains mainly the static <canvas> element (the exact size in pixels must be given explicitly otherwise the default size is 300×150):

<!DOCTYPE html>
    <title>WebGL Basics part 1</title>
    <script type="text/javascript">

  <body onload="start()">
    <h1>WebGL context in canvas</h1>
    <canvas id="glcanvas" width=320 height=240>
          It seems the browser does not support the 'canvas' tag (part of HTML5)
    <p>After the canvas</p>

The code shows as well the alternative mechanism for browsers without HTML5 support (the text between the <canvas> tags is displayed in this case), and the function called after the load of the page. We detail this function in the next paragraph.

Context creation

The start() function, defined in the <head> part of the page and called once by the onload handler, just creates the context and clears it with the background color:

var gl; // GL context

// Function called by onload handler
function start()
  // Gets canvas from the HTML page
  var canvas = document.getElementById('glcanvas');

  // Creates GL context
  gl = null;
  try {gl = canvas.getContext('experimental-webgl');}
  catch(e) {alert('Exception catched in getContext: '+e.toString());return;}

  // If no exception but context creation failed, alerts user
  if(!gl) {alert('Unable to create Web GL context');return;}

  // Sets clear color to non-transparent dark blue and clears context
  gl.clearColor(0.0, 0.0, 0.5, 1.0);

The most important in this function is the call to “getContext“. It is a method of the canvas object, i.e. part of the HTML5 standard and not of the WebGL API. It creates and gives back the actual WebGL object. The context identifier “experimental-webgl” denotes that the WebGL standardization is still on-going (the current version of the WebGL specification is in “final draft” state), and that is why it is important to cleanly catch all possible errors and to inform the user about them (the getContext command may throw an exception or just return a null reference).

Once the context is successfully created (in the variable gl), WebGL-specific functions can be used as methods of gl. In our page, the background color is set to dark blue and the drawable area is cleared.

The complete page is available on-line. You can have a look and try some refresh and resize operations. Please note that the content of the canvas is now buffered and there is no redraw necessary.

This ends the first part of this tutorial.


The main points to remember:

  • A <canvas> tag is the starting point
  • The call to the canvas method getContext("experimental-webgl") returns the WebGL object
  • All API functions (e.g. clear) are methods of this object



WebGL comes

The recent release of Chrome 9 includes a built-in module for WebGL. The result is quite impressive! Have a look at these few examples (you need either Chrome 9 or a nightly build of Firefox or Safari):

Chrome WebGL Experiments

Body Browser

ibiblio examples

Shortly, WebGL is the web version of OpenGL and provides the following features:

  • Everything runs in the browser with plain text scripts (no compilation necessary)
  • Each graphic is rendered in an HTML5 canvas (seamless integration in the HTML page)
  • Computing intensive operations such as vertex transformations or texture processing are done by the WebGL module or directly the GPU (limited interactions with Javascript for higher performance)
  • The GPU is programmable in GLSL (compiling and linking controlled from Javascript)

WebGL will obviously brings a new dimension to the web, but the fact that the GPU can be very simply controlled and programmed makes it possible to deploy almost any kind of computer intensive application simply via the web browser. I would not be surprised to see the next SETI@Home based on WebGL. A massively distributed effort to compute e.g. factorization of large numbers would be an interesting project.

Leave a comment

%d bloggers like this: