3D Local Axis Rotations in OpenGL

Here is a usenet post I wrote on how to perform 3d rotations around the local axis of an object using OpenGL. It is pretty easy to get this stuff wrong(and your mileage may vary when using this knowledge I've written), but I tried to make it correct and elucidate the concepts. I fixed it up a bit for html.

It seems that you have stumbled into the twisty little passages of 3d math that has tripped up many a 3d programmer, including me. I actually was so upset that it was so hard to learn, a friend and I taught a free 12 week class in 3d engine design that explained the basics of things like 3D math, heirarchical transformations, perspective correct texture mapping, etc to others after we had mastered the concepts. Sure books hold the knowledge, but they are frequently dense. Most of the information I got off of the web was *completely* wrong and even more difficult to understand. In lieu of that, the explanation that follows might be wrong or hard to understand. :) Corrections welcome. Also, this explanation is about the math, and it is left as an excersize to the reader to figure out how to use OpenGL matrix calls to implement this. If you want, you can implement your own matrix math library(not very hard at all) and do all of it yourself.

So first, off. I assume you are going to represent your world in three dimensions. This will make it a lot easier to do the math and will allow you to create the levels you want and make them look nice. Also, you can make good use of the hardware acceleration cards that support opengl will give you. I'm also going to assume you know how to multiply 4x4 matricies and take inverses of a rotation matrix. It is 3:30am where I am and I really should be doing something else.... so I'm not going to explain that math now. Also, read up on "homogeneous coordinates" as it is really important and that is information found anywhere.

Ok, here goes.

First, the notion of "spaces". "Spaces" is kindof like a coordinate system. Well, it actually *could* be a coordinate system, but not always, though in your case it most likely will be. Also, the system I am describing follows the "Camera moves in space" concept, sure the math is all the same no matter what you do, but *how* the math is implemented can lead to interesting problems if you accidentily switch between "camera moves" and "camera stays still" concepts in the middle of the math.....

Definitions:
 Object Space:
  The vertex data of an object usually set up so that 0,0,0 is
  the center of rotation you desire.

 World Space:
  A space where all object's vertex data(for all objects) are
  represented with respect to ONE coordinate system centered
  at 0,0,0.

 View Space:
  Almost like a world space, but what happens is you
  translate the camera to the World Space 0,0,0 position
  and align it to the cardinal axes. All of the objects
  in the world are moved such that thay are in their
  relative locations and orientations with respect to the
  camera(which is now at 0,0,0).

 Screen Space:
  This is the result you get when you project the View
  Space Coordinates of the vertex data for each object via
  projection equations. You end up with 2D coordinates that
  map directly to your viewport(the window or drawable
  area, usually rectangular).  Some cleverness need to
  occurr to flip the y axis in the math and translate 0,0
  from the upper left hand corner of the viewport to the
  center of the viewport.

 Rotations:
  Rotation about the X axis by an angle a:
  |1       0        0    0|
  |0  cos(a)  -sin(a)    0|
  |0  sin(a)   cos(a)    0|
  |0       0        0    1|

  Rotation about the Y axis by an angle a:
  | cos(a)  0  sin(a)    0|
  |      0  1       0    0|
  |-sin(a)  0  cos(a)    0|
  |      0  0       0    1|

  Rotation about the Z axis by an angle a:
  |cos(a)  -sin(a)  0   0|
  |sin(a)   cos(a)  0   0|
  |     0        0  1   0|
  |     0        0  0   1|

 Projection Matrix:
  |1   0   0   0|   |           |   |x|   | x' |
  |0  -1   0   0| * | transform | * |y| = |-y' |
  |0   0 1/d   0|   | matrix    |   |z|   |z'/d|
  |0   0   0   1|   |           |   |1|   |  1 |
          
           W
   d = -----------
        2*tan(a/2)

   W = screen width in pixels

   a = a desired Field of View (normally pi/3 to pi rad)

 Inverse/Negation of a homogenous basis:
  Transpose the upper left 3x3 matrix.
  Invert the signs on the translation column.

ok, now we have that out of the way...

Next, the representation of the matrix data and the objects.

NOTE: I will prolly start a holy war with these statments I am about to make, but this way(that of _incremental_ rotations upon a basis) *works* and the precision errors that arise from this method are not as big as people say they are and actually, for all intensive purposes negligible. If you *do* worry about the propogation of errors, then look up the Grahm-Schmidt Renormalization Algorithm and do it to the matrix data for each object every couple of thousand rotations or so. Also, the glRotate() function is a bit misleading in saying that it works on the local axis of an object. From my experience, if you always give the axis of rotation as a cardinal axis, the rotation *fails* to work correctly in accordance to what you thought the function was going to do. And if anyone disagrees with me, I can provide source code to prove either the misusing of glRotate(), or my gross inability to apply it correctly. :)

(Now glRotate() might work in Euler angles, but I haven't explored that yet....)

Ok, each object has some vertex data and a homogeneous matrix associated with it that I will henceforth reference as a "basis". This basis describes how the object is oriented in space--it is effectively an axis system.

So in C terminology:
struct Object
{
  /* point3d is what you expect, x, y, z stuff */
  struct Point3D *verticies; /* the points of the object, in object space */
  /* MeshDesc has a lot of junk in it, but just think of wireframe for now*/
  struct MeshDesc *mesh; /* edge list of how to connect the points */
  /* the basis is never reinitialized, it stays the life of the object */
  float basis[4][4];     /* the orientation matrix representation of axes */
  int numkids;           /* number of children in the child ref frame */
  struct Object **kids;  /* an array of children nodes(heirarchy) */
  struct Object *parent; /* this is null for the root node */
};

There is more stuff that goes in there, but this is the pertinent stuff...

ok, now on how to modify these object space points for something useful.

Next, This is the method for modifying an object space's points:

If you take the basis and multiply it against the objects points, you end up with the object points rotated to be aligned with the basis. To perform rotations upon the basis(for instance to rotate the object on its y axis only) create a y rotation matrix R and multiply it by the basis B of the object. Order of the multiplication matters, basis on the left, newly created rotation matrix on the right. Take the result and make it the new object basis. Now, this is important, each time you calculate the new vertex points, you must do it on the *original* vertex points of the object. The rotation matrix is calculated from a small incremental angle, like say .04 radians or however much you want it to rotate in that axis. In the next iteration of the 3D algo(after rasterization and stuff), if you want to rotate the object again, do the whole process over again with the same small rotation. That is why this algo is called *incremental* rotations. Each small rotation affects the basis(which is kept around across rotations) such that the next rotation has a cumulative effect from the last one, even though it is the same radian.

ASIDE: One wonders what all of the spaces were for, well, here it is:
 How transformation usually works:
  Object Space ->[0]-> World Space -> View Space -> Screen Space

  [0]- This means that there could be more Object Spaces to
   transform through until you got to World
   Space. This is used for heirarchichal transforms,
   like the moon circling the earth circling the
   sun, etc.
`----

Ok, example time:
 Given:
  One unit cube in object space. No camera in the system.

 Goal:
  Get it into screen space and rotate it in local axis coords.

 Procedure:
  0. Create Projection matrix.
   Note the -1 in the 1,1 pos. This is sneakines that
   inverts the Y axis for viewport coords.
  1. Init basis of cube to identity matrix.
  2. Apply a rotation in an axis to another identity matrix.
  3. Multiply the basis by the new rotation matrix and save.
  4. Multiply Projection matrix by basis and save as J.
  5. Multiply J by verticies and get 2D coords.
  6. Plot 2D coords onto viewport.
  7. loop at 2

  Remember to apply the axial incremental rotations
  individually to the basis. This means that you calculate
  a new rotation matrix (off an identity matrix each time)
  and then apply it to the basis, then do it again on the
  new basis for the next axial incremental rotation.

Ok, there you go(end of part ONE, you could say). To relate this to Opengl, since this *is* that group, you can set up your Projection matrix on the stack first using glViewport() and glFrustrum(). Then you can use MultMatrix() and LoadMatrix() to set up your basis along with LoadIdentity and glRotate() to make your rotation matricies. Then opengl can do most of the trnsformations for you. The stuff you need to do is the rotation and saving of the basis. PART II:

[This is just getting longer and longer.......5:30am? What the hell am I doing up at this hour? Oh well.]

Ok, now to talk about hierarchical objects, or reference frames....

Remember those child and parent pointers in the Object Desc? Well here they are being used....

Suppose we have a tree ADT that contains some objects within it:

NOTE! The lines in this tree are bidirectional!

    UNIVERSE
       |
      Sun
      / \
     /   \
 Camera   Earth
            \
             Moon

Here is a nice description of a universe complete with camera. This means that the camera can sit a distance away from the sun and watch the moon rotate around the earth while that system is rotating around the Sun. Also, in this example the UNIVERSE is the identity matrix with NULL as a parent from it. And since I want the camera to be free of the heirarchy of the earth, but not that of the Sun, it is in the location where it is in the tree. Now, in this bit of explanation, I am going to talk about how to calculate the final basis that you must then apply to the original points of the object in question to stick it into view space, and then screen space. Until the bitter end, I am only multiplying matricies together. In the end, you only multiply the final matrix against the original object verticies in question to get the 2D viewport points.

So now, this is how one would transform the Moon into World Space (which is designated as 'the Moon in respect to the UNIVERSE'):

First, some definitions: 
  Mhc - The Moon's Homogenous Coordinates 4x4 Matrix
  Ehc - The Earth's Homogenous Coordinates 4x4 Matrix
  Shc - The Sun's Homogenous Coordinates 4x4 Matrix
  Chc - The Camera's Homogenous Coordinates 4x4 Matrix
  Uhc - The Universe's Homogenous Coordinates 4x4 Matrix(Identity)

So then, the operation to create the transformation matrix |Tmoon| that could be applied to the objects verticies is this:

A = Mhc * Ident
B = Ehc * A
C = Shc * B
Tmoon = Uhc * C

The same thing as above, but written differently:

Tmoon = Uhc * Shc * Ehc * Mhc * Ident

*** Remember the Object -> [0] -> World -> View -> Screen stuff I talked about? Well, this is an actual viewing of the [0] part I described.

Now if we applied Tmoon to the Moon's actual points, we would get the Moon verticies in relation to the UNIVERSE, which is almost what we want. Remember the camera? Well we want to transform the Moons points into the Camera's reference frame, with the Camera centerd at the origin looking down the Negative Z axis, and here is how to do that:

First, we must calculate the camera's Transformation matrix:

Tcamera = Uhc * Shc * Chc * Ident

Second, we must invert the camera's homogeneous coordinate matrix, this inversion "undoes" the rotation of the camera:

       -1
Tcamera  = NegateMatrix(Tcamera) // the inverse of a rotation matrix

Third, compute the REAL Tmoon. This matrix when applied to the Moon's points will transform them from object space to camera space, which is effectively view space.

                       -1
Tmoon = Tmoon * Tcamera

NOW you can apply Tmoon to the Moon's original verticies to transform it into view space.

P' = Tmoon * P

If you would want to go straight to the Screen Space coords, then apply the projection matrix to Tmoon first, then apply that resultant matrix to P to get P', in 2D coords.

Then plot the points and you are done.

You can use opengl to calculate a lot of this for you, but you must do the multiply chain by yourself.

Also, since you are calculating the inverse camera matrix, you don't need to use gluLookAt() anymore, or you could and just stick the camera basis in there and have opengl do that whole last inversion bit for you, but you still need the camera basis around anyway.

ASIDE:
 If you want the moon or any other object to rotate about its axis,
 then apply the incremental rotations as per part one(at least until
 the projection*basis part), then do the above steps.

 Also, I didn't touch cliping to the viewing frustrum at all.
 That is a pandora's box to itself.
------

WHEW! that was a lot of typing at far too late at night. I'm sorry if this is incoherent, but it is getting late. Feel free to ask any questions and I will be glad to answer them for you, or anyone else. From part II I'm sure it shouldn't be too difficult to see how to apply this concept to a level desc and a camera moving in it. I'm sure I've missed stuff and probably made mistakes, but this is the right idea. With this and a couple of books like "Computer Graphics, Principles and Practice" Foley, van Damm, Feiner and Hughes, you should be fine.

Good luck.