Sunday, January 15, 2012

Real-time PointCloud Data Visualization on Tegra 2

In this tutorial, I will explain the last piece of the puzzle in creating high performance 3D visualization of the Kinect 3D range data on the Tegra 2 platform. Even with the 3D rendering, we can still achieve approximately 8fps! Pretty impressive. It is definitely a very usable frame rate for real-time interactions.

OpenGL ES2 provides a simple 3D interface to render 3D points efficiently. PointCloud Data are basically 3D vertices with color information -  e.g., (x,y,z, r,g,b). With the Kinect camera streams these data as of a (r,g,b,d). To recover the (x,y,z) information, we can simply project the depth information back using the intrinsic information of the camera. We can offload such work on the GPU easily by writing another shader program. See comments below.

To push the data onto the screen, we first load the rgbd data to an array buffer, and then we have to construct an perspective matrix manually (see OpenGL perspective projection tutorial). We have also added a bounding box to show the projection of the data and the bounded area as shown in the demo video.

void Renderer::displayOverlay2(){
 glViewport(0, 0, screen_width, screen_height);

 double max_range = 5.0; //5 meters = max range?

 //load the depth map
 GLfloat *point_vertices = myVertices;
 float *depth_float_ptr = depth_float_info;
 //overwrite the depth buffer
    for (float j = 0; j < IMAGE_HEIGHT; j++) {
  for (float i = 0; i < IMAGE_WIDTH; i++) {
   //unsigned short raw_depth = *point_depth;
   float real_depth=*depth_float_ptr; //this range from...?
   if(real_depth<=0)
    real_depth=9999;
   *(point_vertices + 0) = i;
   *(point_vertices + 1) = j;
   *(point_vertices + 2) = real_depth;
   point_vertices += 3;
   depth_float_ptr++;
  }
    }

 glEnable(GL_DEPTH_TEST);
 glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);

 glUseProgram(overlayProgram);

 //Comment the lines below to disable rotation
 iXangle = 5.0+user_input_x/5.0;
 //iYangle += 5;
 iZangle = -180;
 iYangle += 1;
 //reset it
 if(iYangle>270){
  iYangle=90;
 }
 //rotate - begin
 rotate_matrix(iXangle, 1.0, 0.0, 0.0, aModelView);
 rotate_matrix(iYangle, 0.0, 1.0, 0.0, aRotate);
 multiply_matrix(aRotate, aModelView, aModelView);
 rotate_matrix(iZangle, 0.0, 0.0, 1.0, aRotate);
 multiply_matrix(aRotate, aModelView, aModelView);

 //translate_matrix(user_input_x,user_input_y,-5,aModelView);
 //Pull the camera back from the geometry

 aModelView[12] = 0;
 aModelView[13] = -0.5;
 aModelView[14] -= 50;

 //use very little FOV ~ 5 degree... we should use orthogonal instead...
 perspective_matrix(_PI/36.0, (double)screen_width/(double)screen_height, 1, 300.0, aPerspective);
 multiply_matrix(aPerspective, aModelView, aMVP);


 glUniformMatrix4fv(glGetUniformLocation(overlayProgram, "MVPMat"), (GLsizei)1, GL_FALSE, aMVP);
 // Load the vertex data
 glVertexAttribPointer(glGetAttribLocation(overlayProgram, "vPosition"), 3, GL_FLOAT, GL_FALSE, 0, myVertices);
 glVertexAttribPointer(glGetAttribLocation(overlayProgram, "a_color"), 4, GL_UNSIGNED_BYTE, GL_TRUE, 0, processed_data);

 glEnableVertexAttribArray(glGetAttribLocation(overlayProgram, "vPosition"));
 glEnableVertexAttribArray(glGetAttribLocation(overlayProgram, "a_color"));
 glDrawArrays(GL_POINTS, 0, NUM_VERTICES);

 //this will get transformed by the vertex program
 float cube_depth = 7;
 float cube_width = 640;
 float cube_height = 480;
 float min_front = 0.6;
 GLfloat cubeVertex[]={
   //front
   cube_width, cube_height, 0.0,
   0, cube_height, 0.0,
   0, 0, 0.0,
   cube_width, 0, 0.0,
   cube_width, cube_height, 0.0,
   //right
   cube_width, cube_height, 0.0,
   cube_width, cube_height, cube_depth,
   cube_width, 0, cube_depth,
   cube_width, 0, 0.0,
   cube_width, cube_height, 0.0,
   //left
   0, cube_height, 0.0,
   0, cube_height, cube_depth,
   0, 0, cube_depth,
   0, 0, 0.0,
   0, cube_height, 0.0,
   //up
   cube_width, cube_height, 0.0,
   0, cube_height, 0.0,
   0, 0, 0.0,
   cube_width, 0, 0.0,
   cube_width, cube_height, 0.0,
   //down
   cube_width, cube_height, 0.0,
   0, cube_height, 0.0,
   0, 0, 0.0,
   cube_width, 0, 0.0,
   cube_width, cube_height, 0.0,
   //back
   cube_width, cube_height, cube_depth,
   0, cube_height, cube_depth,
   0, 0, cube_depth,
   cube_width, 0, cube_depth,
   cube_width, cube_height, cube_depth,
 };
 GLfloat cubeColor[]={
   1.0, 0.0, 0.0, 1.0,
   0.0, 1.0, 0.0, 1.0,
   0.0, 0.0, 1.0, 1.0,
   1.0, 1.0, 0.0, 1.0,
   1.0, 0.0, 1.0, 1.0,

   1.0, 0.0, 0.0, 1.0,
   0.0, 1.0, 0.0, 1.0,
   0.0, 0.0, 1.0, 1.0,
   1.0, 1.0, 0.0, 1.0,
   1.0, 0.0, 1.0, 1.0,

   1.0, 0.0, 0.0, 1.0,
   0.0, 1.0, 0.0, 1.0,
   0.0, 0.0, 1.0, 1.0,
   1.0, 1.0, 0.0, 1.0,
   1.0, 0.0, 1.0, 1.0,

   1.0, 0.0, 0.0, 1.0,
   0.0, 1.0, 0.0, 1.0,
   0.0, 0.0, 1.0, 1.0,
   1.0, 1.0, 0.0, 1.0,
   1.0, 0.0, 1.0, 1.0,

   1.0, 0.0, 0.0, 1.0,
   0.0, 1.0, 0.0, 1.0,
   0.0, 0.0, 1.0, 1.0,
   1.0, 1.0, 0.0, 1.0,
   1.0, 0.0, 1.0, 1.0,

   1.0, 0.0, 0.0, 1.0,
   0.0, 1.0, 0.0, 1.0,
   0.0, 0.0, 1.0, 1.0,
   1.0, 1.0, 0.0, 1.0,
   1.0, 0.0, 1.0, 1.0
 };
 glVertexAttribPointer(glGetAttribLocation(overlayProgram, "a_color"), 4, GL_FLOAT, GL_FALSE, 0, cubeColor);
 glVertexAttribPointer(glGetAttribLocation(overlayProgram, "vPosition"), 3, GL_FLOAT, GL_FALSE, 0, cubeVertex);
 glEnableVertexAttribArray(glGetAttribLocation(overlayProgram, "vPosition"));
 glEnableVertexAttribArray(glGetAttribLocation(overlayProgram, "a_color"));
//draw the bounding box to visualize the boundary.
 glDrawArrays(GL_LINE_STRIP, 0, 5*6); //6 faces

 glDisable(GL_DEPTH_TEST);
}

Then, on the shader program side, we basically construct the (x,y,z) coordinates on GPU using the depth data and the camera calibration data (see Tutorial # 1) we have found previously.

uniform mat4 MVPMat;    // Model-View-Projection matrix
attribute vec4 vPosition; 
attribute vec4 a_color;
varying vec4 v_color;

const float fx_d = 5.5879981950414015e+02;
const float fy_d = 5.5874227168094478e+02;
const float cx_d = 3.1844162327317980e+02;
const float cy_d = 2.4574257294583529e+02;

void main()
{
    //perform the transformation on GPU
    //transform the position so it will remap back to the real-world coordinate...
    float z = vPosition.z; //depth 
    float x = z * (vPosition.x-cx_d) / fx_d; 
    float y = z * (vPosition.y-cy_d) / fy_d;
    float w = 1.0;
    gl_Position = MVPMat*vec4(x,y,z,w); //perspective transformation
    gl_PointSize = 1.0; //we can make the points bigger or small with this
    v_color = a_color;
}

Then, the fragment program will just handle the rest by setting the color.

precision mediump float;
varying vec4 v_color;
void main()
{   
    gl_FragColor = v_color;
}

I highly recommend you to download the source code and read it through. It is just too difficult to explain everything in one tutorial. However, I can put down these keywords just to allow others to search through these topics easily. Keywords: vertex and fragment shaders, perspective transformation, OpenGL ES2, 3D PointCloud data, etc...


Thoughts:
The configuration and fine-grain data handling is rather tricky with the GPU! Basically, the main problem I had was the lack of APIs for handling the projective matrices (everything is manual!).  That indefinitely forced one to go back to OpenGL 101 and start the reading exercise again. I will leave the explanation of how OpenGL handles 3D data, and how to construct the projective matrices as your homework (see related work and references). In this tutorial, I will basically show you setup the projective matrices, how to write a proper function that will push the 3D vertices to the graphics card efficiently and assign proper color to each of them!  With these you have opened up infinite options on how you want to render 3D data on screen! Most importantly, I found the finer control to this does give me a lot more advantages over the older version of the OpenGL. That is just a trade off after all.


Performance:

The performance is rather mediocre mainly because of the 'memory-copy-back' function (i.e., copying the data from graphics card back to a memory buffer that is usable by the process). We have spent almost 30% of the runtime trying to convert such buffer! What a drag! Today, I have found no way to work around this problem and I will leave that as an open problem for NVIDIA to solve. I beleive we can hack the GraphicsBuffer but that option will destroy the future development of this project due to lack of compatibility in future releases.

OpenCV on Tegra 2

I've also cross-compiled OpenCV 2.3.1 to run on the Tegra 2. The performance was definitely not significantly better than what I have written so far. Maybe OpenCV is just a lazy way to do computer vision on Tegra 2? ;) Check out the source code and you will see sample usage of the library in the source code. We have done the Canny edge detection and some other simple image processing.

Sample Videos:

Note: The application runs at 8fps now at 640x480 with all the processing.

Donation:
To continue my development on this project, I believe the best way is to get better hardware or donation.





Hope this tutorial will help others on getting efficient 3D rendering on Tegra 2 platform! especially people who are also interested in utilizing the Kinect camera for input!

Source Code:
svn co https://openvidia.svn.sourceforge.net/svnroot/openvidia/tegra_kinect multitouch

No comments:

Post a Comment