Wednesday, September 5, 2012

Multithreading on Android 4.0.2 + Tegra 3

Multithreading is one of the most commonly used technique for accelerating your application. Of course, we assume that your application is parallelizable at the first place, or else you would be looking into NEON instruction set and other ways to accelerate your code.

Apparently, Tegra 3 was designed for multi-tasking as it comes with 4 (FOUR!) cores!, but how do we take advantages of such without spending nights and days debugging? 

When we perform multiple tasks in parallel, the first problem we have to handle is the synchronization issue. These are often handles by using 'locks'. The detail of how to handle race condition, using locks, conditional variables, and so are left for your reading (see below). Additionally, creating threads for every task can be time consuming, and eventually you may even run out of thread id. Thus, we often implement a thread pool, a set of threads/workers that waits for commands. 

In this coming up tutorial, I will show you how to multithread your application in 10 mins, and which would give you a potential 3x speed up. We will demonstrate how you can easily obtain real-time results by utilizing the cores of a mobile processor! 


Using the cThreadPool
Sometimes, simplicity is just gold. I've found this little implementation that handles the thread pool. And here is how I have implemented that to our flow. http://sourceforge.net/projects/cthreadpool/files/
     
     static int counter = 0;
     counter++;
     char my_path[512];
     RGBpack *filepack = (RGBpack*)(malloc(sizeof(RGBpack)));
     filepack->frame_count=counter;
     filepack->width=IMAGE_WIDTH;
     filepack->height=IMAGE_HEIGHT;
     //filepack->rgb_data = (unsigned char*)malloc(IMAGE_WIDTH*IMAGE_HEIGHT*3*sizeof(unsigned char));
     filepack->depth_data = (unsigned short*)malloc(IMAGE_WIDTH*IMAGE_HEIGHT*sizeof(unsigned short));
     if(filepack->depth_data==NULL){
      free(filepack);
      return false;
     }
     //ni_wrapper->getRGB(filepack->rgb_data);
     ni_wrapper->getDepth(filepack->depth_data);
     int ret = threadpool_add_task(pool,fast_task,filepack,1); //this will also free the memory

     if(ret==-1){
      __android_log_write(ANDROID_LOG_INFO, "THREAD POOL:", "POOL ERROR?\n");
      //free(filepack->rgb_data);
      free(filepack->depth_data);
      free(filepack);
     }
     if(ret==-2){
      __android_log_write(ANDROID_LOG_INFO, "THREAD POOL:", "FAILED to add task, pool full?\n");
      //free(filepack->rgb_data);
      free(filepack->depth_data);
      free(filepack);
     }
As we can see, we will be packaging the frame we receive from the Kinect and add each to the task list. The thread pool will automatically pick up these whenever they are free, and our job is done. One thing we have to watch out for is memory usage, it does get memory hungry if we allocate too many tasks at the same time! To no surprise, we can now achieve real-time capturing with the Tegra 3! Check it out :)




Raw PNG outputs from the Kinect:


Excuse my dance moves. Scroll down quickly to see what we have captured in 2 seconds. 


























































Some more randomly selected from the set:











Coming up: 
........


Source code:

svn co https://openvidia.svn.sourceforge.net/svnroot/openvidia/tegra_kinect multitouch

Note: the source code at the repository could be broken for time to time. Please email me if you run into any problem.


Reading:
https://computing.llnl.gov/tutorials/pthreads/

Linux Tutorial:
http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html

No comments:

Post a Comment