WIDGG'S RESEARCH: September 2011

I don't have new stuff to show right now, but I've been busy in the last week. I'm currently implementing a more dynamic version of my renderer. CUDA is very nice, but when you need to add quickly new modules or do some rapid tweaks, it's not the best solution. So, for now the plan is to render to have a slower renderer that would be easier to change and adapt. When we're sure of the features that we really want to keep and we'll know the full limitation of our system, an implementation in CUDA would be possible.

One of the reason to change the renderer is that I needed some C++ features like virtual functions, which is very useful for designing rapidly a new system. I think there's some language that allows to create virtual function in CUDA or OpenCL by simply adding some hidden piece of code.

If it doesn't exist, here's how I'd do it:

First, in CUDA, there's no function pointers. But if we take a look at how polymorphism works, it's really just a way to disguise procedural code. Actually, the whole object oriented programming scheme is just a way to disguise procedural code by hiding some information to the programmer so (s)he doesn't have to worry about it. Just like many other languages do (Java is the most popular about that one because it hides a lot of security features like buffer overflows done during the execution to avoid major problems).

So, if we have:

class A
{
private:
int value;
public:
A(void);
virtual ~A(void);
virtual void test(T param1, U param2);
};

Then this piece of code is actually converted into something that look basically like this:

// prototypes
struct class_A;
void __class_A_constructor(struct class_A *this);
void __class_A_destructor(struct class_A *this);
void __class_A_test(struct class_A *this, T param1, U param2);
struct class_A_virtual_table
{
void (*class_A_destructor)(struct class_A *this);
void (*class_A_test)(struct class_A *this, T param1, U param2);
};

// create a single instance of the virtual table and assign the values for the function pointersstruct class_A_virtual_table class_A_virtual_table_only_instance_needed =
{
  __class_A_destructor,
  __class_A_test
};
struct class_A
{
class_A_virtual_table *VT;
int value;
};
void __class_A_constructor(struct class_A *this)
{
  this->value = 0; // or some other default value
  this->VT = &class_A_virtual_table_only_instance_needed;
}
void __class_A_destructor(struct class_A *this)
{
// stuff to destroy
}
void __class_A_test(struct class_A *this, T param1, U param2)
{
// stuff to test
}
// virtual functions
void virtual_class_A_destructor(struct class_A *this)
{
  this->VT.class_A_destructor(this);
}
void virtual_class_A_test(struct class_A*this, T param1, U param2)
{
  this->VT.class_A_test(this, param1, param2);
}

And then, with a subclass B, we have:

class B: public A
{
private:
double value;
public:
B(void);
~B(void);
void test(T param1, U param2); // version of test but for class B
void function_in_B(void);
};

int main(void)
{
B b;
A *a = &b;
a->test(1, 2); // let way that type T and U are integers...
b.function_in_B();
}

Then, all of this become:

struct class_B
{
struct class_A super;
double value;
};
// prototypes for class B
void __class_B_constructor(struct class_B *this);
void __class_B_destructor(struct class_B *this);
void __class_B_test(struct class_B *this, T param1, U param2);
void __class_B_function_in_B(struct class_B *this);
// virtual table for B, with the virtual functions for B
struct class_A_virtual_table class_B_virtual_table_only_instance_needed =
{
  __class_B_destructor,
  __class_B_test
};
void __class_B_constructor(struct class_B *this)
{
__class_B_constructor(&(this->super));
  this->VT = &class_B_virtual_table_only_instance_needed; // replace the virtual table
this->value = 0.0; // or some other default value
}
void __class_B_destructor(struct class_B *this)
{
// stuff to destroy in B
__class_A_destructor(&(this->super)); // then, destroy the stuff in the parent
}
void __class_B_test(struct class_B *this, T param1, U param2)
{
// stuff to test, but for B
}
void __class_B_function_in_B(struct class_B *this)
{
// whatever that function does!
}
int main(void)
{
struct class_B b;
  __class_B_constructor(&b);
struct class_A *a = &(b.super); // the address of the parent in struct class_B
  virtual_class_A_test(a, 1, 2);
   __class_B_function_in_B(&b);
  __class_B_destructor(&b);
}

So, as you can see, polymorphism is using function pointers to create the illusion that the right function is called each time.

But, without any function pointers, we need to recreate that illusion. So, one way that can be done in CUDA/OpenCL, is to add a unique ID for each class. And each instance of that class will have that ID assigned in the constructor. So:

enum CLASS_IDS { CLASS_A_ID, CLASS_B_ID, ..., CLASS_X_ID };
struct class_A
{
int ID;
int value;
};
void __class_A_constructor(struct class_A *this)
{
this->ID = CLASS_A_ID;
  this->value = 0; // or some other default value
}
struct class_B
{
struct class_A super;
double value;
};
void __class_B_constructor(struct class_B *this)
{
  __class_A_constructor(&(this->super)); // always call the parent constructor first!
this->ID = CLASS_B_ID; // overwrite the ID
  this->value = 0.0; // or some other default value
}

Finally, write the virtual functions like this:

void virtual_class_A_destructor(struct class_A *this)
{
switch(this->ID)
{
case CLASS_A_ID:
__class_A_destructor(this);
break;
case CLASS_B_ID:
__class_B_destructor((class_B*)this);
break;
// ...
case CLASS_X_ID:
__class_X_destructor((class_X*)this);
break;
};
}

So, with a switch statement, it's possible to replace the virtual table. So, a compiler that would be able to take C++ code and convert it in CUDA/OpenCL would be able to do polymorphism with that approach. It's much more slower than a function pointer, but the result of the computation would be the same.

I think that such compilers exist already, but I never used one of them. But it's probably using a method similar to the one above to emulate the virtual tables.

And for those of you who didn't know how polymorphism worked, now you have a better idea of the process the compiler has to do to convert classes and their virtual functions in machine bytecode. So, I hoped that post helps a bit for that. Otherwise, there's a couple of resources available on the web that would clearly describe that process.

If any of you know a (free if possible) compiler that convert basic C++ code in CUDA or OpenCL, but also has integrated some of the features of those languages like the synchronization between threads of the same bloc and atomic operations, I'd really like to see that and use it eventually. It must work on Linux!

WIDGG'S RESEARCH

2011/09/30

Changed my profil logo...

research news