Nice to hear you get it up and running

The sensitivity problem is based on the mouse. With the kinect it moves according to your hand, which is muh more intuitive, but i just hacked this version together and didn´t tweek the mouse movement.
And it is really fast . First version with polyvox was running at about 7 frames per second and now we cropped it down to 60 frames.
I used a bunch of optimaziosn, starting from swapping some code from c# into c++ (all the set operations are done in c++ and just called from c#), then i removed every division and replaced it with a bitshift (this was quite efficient, brought us ~15 frames) and after this i implemented a hybrid approach of the volume. The voxels are stored in 1 big polyvox volume (1,5k x 1,5k x1,5k) and a grid ofe 100x100x100 blocks is mapped over the volume for volume extraction. when i set a voxel i calculate the according small block and set to "update needed". This block will now be preprocessed. The extractor fetch the model from the preprocessed blocks, but only consider blocks inside the view frustum.
Again feel free to grab the code (The wrapper is included as well) and ask me if something is unclear (because comments are quite messy)
regards!