Marching Cubes on the GPU in Unity

You may have noticed that there has not been much code posted here lately. After messing about with Unity for a few years I’ve finally decided to start a project of my own. Its been progressing well but this week I’ve decided it was time for a break and perhaps time to do something more interesting than Direct Compute tutorials. After thinking about it for a while I thought a GPU version of the marching cubes algorithm could be interesting and ties in well with the Direct Compute tutorials. This project generates the voxels and the marching cubes mesh all on the GPU using compute shaders. This does mean that this project is dx11 only but will work in Unity Indie as I have stuck to using buffers only, no render textures. The voxels are generated using a version of the improved Perlin noise done previously but running in a compute shader.

For practically reasons it maybe best to do this sort of thing using multithreading on the CPU but I though it would still be useful to have a version running on the GPU. I have set up a few scenes to show a few methods that could be achieved here. The included scenes are as follows.

Marching Cubes GPU

This scene generates 3D noise as the voxel values and then uses the marching cubes algorithm to convert the voxels to a mesh. As the mesh is generated using a compute shader the vertices are written to a compute buffer. Unity’s draw procedural method is then used to render the buffer. For the rendering I am not using a surface shader as it seems like you can not use a surface shader with draw procedural in Unity. Im not 100% sure on this, maybe there is a way but I have not found any examples of how to do so.

Using a buffer does present some issues. The buffer has to be a fixed size but the number of vertices generated is unknown until the algorithm is ran. To get around this I created a buffer that could hold the maximum number of vertices that the marching cubes can generate which is 5 triangles per voxel. When the algorithm runs only the generated vertices will be written to the buffer leaving all other vertices in the buffer as being zero filled. This means that when rendered a vertex shader is run for each vertex but as the triangle’s have no area there is no fragment generated. This is a bit inefficient but I have compared the method with a buffer converted to a normal Unity mesh and there does not seem to be any difference in rendering performance. Maybe in a more complex scene it could become a issue. To get around this and only have the generated vertices in the buffer maybe some sort of set up with a append buffer or counter buffer (to generate indices for the vertices) could be used. Considering append buffers seem to be a bit buggy in Unity at the moment I thought maybe it was best to stick to a normal buffer.

Marching Cubes GPU 4D noise

This scene is a bit of a performance test. The voxels and the marching cubes are performed every frame using 4D noise to animate the mesh over time. To do this every frame is quite demanding for the GPU and unsurprising it can be a bit slow. On my computer I can only generate a cube of voxels that’s 40 cubed at 60 fps. Anymore than that and it starts to drop to unusable levels. It does look rather cool to see the mesh move over time however.

Marching Cubes GPU Mesh Read Back

This scene generates the voxels and marching cubes the same way as the first scene. The difference here is that the data from the buffer is read back from the GPU and used to create a normal Unity mesh with a collider. Obviously this is more practical but the read back does come at a cost. This method is still much faster than generating the mesh on the CPU (non-threaded). Using the previous marching cubes project that runs on the CPU it takes about 120ms to generate a 32 cubed voxel. Using this GPU version it takes about 60ms, not counting the time to make a collidier which is about 20ms. A lot of that time however is the generation of the Unity mesh. By just reading back the buffer it takes about 40ms. Obviously you could optimize this much more than this simple implementation. A good idea would be to only read back what is needed or even just render the mesh as a buffer and only read back and make a collider for whats needed.

I have also added a stage to generate the smoothed normals by using the derivatives of the voxel values and interpolating them based on the vertices position. This is the same method used in the simple voxel terrain project but runs on the GPU.

Project files.

MarchingCubesGPU

17 thoughts on “Marching Cubes on the GPU in Unity

  1. Hi!
    Awesome work there!
    I’ve also written a Marching Cubes implementation that runs on the GPU (though not in Unity).
    I’ve considered porting it to Unity, but for my use case (destructible voxel terrain), I found that the main perf bottleneck was updating the mesh collider, and I also see that in your project, the generation of the mesh collider also takes almost 300ms on my machine. How did you measure the 20ms generation time you mentioned in your post?

    • When you run the scene the total time is printed out which is the noise time + marching cubes time + mesh renderer creation + mesh collider time.

      By commenting out lines 155 and 154 which is the creating of the collider the difference is about 20ms.

      These times are for a size (N) of 32. The default is set to 48 so will take longer unless you change it.

      The total time on my computer for the default scene is also around 300ms.

      • Oh, you are 100% right, although I modified the code to only measure the creation of the mesh collider, but I missed the part where you said the measurements were done with N = 32. I should read more carefully next time! Using that size, I also got similar results. Thanks again. Still an awesome job though. I think I’ll try to port my impl and see how it compares.

      • Dual Contouring is similar to Marching Cubes but it can produce sharp features where marching cubes cannot. Having sharp features can be useful when visualising architecture with voxels. I would love to see what you can do with Dual Contouring in Unity in the future. Looking forward to it as there aren’t many implementations of it in Unity unlike marching cubes.

  2. Great example! Would love to see Dual Marching Cubes implementation, it’s much more scalable. I made similar mc builder using compute shaders but still haven’t done a dual mc or dual contouring.

  3. Hey, this is really nice, thanks :)

    do you found a better way for the m_meshBuffer.SetData(new float[SIZE*7]); ?
    with N = 40 i get about 40 fps (25 ms) in the 4D Noise scene
    but using instead
    m_meshBuffer.Release();
    m_meshBuffer = new ComputeBuffer(SIZE, sizeof(float)*7);
    I get 40 fps (25ms, render 13-17ms) but with N = 96 !!!
    if the Game-Window in Editor isnt maximized its just about 3 ms
    conspicuous is the high renderer-duration
    are there any cons to do it like this? or even a better way? :D

    Thanks a lot for the code :)

    • There are some things that I have not really optimized, thats one of them.

      I am sure there are better ways to do it and if creating a new buffer every frame is faster do it that way.

      I will look into better optimizations some point in the future but far now its really just to use as a example.

      You will probably find that there is a optimal size for N. Too high and there will be to many verts for the GPU to process in a effecient manner.

  4. Hey, this is really interesting stuff! Out of curiosity, instead of using Perlin Noise could the voxels be generated based on the colliders from other objects?

    Something like this, with “water” advancing down slopes

    I realize that it would never actually “flow” in a fluid simulation sense for practical game purposes (though that 4D noise example is crazy cool under those constraints!), but more like a slow update, like Minecraft water, but with smooth voxels.

    Just curious to know if it’s possible anyways with today’s tech.

    Cheers :)

    P.S. WordPress was giving me some problems with my password while posting so hopefully this isn’t a double post.

    • You can use any method to generate the voxel vaules. Perlin noise can be a bit slow so it could be replaced with something better.

      If you want the voxels to match a existing surface like a mesh or collider you would need to use something like a signed distance field for the voxel vaules.

  5. Hello scrawk, I just discovered your blog and I am really impressed! So many interesting topics, so much to learn from you. But it seems like this blog is abandoned… is it? If not, what do you plan to publish next? Maybe Dual Contouring, Dual Marching Cubes? Or something else? Whatever, I am looking forward to it! Thank you for your work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s