V-Ray RT GPU benchmark test

By Elliott Smith

3D Consultant


Date: September 30, 2010

Category: 3D Modelling , Render Farms & HPC

Tags: Rendering , Quadro , Chaos Group , V-Ray RT , Tesla , Performance Benchmark , NVIDIA Geforce

Share:|More
Elliott Smith

Like everyone else in the 3D world, we were cock-a-hoop when we found out that the V-Ray RT GPU beta was available for testing. Once we calmed down, we got down to some serious testing to find out how fast it can render and which graphics card is the best choice.

As a benchmark, we rendered the scene in the image below using the standard production renderer. For this test, we used an Intel i7 920 CPU, which took 25 minutes 36 seconds to complete.

render 1

We also repeated the render using slightly lower settings that are more like the type of settings you would use for a draft render. The result: the render completed in 9 minutes 36 seconds.

The scene is made up of 1 million polygons and uses a single V-Ray dome light with an HDR image to provide varied reflections. Every material is highly reflective, which traditionally means longer render times. We used fairly high render settings to get as close to the final look as possible but without taking too long.

The test

So that the tests were fair, we let each GPU render reach 32 samples per pixel, which I hope you'll agree provides a decent enough image to base a creative decision on.

render 2

Of course there is still noise in the image but, for the type of feedback we are looking for, it is perfect. From this image you can see if the materials, reflections and lights you have applied are correct. So, instead of waiting 25 minutes (or less if you reduce the render settings), you can see within a few seconds if you need to change anything.

Results

At this point, we'd like to remind you that this is the beta of the GPU version. So, while these results are accurate, we can't say they are definitive. Not yet anyway.

We've only tested a few graphics cards so far, but in the near future we hope to be able to give you a much more comprehensive list of benchmarked GPUs, so you can choose which is the most appropriate card for your needs and budget. The results can be seen in the chart below.

render 3

We found that even an entry level NVIDIA GeForce graphics card will give you adequate feedback after 38 seconds. That's 15 times faster than the draft render using the traditional method. The Tesla C2050 is even more impressive, clocking in at about 55 times faster!

The difference between them? CUDA cores. V-Ray RT uses the OpenCL API instead of CUDA, but you can accurately predict the speed of a NVIDIA GPU by looking at the number of CUDA cores. In the above instance, the GTS 250 has 128 cores whereas the Tesla has 448.

So which GPU is best?

Luckily for you, we have created a useful table that shows the range of GPUs we can offer and the all important information about each one.

render 4

You may have noticed that we have only included NVIDIA GPUs and not ATI cards. So far we haven't been able to gauge how the number of processor cores on the ATIs correlate to render time. However, it seems to be widely accepted that NVIDIA are currently able to process much more efficiently using OpenCL than ATI are.

A few things to bear in mind

If you have a big scene that requires 3 or 4 GB of RAM, then in order to load this scene onto the GPU, you will need to go for a GPU that has this amount of RAM on board - which makes sense. So, for huge scenes, the Tesla range looks to be the best choice.

If your scenes are more moderate in memory needs, your primary consideration needs to be the number of CUDA cores, as this will give you a faster render.

We found that when we used a single graphics card to process the render and also drive the video output, artefacts occur on screen as soon as we clicked render. For best results, we recommend a standard GPU for the displays and a secondary GPU to be used solely for the processing

The Tesla C2050 requires dual 6-pin power cables and takes up the space of two PCIe slots. It also requires 238 Watts so check, not only your current power consumption and your PSU, but also that you have an additional 6-pin power cable as we only got one when we opened the box!

If you have any questions at all about this or would like to arrange a demo with your scenes, please feel free to email us at 3D@Jigsaw24.com or call on 03332 409 309. Just remember that this is a beta version, so there may be a degree of onsite troubleshooting going on!

Share:|More
Leave a comment
8 comments


Ben Kitching, Jigsaw 3D Consultant
Tuesday, January 03, 2012

Thanks werez. You're right in pointing out that the FP64 performance of the Quadro 2000, 600 and all of the GeForce cards are artificially limited, although this will only slow down certain 64-bit calculations so it's not really an issue unless you're dealing with massive numbers or representing them to lots of decimal places. If you check Elliott's graph you'll see that V-Ray RT scales linearly as you increase CUDA cores, but well spotted!

werez
Friday, December 30, 2011

'As far as I know all CUDA cores are equal.' Unfortunately that's not true. FP64 performance of geforce gpu and low-end quadro is limited to 1/16.

Elliott Smith
Thursday, December 02, 2010

Hi Mike, CUDA and OpenCl are both graphic APIs. CUDA is Nvidia only and OpenCl should be present on most new ATI and Nvidia GPUs. They are both languages / protocols that programs like V-Ray RT and Bunkspeed use to display the graphics content and in this case accelerate rendering. When choosing a GPU for V-Ray and Bunkspeed, you need to consider your typical scene size. If the RAM requirements of your scene are 2GB (fairly large), you will need a GPU with that amount of RAM onboard, otherwise you will not be able to load the scene onto the board, unless you do some optimisation. So, the amount of RAM dictates the scene size and the amount of cores dictate the speed of render. So, lets say that you have scenes that are 2GB in size, you should be looking at the Quadro range, perhaps the Quadro 4000, which has 256 CUDA cores and is a great price. If you were looking for more, then you would go for a Quadro 5000 or Tesla C1060 or even a Tesla C2050, which has 3GB of RAM and 448 CUDA cores. If you needs are more moderate then you could perhaps look at the gaming cards, such as the GTX 480, which has 480 cores and 1.5 GB RAM. A word of caution when choosing a gamer card though. The gaming cards arent supposed to be used intensively and will be prone to burn out if used daily. Although more expensive, the Quadros / Teslas will not burn out and will have a much higher life expectancy and will also stay current for much longer, you tend to find that the GeForce range gets updated every year. Hope that helps, Elliott p.s to answer your question, yes both the Quadros and Teslas will be insanely fast!

Mikemigs
Thursday, December 02, 2010

Thanks for the great write up. Can you please clarify the CUDA vs. OpenCL difference? I'm looking to purchase a new GPU for rendering in both CUDA renderers (Bunkspeed Shot) AND VRay RT GPU. Would a highly specced Quadro or Tesla kick butt in both departments? Thanks in advance.

Elliott Smith
Thursday, October 28, 2010

In terms of viewport performance, I really havent noticed much difference between the Quadro and GeForce cards, but you can download 3ds Max performance drivers for Quadro cards, that must improve something! I dont do a lot of animation but I guess they would help with playback and frame rate, much like they would do with games. As far as I know all CUDA cores are equal. In fact Ive just recently done some simple maths on the results Ive got and can safely say that despite the GPU you use, if you double the cores, you half the render time. Using this Ive estimated the render times for both the GTX 480 and GTX 285 and hope to be able to test the theory in the next week or so. Hope that answers all your questions.

Paul
Wednesday, October 27, 2010

How are you finding the GTX cards performing in 3dsmax these days for viewport performance compared to the Quadros? Also are all cuda cores created equal as in there clock speed etc? Will one cuda core on one GPU be the same as on another? Thanks for providing the benchmark info, very helpful.

Elliott Smith
Monday, October 04, 2010

Hi Anshuman, Ive not yet tested any of the GTX 400 series, but will do so as soon as possible. It is possible to gauge how fast they will be by looking at the amount of CUDA cores they have. The GTX 460 = 336, GTX 465 = 352, GTX 470 = 448 and the GTX 480 = 480. In terms of CUDA cores, these are as fast as the current Quadros and Teslas. None have more than 1.5GB of RAM, so you will be limited by how much you can fit onto the card, unless you get clever with Xrefs or proxies (if they are supported in the final release) or reduce the size of your bitmaps. Keep checking back and I should have some results up soon.

Anshuman
Sunday, October 03, 2010

Hi! do you have any GTX 400 series benchmarks???






© 1992 - 2012 Jigsaw Systems Ltd trading as Jigsaw24. Registered Company No. 2682904. All rights reserved.
Business Terms & Conditions | Consumer Terms & Conditions | Website User Conditions
UK & Eire Customers Only

Feedback Form