Intel Performance Tuning Protection Plan

Purchased Intel Performance Tuning Protection Plan for the i7 5930K. Now running at 4.5Ghz with memory at 3K. Running on a Corsair Hydro Series H110i GT ! ...

Guild Wars 2 - Bringing Antialiasing to the game

It's no surprise seeing another game slam onto the market lacking AA (Anti-Aliasing) features and Guild Wars 2 is no exception. MMORPG games tend to get...

Syfy's Defiance game Beta Screenshots

Below are a few screenshots from the upcoming Syfy "Defiance" game coming out on April 2, 2013. Defiance is a first of its kind where the game will be in close...

Steam Box, Steam for Linux

Valve has made one of its first steps toward a "Steam Box." Steam for Linux has been released today for Linux's operating system Ubuntu; complete with a...

Lowering game latency with WTFast tunneling

Taking a dive into WTFast  latency tunneling service -- is like your internet on steroids. Internet latency is a gamers worst nightmare. Any network...

Logitech G13 Advanced Gameboard

The G13 Advanced Gameboard is Logitech’s answer to gamers needs. It’s a fantastic light-weight, portable alternative for gamers who don’t want...

Rosewill Thor V2 Computer Case

Rosewill , the company behind quality hardware and affordable cases, debuted the redesigned Thor-V2 case earlier this year and has since become one of the...

Electronic Arts: Battlefield 3

As one of the most anticipated games released by EA (Electronic Arts), Battlefield 3 sets itself apart from its rivals with a great story-line and game play....

Nvidia says large GPGPU speed up claims were due to bad original code

Nvidia has said that most of the outlandish performance increase figures touted by GPGPU vendors was down to poor original code rather than sheer brute force computing power provided by GPUs.

Both AMD and Nvidia have been using real-world code examples and projects to promote the performance of their respective GPGPU accelerators for years, but now it seems some of the eye popping figures including speed ups of 100x or 200x were not down to just the computing power of GPGPUs. Sumit Gupta, GM of Nvidia’s Tesla business told The INQUIRER that such figures were generally down to starting with un-optimized CPU code.

During Intel’s Xeon Phi pre-launch press conference call, the firm cast doubt on some of the orders of magnitude speed up claims that had been bandied about for years. Now Gupta told The INQUIRER that while those large speed ups did happen, it was possible because of poorly optimized code to begin with, thus the bar was set very low.

Gupta said, “Most of the time when you saw the 100x, 200x and larger numbers those came from universities. Nvidia may have taken university work and shown it and it has an 100x on it, but really most of those gains came from academic work. Typically we find when you investigate why someone got 100x [speed up] is because they didn’t have good CPU code to begin with. When you investigate why they didn’t have good CPU code you find that typically they are domain scientists not computer science guys – biologists, chemists, physics – and they wrote some C code and it wasn’t good on the CPU. It turns out most of those people find it easier to code in CUDA C or CUDA Fortran than they do to use MPI or Pthreads to go to multi-core CPUs, so CUDA programming for a GPU is easier than multi-core CPU programming.”

According to Gupta, those users that have optimised their code to squeeze most of the performance out of the CPU can get somewhat more sedate performance gains. “Most people we find who have optimised CPU code, and really you’ll only find optimised CPU code in the HPC world, get between 5x to 10x speed up, that’s the average speed up that people get. In some cases it’s even less, we’ve seen people getting speed ups of 2X but they are delighted with 2x because there is no way for them to get a sustainable 2X speed up from where they are today,” said Gupta.

Gupta’s comments about code optimisation is something that will resonate with many researchers who work on tight paper deadlines intertwined with writing funding proposals where it is far more prudent to spend time working on the theory, modelling a solution and evaluating the results rather than spending time optimising code when these days computing is far easier to come by. Nevertheless Gupta said that when it comes to picking whether to optimise code for CPU or GPGPU, researchers do a simple cost analysis.

Gupta said, “Even if you assume it is the same effort to do multi-core CPU versus GPU, lets say multi-core CPU gives you 10x speedup but CUDA gives you 100X over where you are today, you’ll obviously go for the bigger speed-up and work on that platform first, and that’s why you end up with these guys getting these phenomenal speed ups. It’s only because they have really bad original CPU code and don’t have either the interest or energy or time to make it into good CPU code.”

While Gupta’s candor over the source of the speed-ups touted by Nvidia and AMD in the past is refreshing and should bring GPGPU accelerators back down to Earth for some people, it should be noted that 2x speed ups are still very desirable for many researchers. However it seems that as always with benchmarks it is good to apply a healthy dose of skepticism.

via The Inquirer


Comments

comments

 

 
Categories: Latest News, Top 10 Headlines.