Saturday, September 11, 2010

Lightning Fast Builds with Visual Studio 2010 and an SSD

I reduced my Visual Studio 2010 C++ build time from 21 minutes to  7  5 minutes! You can too. Here's how.

I'm a build performance junkie. If there's one thing I really hate in life, it's sitting around waiting for builds to complete. Fifteen years ago, the very first article I published was titled Speeding Up Visual C++. It was all about making Visual C++ 1.51 go faster on what was then a state of the art computer - an ISA bus Gateway DX2/50. Woo hoo! My recommendations were:
  1. Use Precompiled Headers.
  2. Upgrade to 16MB of memory.
  3. Use 4 megabytes of Disk Cache.
  4. Upgrade your Hard Drive to Fast SCSI or Enhanced IDE.
  5. Turn off Browse Info.
  6. 32 Bit File Access.
Today computers are thousands of times faster, but rotating platter disk drives are still desperately slow. The seek time of 7200RPM drives has changed very little in the last ten years, although the transfer rate for sequential files has risen dramatically. That problem, combined with Visual Studio's desire to create hundreds of .tlog temporary files, quarter gigabyte .sdf files, and project size bloat means that the average build may be even slower today than it was fifteen years ago.

Historically, your CPU would be sitting idle most of the time waiting for the hard disk to keep up. Linking performance is based almost entirely on your disk's random access read/write performance. The highly rated Western Digital Caviar Black can only perform about 100 random access IOPS (I/O Operations Per Second.) It takes multiple I/O operations per OBJ file, so a significant fraction of the link time is waiting for the hard drive to do its job.

Enter the latest generation of SSDs driven by the Sandforce Controller, such as the OCZ Vertex 2. These drives can do over 40,000 IOPS - 400 times faster than a Caviar Black. And Visual Studio 2010 build performance is phenomenal. In fact, these drives are so fast that their impact on build time is negligible. This SSD will easily hit over 50MB/sec of 4KB random writes. In contrast, the ultra-zippy 10,000 RPM VelociRaptor can only do about 3.5MB/sec. (Note that disk striping or mirroring has minimal impact on build performance because the linker isn't smart enough to use scatter/gather to force the queue depth high enough to let the command queuing on the drive work its magic.)

Now that the hard disk performance no longer matters, our next bottleneck is the CPU. You can tell your boss you need one of those monster hyper-threaded quad core i7 processors such as the 875k or, for the money-is-no-object crowd, the hyper-threaded six core 980X. Visual Studio 2010 automatically uses parallel builds. My 875k pegs all eight cores simultaneously at 100%. Compiles proceed eight files at a time and the CPUs stay pegged until the compile is finished. I've never seen a project build so fast.

The next bottleneck is probably your RAM. If you have 4GB RAM running on a 32-bit OS, you are severely limited and you probably won't be able to run eight compiles (much less 24 compiles if you are using parallel MSBuild tasks, as I explain in Part 2.) Upgrade to Windows 7 64-bit with 8GB of RAM.

So it's interesting that the types of recommendations for build performance haven't changed much. Here is my updated list:
  1. Use Precompiled Headers.
  2. Upgrade to 16MB 8GB of memory.
  3. Use 4 megabytes of Disk Cache. Upgrade to 64-bit Windows. 
  4. Upgrade your Hard Drive to Fast SCSI or Enhanced IDE a Sandforce-based SSD. 
  5. Turn off Browse Info. (This is still true. Browse Info is different than Intellisense.)
  6. 32 Bit File Access. Check your motherboard's SATA implementation. At SSD speeds, not all controllers are created equal.
In Part 2 of this article, I'll talk about how to tune your build system to keep all that hardware busy.

The system used for this article was:
  • ASUS P7P55D-E Pro motherboard.
  • Intel 875k i7 processor.
  • OCZ Vertex 2 120GB SSD.
  • 8GB RAM.
  • Thermaltake MUX-120 heatsink.
Do not use the SATA 3 Marvell controller on the ASUS motherboard. The standard ICH10 controller is much faster with SSDs.

The prior system that took 21 minutes was a Core 2 Duo 2.4 GHz with ASUS P5B Deluxe and Caviar Black 750GB hard drive.

23 comments:

  1. I'm not sure if you mean a RAM drive now or 15 years ago. I actually tried to use a RAM drive for temporary files a few months ago with Visual Studio 2008 and the performance impact was negligible. Definitely not worth the effort.

    Certainly moving the entire build to a RAM drive would make things go a lot faster, but an SSD is a lot cheaper and much less susceptible to power failures and user error.

    Back in the day, a RAM drive for temporary files was absolutely required to get halfway decent build speeds. I think I dropped my recommendation for a RAM drive when Visual C++ started using precompiled headers.

    ReplyDelete
  2. I've been researching to try to find out if the same OCZ drive would increase my compile speed - it's good to hear that this worked well for you.

    ReplyDelete
  3. Hi Jim. I typically install my OS and Visual Studio 2010 on my C: disk, and my C++ workspaces on my D: disk. I can only afford one SSD and one conventional disk for my new computer. Should I use the SSD for my C: or my D: ? Thanks.

    ReplyDelete
  4. Without question, your SSD is where Windows is installed and where Program Files is located, so C:. You can then use symbolic links to move some directories to other drives. For example, my home directory is C:\Users\jimb, but my Documents, Pictures and Videos directories all point to D:.

    ReplyDelete
  5. Thanks for the advice, Jim. Why did you symlink to another drive? To save space on the SSD?

    ReplyDelete
  6. My SSD is only 120GB. My pictures alone would take up most of that.

    ReplyDelete
  7. What about using 2 SSD drives in RAID 0? Would that make sense?

    ReplyDelete
  8. I doubt that using SSD drives in RAID 0 would have any impact at all. Remember that Windows does write behind, so you don't have to wait for PDB files (for example) to be written to the drive before continuing. For all intermediate files, you should have enough RAM that they are being cached. So the only effect that that striped SSDs would have is reading the source code, but since your SSD can read 200MB/sec, there's little point in striping SSDs unless you have a desperate need to shave 200ms off of your build.

    ReplyDelete
  9. (Disregarding precompiled headers)
    My comment would be do this:
    Make a second project as static lib, move all your code in there, and then in your original project, include the new lib, and remove all the cpp files - agree?

    ReplyDelete
  10. Not sure of what you are trying to accomplish with a static lib in your description. If you have some code that never changes, then it's reasonable to move it to a .lib, but unless you've gone to the work to make it a completely separate set of modules with no dependencies, you'll need to rebuild that library whenever there's a related header file changes, which defeats the purpose of the library, especially since the downside of a .lib is that it breaks compile-while-debugging.

    For my products, we have a .lib that includes a lot of the common code, but we have eight products that rely on that code and the library is rebuilt very frequently.

    ReplyDelete
  11. Just to echo your findings, I got a fast 128GB SSD for my dev machine and saw my build time drop from 30 mins to under 5 mins (large C++ soln with many small files so very IO bound). The impact this has on my productivity is enormous. Additionally many other slow tasks such as SVN checkouts and other operations, using email clients and even installing and launching applications is super charged. I've seldom been as impressed with a single hardware upgrade. I'll be recommending to my boss that every developer on my 30+ team gets one. It will pay for itself in a week.

    ReplyDelete
    Replies
    1. Replace SVN with Git and you'll go even faster.

      Delete
    2. Anonymous,

      Let me explain to you what your recommendation would mean in professional software development environment. I have nothing against Git, but you don't seem to grasp what you are proposing. Chris said he has 30 developers and a large project. That means to transition to Git, he would have to retrain 30 people and rewrite all of the build scripts, release scripts, and test-driven development scripts. Total cost for a project of that size could easily exceed $100,000. Your proposed "improved productivity" will be completely overrun by the weeks or months of effort to retrain and redo scripts and the opportunity cost of having those developers reassigned for that time.

      Therefore, I have to say that switching to Git on this project is a pretty bad idea.

      Delete
  12. One way to optimize your compilation time is to use one header file that will include all your headers files and do the same with for cpp files (one cpp that include all cpp files). Exclude all files from the project exept this two files, it will be 20 times faster and it simplify the cross plateform development.

    ReplyDelete
    Replies
    1. Jamie,

      This is really, really bad advice. Your idea means that removing a single character from a file causes a full rebuild of your entire project. For example, on my project (a relatively modest size 250,000 lines of code), that would mean that removing a comment would take a couple of minutes to rebuild instead of three seconds. Your strategy also means that compilation is handled by a single core instead of multiple cores. On my 8-core system, that would cause nearly an 8x increase in my build time.

      Finally, your idea of including multiple .cpp files together completely breaks the concept of compilation units. You lose any semblance of information hiding and coupling because all header files (and therefore all classes) are available in all files. This can lead to very sloppy development practices.

      I've worked with two libraries that use your strategy (UW IMAP and SQLite) and both of them use the technique solely to make distribution simpler, not to decrease build time.

      Bundling headers together can be a good idea - but the idea was implemented by Microsoft at least 15 years ago and it's called precompiled headers. Again though, the headers you select to be precompiled have to be very carefully chosen. Putting all headers into your precompiled header will cause horrible rebuild dependencies that will slow down your builds dramatically.

      Therefore, if you are seeing a "20x" improvement in your build times by following the strategy you described, then your project build strategy is badly designed and needs to be rethought from the ground up to depend on parallel builds and precompiled headers.

      Delete
  13. Hey Jim,

    My personal project count ~50 000 lines of codes, with this method it take 2 seconds to compile and link with visual studio and gcc. I don't need to maintain precompiled header and i stopped to play with compiler option. The goal of this method is to minimize IO acces. At work we have 3,5 millions line of code, we have a tool that will build some big file as some compiler don't support file with >65536 lines of code.

    Parallel build is a really good option exept if your compiler don't support this feature, don't forget that some of us work with really old device or crappy compiler.

    ReplyDelete
  14. Jamie,

    I've actually done quite a lot of work on cross platform and even some embedded work. You don't need your compiler to support parallel builds. Any time you are building on a Linux/UNIX/Windows system you can use parallel make instead. I was using pmake 20 years ago on a 4 processor Sun server. Builds were really, really fast :-)

    ReplyDelete
  15. > Without question, your SSD is where Windows is installed

    I don't really understand why this is the case. We have a very large, complex C++ project that takes over an hour to build, and produces about 20G of object/pch/pdb files. (we may have 4-5 branches of the same object code active at any one time), thus we're looking at getting a 120G+ SSD for each developer.

    It seems to make sense that the build artifacts (given there are so many) go on the SSD?

    ReplyDelete
    Replies
    1. Anonymous,

      You are correct, I was a little hasty in my answer. My original reasoning was that, if you are asking that question in the first place, then you are probably on a tight budget, and therefore your project probably isn't very big. In that case, the productivity gains of Windows and Visual Studio running faster from the SSD are going to outweigh the gains from building from the SSD. (I found the productivity gains from having Windows on SSD to be substantial. Reboots happen in seconds instead of minutes. Word, Excel and OneNote start instantly.)

      For a "well financed" project, it never occurred to me that this question would even be asked because Windows, Program Files, and the dev project should all be on the SSD. Move everything else (Documents, Pictures, Videos, Music) to rotating platter. This is how my own dev system is organized.

      For a large project such as yours, I agree, if you have to choose then use the SSD for your build directory.

      Delete
  16. Thanks for that mate. I'd love to have SSD as the main drive (with Windows et. al. on it) but it is still cost prohibitive in large sizes. We were looking to go for 500G main drive (windows, dev tools etc) and 100G SSD 'build' drive, but I'd heard so many varying views regarding dev setup with an SSD I was very confused :-)

    ReplyDelete
    Replies
    1. Make sure your TEMP directory is on the SSD too. Also, you can move Visual Studio onto the SSD by using a junction link in Program Files.

      Delete
  17. That's a great idea. I would never have thought to put the TEMP folder on the SSD - but of course that's where the compiler/linker puts all the intermediary stuff so it would make a significant improvement!

    ReplyDelete