Saturday, September 11, 2010

Lightning Fast Builds with Visual Studio 2010 and an SSD

I reduced my Visual Studio 2010 C++ build time from 21 minutes to  7  5 minutes! You can too. Here's how.

I'm a build performance junkie. If there's one thing I really hate in life, it's sitting around waiting for builds to complete. Fifteen years ago, the very first article I published was titled Speeding Up Visual C++. It was all about making Visual C++ 1.51 go faster on what was then a state of the art computer - an ISA bus Gateway DX2/50. Woo hoo! My recommendations were:
  1. Use Precompiled Headers.
  2. Upgrade to 16MB of memory.
  3. Use 4 megabytes of Disk Cache.
  4. Upgrade your Hard Drive to Fast SCSI or Enhanced IDE.
  5. Turn off Browse Info.
  6. 32 Bit File Access.
Today computers are thousands of times faster, but rotating platter disk drives are still desperately slow. The seek time of 7200RPM drives has changed very little in the last ten years, although the transfer rate for sequential files has risen dramatically. That problem, combined with Visual Studio's desire to create hundreds of .tlog temporary files, quarter gigabyte .sdf files, and project size bloat means that the average build may be even slower today than it was fifteen years ago.

Historically, your CPU would be sitting idle most of the time waiting for the hard disk to keep up. Linking performance is based almost entirely on your disk's random access read/write performance. The highly rated Western Digital Caviar Black can only perform about 100 random access IOPS (I/O Operations Per Second.) It takes multiple I/O operations per OBJ file, so a significant fraction of the link time is waiting for the hard drive to do its job.

Enter the latest generation of SSDs driven by the Sandforce Controller, such as the OCZ Vertex 2. These drives can do over 40,000 IOPS - 400 times faster than a Caviar Black. And Visual Studio 2010 build performance is phenomenal. In fact, these drives are so fast that their impact on build time is negligible. This SSD will easily hit over 50MB/sec of 4KB random writes. In contrast, the ultra-zippy 10,000 RPM VelociRaptor can only do about 3.5MB/sec. (Note that disk striping or mirroring has minimal impact on build performance because the linker isn't smart enough to use scatter/gather to force the queue depth high enough to let the command queuing on the drive work its magic.)

Now that the hard disk performance no longer matters, our next bottleneck is the CPU. You can tell your boss you need one of those monster hyper-threaded quad core i7 processors such as the 875k or, for the money-is-no-object crowd, the hyper-threaded six core 980X. Visual Studio 2010 automatically uses parallel builds. My 875k pegs all eight cores simultaneously at 100%. Compiles proceed eight files at a time and the CPUs stay pegged until the compile is finished. I've never seen a project build so fast.

The next bottleneck is probably your RAM. If you have 4GB RAM running on a 32-bit OS, you are severely limited and you probably won't be able to run eight compiles (much less 24 compiles if you are using parallel MSBuild tasks, as I explain in Part 2.) Upgrade to Windows 7 64-bit with 8GB of RAM.

So it's interesting that the types of recommendations for build performance haven't changed much. Here is my updated list:
  1. Use Precompiled Headers.
  2. Upgrade to 16MB 8GB of memory.
  3. Use 4 megabytes of Disk Cache. Upgrade to 64-bit Windows. 
  4. Upgrade your Hard Drive to Fast SCSI or Enhanced IDE a Sandforce-based SSD. 
  5. Turn off Browse Info. (This is still true. Browse Info is different than Intellisense.)
  6. 32 Bit File Access. Check your motherboard's SATA implementation. At SSD speeds, not all controllers are created equal.
In Part 2 of this article, I'll talk about how to tune your build system to keep all that hardware busy.

The system used for this article was:
  • ASUS P7P55D-E Pro motherboard.
  • Intel 875k i7 processor.
  • OCZ Vertex 2 120GB SSD.
  • 8GB RAM.
  • Thermaltake MUX-120 heatsink.
Do not use the SATA 3 Marvell controller on the ASUS motherboard. The standard ICH10 controller is much faster with SSDs.

The prior system that took 21 minutes was a Core 2 Duo 2.4 GHz with ASUS P5B Deluxe and Caviar Black 750GB hard drive.

38 comments:

  1. I'm not sure if you mean a RAM drive now or 15 years ago. I actually tried to use a RAM drive for temporary files a few months ago with Visual Studio 2008 and the performance impact was negligible. Definitely not worth the effort.

    Certainly moving the entire build to a RAM drive would make things go a lot faster, but an SSD is a lot cheaper and much less susceptible to power failures and user error.

    Back in the day, a RAM drive for temporary files was absolutely required to get halfway decent build speeds. I think I dropped my recommendation for a RAM drive when Visual C++ started using precompiled headers.

    ReplyDelete
  2. I've been researching to try to find out if the same OCZ drive would increase my compile speed - it's good to hear that this worked well for you.

    ReplyDelete
    Replies
    1. careful both a colleague and I independently found OCZ Synapse with dataplex to be desperately unreliable

      Delete
  3. Hi Jim. I typically install my OS and Visual Studio 2010 on my C: disk, and my C++ workspaces on my D: disk. I can only afford one SSD and one conventional disk for my new computer. Should I use the SSD for my C: or my D: ? Thanks.

    ReplyDelete
  4. Without question, your SSD is where Windows is installed and where Program Files is located, so C:. You can then use symbolic links to move some directories to other drives. For example, my home directory is C:\Users\jimb, but my Documents, Pictures and Videos directories all point to D:.

    ReplyDelete
  5. Thanks for the advice, Jim. Why did you symlink to another drive? To save space on the SSD?

    ReplyDelete
  6. My SSD is only 120GB. My pictures alone would take up most of that.

    ReplyDelete
  7. What about using 2 SSD drives in RAID 0? Would that make sense?

    ReplyDelete
  8. I doubt that using SSD drives in RAID 0 would have any impact at all. Remember that Windows does write behind, so you don't have to wait for PDB files (for example) to be written to the drive before continuing. For all intermediate files, you should have enough RAM that they are being cached. So the only effect that that striped SSDs would have is reading the source code, but since your SSD can read 200MB/sec, there's little point in striping SSDs unless you have a desperate need to shave 200ms off of your build.

    ReplyDelete
  9. (Disregarding precompiled headers)
    My comment would be do this:
    Make a second project as static lib, move all your code in there, and then in your original project, include the new lib, and remove all the cpp files - agree?

    ReplyDelete
  10. Not sure of what you are trying to accomplish with a static lib in your description. If you have some code that never changes, then it's reasonable to move it to a .lib, but unless you've gone to the work to make it a completely separate set of modules with no dependencies, you'll need to rebuild that library whenever there's a related header file changes, which defeats the purpose of the library, especially since the downside of a .lib is that it breaks compile-while-debugging.

    For my products, we have a .lib that includes a lot of the common code, but we have eight products that rely on that code and the library is rebuilt very frequently.

    ReplyDelete
  11. Just to echo your findings, I got a fast 128GB SSD for my dev machine and saw my build time drop from 30 mins to under 5 mins (large C++ soln with many small files so very IO bound). The impact this has on my productivity is enormous. Additionally many other slow tasks such as SVN checkouts and other operations, using email clients and even installing and launching applications is super charged. I've seldom been as impressed with a single hardware upgrade. I'll be recommending to my boss that every developer on my 30+ team gets one. It will pay for itself in a week.

    ReplyDelete
    Replies
    1. Replace SVN with Git and you'll go even faster.

      Delete
    2. Anonymous,

      Let me explain to you what your recommendation would mean in professional software development environment. I have nothing against Git, but you don't seem to grasp what you are proposing. Chris said he has 30 developers and a large project. That means to transition to Git, he would have to retrain 30 people and rewrite all of the build scripts, release scripts, and test-driven development scripts. Total cost for a project of that size could easily exceed $100,000. Your proposed "improved productivity" will be completely overrun by the weeks or months of effort to retrain and redo scripts and the opportunity cost of having those developers reassigned for that time.

      Therefore, I have to say that switching to Git on this project is a pretty bad idea.

      Delete
  12. One way to optimize your compilation time is to use one header file that will include all your headers files and do the same with for cpp files (one cpp that include all cpp files). Exclude all files from the project exept this two files, it will be 20 times faster and it simplify the cross plateform development.

    ReplyDelete
    Replies
    1. Jamie,

      This is really, really bad advice. Your idea means that removing a single character from a file causes a full rebuild of your entire project. For example, on my project (a relatively modest size 250,000 lines of code), that would mean that removing a comment would take a couple of minutes to rebuild instead of three seconds. Your strategy also means that compilation is handled by a single core instead of multiple cores. On my 8-core system, that would cause nearly an 8x increase in my build time.

      Finally, your idea of including multiple .cpp files together completely breaks the concept of compilation units. You lose any semblance of information hiding and coupling because all header files (and therefore all classes) are available in all files. This can lead to very sloppy development practices.

      I've worked with two libraries that use your strategy (UW IMAP and SQLite) and both of them use the technique solely to make distribution simpler, not to decrease build time.

      Bundling headers together can be a good idea - but the idea was implemented by Microsoft at least 15 years ago and it's called precompiled headers. Again though, the headers you select to be precompiled have to be very carefully chosen. Putting all headers into your precompiled header will cause horrible rebuild dependencies that will slow down your builds dramatically.

      Therefore, if you are seeing a "20x" improvement in your build times by following the strategy you described, then your project build strategy is badly designed and needs to be rethought from the ground up to depend on parallel builds and precompiled headers.

      Delete
  13. Hey Jim,

    My personal project count ~50 000 lines of codes, with this method it take 2 seconds to compile and link with visual studio and gcc. I don't need to maintain precompiled header and i stopped to play with compiler option. The goal of this method is to minimize IO acces. At work we have 3,5 millions line of code, we have a tool that will build some big file as some compiler don't support file with >65536 lines of code.

    Parallel build is a really good option exept if your compiler don't support this feature, don't forget that some of us work with really old device or crappy compiler.

    ReplyDelete
  14. Jamie,

    I've actually done quite a lot of work on cross platform and even some embedded work. You don't need your compiler to support parallel builds. Any time you are building on a Linux/UNIX/Windows system you can use parallel make instead. I was using pmake 20 years ago on a 4 processor Sun server. Builds were really, really fast :-)

    ReplyDelete
  15. > Without question, your SSD is where Windows is installed

    I don't really understand why this is the case. We have a very large, complex C++ project that takes over an hour to build, and produces about 20G of object/pch/pdb files. (we may have 4-5 branches of the same object code active at any one time), thus we're looking at getting a 120G+ SSD for each developer.

    It seems to make sense that the build artifacts (given there are so many) go on the SSD?

    ReplyDelete
    Replies
    1. Anonymous,

      You are correct, I was a little hasty in my answer. My original reasoning was that, if you are asking that question in the first place, then you are probably on a tight budget, and therefore your project probably isn't very big. In that case, the productivity gains of Windows and Visual Studio running faster from the SSD are going to outweigh the gains from building from the SSD. (I found the productivity gains from having Windows on SSD to be substantial. Reboots happen in seconds instead of minutes. Word, Excel and OneNote start instantly.)

      For a "well financed" project, it never occurred to me that this question would even be asked because Windows, Program Files, and the dev project should all be on the SSD. Move everything else (Documents, Pictures, Videos, Music) to rotating platter. This is how my own dev system is organized.

      For a large project such as yours, I agree, if you have to choose then use the SSD for your build directory.

      Delete
  16. Thanks for that mate. I'd love to have SSD as the main drive (with Windows et. al. on it) but it is still cost prohibitive in large sizes. We were looking to go for 500G main drive (windows, dev tools etc) and 100G SSD 'build' drive, but I'd heard so many varying views regarding dev setup with an SSD I was very confused :-)

    ReplyDelete
    Replies
    1. Make sure your TEMP directory is on the SSD too. Also, you can move Visual Studio onto the SSD by using a junction link in Program Files.

      Delete
  17. That's a great idea. I would never have thought to put the TEMP folder on the SSD - but of course that's where the compiler/linker puts all the intermediary stuff so it would make a significant improvement!

    ReplyDelete
  18. Hi Jim,

    After doing some reading about SSD, I read that some had a short life expectancy.

    What's your view on that?
    And in your experience how long does a SSD last when used for the operating system and for compiling?

    Cheers,
    Franck

    ReplyDelete
  19. Short is relative. Yes, it will probably not last ten years if you beat on it doing builds all day. But who cares? In the last 24 months, prices have dropped by 50% and will continue to fall. Throw it away every two years and get a new one.

    The fast builds on that drive save me a couple of hours per week of wasted time. That paid for my SSD within the first month. After that, it's just gravy.

    Also, SSDs usually come with reserve space. If you buy a 120GB drive (for example), it's really a 128GB drive with 8GB in reserve. A 100GB drive is really a 128GB drive with 28GB in reserve. All of that extra space is used all of the time for wear leveling. It's not "saved" until there are bad sectors like a spinning platter.

    My SSD is 90% full (which is bad for the wear leveling) and it has been used for builds writing gigabytes of data every day. After two years, I've had zero issues. So my advice is, don't worry about it. (I'm running an OCZ Vertex 2 120GB.) Just make sure you get an SSD with good reviews on Newegg. I'm probably going to upgrade to an OCZ Vertex 4 in the near future.

    ReplyDelete
  20. Hello Jim

    I have just ordered my first SSD (a OCZ Agility 4, 256GB) in order to reduce C++ compile/link times. I was wondering how to organize my projects and was delighted to come across you post.

    Question 1: I was unable to find part 2 of this post. Could you provide a reference to it?

    Question2: I found another article which suggests partitioning the SSD for backup purposes. As I have not experience working with SSDs, in your opinion, is there any reason I should not partition the SSD?

    Thanks,

    Ian




    ReplyDelete
    Replies
    1. Ian,

      I think you'll find answers to some of your questions in the comments for this article. Sorry about Part II, it was never written. However, the article would have been about how to set up MSBUILD, not how to set up your system.

      I followed that "partitioning" strategy for years on rotating platters and finally gave up as a total waste of time because I kept filling up one of the partitions while the other was half empty, meaning I'd waste hours resizing partitions. Total waste of time.

      The real question for SSD devices is how to back them up. You absolutely must back them up, because when they fail, they often fail completely.

      I have other blog entries on backups, but I back up my SSD every night to a separate computer. By using Gigabit Ethernet and Windows Server, I can back the whole thing up and verify it in less than 30 minutes. (Usually less than ten minutes for incremental backups.) If you use a consumer-level NAS, it will take longer, but it will still easily back up while you sleep.


      Delete
  21. Hi Jim,

    Thank your for your great article. I read your recent answer about life expectancy of the SSD. We might want to try out the OCZ Vertex 4 in a near future. I just want to know if you and I generate the same order of magnitude of data related to a typical day.

    We use IncrediBuild so, at any time (day and night), our workstation is building for our coworkers/CI/night build worstations. This generates an impressive amount of files I/O (at least a couple of tens of GB?).

    When we build our source tree from scratch (Debug and Release x64), we generate about 50 Gb of files (obj, pdb, lib, dll, exe, redistributables). This does not count the tons of temporary files our friend MSBuild might have generated in the %TEMP% dir.

    So a rough estimate might point to more than 2 hundreds of Gb of I/O per day. Do you think it is realistic to put an SSD as our main hard drive with the OS, programs, temp, Incredibuild temp, our sandboxes on it. If we do so with the amount of I/O we generate, will it last at least 2-3 years?

    We just want to prevent buying it and that it explodes within the first 6 months. If ever, it does for about 50 developers, this starts to cause bigger problems than build times.

    Thanks a lot again for the article!
    Frédérick

    ReplyDelete
  22. Hi Frédérick,

    Here are my thoughts. Given that you've bought IncrediBuild, your company certainly sees the value of faster builds. For 50 developers, every wasted hour costs you a man-week of salary. So my guess is that you could afford to replace every SSD monthly and you would still come out way ahead on the ROI.

    For the least downtime, my solution would be to have two SSDs per workstation, one for Windows and Program Files (none of which change much) and a second drive devoted to build files. If the second drive dies, pull it out, slap in a new one, and you are ready to go. Less than five minutes of downtime.

    I also wonder if IncrediBuild can handle failed hard drives and drop the machine from the cluster. When using dozens of computers, failed drives is a certainty and I'd expect IncrediBuild to have a solution.

    Hope this helps.

    ReplyDelete
  23. a sandforce based ssd?
    Sir, are you trolling us? That's the one of the least stable controllers ever made..

    If you take an SSD, you pretty much have to choose between these 2:
    - Samsung 830
    - Crucial M4

    Not only are these way more stable, they also get better speeds than sandforce ssd's.. (actual speed benchmarked & not boxspeed)

    ReplyDelete
    Replies
    1. Anonymous, I think you'll find that the article was written in 2010. Both of the controllers you described came out in 2011. As I haven't used either the Samsung or the Crucial controller, I'll leave it to readers to do their own research.

      Delete
    2. Follow-up: I bought the Samsung 840 (250GB) to replace my OCZ Vertex and I've been very happy with it.

      Delete
  24. Hi Jim,

    I just wanted to give the community some heads up on how we are doing right now. We spoke with OCZ guys and they recommended us to try out the Deneva 2 drives. They estimate that with an average of 100 Gb of IO per day, the drive will last 5 years since it is an enterprise grade. We will try this out on one of our workstation and will come out with the results sometimes in January.

    Best Regards,
    Frédérick

    ReplyDelete
  25. Hi Jim,

    We did some benchmark tests with an OCZ Deneva 2 and the results are impressive :

    Parallel linking of multiple projects with IncrediBuild is no longer an issue. We used to see link times of about 5-10 minutes when linking 3 to 4 huge applications at the same time. We now see link times of about a minute when linking in parallel. This is the exact same time as if we did link each application on its own. This is mainly due to the massive IOPS of the SSD. In the end, we observe a gain of 40% to 50% on a complete build (from scratch) with IB.

    Also, we do see that there is a much less visible "file cache effect" (when files are put in RAM) when compiling and linking. In fact, we now see a difference in link time of about 10% when all required files are not in cache (.lib, .obj, etc...). This is great because we used to see link time difference of more than 100% with a standard hard drive.

    Thanks again for your article,
    Frédérick

    ReplyDelete
  26. hi Jim. can you pls link me to the part of this blog, where by i can try to setup the details explained in this article. thanks.

    ReplyDelete
  27. Part II was never written. Please see other comments. It was intended to describe MSBUILD, not system configuration, so it probably wouldn't have helped anyway.

    ReplyDelete
  28. I ade lots of test myself. In my scenario the best setups is one SandForce SSD as drive C, one WD Black 1 TB partitioned as E: (30 GB as the *first* partition) / D: (the rest), one WD Black 1 TB with 3 or 4 partitions (T:, P:, R: and S:).
    Then the partitions functions are based on SVN concepts, and are as follow:
    - C: Windows + progs
    - D: My Documents all general purpose data
    - E: Temp drive (Must set TMP and TEMP environment to reference it)
    - P: Production code + installers, etc
    - R: Branch partition
    - S: Stable partition
    - T: Trunk partition (this is should be the first disk physical partition)
    Regarding to CPU, AMD CPU's gains more benefit with /MP switch on, besides parallel project builds. For example: Phenon X6 1090T vs i7 920 (are considered 'equivalent' according to benchmark sites) I obtained 9:30 min vs 13:00 min build times!!! This is for a huge C++ project. An FX-8350 will perform the same build in 6:30 min, with same disk setup.

    The main goal of this setup is to split OS, from Temp and Solution files, having more paralIel disk access. I tried to use SSD as TEMP drive, but WD Black seems to work better with its huge cache (Static RAM memory outperforms Flash since the very beginnings). But this may change with faster SSD drives.

    Mathias

    ReplyDelete