I recently got an 8-core CPU (AMD FX-8350 4GHz) and I did some testing to see how many parallel compilation jobs got the best speed. These tests were done compiling "The Powder Toy" (it uses SCons, but the -j option does the same thing as on make).
-j1 (one core only) takes 97.877s
-j4 (number of "modules") takes 27.201s
-j8 (number of cores) takes 19.586s
-j9 (cores+1) takes 19.463s
-j10 (cores+2) takes 19.583s
-j16 (cores*2) takes 19.861s
As you can see, cores+1 got the best speed on this computer, but not by much.
-j1 (one core only) takes 97.877s
-j4 (number of "modules") takes 27.201s
-j8 (number of cores) takes 19.586s
-j9 (cores+1) takes 19.463s
-j10 (cores+2) takes 19.583s
-j16 (cores*2) takes 19.861s
As you can see, cores+1 got the best speed on this computer, but not by much.
Comment