Here is the head-to-head comparison on my new Xyzzy-built Broadwell (i3) NUC, both programs run 4-threaded on the 2 physical cores of the system (that setup gives best per-iteration timing for both on this system) - these timings and ratios can be compared to the Haswell ones in the above post:
Code:
FFTlen Prime95 Mlucas Timing Ratio
(Kdbl) msec/iter msec/iter [Mlucas/P95] Comments
------ --------- --------- ------------ ------------
1024 3.894 6.869 1.76
1152 4.634 8.294 1.79
1280 4.990 8.702 1.74
1408 5.502 10.118 1.84 [Prime95 1440K]
1536 6.203 10.298 1.66
1664 6.506 11.562 1.78 [Prime95: average of the 1600K and 1728K timings]
1792 7.473 11.904 1.59
1920 7.843 13.186 1.68
2048 7.898 13.946 1.77
2304 8.889 15.846 1.78
2560 9.930 17.281 1.74
2816 11.369 19.931 1.75 [Prime95 2880K]
3072 12.465 22.373 1.79
3328 13.688 23.541 1.72 [Prime95 3360K]
3584 14.567 25.318 1.74
3840 16.079 27.987 1.74
4096 16.917 29.488 1.74
4608 19.762 34.077 1.72
5120 21.736 37.573 1.73
5632 25.657 43.197 1.68 [Prime95 5760K]
6144 26.867 50.179 1.87
6656 30.958 51.091 1.65 [Prime95 6720K]
7168 32.399 54.929 1.70
7680 34.025 60.411 1.78
8192 34.791 65.911 1.89
Avg: 1.75