Vector addition benchmark in C, C++ and Fortran
The inner loop the author is benchmarking is doing different things between the versions.
In the C++ version they are hot loading c[] into the cache outside of the timing loop with: vector<double> c[M];
In the C version you are doing it inside the timing loop. Thus your C version is doing much more work inside the loop.
Try running the timing loops on both, or pre-initializing the c[] array in the C version as in the C++ version and you will see the same performance.
In both cases C/C++ aliasing analysis can determine the arrays are not overlapping and so the same optimized loop can be used. ie all fortran, C and C++ versions are working on the same playing field for this example.