A response to the blog post "{n} times faster than C". Our final program achieved a speedup of 128x (36 GiB/s throughput) by reformulating the problem and leveraging SIMD intrinsics.
it would be great if they measured the results of opt1_idiomatic with _ => unreachable!(). In theory the compiler would optimize that better than _ => 0.
it would be great if they measured the results of
opt1_idiomatic
with_ => unreachable!()
. In theory the compiler would optimize that better than_ => 0
.