Well that blew up, huh? If you follow emulation or just gaming on the whole, you've probably heard about the controversy around the Dolphin Steam release and the Wii Common Key. There's been a lot of conclusions made, and while we've wanted to defend ourselves, we thought it would be prudent to contact lawyers first to make sure that our understanding of the situation was legally sound. That took some time, which was frustrating to ourselves and to our users, but now we are educated and ready to give an informed response.
They have a fixed size output, yes. That output is effectively universally substantially smaller than the input it supports. The fact that they can also take smaller inputs as well increases the actual number of inputs, because those are in addition to the number of full length messages. The point is that the input space is a fuckton of orders of magnitude larger than the output space, which means you’re literally unconditionally guaranteed that collisions have to exist.
Half your points are specific to a cryptographic hash, which isn’t the only kind of hash or the only useful kind of hash, but since that’s what you’re talking about fine.
Collisions existing are normal. You can only avoid making finding a collision easier than finding the actual input for a password application and finding a collision with a modified hard to do for a checksum. The collisions still exist. In some applications of hashing, eg semantic hashing, collisions for similar inputs are desirable.
Yes, this is the point of a hash, but it’s not hard to do.
Again, same thing. Deterministic code isn’t that hard to do.
Preventing predictability is the only point for a cryptographic hash (besides being deliberately heavy to prevent brute force). If there aren’t systematic flaws to make the distribution of outputs distinguishable from randomness, your cryptographic hash is going its job.
They have a fixed size output, yes. That output is effectively universally substantially smaller than the input it supports. The fact that they can also take smaller inputs as well increases the actual number of inputs, because those are in addition to the number of full length messages. The point is that the input space is a fuckton of orders of magnitude larger than the output space, which means you’re literally unconditionally guaranteed that collisions have to exist.
Half your points are specific to a cryptographic hash, which isn’t the only kind of hash or the only useful kind of hash, but since that’s what you’re talking about fine.
Collisions existing are normal. You can only avoid making finding a collision easier than finding the actual input for a password application and finding a collision with a modified hard to do for a checksum. The collisions still exist. In some applications of hashing, eg semantic hashing, collisions for similar inputs are desirable.
Yes, this is the point of a hash, but it’s not hard to do.
Again, same thing. Deterministic code isn’t that hard to do.
Preventing predictability is the only point for a cryptographic hash (besides being deliberately heavy to prevent brute force). If there aren’t systematic flaws to make the distribution of outputs distinguishable from randomness, your cryptographic hash is going its job.