Hash
Sequence of letters and numbers that represent data uniquely. If two hashes match, the data are the same.
When two different data have the same hash, there is a Hash Collision. This is not desired and happens very rarely.
Example
Considering data as the contents of the _ReadMe_.txt
file that was in Pleasuredome torrents, its calculated hashes are:
Hash Function | Hash Value of the data |
---|---|
CRC32 | 5fb670d2 |
MD5 | 5b471f252ab111d415357f00469cfb43 |
SHA-1 | e2c1eff8c010a4f3fc0d97df559c2dd2c5b8667e |
In theory, if another file with the same hash values listed above is found, the contents of the two files is the same.
From the hash functions in the table above, SHA-1 is the one with the lowest probability of a Hash Collision. CRC32 is the one with the highest.
Size comparison
ROM Managers compare information (name, size and hashes of files) on a dat-file to files stored on a computer. For each file to be identified, all given information must match.
Computationally, comparing the sizes of files is much faster than calculating and comparing the hashes of files. So, in order to gain speed, ROM managers compare the file sizes first, and only if they match, the hashes of the files on the computer are calculated and compared.
If two files have the same hash, their content is the same and consequently their sizes are equal. On this situation, only the hashes are enough to identify a file. In fact, size information isn't obligatory in dat-files. If size information is absent, the hashes of the files on the computer always have to be calculated.
Additional information
Usually, compression software (such as 7-Zip or WinRAR) are able to show the calculated CRC32 of each file inside an archive. See the image below for an example:
To calculate the hashes of a file or archive, read the How to hash a file with clrmamepro guide.
See also