12

A lot of sites offer MD5 or SHA sums to verify the validity of your download, but why do some things rely almost entirely on this?

Is there anything in place to prevent people from just replacing the checksum with the malicious binary's checksum?

4 Answers4

8

Your problem is a common problem in security know as authentication.

Checksum solves the problem of integrity, you know that what you just downloaded is correct, but you do not know from who it comes.

If someone is operating a MitM attack, they obviously will try to replace the checksum of whatever download you required with the checksum of the binary they want you to have instead.

If you want to assess the authenticity (prove that the binary has been issue by someone you trust), then you need to use authentication method like digital signature.

The typical process goes this way:

  • original owner issue a public key he will share for the authentication purpose (see asymmetric cryptosystems)
  • original owner computes the hash of the document he wants to sign
  • owner signs the hash with its private key, so that anyone can verify it
  • client gets the document
  • client gets the public key of the original owner and validates it through chain of trust protocols (see ad-hoc documentation)
  • client verifies that the signature of the hash is correct
  • client verifies that the checksum of the program is the same

If you follow such a process, you cannot be subject to MitM attack. The key point of the protocol here is obviously the verification of the public key. If you fail to assess that the public key belongs to the owner, you can be deceived.

M'vy
  • 13,053
  • 3
  • 49
  • 70
6

In the most simple and insecure case, both the file and the checksum are served via a plaintext protocol like HTTP or FTP. In this case, no, there is nothing to prevent Man-in-the-Middle modification of both the file and the checksum.

Your first question, though, was, "Why do some processes rely on published checksums?" The answer is that, if properly authenticated, checksums provide integrity protection. That is to say, if I can guarantee the integrity and authenticity of the checksum, I can verify the integrity of the downloaded file.

This may seem redundant, since if I have some method of verifying the authenticity and integrity of a checksum, why does the same method not protect the file itself? The answer is that the checksum is not primarily intended to address malicious tampering, but it is intended to discover data corruption. If you have a network connection or disk drive or RAM module that introduces errors at a rate of 1 bit in every 10 MB, then the odds are very good that you will have an error in a 50 MB download. But the odds are quite low that the 20-byte checksum will have a bit error.

This is all well and good, but then how can we protect against malicious tampering? Here are some solutions:

  1. Provide the checksum over an authenticated and integrity-protected channel. The most common solution here is HTTPS, in which TLS provides authentication, integrity, and encryption. The data provider can (and should) double down and provide the file over this channel as well.

  2. Provide a cryptographic signature of the file. Instead of just providing integrity protection, this method also provides authenticity, but requires a little more work on the part of the downloader, who must either have or securely obtain the provider's public key in order to verify the signature. The same underlying principle is used to provide integrity protection within TLS, but a separate signature for the file relies on a different key distribution channel which may or may not be harder for the attacker to corrupt.

These methods can and should be combined, since there will be some users who distrust the protections of TLS, and others who cannot be bothered to verify a cryptographic signature, but for various reasons can only support basic checksumming.

Of course all of this begs the question of whether the file can be corrupted in such a way that the checksum is still valid. This is called a preimage attack, and MD5 has been shown to be theoretically vulnerable. You should always use the most secure hash function available: SHA-2 and SHA-3 are good choices; MD5 and SHA-1 are more risky.

bonsaiviking
  • 11,666
  • 1
  • 29
  • 50
  • the last paragraph is incorrect, MD5 is vulnerable to collision attack (attacker can produce two different files with same hash with less than the theoretical 2^64 complexity, but they must be the producer of both files). MD5 is not vulnerable to second preimage attack (modify input, keep hash) https://crypto.stackexchange.com/questions/3441/is-a-second-preimage-attack-on-md5-feasible – Marek May 30 '18 at 08:19
  • @Marek Thanks for the clarification. The paragraph as written is technically correct: Aoki & Sasaki demonstrated a preimage attack with complexity of 2^123.4 in 2009. This is still beyond practical limits, but it demonstrates the general weakness of the algorithm and why it should be replaced by better ones when possible. – bonsaiviking May 30 '18 at 16:15
1

Files can get downloaded and passed around. At some point, they can be modified maliciously.

By posting the checksum on a site, it provides a 2nd vector of verification. In order for a bad actor to be successful, they will also have to be able to hack the vendor's website and modify the hash (checksum).

schroeder
  • 129,372
  • 55
  • 299
  • 340
1

No.

Checksums are only there to verify your download, in order to prevent transfer errors.

Of course, a man in the middle could present a fake directory, with infected binary files and correct recreated checksums lists.

The way to prevent this is to sign your checksums with PGP or S/Mime.

Else you could checksum your checksum list and send the result via a securised way (real paper mail, phone or another crypted and/or non-internet way).