0

I saw a few website APIs that use a checksum as a GET parameter.

For example: http://api.test.com/get/?people=fathers&view=all&hash={28 character Base64-like string}.

The hash relies on all of the GET parameters, changing any of them or adding into them would alter the hash.

I was wondering what kind of hashing do such websites use. I tried decoding the Base64 and I got some gibberish text with weird characters I've never seen before.

They probably do use Base64 because if I supply an invalid Base64 hash then it throws a Base64 error.

But, why when I decode it, gibberish characters come out? Do I have to further decode those characters?

Joan Clark
  • 1
  • 1
  • 2
  • when I decode it, gibberish characters come out? You can base64 arbitrary data. To represent, say, a md5 hash safely in a URL, you could display the bytes as ascii letters and numbers, but you could also base64-encode it. If you base64-decode the sequence to ascii, you may randomly get printable characters or not. – Arminius Oct 25 '18 at 15:59
  • Possibly. The Base64 could just be the encoded value of ciphertext. Some of the other facets of the URI might be the key and initialization vector of the ciphertext. Or... they could be encoded themselves or represent values that translate to something on server-side: e.g. fathers=1g*D43!#$%asd but is only exposed on the server level. – thepip3r Oct 25 '18 at 15:59
  • Ah, both makes sense. @thepip3r that's probably what they do. After decoding the base64, I get some random Chinese characters as the "ciphertext" and a weird umbrella character. I'll probably never be able to know how to decode that strange ciphertext. – Joan Clark Oct 25 '18 at 16:02
  • This is kind of the heart of reverse engineering and/or cracking. Understand what's possible so you can try and piece the puzzle back together. You'd need to try varying techniques to try and continue to unwrap the values (that might include Google translate from Chinese, Japanese, Korean) until you get what you believe is the underlying plaintext. – thepip3r Oct 25 '18 at 16:14

1 Answers1

1

Most likely the checksum is just that - a check to make sure that none of the parameters changed in transit. The simplest way to do something like that is with a SHA256 (or even) MD5 hash of the URL parameters. For something like this, even MD5 can be a reasonable choice for hashing algorithm (we're not talking passwords here). A hashing algorithm is used because if you change a single character you change the output hash, and (even with MD5) it can be difficult to find another set of inputs that will match the given output.

With a hash function you will absolutely get gibberish if you base64 decode the results. The hash function is effectively generating "random" bits of data, so there is a small chance of generating ASCII characters with the raw binary data itself. Instead you get gibberish. That's okay though, because the exact value of the hash doesn't matter - just that whoever takes the GET data on the other end and hashes it gets exactly the same hash as what was passed along in the request.

I'm not sure off the top of my head what hash function would generate a 28 character string when base64 encoded (MD5 generates 22 or 24 character strings). It could be a custom hash. It could be something else all together, but best guess is it is just the output of some kind of hashing algorithm.

Conor Mancone
  • 31,265
  • 13
  • 94
  • 100
  • 28÷4×3=21 so depending on the padding (=) at the end it's 21 20 or 19 octets. 20 is popular, including at least SHA-1 and RIPEMD-160 -- or HMAC of either of those. – dave_thompson_085 Oct 26 '18 at 02:17