2

I'm working on a small game website and don't expect too much traffic.

Would md5(time . rand) suffice to have a "random enough" identifier for a game?

(Or is it easy to get a clash that way?)

U. Windl
  • 183
  • 8
SandTh
  • 125
  • 4
  • 2
    secure enough for ... what? session info? UID generation of diff accounts? https://datatracker.ietf.org/doc/html/rfc4122 – CaffeineAddiction Aug 25 '21 at 17:22
  • session info (the hash is needed just for as long as the game lasts) – SandTh Aug 25 '21 at 17:39
  • What does the size of the site have to do with anything? And "small" in what way? – schroeder Aug 25 '21 at 17:39
  • 1
    Why not use established session management libraries? And how is this a security question? – schroeder Aug 25 '21 at 17:40
  • I heard md5 isn't secure anymore and it's difficult to get it to be secure, I wondered if hashing time and rand won't get me different games with the same hash. The size of the site is relevant because I don't expect 10.000 requests per second. – SandTh Aug 25 '21 at 17:43
  • 1
    md5 is fine for somethings not for others, the notes about it being "unsecure" are largely around hashing passwords. Its most likly fine for session info. – CaffeineAddiction Aug 25 '21 at 17:44
  • also your site being "small" does not effect its risk profile as much as things like, are you taking CC info, are you collecting PII, are you dealing w/ passwords. – CaffeineAddiction Aug 25 '21 at 17:46
  • no, this mechanism wouldn't affect passwords at all... I just wondered to which extend it covered this particular use-case. – SandTh Aug 25 '21 at 18:03
  • @CaffeineAddiction ... no, that's completely false. Plain MD5 isn't secure for password hashing, but no fast hash is, not even new stuff like SHA3. MD5 collision weakness means it is insecure for a ton of other stuff, like digital signatures, potentially HMACs, checksums, and so on. This construction isn't secure either, but that's not MD5's fault; the actual problem is that there's not nearly enough entropy. If there was enough entropy, there'd be no point in hashing at all! – CBHacking Jun 29 '23 at 13:20
  • @CBHacking ... think you misread my comment ... cause your basically agreeing w/ me with the prefix of "no, that's completely false". Also, OP isnt hashing passwords ... he is hashing session information for a game. – CaffeineAddiction Jun 29 '23 at 19:27
  • Yeah, and MD5 isn't secure for that (though the reason is that no unkeyed hash would be secure for that). There is nothing that you should use MD5 for anymore; if you want a secure hash it isn't secure, if you want a fast insecure checksum there are faster and more-informative checksums, etc. It isn't "fine" for anything, and its weakness is not about passwords (either in general or specifically in comparison to other fast secure hashes) like you claimed. – CBHacking Jun 30 '23 at 03:42
  • @CBHacking that is factually incorrect ... there are plenty of things that currently do had have no reason to stop using md5. It is perfectly acceptable form of hashing when security is not a concern. Creating a UUID for example does not require a cryptographic hash. RFC4122 is an example of this and supports both md5 and sha1. – CaffeineAddiction Jun 30 '23 at 16:06

2 Answers2

3

An MD5 hash doesn't have any randomness, it's entirely deterministic on its input. So md5(time . rand) will be at best "as random" as just time . rand; and since time is predictable, the randomness all comes from rand. So:

  1. you can skip the other parts and just use rand directly; and
  2. you need to look closely at where rand is coming from

In case you're not familiar, computers (in the sense of the silicon chips we're used to thinking of as CPUs, GPUs, etc) are really bad at being random, because they're really good at doing exactly what they're told. Many simple "random" functions (e.g. PHP's rand() function) are "pseudo-random number generators": given a seed like the time the process started, they iterate through a pattern that's hard to spot if you don't know the seed, but isn't actually random. Some are better than others, but ultimately all are predictable if you try hard enough.

A "true random number generator" (sometimes referred to as "strong" or "cryptographically sound") takes random events happening outside the computer (fine variations in background noise, the exact millisecond you press keys on your keyboard, or famously a bank of lava lamps) and turns it into a stream of truly unpredictable numbers.

Most programming environments now provide access to true random numbers, because they are such an important factor in many security systems. They use dedicated hardware built into the circuitry of the CPU, or measure noisy inputs, and provide a standardised stream of bytes. For example, PHP provides a random_bytes() function which accesses this system source.

The only real downside is they can be slower, and they can "run out of entropy" (become predictable because you're asking for more output than they have input). This is extremely unlikely to be a problem for the use case you describe.

If the problem is that your source of randomness gives you non-printable characters, you can use something like base64 to encode them directly, without the risk of collision inherent in any hash.

IMSoP
  • 3,910
  • 1
  • 17
  • 20
  • 1
    The part of this answer about time being useless is correct (it may even be harmful because it reduces the chance that an improperly seeded rand will be detected during testing). The part about needing to look at what rand does is correct but incomplete. The function called rand in PHP (and most other languages) is not cryptographically secure and must not be used for any purpose where security is relevant, such as generating a session cookie. – Gilles 'SO- stop being evil' Aug 25 '21 at 21:52
  • 1
    The part about computers being bad at randomness is a decade out of date: modern computers include an entropy source. You have to get it via the operating system. Running out of randomness is not a thing (it's a theoretical concern, but if you start with enough entropy for a cryptographic application, you have enough entropy to outlast your computer). Speed can be a concern if you're generating a lot of random data for a numerical simulation, but it's irrelevant if you're getting a few hundred bytes of random data to serve a web page. – Gilles 'SO- stop being evil' Aug 25 '21 at 21:55
  • @Gilles'SO-stopbeingevil' I fail to see how it is out of date. Modern OS’s add entropy sources to their pseudo random functions and use tricks like stored randomness. And there are some “CPRNG” build in silica in some chips. (These pack independent audits though… ). Basically until you have accrued enough random bits you can run out of them. “Stirring your random source” does not help here. Computers are still bad at randomness, that hasn’t changed in over 30 years. We just got better at using the randomness present in the system. – LvB Aug 26 '21 at 08:26
  • 1
    @LvB Stored randomness isn't a “trick”, it's how OSes have worked for decades. Modern chips have TRNG, i.e. sources of actual randomness (not pseudorandomness which comes from deterministic computation). This was very uncommon 30 years ago, but it's common now. Until you have accrued enough random bits, you can't do any cryptography, so you won't run out of anything. And once you have accrued enough you won't run out. For example, if you want a 128-bit security level, you need to accrue 128 bits of entropy before anything else, and that will last you forever. – Gilles 'SO- stop being evil' Aug 26 '21 at 11:13
  • @Gilles'SO-stopbeingevil' it is a trick…. Just because it’s commonly used does not make it not a trick. And it’s hardly ever used embedded. (Where it’s needed more imho). TRNG in die are still pseudo, (There basically floating connections on the die hovering between the transition state, influenced by the EM field there in… which is the rest of the die usually). – LvB Aug 26 '21 at 11:23
  • @Gilles'SO-stopbeingevil' True Random sources are scares, I know of only 1: Nuclear decay…. The No Entropy problem that was comment a few years ago is still present. Until other point the OS has accrued enough randomness to be completely independent of its initial state, the random values it can produce is limited. Determining how much entropy you got is also extremely hard, so we tend to be conservative. To be sure we got Atleast that level of entropy. – LvB Aug 26 '21 at 11:25
  • @LvB Until the OS has accrued enough randomness to be completely independent from its initial state, it cannot produce random values at all. There are far more random sources than nuclear decay. All modern PC processors have one. Lack of entropy is still a problem on embedded devices (e.g. network appliances) that lack an entropy source, but it's not a problem on modern PC and smartphones because they do have an entropy source. – Gilles 'SO- stop being evil' Aug 26 '21 at 11:39
  • @Gilles'SO-stopbeingevil' I think we use a different definition of random here. True random (unpredictable and independent) is not something you find in a deterministic system (like a computer or discrete logic). I mentioned the type of TRNG as used in most computers these days. (Al be it only 1 variant… the other variants work the same in essence).And I agree that entropy past boot + 10s is always enough on modern PC’s / mobiles. In short, I think we agree on most points but used different terminology. And further discussion would be academic in nature. Which isn’t bad just not for here,imho. – LvB Aug 26 '21 at 11:46
  • @IMSoP By computer, I do mean the object that's on your desk (or in your pocket), but all the relevant parts are in a single chip. No, it isn't deterministic these days. Modern high-end CPUs (and more and more low-end CPUs) include a TRNG (i.e. an entropy source, something that is not deterministic). – Gilles 'SO- stop being evil' Aug 26 '21 at 12:46
  • As for getting values from entropy, you do indeed use the entropy to seed a CSPRNG (you can't directly obtain crypto-quality randomness from a TRNG due to biases: you have to “whiten” it, and the way to do this is (a part of) a CSPRNG). Once you have a CSPRNG, it lasts effectively forever. The reason to periodically reseed a RNG is not that entropy wears out, it's to protect against the risk of compromise of the RNG state (e.g. via side channels). – Gilles 'SO- stop being evil' Aug 26 '21 at 12:47
  • Your edit doesn't fix anything. Modern PC CPUs include a TRNG. The randomness is coming from inside the chip. Regarding the second point: the goal of an RNG is to be unpredictable by an adversary. Any practical RNG is a combination of an entropy source (a.k.a. TRNG) and a CSPRNG (deterministic cryptographic algorithm). A CSPRNG with a 128-bit seed can produce about 2^128 bits of randomness. Thanks to cryptography, you can get more randomness than the amount of initial entropy. But a CSPRNG with a 1-bit seed can produce at most 1 bit of randomness. – Gilles 'SO- stop being evil' Aug 26 '21 at 17:15
  • @Gilles'SO-stopbeingevil' I give up. You're using some hyper-pedantic definition of "CPU" and "randomness", so that whatever I say will still be "wrong". Nothing you've said actually changes anything of value in my answer, so this conversation has been a complete waste of time. – IMSoP Aug 26 '21 at 17:54
  • The only thing where applying MD5 could make a difference is when using "security by obscurity": Using time . rand literally (decimal or hex or BASE64) may indicate to the experienced user how it is constructed (specifically how much digits or bits there are). If it's not known that the output is from MD5 (maybe specifically if it's truncated), it may be harder to guess. However if the use knows the algorithm, MD5 does not add anything to security here. – U. Windl Jun 29 '23 at 12:35
2

MD5 is not the problem here, the problem is the amount of entropy on your solution. It does matter little if you use MD5 or SHA512, the entropy comes from rand and that can be very little random depending on the implementation.

If you have PHP7 or newer, there's a function exact for that: random_bytes. It will generate cryptographically secure pseudo-random bytes for you. You don't need MD5 for that, only bin2hex(random_bytes(64)) and you are good to go.

As a sidenote, people saying that MD5 is insecure is when you use MD5 as password storage solution. MD5 is a very fast hash function and that's an issue for passwords: an attacker can try billions of different passwords per second. For deterministic filenames, session identifiers and other kinds of data where an attacker won't bruteforce an entire dictionary, MD5 is a good choice.

ThoriumBR
  • 53,925
  • 13
  • 135
  • 152
  • 2
    MD5 is insecure for many things beyond passwords, most notably for integrity checks and in particular asymmetric digital signatures. That has nothing to do with speed, though, and is all about collision and preimage attack resistance. In fact, for passwords specifically, PBKDF2-MD5 is not notably weaker than PBKDF2 with any other "secure" hash function (you might want a slightly higher iteration count since the algorithm is faster, but that's part of why it's tunable; the actual security flaws in MD5 aren't relevant there). – CBHacking Aug 25 '21 at 20:06
  • Good points. Just wanted to be between the "MD5 is insecure, don't touch it" and "MD5 is fine, I use it for everything". – ThoriumBR Aug 25 '21 at 21:35