I'm working on a small game website and don't expect too much traffic.
Would md5(time . rand) suffice to have a "random enough" identifier for a game?
(Or is it easy to get a clash that way?)
I'm working on a small game website and don't expect too much traffic.
Would md5(time . rand) suffice to have a "random enough" identifier for a game?
(Or is it easy to get a clash that way?)
An MD5 hash doesn't have any randomness, it's entirely deterministic on its input. So md5(time . rand) will be at best "as random" as just time . rand; and since time is predictable, the randomness all comes from rand. So:
rand directly; andrand is coming fromIn case you're not familiar, computers (in the sense of the silicon chips we're used to thinking of as CPUs, GPUs, etc) are really bad at being random, because they're really good at doing exactly what they're told. Many simple "random" functions (e.g. PHP's rand() function) are "pseudo-random number generators": given a seed like the time the process started, they iterate through a pattern that's hard to spot if you don't know the seed, but isn't actually random. Some are better than others, but ultimately all are predictable if you try hard enough.
A "true random number generator" (sometimes referred to as "strong" or "cryptographically sound") takes random events happening outside the computer (fine variations in background noise, the exact millisecond you press keys on your keyboard, or famously a bank of lava lamps) and turns it into a stream of truly unpredictable numbers.
Most programming environments now provide access to true random numbers, because they are such an important factor in many security systems. They use dedicated hardware built into the circuitry of the CPU, or measure noisy inputs, and provide a standardised stream of bytes. For example, PHP provides a random_bytes() function which accesses this system source.
The only real downside is they can be slower, and they can "run out of entropy" (become predictable because you're asking for more output than they have input). This is extremely unlikely to be a problem for the use case you describe.
If the problem is that your source of randomness gives you non-printable characters, you can use something like base64 to encode them directly, without the risk of collision inherent in any hash.
time being useless is correct (it may even be harmful because it reduces the chance that an improperly seeded rand will be detected during testing). The part about needing to look at what rand does is correct but incomplete. The function called rand in PHP (and most other languages) is not cryptographically secure and must not be used for any purpose where security is relevant, such as generating a session cookie.
– Gilles 'SO- stop being evil'
Aug 25 '21 at 21:52
time . rand literally (decimal or hex or BASE64) may indicate to the experienced user how it is constructed (specifically how much digits or bits there are). If it's not known that the output is from MD5 (maybe specifically if it's truncated), it may be harder to guess. However if the use knows the algorithm, MD5 does not add anything to security here.
– U. Windl
Jun 29 '23 at 12:35
MD5 is not the problem here, the problem is the amount of entropy on your solution. It does matter little if you use MD5 or SHA512, the entropy comes from rand and that can be very little random depending on the implementation.
If you have PHP7 or newer, there's a function exact for that: random_bytes. It will generate cryptographically secure pseudo-random bytes for you. You don't need MD5 for that, only bin2hex(random_bytes(64)) and you are good to go.
As a sidenote, people saying that MD5 is insecure is when you use MD5 as password storage solution. MD5 is a very fast hash function and that's an issue for passwords: an attacker can try billions of different passwords per second. For deterministic filenames, session identifiers and other kinds of data where an attacker won't bruteforce an entire dictionary, MD5 is a good choice.