15

This function does not work correctly for me. The generated Gravatars do not match the ones actually produced by using the corresponding address.

The problem was traced to differing output from the Hash function.

VLC points out that the Hash documentation pages for version 7 and version 8 have common examples but show different outputs.

From version 7 I get:

IntegerString[Hash["gmailuser+10511@gmail.com", "MD5"], 16, 32]
"cc270b8fbfb33469629c14e358183dd7"

Version 8 users apparently get:

"d9c48b644b8cc89bbcd07ec7a54dafc9"

Why do I get a different result and how do I get the right one?

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371

3 Answers3

16

I believe that the difference is that the Hash in v7 was hashing the quotation marks around the string, but the Hash in v8 does not. For example, in M7:

In[1]:= Hash["test", "MD5"]

Out[1]= 64111166190477440563271147919838643147

and in M8:

In[1]:= Hash["\"test\"", "MD5"]

Out[1]= 64111166190477440563271147919838643147

Therefore, I'd say that the M8 hash is the correct one.

Note that it appears that FileHash has not changed across versions. The following hash function will produce the same results in v7 and v8, although those results will not be the same as what you get from Hash in either version:

myHash[expr_, type_] := Module[{fname, res},
  fname = Close[OpenTemporary[]];
  Put[expr, fname];
  res = FileHash[fname, type];
  DeleteFile[fname];
  res
  ]
KAI
  • 673
  • 3
  • 8
9

Older Mathematica versions were including the enclosing quotes "" when generating the hash.

To get rid of the quotation marks you can use the function below instead of the standard Hash function. The credit for the function goes to Mark Fisher.

StringHash[string_String, type_: "MD5"] := 
  Module[{stream, file, hash}, stream = OpenWrite[];
  WriteString[stream, string];
  file = Close[stream];
  hash = FileHash[file, type];
  DeleteFile[file];
  hash]
VLC
  • 9,818
  • 1
  • 31
  • 60
9

As an alternative to the solution by @VLC / Mark Fisher, the following JLink code will compute the correct MD5 hash (as returned by V8 or Unix md5sum):

Needs["JLink`"]
LoadJavaClass["java.security.MessageDigest"];

md5[s_String] :=
  JavaBlock @ Module[{d, b}
  , d = java`security`MessageDigest`getInstance["MD5"]
  ; d@update[JavaNew["java.lang.String", s]@getBytes[]]
  ; b = JavaObjectToExpression[d@digest[]]
  ; StringJoin @@ IntegerString[b /. n_?Negative :> 256+n, 16, 2]
  ]


md5["gmailuser+10511@gmail.com"]
(* d9c48b644b8cc89bbcd07ec7a54dafc9 *)

We can use this function to see how V7 is adding quotation marks to a string before hashing:

md5["\"gmailuser+10511@gmail.com\""]
(* cc270b8fbfb33469629c14e358183dd7 *)
WReach
  • 68,832
  • 4
  • 164
  • 269
  • 2
    +1 -- incidentally I tried a Trace[. . . , TraceInternal->True] on Hash and it seems to already be using Java. It's also almost comical how complicated it appears to be. – Mr.Wizard Oct 23 '12 at 18:07