I've been trying to use the hyphenation patterns embedded in the file dehyphtex.tex, which is part of the hyphen-german package. Unfortunately, the file seems to be encoded in a system that's neither UTF-8 nor ASCII. Hundreds of words listed in the file contain characters (mostly vowels with Umlaute, but also e-with-sharp-accent and others) rendered as �. With MacTeX2012, I use TeXworks as my editor; this editor uses UTF8 as the default input encoding scheme. So far, the trial-and-error method of reloading the file using any of the several dozen alternative input encoding schemes that this editor is familiar with has produced no success.
I guess I could hand-edit the file to replace all � instances with the correct UTF8-encoded characters, but I'm hoping there's a more automated way of doing this. Does anyone know off-hand which input encoding scheme is used for this file, and/or does anyone know of a handy method to convert a file of unknown input encoding into a UTF8-encoded file?
recodeyou can change it – Dec 17 '12 at 21:07latin-1withutf-8and save (a copy of) the file. :) – egreg Dec 17 '12 at 21:18latin-1andISO-8859-15... – Mico Dec 17 '12 at 21:30latin1is ISO-8859-1, while ISO-8859-15 is calledlatin9. Confusing, yes. Concerning accented letters both encodings are identical, the diffrences being in some infrequent characters and mostly in the encoding of the euro symbol, which was not present inlatin1and later added inlatin9– JLDiaz Dec 18 '12 at 00:29iconvprogram does the conversion. A freeware program running also on Windows is Charco – egreg Dec 18 '12 at 00:32Notepad++, a nice editor, which can read and save any encoding. – JLDiaz Dec 18 '12 at 00:36