1

I use Mac, OSX 10.11.5. When I paste text from pdf files to emacs (tex mode), I get characters like "∈", "Ω". I need to convert them to LaTex equivalents "\in" "\Omega". Any hint?

2 Answers2

2

Using suggestions form various editors (thanks a lot!) I implemented this solution, which is ok for me

#!/usr/bin/pythonw
# -*- coding: utf-8 -*-

import sys
import re

# NOTE: literals enclosing backckslash are forced to raw using prefix r'...'
repldict = {'Ω':r'\Omega','?<8A><86>':r'\subseteq','?<8A><82>':r'\subset',
        '?<9F>?':'<','?<9F>?':'>',
        '?<88><88>':r'\in','?<97>':r'\times','?<80><99>':'*apostrofo*',
        'μ':r'\mu','λ':r'\lambda','?<86>':r'\phi',
        '?<86><92>':r'\rightarrow','·':r'\cdot','?<88>?':'||',
        '?<89>?':r'\le',
        '?<88><9E>':r'\infty','ε':r'\varepsilon','Φ':r'\Phi',
        '?<88><92>':r'-','?<80><9C>':r'``','?<80><9D>':r'"','?<80><94>':r'-'}
def replfunc(match):
return repldict[match.group(0)]

def main():
 regex = re.compile('|'.join(re.escape(x) for x in repldict))

 inFile = sys.argv[1]
 fin = open(inFile,'r')
 outFile = 'pdf2latexChars' + '.tex'
 fout = open(outFile, 'w')
 print 'inFile=' + inFile + '; outFile=' + outFile 

 for line in fin:
    fout.write(regex.sub(replfunc,line))

if __name__ == '__main__':
 main()
0

I just did this in PHP. You should be able to do something like this in Python:

$string = "I want the Latex equivalent of Ω, Δ, etc.";
$string = str_replace(
array("Ω","Δ"),
array("$\Omega$", "$\Delta$"),
$string);

string variable will now be: "I want the Latex equivalent of $\Omega$, $\Delta$, etc."