1

I need to process the following text to get rid of the strange symbols such as:

â<80><99> â<80><9c> â<80>?

Example text:

With the mystery unexplained, the Hyatt tried to give its guests a sense of security by posting a guard in its lobby. But Wolf couldnâ<80><99>t shake the notion that a thief could re-enter her room at any time. â<80><9c>I had dreams about it for many nights,â<80>?says Wolf, a 66-year-old Dell IT services consultant traveling in Houston for business.

Can anyone help me with it? I hope to either manually delete it with some command in Vi or do it with script.

  • Looks like you're editing a UTF-8 file in a vi that doesn't understand UTF-8, try using vim instead. –  Dec 24 '12 at 06:59
  • You may need to change the language support to UTF-8 available in Window Preferences of application you are using. –  Dec 24 '12 at 07:01
  • @muistooshort vi became vim at least 25 years ago. – Shiplu Mokaddim Dec 25 '12 at 05:20
  • @Shiplu vim was only publicly released 21 years ago. Some OSes still use vi as default (IIRC, this includes FreeBSD). It's a valid suggestion. – Bob Dec 25 '12 at 09:22

2 Answers2

0

I found the text in question here: http://www.forbes.com/sites/andygreenberg/2012/11/26/security-flaw-in-common-keycard-locks-exploited-in-string-of-hotel-room-break-ins/

The characters causing the problem are fancy quotes and an apostrophe, that are not the standard ascii symbols for quotes and apostrophe.

I pasted that text into my copy of vim, and it handled those characters just fine.

But here's how to do replaces when this kind of thing happens: http://aditya.sublucid.com/2008/01/18/replacing-those-pesky-smart-quotes-in-vim/

DWright
  • 233
  • Many thanks! It works and is just what I need. I didn't intend to inspire any interest in the text content though. Lesson learned :) –  Dec 24 '12 at 08:49
0

Mostly this issue will occur if you transfer your file from Windows or DOS machines. To get rid of those unwanted special character use "dos2unix" utility

mkannan@talksense-dr:~/tmp$ dos2unix test.sh 
dos2unix: converting file test.sh to UNIX format ...
  • dos2unix converts line endings (from CR + LF to just LF). This looks more like a problem with the encoding of quotes. (@querystack confirmed in a comment on the other answer that it is an issue with 'smart quotes', which are not part of the ASCII character set.) – Bob Dec 25 '12 at 09:23