10

Bug introduced in 10.0 and fixed in 10.0.2


I tried to import file in Mathematica 10 ( windows 7 system), the path contains some chinese characters. After holding ctrl+shift, and drag the file into front end. I got path like

c:\\Users\\m&p\\Desktop\\桌面\\test.txt

But after running

Import["c:\\Users\\m&p\\Desktop\\\:684c\:9762\\test.txt"]

Mathematica 10 gives error message as follows:

Import::nffil: File not found during Import.

What should I do to make mathematica 10 support path containing Chinese characters?

Michael E2
  • 235,386
  • 17
  • 334
  • 747
matheorem
  • 17,132
  • 8
  • 45
  • 115
  • 1
    What kind of computer are you using: Macintosh, Raspberry Pi, Linux, or Windows? – librik Jul 17 '14 at 02:41
  • 1
    Works fine on OS X. This is expected to be OS-dependent. Are you on Windows? What language version of Windows? There's also a setting in Windows (which I don't know where it is anymore, as I don't have a Windows computer here) that will adjust the character set for software that doesn't support Unicode. If you like to use Chinese path names, it's worth setting that to Chinese. Otherwise non-Unicode-aware Windows programs won't see those paths. – Szabolcs Jul 17 '14 at 02:48
  • 2
  • I cannot reproduce this problem under linux. Here is what I tried (and with success): (1) create directory /tmp/试验. (2) put a file try.dat with some data in it. (3) In Mma10, Import["/tmp/试验/try.dat"]. And I don't have problem to get the content of the file. It is perhaps OS dependent? – Yi Wang Jul 17 '14 at 10:47
  • 1
    I can reproduce this under win8.1, and the slash (/) workaround in the Q/A linked by @AlexeyPopkov works fine. – Silvia Jul 17 '14 at 19:04
  • @YiWang Under Windows the names of the folders in a path to a file are by default separated by backslashes (""). The problem appears specifically due to backslashes as in the example: Import["D:\\tmp\\试验\\try.dat"]. Probably under Linux backslashes are not used in the paths. If so you cannot not see this bug with Import but still can face it in another manner described in the answer by Tetsuo Ichii. – Alexey Popkov Jul 17 '14 at 19:15
  • 1
    @AlexeyPopkov Yes, I confirm that if I put a special character after \\, I also got strange character code under Linux. – Yi Wang Jul 18 '14 at 06:39
  • I tried to edit the bug notice (see http://meta.mathematica.stackexchange.com/questions/1610/standard-header-for-bugs-tagged-posts-for-easy-searching) but SE won't let me save the edit because of the Chinese character! Perhaps you can edit it? – Michael E2 Aug 06 '15 at 18:46
  • @MichaelE2 That is weird. I got the same error as well. I think it's some change that was introduced after this post was written/edited. Digging in meta SE, I found this and the SE developer's response was that it's related to some OS X unicode kiss of death issue. Not sure if they extended that to certain chinese characters as well. Probably worth asking on our meta so that we can forward it to the dev. – rm -rf Aug 06 '15 at 19:30
  • @TheToad Thanks. This seems closer. I'll see if I can figure out whether the workaround there will work. – Michael E2 Aug 06 '15 at 19:35
  • @MichaelE2 Interesting; I was not aware of this issue or the work-around. Thanks both for the edit and for the education. – Mr.Wizard Aug 07 '15 at 04:14

1 Answers1

11

In Mathematica 10.0.0 for Windows, I have experienced similar problems.
When non-ASCII characters were placed after \ in a string, they were decoded in a strange way. (Character '\' is used as a path separator in Windows).

ToCharacterCode["\\a", "Unicode"](*OK*)
{92, 97}

ToCharacterCode["\\", "Unicode"](*OK*)
{92}

ToCharacterCode["μ", "Unicode"](*OK*)
{956}

ToCharacterCode["\\μ", "Unicode"](*Strange!*)
{92, 92, 58, 48, 51, 98, 99}

"μ" in this expample can be some Chinese/Japanese characters. It makes a severe problem because this sometimes happens in file path such as "C:\MyData\μ-channel\" in Windows. You could avoid this problem by using "c:/MyData/μ-channel/".

Sektor
  • 3,320
  • 7
  • 27
  • 36
Tetsuo Ichii
  • 1,076
  • 9
  • 10