2
epath = SystemDialogInput["FileOpen",WindowTitle -> "Select a .xls to Open"]; epath2 = Input["WHAT TAB IS THE INFORMATION IN?"];\[Lambda] = SetPrecision[Import[epath, {"Data", epath2, 6, excelfile2}], 10];

When the file is being imported, it is being converted from xls to zip file. How do you stop it from changing the file format? The error I am getting is

Import::noelem: The Import element "Data" is not present when importing as ZIP.

If I trace the file I can find that it is a zip file, but is there a way to either convert it back or prevent it in the first place?

In[135]:= heldexpression // ReleaseHold

Out[135]= "ZIP"

In[124]:= epath

Out[124]= "\\\\Client\\H$\\Downloads\\BMB P4 Large Pump NASA VII Case \
5 \[Mu]500 80pts.xls"

If I choose any of my three input files the contents are the same.

{"[Content_Types].xml", "_rels\\.rels","xl\\_rels\\workbook.xml.rels","xl\\workbook.xml", "xl\\styles.xml", \"xl\\theme\\theme1.xml", "xl\\worksheets\\sheet1.xml", \"xl\\sharedStrings.xml", "xl\\worksheets\\sheet2.xml", \"docProps\\app.xml", "xl\\calcChain.xml", "docProps\\core.xml"}
J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
Rookey
  • 43
  • 6
  • Might your file actually be a "zip" file, perhaps containing the spreadsheet you want. You could make a copy, rename it to xxx.zip and try unzipping it. – mikado Mar 15 '20 at 12:41
  • The file format is a .xlsx originally, I changed it to a .xls to see if that would solve the problem. I used the exact same notebook code on another computer. The files were local in that instance, where as this particular error is happening using a school network vpn. – Rookey Mar 15 '20 at 21:37
  • 5
    I'm voting to close this question as off-topic because renaming a file's extension does not change the file's format. – Edmund Mar 19 '20 at 03:50

4 Answers4

5

You should always specify the format in Import. "Data" is not the format. It is the input element. Add the format to the command.

If you don't, the system will guess the format based on the file contents and the file name. I don't think it's a good idea to rely on guesses when any level of reliability is expected. XLSX files are in fact ZIP files, so technically it's not a bad guess.

Thus, if you use

Import[filename, {"Data", 1}]

or similar, Mathematica will try to guess the file format from the file name and file contents. Then it tries to import the 1st part of the "Data" element of the guessed format.

Instead, use

Import[filename, {"XLS", "Data", 1}]

to import an XLS file or

Import[filename, {"XLSX", "Data", 1}]

to import an XLSX file. If the file is not in the given format, Import will throw an error. It will not try to guess the format.

Note that XLS and XLSX are different file formats. The extension should not be changed. XLSX happens to also be a ZIP file. If you renamed an XLSX file to XLS, what likely happened was this: Mathematica tries to guess the format. The extension indicated XLS, but the magic byte didn't match. So it made a new guess, and determined that it was a ZIP file based on the magic byte (not the extension).

J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
  • 1
    It should be noted that while XLSX files are in fact ZIP files as you've pointed out, XLS files are a binary format. And according to this documentation page, I would have expected the code in the question to work (of course it is still helpful to be explicit about the file type) – Lukas Lang Mar 15 '20 at 14:37
  • @LukasLang Unless that xls files was in fact an xlsx with the wrong extension. That could have thrown Mathematica off and cause the misdetection, but Excel could still open it – Szabolcs Mar 15 '20 at 14:56
  • epath is stored as the .xls as shown above, how do you specify the format further? I've run this notebook on a different laptop with the .xls on the same harddrive, but i am using a school network to use their mathematica. Would that have an effect on the the file itself? – Rookey Mar 15 '20 at 21:45
  • 2
    @LukasLang Looks like my guess was spot on: OP renamed and .xlsx to .xls – Szabolcs Mar 18 '20 at 09:13
  • 1
    @Rookey First, do not mix XLS with XLSX: they are not the same format and they are no interchangeable. You can look up Import in the documentation to learn about specifying formats and import elements. Use Import[..., {"XLSX", "Data", 1}] and not Import[..., {"Data", 1}]. This prevents the system from guessing the file format, and forces using the format you specify. If the file is not actually in that format, Import will throw an error. – Szabolcs Mar 18 '20 at 09:15
3

As I attempted to explain to you in the comments, the issue is due to your method of changing the .xlsx extension to .xls by renaming it through your file browser (i.e., Finder or Windows Explorer) instead of resaving it using the software within which the original .xlsx file was created. Your questions in an attempt to solve this, while well-thought-out, will not move us towards solving the problem.

Before we continue, it is important to note that Import and Export use the "Data" element by default, so you need not specify it as you do in your code.

The problem may be solved by any of the following:

  • Saving the file as .xls through the spreadsheet software used to create the source .xlsx file.
    Renaming it will not change the structure of the file internally, and you will continue to get the erroneous results which prompted you to post this question in the first place.
  • Importing the .xlsx with Import["filepath.xlsx"] or Import["filepath.xlsx",{"XLSX","Data"}].
  • If you do re-save the file as .xls, you may do the same as above, except for .xls instead of .xlsx, with Import["filepath.xls"] or Import["filepath.xls",{"XLS","Data"}].

Once you have imported the file, you are, of course, welcome to use Export in order to change the format to .xls, .xlsx, .csv, or any other supported format with marked success. But you will not see any productive results by merely renaming the extension of the file which leaves the internal structure unaltered, as it is better and successful to either re-save the file to your desired format using the original spreadsheet software, or use Export because both of these will alter the internal structure to the format that you tell them you want.

I hope this provides you with a full enough explanation. Note that I have tested these processes on 12.0 and 12.1.

CA Trevillian
  • 3,342
  • 2
  • 8
  • 26
0

How about saving the data as csv file? for example:

a = Import["file.csv", "CSV"]
a

The output is:

{{0.9, 1.32}, {-1.14, 1.36}}
Max
  • 1
  • 1
  • Didn't change anything. The imported file is still a zip file? I'm not sure why, If i make a blank excel its still imports it as a zip file. – Rookey Mar 15 '20 at 22:03
0

The file path name had a non ASCII value in the path. Once that was changed it worked. Also I had to change the file extension back to .xlsx in order for it to be not recognized as a zip file? Apparently it confuses the file path for a zip file instead of an excel because they are similar? Any details on this?

Rookey
  • 43
  • 6
  • How are you changing the extensions? – CA Trevillian Mar 17 '20 at 03:47
  • right click, rename. – Rookey Mar 17 '20 at 06:13
  • Oh that’s likely the issue then. See what happens if you re-save it as a .xls file instead of just renaming it? What I presume is happening is that it is being treated as a .zip file for the reasons that have already been mentioned. You might also find success with the renamed file in specifying the file type along with the proper extension. – CA Trevillian Mar 17 '20 at 06:21
  • Ok, I'm confused as to why an.xls is a .zip file? How would I succeed in this using a .xls? if my file path ends in .xls what is the adjustment I make to the import command to make sure it is not confused. Thanks – Rookey Mar 17 '20 at 06:30
  • If internally a .xlsx is a .zip file, renaming it would not change how it is internally. That is one issue that you might solve by actually re-saving it as a .xls file, instead of just renaming it. For Import, you should add the designation of the proper file extension as mentioned in another answer here for CSV. Except you use XLS. Don’t consider, in this case, that the specification of the file-path is enough. You can use “.” within a filename in Windows. So, if you don’t specify XLS to Import, and it is internally a .zip, then it is clear what the outcome would be. – CA Trevillian Mar 17 '20 at 06:39
  • The original file was always an excel file. I'm using a Mac. It was never a .zip file. How come import thinks that an .xls it is a .zip file? – Rookey Mar 17 '20 at 06:44
  • Likely the “.” aspect of my last comment is a feature in Mac OS too. This result is likely because you created an .xlsx file and merely renamed it, instead of saving it as a .xls—and so, because you’ve given it an incorrect extension (did not actually “save as”) then Import will treat it as what it is internally. As stated above by multiple users, .xlsx is internally a .zip file. Therefore, if you’ve not specified “XLS” to Import, it can only assume the extension based on the internal structure, a .zip file, as the extension is not a valid one due to only renaming it and not re-saving it. – CA Trevillian Mar 17 '20 at 06:50
  • 1
    Ok I sort of understand. The rename doesn't necessarily change the file extension. I would have to do so in excel. But I don't understand why this is a .zip file, why isn't it a .csv like when it imports it correctly? Is there a way to unzip that file so that it comes out correctly as the csv that it becomes? – Rookey Mar 17 '20 at 20:19
  • 1
    The thing with non-ASCII characters and XLS(X) import is a bug in some versions of Mathematica. https://mathematica.stackexchange.com/q/76871/12 – Szabolcs Mar 18 '20 at 09:21