15

I have a sheet in an Excel file already formatted: different colors, fonts, columns width and cells styles. I want to fill-out values from Mathematica into specific cells of that sheets, or use that sheet as a formatted template to create a new sheets, "copying" the stylings and data in the corresponding cells.

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
jss
  • 405
  • 3
  • 8
  • @Mr.Wizard I edited the question so as to explicitly state the bearing with Mathematica – jss Dec 05 '12 at 20:38
  • Okay, thank you. This now looks like it may be a duplicate of one of these. Please take a look and let me know. http://mathematica.stackexchange.com/q/11759/121 or http://mathematica.stackexchange.com/q/8725/121 – Mr.Wizard Dec 05 '12 at 20:57
  • The first post seems discontinued. I am just learning from the second one (q/8725) and let you know later. – jss Dec 05 '12 at 21:29
  • @Mr.Wizard The example there override the whole file. I need to write in specific cells already formated on a previously existing excel file. – jss Dec 05 '12 at 21:40
  • Keeping intact the original format of the sheet, just writing data there, mostly numbers and strings. – jss Dec 05 '12 at 21:48
  • @Mr.Wizard I want not be dealing with .NET and NETLink from Mathematica to do that. – jss Dec 05 '12 at 22:13
  • 1
    jss, sorry I was not here sooner to reply. I shall reopen this question for you. Please add sufficient detail of the operation you wish to perform, and include as much of an example as is practical. Please also make it clear in the question how it is distinct from those to prior ones (which, if I understand, it is) or others may close your question again; this is because even if an old question is unanswered ("discontinued" as you put it) a new but identical question should not be started. – Mr.Wizard Dec 06 '12 at 07:55
  • You can use the VBA API from Mathematica, see here http://mathematica.stackexchange.com/q/4594/66 – faysou Dec 06 '12 at 10:15
  • @FaysalAberkane: that's actually using the COM interface via .NET and what I and presumabely he refer to as the .NET solution, which the OP doesn't want to deal with but most probably needs anyway... – Albert Retey Dec 06 '12 at 14:52
  • 1
    I am unable, from the little information supplied, to form a coherent idea of exactly what asker expects to accomplish in Mathematica. – m_goldberg Dec 06 '12 at 18:36
  • @AlbertRetey not necessarily. Likely the easiest thing to do is 1. load the original in its entirety, 2. modify the specific cells, and 3. save the modified spreadsheet. Doesn't need VBA via COM or .NET. Of course, that will likely destroy the formatting, but I do not know if it will or not. – rcollyer Dec 06 '12 at 18:36
  • @rcollyer: if you look at my and Alexeys answers you'll find that variants of the import/export strategy have been mentioned but need some extra efforts to preserve the formatting as well. Using the xml-format as Alexey suggested looks like the least demanding but also isn't really straightforward. I personally think that none of these alternatives is much easier than using the COM interface and that's why I'm speculating that the OP might well end up using it. I didn't mean to indicate that there aren't other possibilities... – Albert Retey Dec 06 '12 at 23:55

3 Answers3

12

I see several ways how you could achieve what you want. Any of them either needs extra "non-mathematica" software, efforts or knowledge. I think the best way is to learn how to interact with excel via .Net/COM as is described e.g. in the documentation, but you have mentioned that you want to avoid that. Here are the alternatives that I can think of:

Use Import/Export with FormattedData (broken?)

In newer versions there is the possibility to use "FormattedData" as the optional element argument for excel files. While import seems to work I couldn't get the export to work even for a very simple example. If you get the export to work for your specific files, you could import with something like:

data = Import[xlsfilename,"FormattedData"]

then manipulate the data (needs some extra care as the formatting need to be taken into account), e.g. this will change the value of a formatted number entry of the cell b2 in the first sheet, the content as imported is wrapped with Style and the first argument of Style is the actual data:

data[[1, 2, 2, 1]] = 4;

once you have replaced the data, you should be able to reexport (this is the part that doesn't seem to work, at least for me):

Export[xlsfilename, data,"FormattedData"]

Use .NET/COM

I've read that you don't want to delve into this, but I think it is the one possibility that definitely would work. It needs some effort and learning of course, especially if you haven't done anything like that before. On the other hand it isn't very difficult and it's the one method that I did succesfully use before. I'm sure you'll get answers to get you started if you ask for this in a more specific way.

Import/Export as xml

the current XLSX format is actually a zipped archive containing some XML files which you could import, manipulate within mathematica and reexport. I have no real experience in how well that works but the following might get you started. This will show the various filenames:

Import[xlsxfilename, "ZIP"]

this will import the content of the first sheet:

Import[xlsxfilename, {"ZIP","xl\\worksheets\\sheet1.xml"}]

manipulating these files might be cumbersome though, as the data and styling is spread over several files. I think getting this to work will be a lot more effort than learning how to manipulate excel files via .NET. Using the information that Alexey provides in his answer it looks like working with files saved as XML from Excel might be less effort than working with the xlsx-files directly.

Find and Use other libraries to manipulate XLS files

you might be able to find 3rd party libraries which let you read/write xls files, written e.g. in Java. You could try to use these from Mathematica. I doubt that this would be any easier than using .NET/COM which is readily available, but it could be a solution if there is no excel/.NET/COM for the platform you are planning to run this on. Actually there is such a java library coming with Mathematica which seems to be used for the current import/export functionality($InstallationDirectory/SystemFiles/Converters/Java/jxl.jar, looks like that is jexcelapi, but I'm not sure). I have no experience what exactly it can do and if/how you can make use of it directly.

Use ExcelLink

There is a commercial add-on to Mathematica called ExcelLink. I also have no experience in using this myself and I don't know if it supports the very recent versions of Mathematica, but it seems to provide the functionality you want. See the link for more information about it.

Albert Retey
  • 23,585
  • 60
  • 104
  • I did it in .NET with Visual Studio and wanted to migrate to Mathematics fully, but without wraping .NET. XML choice seems quite flexible but as you said will be a lot more effort. – jss Dec 07 '12 at 17:52
  • ExcelLink seems to me the better choice but not free. – jss Dec 07 '12 at 17:53
  • There are a lot of things that have influence on what "better" would mean for your specific use case and situation -- and that we can't know. This is why I tried to show the alternatives, at least as known to me. ExcelLink might well be the best choice. I'm not sure but think that it relies on .Net or COM under the hood, though. So it is, AFAIK, only available on Windows... – Albert Retey Dec 08 '12 at 10:58
10

You can copy a tabulated data to clipboard and then paste it in a sheet with formatted cells without loss of formatting:

CopyToClipboard@
 Cell[StringReplace[
   ExportString[myTable, "TSV"], 
   "\r\n" -> "\n"], TextClipboardType -> "PlainText"]

Also you can automate this process by using Data ► Import from external source... ► Import data... Excel menu item and selecting a .dat file exported from Mathematica. Upon import the formatting of the Excel sheet will be retained. Also you can made a template Excel file with formatting and already tuned data import. This method even allows to add working hyperlinks as shown here.

Alternatively, you could save your Excel file as XML file, Import this file in Mathematica and manipulate it as you wish. Excel works with such files exactly as with its native format.

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
  • And for version 7, TableForm[myTable] and copy as Plain Text ... – Chris Degnen Dec 06 '12 at 13:57
  • @Chris It can be automatized too (via FrontEndTokenExecute[nb, "Copy"]), see here. – Alexey Popkov Dec 06 '12 at 14:43
  • 1
    @ Alexey - Yes, something like this maybe: http://stackoverflow.com/a/7608554/879601 – Chris Degnen Dec 06 '12 at 14:47
  • @AlexeyPopkov: can you automatize the paste into the correct excel sheet also? I doubt that this can be done from within Mathematica (without using .NET/COM which probably would make possible more straightforward solutions). Am I unaware of something obvious?? – Albert Retey Dec 06 '12 at 14:54
  • 1
    @Albert Yes, there is a workaround which allows automatic import of the textual data into appropriate place in an Excel worksheet. In Excel, go to the "Data" menu, select "Import..." submenu and then "Import data...". I recommend to use the .dat format for intermediate file since Excel does not lock this file for writing as opposed to the .txt files. – Alexey Popkov Dec 06 '12 at 15:12
  • @AlexeyPopkov: sorry, I probably wasn't clear about what I meant (and honestly I'm also not sure what the OP really is after). I meant with automatization an approach where you'd do everything programmatically from within Mathematica without interacting with the excel gui. Do you see a possibility to do that without the COM interface? – Albert Retey Dec 06 '12 at 15:19
  • @Albert Yes, I have already showed a way to do this in the body of the post: one can use XML format instead of XLS and manipulate it in Mathematica without use of Excel. – Alexey Popkov Dec 06 '12 at 15:32
  • @AlexeyPopkov: o.k., that I overlooked. It's probably far better to use that xml-format than to cope with the zipped xml-files of xlsx... Very good point, thanks. – Albert Retey Dec 06 '12 at 15:54
  • CopyToClipboard could be an economic good choice for me. – jss Dec 07 '12 at 18:03
  • @AlexeyPopkov I couldn't follow (understand) from these comments if complete automation have been succeded. I mean automating "Paste as values" in specific cells within the formated sheet. – jss Dec 07 '12 at 18:03
  • @jss There is no complete automation with the CopyToClipboard method: you still need paste the clipboard content by hands. The complete automation could be achieved with the XML method but it requres much more work for complete implementation. – Alexey Popkov Dec 07 '12 at 18:28
4

Here's a go at doing things from the extracted XML level. I only bothered to do this for the values, but if I ever use this I'll extend it to styles too probably.

One note before we start is that since CreateArchive can only take a directory, I had to use RunProcess and a system level zip compressor. On Mac (and I think Unix) that's zip. For your OS change that (it's under excelCompress).

So the basic idea is this:

Extract the file into a temp dir:

Clear[excelExtract];
excelExtract[file_] := 
 With[{eDir = 
    FileNameJoin@{$TemporaryDirectory, "_excel_extractor", 
      FileBaseName[file]}},
  Quiet@
   DeleteDirectory[
    eDir,
    DeleteContents -> True
    ];
  CreateDirectory[eDir,
   CreateIntermediateDirectories -> True
   ];
  CopyFile[file,
   FileNameJoin@{DirectoryName@eDir, FileBaseName[file] <> ".zip"},
   OverwriteTarget -> True
   ];
  ExtractArchive[
   FileNameJoin@{DirectoryName@eDir, FileBaseName[file] <> ".zip"},
   eDir
   ];
  DeleteFile[
   FileNameJoin@{DirectoryName@eDir, FileBaseName[file] <> ".zip"}];
  eDir
  ]

Create a worksheet-level update function:

Clear[excelValuesUpdate];
excelValuesUpdate[
   file_String?(FileExistsQ[#] && FileExtension[#] === "xml" &), 
   vals : _List | _Rule, overwrite : True | False : False] :=

  With[{xml = Import[file]},
   Export[file,
    excelValuesUpdate[xml, vals, overwrite],
    "XML",
    "ElementFormatting" -> False
    ]
   ];
excelValuesUpdate[
   xml : XMLObject[dec___][spec_, els_, r___],
   vals_,
   overwrite : True | False : False
   ] :=
  With[{vs =
     excelValuesFormat@
      Normal@
       excelValuesMerge[
        If[overwrite,
         vals,
         {excelValuesExtract[els], vals}
         ]
        ]
    },
   XMLObject[dec][spec,
    els /.
     XMLElement["sheetData", __] :>
      vs,
    r
    ]
   ];
excelValuesUpdate[dir_String?DirectoryQ, ws_String,
   vals_,
   overwrite : True | False : False
   ] :=

  With[{f = 
     FileNameJoin@{dir, "xl", "worksheets", 
       StringTrim[ws, ".xml"] <> ".xml"}},
   If[FileExistsQ[f],
    excelValuesUpdate[f, vals, overwrite],
    $Failed
    ]
   ];
excelValuesUpdate[dir_String?DirectoryQ, ws : _Integer : 1,
   vals_,
   overwrite : True | False : False
   ] :=

  With[{f = 
     FileNames["*.xml", FileNameJoin@{dir, "xl", "worksheets"}]},
   If[Length[f] >= ws,
    excelValuesUpdate[f[[ws]],
     vals,
     overwrite
     ],
    $Failed
    ]
   ];
excelValuesUpdate[file_, ws : _String | _Integer : 1,
  vals_,
  overwrite : True | False : False
  ] :=
 With[
  {eDir = 
    FileNameJoin@{$TemporaryDirectory, "_excel_extractor", 
      FileBaseName[file]}},
  If[! DirectoryQ[eDir], excelExtract[file]];
  excelValuesUpdate[eDir, ws, vals, overwrite]
  ]

This calls excelValuesFormat, excelValuesMerge, and excelValuesExtract which do the XML processing:

Here's the formatter:

Clear[excelValuesFormat];
excelValuesFormat[rules_] :=
  With[{coreData =
     GroupBy[
      With[{
          v = Lookup[#[[2]], "CellValue", Null],
          m = Lookup[#[[2]], "CellMeta", {}]
          },
         #[[1]] ->
          XMLElement["c",
           {"r" -> #[[1]], Sequence @@ m},
           {
            Replace[v, {
              Null -> Nothing,
              _ :>
               XMLElement["v", {},
                {ToString[v, InputForm]}
                ]
              }]}
           ]
         ] & /@ rules,
      (StringTrim[#[[1]], LetterCharacter ..] &) -> Last
      ]
    },
   XMLElement["sheetData", {},
    KeyValueMap[
     With[{
        keys =
         FromDigits[Flatten[{LetterNumber[#]}, 1], 26] & /@

          StringTrim[
           Lookup[List @@ #2[[All, 2]],
            "r"
            ],
           DigitCharacter ..
           ]
        },
       XMLElement["row",
        Flatten@
         {"r" -> ToString@#, "spans" ->
           StringRiffle[
            ToString /@
             MinMax[keys],
            ":"
            ]
          },
        #2[[Ordering[keys]]]
        ]
       ] &,
     coreData
     ]
    ]
   ];

The extractor:

Clear[excelValuesExtract];
excelValuesExtract[xml_] :=
  Cases[xml,
   XMLElement["c", {a___, "r" -> id_, b___}, v_] :>
    (id ->
      <|
       "CellValue" ->
        Replace[v,
         {
          {XMLElement["v", _, {s_String}]} :> ToExpression@s,
          _ -> Nothing
          }],
       "CellMeta" -> {a, b}
       |>),
   \[Infinity]
   ];

And the merger:

excelValueAssociationFormat[v_] :=

  If[AssociationQ[v] && 
    AnyTrue[{"CellValue"}, KeyMemberQ[v, #] &],
   v,
   <|
    "CellValue" -> v
    |>
   ];

Clear[excelValuesMerge];
excelValuesMerge[rules_] :=
 Merge[rules,
  <|

     "CellValue" -> 
      Replace[Lookup[#, "CellValue", 
        Nothing], {{___, l_} :> l, {} -> Null}],
     "CellMeta" -> 
      DeleteDuplicates@Apply[Join, Lookup[#, "CellMeta", {}]]
     |> &@*Map[excelValueAssociationFormat]
  ]

These store the cell data as an Association with style info in "CellMeta" and the value in "CellValue".

Then finally we need our compression function to repack the XML:

Clear[excelCompress];
excelCompress[file_, path_] :=

  With[{eDir = 
     FileNameJoin@{$TemporaryDirectory, "_excel_extractor", 
       FileBaseName[path]}},
   If[DirectoryQ[eDir],
    Quiet@DeleteFile[file];
    Switch[$OperatingSystem,
     "MacOSX",
     RunProcess[{"zip", "-r", ExpandFileName@file,
       Sequence @@
        Map[
         FileNameDrop[#, FileNameDepth[eDir]] &,
         FileNames["*", eDir, \[Infinity]]
         ]},
      ProcessDirectory -> eDir
      ],
     _,
     RunProcess[{"zip", "-r", ExpandFileName@file,
       Sequence @@
        Map[
         FileNameDrop[#, FileNameDepth[eDir]] &,
         FileNames["*", eDir, \[Infinity]]
         ]},
      ProcessDirectory -> eDir
      ]
     ];
    DeleteDirectory[eDir, DeleteContents -> True];
    file,
    $Failed
    ]
   ];
excelCompress[file_] :=
 excelCompress[file, file]

Now I made a dumb little spreadsheet to work with:

dumb spreadsheet

And we'll just add at value at "A10":

excelExtract[$workBook]

"/private/var/folders/9t/tqc70b7d61v753jkdbjkvd640000gp/T/_excel_\
extractor/bleh"

excelValuesUpdate[$workBook,
 "A10" -> 20
 ]

"/private/var/folders/9t/tqc70b7d61v753jkdbjkvd640000gp/T/_excel_\
extractor/bleh/xl/worksheets/sheet1.xml"

excelCompress[$workBook]

"~/Desktop/bleh.xlsx"

Note that the changes aren't merged until we re-compress. And here's the output:

update

One nice thing about this is that it lets us update with more precision. Everything else about it is worse.

Finally here's a function to do all of this in one go:

excelUpdate[file_, worksheet : _String | _Integer : 1,
  vals_, overwrite : True | False : False] :=
 (
  excelExtract[file];
  excelValuesUpdate[file,
   worksheet,
   vals,
   overwrite
   ];
  excelCompress[file]
  )

And we can use this like so:

excelUpdate[$workBook,
 {
  "B15" -> 10,
  "G22" -> 10,
  "C1" ->
   <|
    "CellValue" -> 1,
    "CellMeta" ->
     {
      "s" -> "1"
      }
    |>
  }
 ]

update 2

Note that odd Association spec. That's how I'm handling the formatting. The styles are numbered, I guess. So you use "s"->"n" to refer to the nth style. In my case yellow background is the first.

The style spec seems to be somewhat involved and not worth going in to, so I'd consider this a way to replace values without preserving styling.

Finally I've put this in a package here. You can use it like so:

Get["https://raw.githubusercontent.com/b3m2a1/mathematica-tools/master/ExcelUpdate.m"];
ExcelUpdate[file,vals]
b3m2a1
  • 46,870
  • 3
  • 92
  • 239
  • 1
    Excel 2003 (and as I suppose later versions too) can save its Workbooks in the "XML Table" format preserving all the formatting and styling information and then open and edit them exactly as native Excel workbooks (they even recognized by the Windows Explorer as Excel files despite the XML extension!). Some time ago I tried to investigate how it is possible to generate them from Mathematica with explicit control over formatting: (1), (2), (3). – Alexey Popkov Sep 15 '17 at 18:45
  • @AlexeyPopkov Very cool. I did this mostly to see if I could, but sometime when I'm not sick of Excel (this was the first time it had been opened on my machine since ~2014 and hopefully I can go another 3 years without it) I'll play with that. – b3m2a1 Sep 15 '17 at 18:49