4

I have found an html page clearly created using the Save as HTML option of Mathematica. Is there a way to reverse this operation? That is open the html file with Mathematica and render it as a nb file again?

Red
  • 267
  • 1
  • 7
  • 2
    Without proper specification via Export, by default almost everything important is rasterized so I don't think so. Why not embedded CDF? – Kuba Dec 09 '13 at 11:26
  • @Kuba ok, I didn't realize that not only images but also expression were rasterized. What is "embedded CDF"? (I am not the author of the html page) – Red Dec 09 '13 at 12:35
  • CDF. Usually there are links to source notebooks but if there isn't any maybe you can ask the author? I don't believe text recognition approach will be successful with Mathematica syntax. – Kuba Dec 09 '13 at 12:40
  • @Kuba I don't think he created the webpage himself, he probably wants to run something he found online. Red, is this correct? I've felt the pain of that before, uncopyable rasterized cells are very annoying. – Szabolcs Dec 09 '13 at 19:57
  • @Szabolcs You are right, but I know it, have you read all the comments? :) – Kuba Dec 09 '13 at 20:00
  • @Kuba I was confused by "why not embedded CDF", it sounded like you assumed he created the document. – Szabolcs Dec 09 '13 at 20:24
  • @Szabolcs Yes, it was so. But then OP said "I am not the author of the html page" – Kuba Dec 09 '13 at 20:28

1 Answers1

5

I've created test image with Heike's code from How to create word clouds? and I've posted it... here :):

enter image description here

So let's download it:

pic = Import["https://i.stack.imgur.com/Ni4Kl.png"]; 

In case of full html you can use Import[....html, "Images"].

TextRecognize[ImageResize[pic, {Automatic, 1100}]]

enter image description here

Not perfect but it's something. It is String so after corrections you can convert it to InputForm.

You can also play with ImageResize, I've choosen 1100 because it gives nice output.

I think that the first operation should be

StringReplace[..., {"»?" -> "#", "f?" -> "#", " . " -> "."}]

:)

Kuba
  • 136,707
  • 13
  • 279
  • 740