0

When I try to get this answer from Mathematica.SE by evaluating

Import["https://mathematica.stackexchange.com/a/95725/21532"]

I just get some text without any formatting. But what I want is an image like this:

Can Mathematica do this? Maybe to do what I want I should use my web browser? But I don't know how to do it that way.

Really my question is how can I make Mathematica retrieve the question or the full page as an image when all I have is its URL?


Not exactly duplicate of comment's link. When I evaluate

dotNetBrowserScreenshot["https://mathematica.stackexchange.com/a/95725/21532"]

I don't just get the answer, but full of the page. I could add the option Height -> 500, But I usually don't know what value to give. Can anybody improve dotNetBrowserScreenshot to meet my need?

yode
  • 26,686
  • 4
  • 62
  • 167
  • 1
    Re. comment about the duplicate: You have create a new HTML document with just the HTML that you want to render, if you want to render just a part of the document. IMHO this question is not about Mathematica. – C. E. Jul 09 '16 at 14:52
  • @C.E. Sorry,Maybe annoy you,but have you ever run dotNetBrowserScreenshot["http://mathematica.stackexchange.com/a/95725/21532",Height->500]?This part is exactly what I want.As this output.I think dotNetBrowserScreenshot can be improved.Then this post have its value stil. :) – yode Jul 09 '16 at 15:03
  • 1
    You can compare it with LaTeX if you want, if you have a LaTeX document, can you use Mathematica to determine how many pixels a certain part of the document will be in a PDF? The answer quite clearly is that you can't without running the LaTeX code through a compiler that creates a PDF document with all the parts laid out. But even when you have the PDF, there's no simple way to determine how many pixels a certain part is. But what you can do is only take the LaTex of that part that you want and render just that in a PDF document by itself. Then you know that you'll get the right result. – C. E. Jul 09 '16 at 16:04
  • 1
    It's not clear what you mean by "get an image like this" - what are the actual requirements? It seems that you want to scroll to a certain part of the web page automatically and then set a certain rectangle for which an image should be produced. That appears to be outside the scope of Mathematica because it's a web browser automation question. But maybe I don't understand what you actually want. – Jens Jul 09 '16 at 16:08
  • @Jens The demand is:When input is "http://mathematica.stackexchange.com/a/95725/21532",then output is this picture.Actullay the dotNetBrowserScreenshot["http://mathematica.stackexchange.com/a/95725/21532",He‌​ight->500] will give what I want exactly,I just don't like the He‌​ight->500.So I hope the dotNetBrowserScreenshot can be improved. – yode Jul 09 '16 at 16:23
  • So you want the default height to be 500 rather than Automatic? Just change the value in Options[dotNetBrowserScreenshot] – Simon Woods Jul 09 '16 at 16:29
  • @SimonWoods No,I hope the option of He‌​‌​ight will automatica to fit my demand,in my this examlple the values is 500.But I don't know what is its values when I give a link of other link – yode Jul 09 '16 at 16:33
  • @SimonWoods I just confuse why my description will result you so many misunderstanding? – yode Jul 09 '16 at 16:34
  • Adjusting the height to the exact size of the answer isn't feasible even in a regular browser. What if the answer doesn't fit in the window even when maxing out the screen? – Jens Jul 09 '16 at 17:20
  • I guess one could build a copy of the formatted answer from the XMLObject (cutting unwanted parts), send it to a browser and tell it to print that. But getting the formatting right (which seems to be required) isn't a Mathematica issue. – Jens Jul 09 '16 at 17:33
  • @Jens If you use firefox browser,when you enter the link of certain answer,it will give a twinkle by a yellow background to hightlight .I don't know whether this behavior can help. – yode Jul 09 '16 at 17:43

1 Answers1

3

This is an answer to original question. It does not cover the revision concerning dotNetBrowserScreenshot.

  1. If you give Import a URL, you are going to get back whatever the site you contact gives you in response. In this case it's going to be the whole page.

  2. Without any qualifying 2nd argument, you are going to just scrape the text off the page.

  3. You can find out what 2nd arguments you might give with

    Import["http://mathematica.stackexchange.com/a/95725/21532", "Elements"]
    

    which returns

    {"Data", "FullData", "Hyperlinks", "ImageLinks", "Images", "Plaintext", 
     "Source", "Title", "XMLObject"}
    
  4. You will get complete formatting information if you give either "Source" or "XMLObject" as the 2nd argument. You will still need to isolate the answer you want from the rest of page, but writing the code to do that is doable.

m_goldberg
  • 107,779
  • 16
  • 103
  • 257