44

Is there a way to copy and paste code snippets from SE to Mathematica if these snippets are interspersed with text?

Like e.g. in Morphing Graphics, color and location in both the question and answer, there are code blocks separated by text and graphics.

Pasting this into Mathematica in n steps is tiresome. Perhaps there is some nice way to make pasting as comfortable as the other way round with the code and graphics palettes?

Yves Klett
  • 15,383
  • 5
  • 57
  • 124

4 Answers4

41

Code extractor using the StackExchange API

The following code uses the 2.0 version of the SE API and has also been cleaned up a bit (place it in your kernel's init.m or your custom functions package if you'd like to be able to use it anytime).

The function takes a single string argument, which is the URL obtained from the share link under a question/answer.

Example

enter image description here


importCode[url_String] := 
 With[{
  filterCode = StringCases[#, ("<pre><code>" ~~ ("\n" ...) ~~ x__ ~~ ("\n" ...) ~~ 
          "</code></pre>") /; StringFreeQ[x, "<pre><code>" | "</code></pre>"] :> x] &,

convertEntities = StringReplace[#, {"&gt;" -> ">", "&lt;" -> "<", "&amp;" -> "&", "&quot;" -> "&quot;"}] &,

makeCodeCell = Scan[NotebookWrite[EvaluationNotebook[], Cell[Defer@#, "Input", CellTags -> "Ignore"]] &, Flatten@{#}] &,

postInfo = Import[ToString@ StringForm["http://api.stackexchange.com/2.1/posts/1?site=2&filter=!9hnGsretg", #3, #1] & @@ {First@StringCases[#, Shortest[s__] ~~ "." ~~ ___ :> s], #2, #3} & @@ StringSplit[StringDrop[url, 7], "/"][[;; 3]], "JSON"]},

OptionValue["items" /. postInfo, "body"] // filterCode // convertEntities // makeCodeCell]

NOTE: I don't do any rigorous error checking or check to see if you're entering a valid Stack Exchange URL or if the question/answer is deleted (deleted posts cannot be accessed via the API), etc. So if you get any errors, it might be worthwhile to check if there's something wrong on the site.

Also, SE API limits you to 300 calls/day/IP, if I remember correctly. That's quite a lot of calls for any reasonable person and ideally, you shouldn't cross that. Nevertheless, a possibility of being throttled is something to keep in mind if you also happen to be playing with the API for other purposes such as site statistics, etc.

xzczd
  • 65,995
  • 9
  • 163
  • 468
rm -rf
  • 88,781
  • 21
  • 293
  • 472
  • Could it grab the URL from the clipboard? – Dr. belisarius Aug 22 '12 at 14:55
  • @Verde You could add a definition: importCode[] := importCode[ First@Cases[NotebookGet@ClipboardNotebook[], Cell[x_, ___] :> x, Infinity]] which will copy whatever is in the clipboard (no checks to see if it is a URL or not), but I don't think this is useful as the chances of you having forgotten and copied something else is high... pasting it as an argument shouldn't be hard and I would prefer explicit over clipboard copy :) – rm -rf Aug 22 '12 at 15:16
  • The call throttling problem can be circumvented if you use an API key. I've written about that before. I believe it was in my SO API answer that you were referring to in a comment above. – Sjoerd C. de Vries Aug 26 '12 at 06:39
  • I think you could first filter out blockquotes before extracting code blocks, so it doesn't get messed by those blockquotes with a codeblock in it, which are used by some user for formatting output. Also IMO Shortest[...] would be enough for html parsing instead of FreeStringQ[...] condition. Maybe something like StringCases[StringReplace[#,Shortest["<blockquote>"~~__~~"</blockquote>"]:>""],Shortest["<pre><code>"~~("\n"...)~~x__~~("\n"...)~~"</code></pre>"]:>x]. – Silvia Jun 12 '13 at 11:21
  • @Silvia I agree that one can implement workarounds to filter out the blockquote, but I'm against that because the issue is not the code here, but the incorrect use of formatting styles. Tomorrow, someone else might start using a different tag for some other unintended purpose and we can't be catching/tracking them all. I figured it was better to educate and reason with users to use only code blocks for code than to add exceptions . Re: Shortest, yes, I think they're equivalent and looks cleaner. – rm -rf Jun 12 '13 at 12:01
  • 1
    @R.M. If there was an animated gif to show how it works would be very good. – LCarvalho Jan 04 '17 at 12:50
  • Maybe this post could be sanitized and put into the resource function repo? – Gravifer Mar 09 '21 at 08:52
  • I take the liberty to add a rule "&quot;" -> "\"" for handling ", feel free to rollback if you don't like it :) . – xzczd Jun 26 '22 at 02:57
  • 1
    I take the liberty to use NotebookWrite instead of CellPrint according to the discussion in https://mathematica.stackexchange.com/q/269955/1871. Feel free to rollback if you don't like it :) . – xzczd Feb 21 '23 at 06:33
20

You could do something like this:

string = "(Paste Here)"

exps = Select[
   string ~StringSplit~ "\n\n",
   SyntaxQ@# && ! MatchQ[MakeExpression@#, _@__Times | _@Null] &];

CellPrint@Cell[#, "Input"] & ~Scan~ exps
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
16

I posted a possible answer for this on meta (Download questions or chats for offline reading), but perhaps it belongs here on the main site instead. I have a paclet that downloads a stack exchange question url, and creates a notebook version where code blocks are evaluatable.

It can be downloaded from:

https://github.com/carlwoll/Stack-Exchange-Stylesheet/releases/tag/v0.1-alpha

Download the .paclet file, and then run:

PacletInstall[file]

To use, do:

<<StackExchange`
NotebookPut @ StackExchangeView["http://mathematica.stackexchange.com/q/3535/45431"]

where I use this question as an example (the NotebookPut won't be necessary when the paclet is final). The following is snippet of what the notebook output looks like:

enter image description here

I use Import[url, "XMLObject"] instead of the Stack Exchange api (since I didn't know about the api when I started), so I need to investigate the merits of using the api or not.

It is also possible to use style key tabbing (tab at the start of a cell), shift-enter and right click to modify "StackExchange" styled cells to a markdown version, a hybrid WYSIWIG version, or a deployed version (although this aspect is a bit buggy). This is what the notebook looks like after converting the snippet to the deployed version:

enter image description here

As you can see, the deployed version still needs work (h2 and * formatting)

Feed back is welcomed.

Carl Woll
  • 130,679
  • 6
  • 243
  • 355
6

The StackAPI answer seems really nice, however it failed for me when I tested it, so I coded a bare bones implimentation which simply pulls out any code blocks from an arbitrary html page, without needing it to be from stackexchange, or even well formed html:

 codeBlocks[url_] := CellPrint[
   Cell[#, "Input"] & /@ 
     StringCases[Import[url, "Source"], 
     "<pre><code>" ~~ p : (Shortest[___]) ~~ "</code></pre>" :> p]]
jVincent
  • 14,766
  • 1
  • 42
  • 74
  • 1
    could you share the exact call you made that failed? What error did you get? A bunch of badly formed string boxes? – rm -rf Aug 22 '12 at 13:57
  • @R.M I ran the code posted in the answer and the suggested command StackAPI\GetCode["http://meta.mathematica.stackexchange.com/a/307/5"], which returned a set of errors fromStringCasesandStringReplace. Seems like the call to the API is failing, returning{"answers" -> {}, "page" -> 1, "pagesize" -> 30, "total" -> 0}` from the Import. – jVincent Aug 23 '12 at 08:53
  • For some reason, I wasn't pinged by this comment from you back in August. Anyway, the reason it failed for you is because this question was originally on meta (which is where I wrote this answer) and then it was migrated here. The suggested command pointed to the original, but since it was deleted, you couldn't retrieve it using the API (This was something I had mentioned in the post). In any case, my answer used v1.0 of the API, which was bound to fail at some point since they've deprecated it. I updated my answer today with a new version (using API v2.0), which is when I saw this comment. – rm -rf Feb 16 '13 at 00:40