I want to build a bot to automate web browsing, this mean something like:
- filling forms
- press "submit" buttons
- find certain text inside pages
- and so on...
How can I do this with Mathematica?
The Import function just make you download a single web page but it doesn't support cookies and similar things to build a complete automated bot, does Mathematica have some useful packet to do so?
Importworks). The question is very general in this form, and I'd be inclined to say Mathematica is not the right tool for this (you'll end up using JLink or .NETLink anyway). But if you can give a very specific example, we can think about how to implement it in Mathematica (or will be able to say with more confidence that it's not possible without external libraries) – Szabolcs Feb 15 '12 at 10:32wget. You can include it in Mma code by usingRun. About to go to bed but do a search cause posted something here in answer to another question a couple of weeks ago. ...here it is: http://mathematica.stackexchange.com/questions/1186/downloading-files-without-using-import/1211#1211 – Mike Honeychurch Feb 15 '12 at 11:26curl: http://stackoverflow.com/a/6977128/695132 – Szabolcs Feb 15 '12 at 12:08RunCurl[x_String, dir_:"C:\\directorywherecurlis\\"] := Module[{id = ToString[Round[AbsoluteTime[]]], run, res}, run = Run2[StringJoin["%comspec% /c ", dir, "curl.exe ", x, " > ", dir, "curl", id, ".log 2>", id, "curl", id, ".err"]]; res = Import[StringJoin[dir, "curl", id, ".log"], "Text"]; DeleteFile[StringJoin[dir, "curl", id, ".log"]]; (If[FileExistsQ[#1], DeleteFile[#1]] & )[StringJoin[dir, "curl", id, ".err"]]; res];– Rolf Mertig Feb 15 '12 at 12:37Run2[cmd_String] := Module[{shell}, Switch[$OperatingSystem, "Windows", If[$OperatingSystem === "Windows", Needs["NETLink`"]; shell = NETLink`CreateCOMObject["WScript.shell"]; ]; shell[run[StringReplace[cmd, {"\n" -> "", "\r" -> ""}], 0, True]], "Unix", Run[cmd], "MacOSX", Run[cmd]]];– Rolf Mertig Feb 15 '12 at 12:37Import[x,"Source"]where x is the site (all manually downloaded w/ wget) and then find content usingStringCases[]i.e.trlist = StringCases[pagetext, Shortest["<tr>" ~~ ___ ~~ "</tr>"]];(which would find all text within rows in a page arranged in that way, for example) – canadian_scholar Feb 15 '12 at 14:24curlfor FTP-ing Mma content but it was only once or twice.wgetis something I use regularly. – Mike Honeychurch Feb 15 '12 at 21:33ValueQquestion---originally I voted to close because I thought it had been asked, not because it's a bad question (it is a good question). (Just to avoid any misunderstanding on why I vote to close.) – Szabolcs Mar 26 '12 at 05:59Import[]in version 9! – CHM Apr 02 '12 at 05:16