18

I am trying to write a function that, starting from a string, returns the Google search hits on that string (the format and other stuff, I still have to decide).

I have looked around and have found nothing that works (the wsdl-thing for Google search referenced by the docs, with WebServices, is retired and returns 404).

Any hints, except scraping the result page by hand?

m_goldberg
  • 107,779
  • 16
  • 103
  • 257
mgm
  • 1,029
  • 7
  • 15

2 Answers2

11

After looking better into the comment by Szabolcs, I have chosen this path (which I did try for images, but it didn't cross my mind to use it for web):

result = Import[ "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=furry%20beasts", "JSON"]

and then

"url" /. ("results" /. ("responseData" /. result))

to get the links to the pages. Or, to get the "preview" of the pages:

"content" /. ("results" /. ("responseData" /. result))

It works like a charm even if it is unsupported and bound to break at some point.

For reference: https://developers.google.com/web-search/docs/

mgm
  • 1,029
  • 7
  • 15
8

Mathematica now supports a native connection to GoogleCustomSearch API.

You can do for example

gs = ServiceConnect["GoogleCustomSearch"]
gs["Search", {"Query" -> "Jennifer Lawrence"}]

enter image description here

You can also use image search

gs["Search",{"Query"-> "Jennifer Lawrence","SearchType"-> "Image"}]

enter image description here

To use GoogleCustomSearch API you need an API Key and a Custom search engine ID.

To get the API Key you first have to go to https://console.developers.google.com and create a project if you don't have one yet. Once you have the project (asuming you are using the new console interface), click the Credentials menu at the left. There you'll see all your credentials. If don't have one, click the Add credentials and, for GoogleCustomSearch, you need the API key Browser key type.

For the Custom search engine ID, you have to go to https://cse.google.com/all and add a search engine. Once you create it, just click on that and you'll get a screen with a lot of options. There's a button there that says Search engine ID. Click it and you'll get the ID you need.

Here is the official Wolfram documentation for Google API.

You may also want to try the BingSearch service also available from Mathematica. The functionality is basically the same. Here is the official documentation for that.

xtian777x
  • 1,018
  • 8
  • 14