8

For a number of years I have been compiling a comprehensive bibliography of works that include mention of fishes of the family Scorpaenidae. Consequently, I am always on the lookout for datasets that can be mined for this purpose.

Recently, I learned that I could use this method to automatically search the PubMed database:

pubmed = ServiceConnect["PubMed"]

scorpaenidaeSearch = pubmed["PublicationSearch", "Query" -> "Scorpaenidae", MaxItems -> 1000, "Elements" -> "FullData"]

I am able to get much of the pertinent data directly (although it only seems to return recent publications). From this I can obtain a lists of the UID's or the located publications by using the following commands.

scorpSearch = Values[Normal[scorpaenidaeSearch]]
scorpSearch[[All, 1]]

However, the "FullData" parameter does not return the publication abstract. For an individual UID I can obtain the corresponding Abstract using

pubmed["PublicationAbstract", "ID" -> "37344374"]

However, I can't figure out the syntax to generate the abstracts from a list of UID's, without violating the NCBI limit of requesting not more than one request every 3 seconds.

How can I place Pause[5] into the following mapping to generate abstracts for the list of UID's obtained using the UID's returned via the following command?

abstracts = 
  pubmed["PublicationAbstract", "ID" -> #] & /@ scorpSearch[[All, 1]]

I've tried to construct a Do loop for this purpose, but can not construct an appropriate syntax for the mapping. Without a pause, the mapping violates the rate limit on the service and the request is denied. The following does not work.

 abstracts = 
  Do[Pause[5]; pubmed["PublicationAbstract", "ID" -> String[#]] &, 125] /@ 
  scorpSearch[[All, 1]]

(125 records are returned for this particular search)

Nor does:

abstracts = (pubmed["PublicationAbstract", "ID" -> ToString[#]] &;
  Pause[5]) /@ scorpSearch[[All, 1]]
creidhne
  • 5,055
  • 4
  • 20
  • 28
Stuart Poss
  • 1,883
  • 9
  • 17

1 Answers1

7
uids = Normal @ scorpaenidaeSearch[All, "UID"]

abstracts = {};

Do[Pause[5]; 
 AppendTo[abstracts, 
  pubmed["PublicationAbstract", "ID" -> ToString[i]]], 
 {i, uids}]

View abstracts 5 and 6:

Row[Panel /@ abstracts[[5 ;; 7]]]

enter image description here

$Version
"13.3.0 for Linux x86 (64-bit) (June 3, 2023)"
kglr
  • 394,356
  • 18
  • 477
  • 896
  • Code as written works for reading records from abstracts. However, when placed into following write stream it generates an error without creating the file. – Stuart Poss Jul 12 '23 at 15:37
  • stream = OpenWrite[scorpaenidaeAbstractFile] Write[stream, abstracts[[All]]] Close[stream]URLFetch::invhttp: A library error occurred. The raw details are: "libcurl error (18): transfer closed with outstanding read data remaining" – Stuart Poss Jul 12 '23 at 15:44
  • 2
    @StuartPoss, I was able to run the code s = OpenWrite["testfile"]; Write[s, abstracts] to create a file without any error messages (version 13.3.0 for Linux x86). Try putting Close[stream in a separate cell after executing Write[stream,abstracts] - if that does not work you might consider posting the issue as a new question. – kglr Jul 12 '23 at 16:56