3

I am a student trying to analyze GEO2r datas for one of my courses. The IDs given in the output are different for different series. I need to convert all of them to a similar format.

In this process I encountered the following type of ID which I don't know its origin:

”ID"    "logFC"
"SP_v2 4634"    "-0.9897758"
"SP_v2 3382"    "-0.8391782"
"SP_v2 4210"    "-1.1693583"
"SP_v2 2117"    "-1.0504727"
"SP_v2 3488"    "-0.9756444"
"SP_v2 1128"    "-0.8289103"
"SP_v2 2735"    "-0.8629999"
...

Each one of the rows represent a single gene. My question is that what is this ID? The GEO accession is GSE97750.

terdon
  • 10,071
  • 5
  • 22
  • 48
hhoomn
  • 325
  • 1
  • 5

1 Answers1

5

This is possibly a wild goose chase, but a lot of searching led me to this sample:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM2536852

Which appears to exhibit accessions of the right format. This in turn leads to the platform:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL22166

Which is a custom spotted cDNA array, which the metadata seems to suggest is a human platform.

The full annotation table gives gene symbols and Entrez IDs for most of the probes on the array:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?view=data&acc=GPL22166&id=53508&db=GeoDb_blob144

To take the example of the first line of your results:

ID          ORF     Entrez gene SEQUENCE
SP_v2 4634  ETNK1   55500       AAAGCAGCTTCATCTTTCAAAATTGATTTGCTCTGGTTTT

The Entrez Gene record for that ID matches up:

https://www.ncbi.nlm.nih.gov/gene/?term=55500

sjcockell
  • 861
  • 4
  • 14