7

I have a table:

codebook={{1, "BO4675365"}, {2, "4581X0DM7"}, {3, "BR8983933"}, {4,    "BR2057668"}, {5, "BN7431495"}, {6, "BG1120297"}, {7,    "ZN796880"}, {8, "BO9805678"}, {9, "BJ4669187"}, {10,    "BM0350801"}, {11, "ZQ6021518"}, {12, "BV671559"}, {13,    "BO3833387"}, {14, "BU6273025"}, {15, "BQ1964585"}, {16,    "BM7272362"}, {17, "ZP1747481"}, {18, "BZ314326"}, {19,    "ZM5972504"}, {20, "UV8801572"}}

To "read" the code, I search for the alphanumeric entry, and read the corresponding value

Select[codebook, #[[2]] == "4581X0DM7" &] // AbsoluteTiming

which gives (for a table with 7209 entries),

{0.021373, {{2, "4581X0DM7"}}}

So to read the entry number, I use,

Select[codebook, #[[2]] == "4581X0DM7" &][[1, 1]]

which gives

2

Is there a faster way of picking the right entry? Running this millions of times may be trouble. Basically I'm looking for the fastest way to read from a table and get a code.

apg
  • 2,075
  • 10
  • 13

2 Answers2

7

I would create two Associationss, one for picking by number and the other for queries by code. For large tables Associationss will be considerably faster than Select.

1. By number

enum = First /@ GroupBy[First -> Last] @ codebook

Lookup[enum, 6, {}]

"BG1120297"

Lookup[enum, {6, 7, -1}, {}]

{"BG1120297", "ZN796880", {}}

enum[6]

"BG1120297"

2. By code

code = First /@ GroupBy[Last -> First] @ codebook;

Lookup[code, "ZP1747481", {}]

17

code["ZP1747481"]

17

3. KeyTake

If you want to see both entries:

KeyTake[{"BO4675365", "BO9805678"}] @ code

<|"BO4675365" -> 1, "BO9805678" -> 8|>

Convert to List:

KeyValueMap[List] @ KeyTake[{"BO4675365", "BO9805678"}] @ code

{{"BO4675365", 1}, {"BO9805678", 8}}

eldo
  • 67,911
  • 5
  • 60
  • 168
2

Based on this 2013 answer by Mr.Wizard.

It is fast because it only traverses the codebook once (and doesn't use Alternatives which would be quite slow).

getSubset[input_List, sub_List] := Module[{test},
  (test@# = True) &~Scan~sub;
  Apply[Sequence,
    Reap[Cases[input, x : {_, y_?test} :> x~Sow~y], sub][[2]], {2}]]

alphanums = {"BM0350801", "NOTINLIST", "BR2057668"};

found = getSubset[codebook, alphanums]

{{{10, "BM0350801"}}, {}, {{4, "BR2057668"}}}

First /@ Catenate[found]

{10, 4}

Entries not found can be identified

Extract[alphanums, Position[found, {}]]

{"NOTINLIST"}

Chris Degnen
  • 30,927
  • 2
  • 54
  • 108