3

How can I write an efficient version of the following selection statement:

Keys[Select[hash, # < constant &]]; // AbsoluteTiming

hash can be made from:

hash = Association[Table[i -> i^2, {i, 1, 10^6}]];

On my computer, a last year MacBook Pro, it takes almost a half second. However, this is non an acceptable time for working with Hashs. Any other ways to do that? Thank you very much.

Paul
  • 181
  • 2
  • 4
  • 2
    I will remark that the analogous operation, on Range[10^6]^2 (that is, a List rather than Association), is less than a factor of two faster. So it is not clear that the original expectation of greater speed is reasonable, unless there is a similar claim about speed of Select on the raw list. – Daniel Lichtblau Sep 08 '14 at 22:26
  • @Daniel Select is pretty slow on raw lists compared to numeric equivalents, when the latter are possible. Perhaps Select could be made to auto-compile like Fold etc.? – Mr.Wizard Sep 08 '14 at 23:28

1 Answers1

5

I don't know if it is possible to do much to improve this for an Association object. If the conversion to a list of keys and values can be externalized this numeric selection can be performed quite quickly by using UnitStep, and SparseArray Properties:

(* hash randomized to demonstrate order independence *)
hash = Association[RandomSample @ Table[i -> i^2, {i, 1, 10^6}]];

keys   = Keys[hash];
values = Values[hash];

constant = 27;

keys[[
 SparseArray[UnitStep[values - constant], Automatic, 1]["AdjacencyLists"]
]]
{3, 5, 1, 2, 4}
Needs["GeneralUtilities`"]

keys[[
 SparseArray[UnitStep[values - constant], Automatic, 1]["AdjacencyLists"]
]] // AccurateTiming
0.00680001

Unfortunately the conversion to lists is two orders of magnitude slower:

AccurateTiming[
 Keys[hash];
 Values[hash];
]
0.665001

For the time being you may be better served by a different data structure.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371