The Goal
I have a list consisting of 2 million lists of (positive) integers. The length of the list when flattened is about 60 million. The goal is to Flatten the list (Catenate) and delete all duplicates.
The Issue
The list consists of very large numbers (>10^11) and DeleteDuplicates causes the error SystemException["MemoryAllocationFailure"].
My Attempts
As a dummy, let list be defined as:
a := RandomInteger[10^10, 30];
list = Table[a, 2*^6];
Using Catenate and DeleteDuplicates on list consumes a lot of memory:
MaxMemoryUsed[DeleteDuplicates@Catenate@list]
(* 3475344280 *)
I have tried Folding Join and DeleteDuplicates
Fold[DeleteDuplicates[Join[#1, #2]]&, {}, list]
It takes less memory but takes too much time (takes ~300x more time).
I attempted using @RayKoopman 's method, which actually took more memory and crashed my computer.
DD[data_] := Part[data,Sort@Part[Range[Length@data][[#]],
Most@FoldList[Plus,1,Length/@Split@data[[#]]]]]& @ Ordering@data;
DD@Catenate@list
The Question
How could one memory-efficiently remove all duplicates in a very long list?
Note: My list consists of only positive integers.