I have some code that process very large flat files in to separate HSQL(Standalone) databases using ParallelSubmit and JDBC for each file using the strategy in this post (19542). Lately I keep getting a Java error GC overhead limit exceeded. I found this post (2391189) with comments that suggest that the issue could be more than one program running in the Java VM.
I tried running a set of two files in series and they completed whereas the ParallelSubmit approach fails for them. Therefore I would like to run a separate Java VM on each sub-Kernel so that this issue is avoided.
I am already running ReinstallJava[JVMArguments -> "-Xms500m -Xmx6g"]; in the front end so I do believe it is the interaction and not memory.
How do I launch a separate Java VM on each subKernel?
In 19542 this would be creating a Java VM at the start of loadBigFile and then dropping it at the end of the function. Unless, of course, there is a better way to do this.
CloseSQLConnection,SQLResultSetClose, etc). – WReach Jan 10 '18 at 00:38SQLInserton separate HSQL(Standalone) databases. The function creates the database for its file and connections are opened and closed correctly. Each file takes between 15 to 30 mins to read into the database with 10's of millions of rows. It has been running fine until recently. – Edmund Jan 10 '18 at 01:16