A few tricks are required, but it is possible to create a database in LibreOffice and read the data via DatabaseLink in MMA. Here is a summary of worked for me with MMA 11.0, LibreOffice 5.0 and GNU/Linux.
Database Creation
The trick in creating the database is to use all capital letters for table names and column names. When I used lower case names, MMA could see the names, but I could not access the data with the SQL command I was using. Later, I will describe how to access data in tables with mixed case identifiers. However, if you can use upper case from the beginning, everything will be easier.
From the LibreOffice menu, I selected Insert->Table Design... to create a new database with one table. The table had 3 columns for integer, varchar and float data. I entered 3 rows of data, saved the table, and quit LibreOffice. It is probably a good idea to quit LibreOffice entirely.
The database filename is "db_137967.odb". I tried to open it in MMA, but could not get that to work. All MMA wanted to do was to create a new database, using "db_137967.odb" as the database name.
Database Extraction
The next trick is to extract the database files from the .odb file, which is simply a zip archive. Working at the command line, I created a scratch directory, moved db_137967.odb to the directory, and extracted the database files using the GNU unzip utility. No options required, just enter unzip db_137967.odb at the prompt to extract everything. Or if you want unzip db_137967.odb database/* will extract only the good stuff.
Unzipping the archive creates several files and directories. The five important files are in a directory named "database". I assumed all five of these files are required. I renamed (mv command) "log" to db_137697.log, renamed "data" to db_137967.data, and likewise for "properties", "backup" and "script". I moved these 5 files to the directory with my MMA notebooks. I noticed that unzipping the archive does not remove the "db_137967.odb" file, so LibreOffice can still read it.
I tried to unzip the .odb archive in MMA using Import[], but made no progress. It would be nice to have a MMA function that did this extraction.
It should be noted that it may be possible to create a conflict between the database name and the notebook name. My notebook name is "mma_137967.nb" and when I used the same name for the "log", "data", "properties", etc file, I had problems. So, I changed the database filenames to "da_137967", and the problems disappeared. Once it started working with different names, I did not try to re-create conflict.
Database Connection
Finally, we get to the MMA part. We have quit LibreOffice, we have extracted and renamed the 5 database files, and we are ready to read the data into MMA. Here's the MMA code to make the connection:
Needs["DatabaseLink`"]
conn = OpenSQLConnection[
JDBC["HSQL(Standalone)",
ToFileName[{NotebookDirectory[]}, "db_137967"]
]
];
info = Transpose@SQLConnectionInformation[conn];
Dimensions@info
(* {170, 2} *)
A few things to note in the above code: first, don't forget the Needs[] statement. Second, we do not really give a filename, just the "db_137967" part. MMA will open the files it needs. In this example all 5 of the files were in my NotebookDirectory[]. After opening the connection, I read the info, made sure that I got 170 rows and 2 columns. Note the use of Transpose in the above.
Later, I used the Dataset command to obtain a scrollable view of the info. Of the 170 entries in info, the ones that relate to "identifiers" appear to be especially relevant. They can be extracted with
TableForm@Select[info,
StringContainsQ[#[[1]], ___ ~~ "Identifier" ~~ ___] &]
Dataset@info;
(**
IdentifierQuoteString "
StoresLowerCaseIdentifiers False
StoresLowerCaseQuotedIdentifiers False
StoresMixedCaseIdentifiers False
StoresMixedCaseQuotedIdentifiers False
StoresUpperCaseIdentifiers True
StoresUpperCaseQuotedIdentifiers False
SupportsMixedCaseIdentifiers False
SupportsMixedCaseQuotedIdentifiers True
**)
From the above, one might suspect the database stores only the uppercase table names and column names. Likewise for names of views and of other database structures. However, these values also hint that mixed case names could be used, if they are quoted with the double quote character. As it turns out, lower case and mixed case identifiers are allowed, as we shall see.
Database Table Access
Here are the MMA commands that accessed the table data. The first command lists the table names.
SQLTableNames[conn]//ColumnForm
SQLColumnNames[conn,"PROJECTS"]//TableForm
SQLExecute[conn,"SELECT COUNT(*) FROM PROJECTS"]//TableForm
SQLExecute[conn,"SELECT TOP 10 * FROM PROJECTS ORDER BY NAME"]//TableForm
(**
PROJECTS
PROJECTS ID
PROJECTS NAME
PROJECTS BUDGET
3
2 Cold Fusion 3.14159
3 Negative Gravity -1.12358*10^7
1 Transmutation 1234.57
**)
The second command lists the table name / column name pairs. The third command gives a count of the number of entries in the PROJECTS table. The fourth command displays up to 10 of the data entries in the table in alphabetical order by project name.
Mixed Case Identifiers
I went back to LibreOffice and added some tables that had lower case and mixed case table names and column names. I extracted the 5 database files from the .odb file, as described above. Then, in MMA, I created a connection, exactly as above, and got the following results:
TableForm@SQLColumnNames[conn, "TestResults"]
quote = Last@FirstCase[info, {"IdentifierQuoteString", _}];
sqlcmd = StringJoin[{"SELECT COUNT(*) FROM ", quote,
"TestResults", quote}]
TableForm@SQLExecute[conn, sqlcmd]
sqlcmd = StringJoin[{"SELECT TOP 10 * FROM ",
quote, "TestResults", quote,
" ORDER BY ", quote, "ColumnNo2", quote}]
TableForm@SQLExecute[conn, sqlcmd]
(**
TestResults ColumnNo1
TestResults ColumnNo2
TestResults ColumnNo3
2
123 data-123 123.4
987 data-987 987.6
**)
The first SQLColumnNames[] lists the 3 table name / column name pairs. The table name for this query is "TestResults" and MMA recognizes the mixed case identifier. To access the data, we must put the mixed case identifiers in quotes.
The second command selects the appropriate quote character from the info variable, which is computed in the first section of code above.
The third command is a StringJoin[] that assembles an SQL command with quote characters on the table name. A simpler way to do this would be to embed \" in the string, but I didn't think of that until later. The fourth command in another SQLExecute[], with the result that there are 2 rows of data in the table.
The fifth command is another StringJoin[] that creates a second SQL command to access the data. In this second SQL the quotes are on both the table name and the column name. The column name is "ColumnNo2". The final SQLExecute[] executes the SQL command, which produces two lines of data from the table.
This same approach worked on a database table that had all lower case table names and column names. Because we must put that quote character on our mixed case identifiers, this method is a little messy. The simplest thing that works is to use all upper case identifiers, if possible.
Database Cleanup
When I closed the connection, I took a look at the file names that were left. I noticed MMA left behind one "tmp" directory and an "lck" file. Here's the MMA code that closed the database connection and showed me the new file and the new directory:
CloseSQLConnection[conn];
ColumnForm[FileNameTake[#, -1] & /@
FileNames["db_137967*", NotebookDirectory[]]
]
(*
db_137967.backup,
db_137967.data,
db_137967.lck,
db_137967.log,
db_137967.properties,
db_137967.script,
db_137967.tmp *)
So, that is about it. Some of it is a bit messy, but a shell script could greatly simplify the database extraction process. It would be nice to know how well this approach works with large data sets. Does the need to unzip the archive become an impediment?
SQLExecute[conn, "show tables"]and it gives$Failed(JDBC: unexpected token: show). But the connection is open. I just saw that it indeed seems to open the database because there are several temp files that appear with the extensionlck,log,properties,script... – halirutan Feb 23 '17 at 04:39"org.hsqldb.jdbcDriver"and the protocol"jdbc:hsqldb:file:"seems to match as well. Still, I don't get any tables out of my database. – halirutan Mar 02 '17 at 05:24