1

I have a very large Json file. It contains 27000 records.

a record looks like this:

    {
adlibJSON: {
recordList: {
record: [
{
@attributes: {
priref: "4372",
created: "2011-12-09T23:09:57",
modification: "2012-08-11T17:07:51",
selected: "False"
},
acquisition.date: [
"1954"
],
acquisition.method: [
"bruikleen"
],
association.person: [
"Backer, Bregitta"
],
association.subject: [
"heraldiek"
],
collection: [
"Backer, collectie"
], ... ...

The problem is that this is not valid Json. The quotes are missing for the names.

Like for example acquisition.date should be "acquisition.date":

I need to edit this big json file and add all the quotation marks, otherwise the file doesn't parse with for example D3.js

What is the best way to repair this Json file?

  • Which OS are you using? – Dennis Jan 22 '13 at 20:52
  • To the migrate voters: I don't think this is off topic. This can be easily fixed by a sed command or something similar. And it's certainly not a programming question, so it's probably off topic for Stack Overflow. – Dennis Jan 22 '13 at 20:54

2 Answers2

2

This is my solution:

I'd use a decent text editor with regex find and replace capability (e.g., Visual Studio, UltraEdit, etc.).

Then Do: find

^\s*(\w+\.\w+)\s*:

and for the names with 2 dots:

 ^\s*((\w+\.\w+)+)\s*:

and replace with

"$1":

Or you could use powershell:

$allText = gc yourfile.txt
$allText -replace '^\s*(\w+\.\w+)\s*:', '"$1":'
1

You could use Hjson. If you only need it for one file use the online user interface https://hjson.github.io/try.html (I've used it with 100.000+ lines of json without problem), otherwise there are libraries for multiple programming languages or a CLI at https://hjson.github.io/.

CennoxX
  • 113