I want to write a data parser script. The example data is:
name: John Doe
description: AM
email: john@doe.cc
lastLogon: 999999999999999
status: active
name: Jane Doe
description: HR
email: jane@doe.cc
lastLogon: 8888888888
status: active
...
name: Foo Bar
description: XX
email: foo@bar.cc
status: inactive
The key-value pairs are always in the same order (name, description, email, lastLogon, status), but some of the fields may be missing. It is also not guaranteed that the first record is complete.
The expected output is delimiter-separated (e.g. CSV) values:
John Doe,AM,john@doe.cc,999999999999999,active
Jane Doe,HR,jane@doe.cc,8888888888,active
...
Foo Bar,XX,foo@bar.cc,n/a,inactive
My solution is by using a while read loop. The main part of my script:
while read line; do
grep -q '^name:' <<< "$line" && status=''
case "${line,,}" in
name*) # capture value ;;
desc*) # capture value ;;
email*) # capture value ;;
last*) # capture value ;;
status*) # capture value ;;
esac
if test -n "$status"; then
printf '%s,%s,%s,%s,%s\n' "${name:-n\a}" ... etc ...
unset name ... etc ...
fi
done < input.txt
This works. But obviously, very slow. The execution time with 703 lines of data:
real 0m37.195s
user 0m2.844s
sys 0m22.984s
I'm thinking about the awk approach but I'm not experienced enough using it.
:in them? For exampledesc– ibuprofen Jun 22 '21 at 07:49