Instead of using the pythonic pandas way to produce such thing from two .csv files I am trying to see if I can do this with awk.
What I have is the following:
file1.csv | file2.csv
name,value_a,value_b | name,value_a,value_b
A,2,3 | A,3,5
B,1,5 | B,5,7
C,5,1 | C,9,4
D,9,2 | D,10,20
The column $1 is the same in both .csv files. But the columns $2 and $3 are different. What I wish to do is to produce a new file that has the same column $1 but for the columns $2 and $3, namely value_a and value_b it has the difference of those two.
The column value_a should have the difference produced when subtracting the second column of file1.csv from file2.csv where value_b should have the difference produced when subtracting the third column of file1.csv from file2.csv
So, the result should look like this.
diff.csv
name,value_a,value_b
A,1,2
B,4,2
C,4,3
D,1,18
Can this be done by awk ? Or should I stick with the python3 way ?
Thanks in advance
My attempt:
awk -F, '(FNR==NR){hl[$1]=$2;lh[$1]=$3;next}
{hl[$1]=$2-hl[$1];lh[$1]=$3-lh[$1];next} #update the arrays with the differences
END{ for (i in hl){print i,":",hl[i],lh[i]}}' file_1.csv file_2.csv #print them
Is there a better way to do it?
nextstatement on my code. Is there a better way to do it? – ex1led Mar 09 '21 at 13:54nextis not necessary, but the first one is (see https://stackoverflow.com/q/32481877). If you set theOFS=,, your answer will be perfect. – Quasímodo Mar 09 '21 at 14:00