I have a dataset which I want to split based on the first column.
The data looks like this
AA =
{{"Symbol", "Full Name", "Date", "Open", "High", "Low", "Close",
"Adj Close", "Volume", "Type"},
{"KO", "The Coca-Cola Company", {2013, 7, 19, 0, 0, 0.}, 40.88, 41.1,
40.79, 41.09, 41.09, 1.14011*10^7, "Stock"},
{"KO", "The Coca-Cola Company", {2013, 7, 18, 0, 0, 0.}, 40.86,
41.07, 40.74, 40.81, 40.81, 9.7088*10^6, "Stock"},
{"KO", "The Coca-Cola Company", {2013, 7, 17, 0, 0, 0.}, 40.55,
40.98, 40.31, 40.84, 40.84, 1.85131*10^7, "Stock"},
{"KO", "The Coca-Cola Company", {2013, 7, 16, 0, 0, 0.}, 39.78, 40.5,
39.5, 40.23, 40.23, 3.35756*10^7, "Stock"},
{"KO", "The Coca-Cola Company", {2013, 7, 15, 0, 0, 0.}, 41.05,
41.25, 40.93, 41.01, 41.01, 1.14184*10^7, "Stock"},
{"KO", "The Coca-Cola Company", {2013, 7, 12, 0, 0, 0.}, 41.03,
41.13, 40.73, 41.03, 41.03, 1.06864*10^7, "Stock"}
{"MCD", "McDonald", {2013, 7, 19, 0, 0, 0.}, 100.2, 100.41, 99.53,
100.27, 100.27, 4.5083*10^6, "Stock"},
{"MCD", "McDonald", {2013, 7, 18, 0, 0, 0.}, 100.48, 100.77, 99.99,
100.18, 100.18, 3.4016*10^6, "Stock"},
{"MCD", "McDonald", {2013, 7, 17, 0, 0, 0.}, 100.05, 100.35, 99.3,
100.1, 100.1, 5.3774*10^6, "Stock"},
{"MCD", "McDonald", {2013, 7, 16, 0, 0, 0.}, 100.18, 101.12, 99.47,
100.88, 100.88, 4.4062*10^6, "Stock"},
{"MCD", "McDonald", {2013, 7, 15, 0, 0, 0.}, 101.6, 101.73, 100.7,
100.75, 100.75, 4.4807*10^6, "Stock"},
{"MCD", "McDonald", {2013, 7, 12, 0, 0, 0.}, 100.59, 101.81, 100.5,
101.58, 101.58, 4.7668*10^6, "Stock"},
{"MCD", "McDonald", {2013, 7, 11, 0, 0, 0.}, 100.75, 100.96, 99.76,
100.79, 100.79, 4.0641*10^6, "Stock"}}
I've been able to do it:
Split[AA, First[#1] === First[#2] &]
But I do not understand the theory behind it. This is how far I got
Split[list, test]treats pairs of adjacent elements as identical whenever applying the functiontestto them yieldsTrue.
From this I get that Split is a function where AA is my list, and that it splits the data into groups when the test yields true.
But I get confused here.
First[#1] === First[#2] &
First[{a, b, c}]
a
Based on this I get that it picks the first element of the list within the list, in my case "KO" This means that when First[#1] equals First[#2], it yields true. However, I don't understand the function of #.
How does this predicate function work? Does it take
Row 1 column 1 = Row 2 column 1 --> False
Row 2 column 1 = Row 3 column 1 --> True
Row 3 Column 1 = Row 4 column 1 --> True
...
and then group all the True and all the False together and create seperate lists?
Same function again:
I'm just not sure what #1 does in the list and what #2 does in the list and why & needs to be added at the end.
Same function as above ---> Split[AA, First[#1] === First[#2] &]
#is theSlot. You may want to take a look atSlotandPureFunctionin documentation. Also, this part of common pitfalls awaiting new users is going to help you. Next time please check this in order to know how to format your question. – Kuba Jul 22 '13 at 06:28Falsegiven by test are not grouped at the end, those are marks where list is splitted. Notice that test "is applied"n-1times, wheren=Length@list. For example:Split[{1,1,2,2},#1==#2&]gives the test sequence{true,false,true}, now you see that it is not going to "group" all the sequences ofFalseandTrue. Also, notice that===is notEqual, if you are new to Mathematica you will need to spend some time with Documentation. We are certainly going to help you but with more complicated problems. – Kuba Jul 22 '13 at 07:03