For conditional calculations, however, you will have to start using IF() statements, which can get complicated if you have several conditions you want to check for.
Consider the following Tab-separated data from a spreadsheet covering the month of September 2014:
Bed Wake DoW Condition nap (hrs) Rest (hrs)
9/2/2014 0:45:00 9/2/2014 5:50:00 1 5.08
9/3/2014 0:10:00 9/3/2014 5:50:00 2 5.67
9/4/2014 1:00:00 9/4/2014 5:50:00 3 4.83
9/5/2014 2:00:00 9/5/2014 11:30:00 4 pretty hung over today 9.50
9/6/2014 2:00:00 9/6/2014 9:45:00 5 1.5 9.25
9/7/2014 2:00:00 9/7/2014 9:45:00 6 7.75
9/8/2014 3:00:00 9/8/2014 10:40:00 0 Chuseok day 7.67
9/9/2014 2:00:00 9/9/2014 11:50:00 1 9.83
9/10/2014 0:30:00 9/10/2014 5:50:00 2 came into the office to study; caught a cold 5.33
9/10/2014 23:30:00 9/11/2014 5:50:00 3 cold sore appears on lip 6.33
9/11/2014 23:15:00 9/12/2014 6:20:00 4 7.08
9/13/2014 1:30:00 9/13/2014 11:00:00 5 9.50
9/14/2014 1:30:00 9/14/2014 11:00:00 6 9.50
9/15/2014 1:00:00 9/15/2014 5:50:00 0 common cold has moved to the chest; phlegm comes out 4.83
9/16/2014 0:00:00 9/16/2014 5:00:00 1 woke up 40 minutes early b/c of bad cough, condition is better than it was on Monday 5.00
9/17/2014 0:00:00 9/17/2014 5:50:00 2 5.83
9/18/2014 0:00:00 9/18/2014 6:00:00 3 6.00
9/19/2014 0:00:00 9/19/2014 5:50:00 4 5.83
9/19/2014 23:00:00 9/20/2014 9:30:00 5 10.50
9/21/2014 0:30:00 9/21/2014 10:00:00 6 9.50
9/22/2014 0:00:00 9/22/2014 5:55:00 0 5.92
9/23/2014 0:00:00 9/23/2014 9:00:00 1 9.00
9/24/2014 12:30:00 9/24/2014 17:30:00 2 5.00
9/24/2014 23:40:00 9/25/2014 5:50:00 3 6.17
9/25/2014 23:30:00 9/26/2014 5:50:00 4 6.33
9/27/2014 1:30:00 9/27/2014 10:00:00 5 1 9.50
9/28/2014 0:30:00 9/28/2014 10:00:00 6 9.50
9/29/2014 0:30:00 9/29/2014 5:55:00 0 5.42
9/29/2014 23:40:00 9/30/2014 5:55:00 1 6.25
9/30/2014 23:50:00 10/1/2014 5:55:00 2 6.08
The 3rd field, DoW (Day Of Week), takes values from 0 to 6, with 0 being Monday and 6 being Sunday. Getting an average value for hours slept Mon~Sun is trivial, as I can simply use AVERAGE() on the 6th column which is the field Rest.
But what if I want to find the average number of hours slept on the weekends (when DoW is 5 or 6)? Doing it the spreadsheet way would require an IF() statement checking if the 3rd field, DoW, is either 5 or 6 and then taking the average of the values in the 6th field, Rest, in the case that the IF conditions are satisfied.
Using Linux/UNIX text parsing tools is simpler, in my opinion. First I will copy the above TSV data into a text file named sept2014.txt
I will now print all lines to stdout (or I could redirect output to a file with > filename) satisfying the condition that the 3rd field contains a 5 or a 6.
$ cat sept2014.txt | awk -F'\t' '$3 == "5" || $3 == "6"'
The -F flag above designates the field separator character, which in the case above is TAB denoted by \t (single-quoted for a string literal). The default field separator in awk is non-TAB whitespace, so if fields are separated by spaces, there is no reason to explicitly state the field separator.
$N where N is some natural number, denotes the field number. $3 == "5" simply checks if the 3rd field has the value 5, while $3 checks if the 3rd field has the value 6.
The output of the above one-liner above is:
9/6/2014 2:00:00 9/6/2014 9:45:00 5 1.5 9.25
9/7/2014 2:00:00 9/7/2014 9:45:00 6 7.75
9/13/2014 1:30:00 9/13/2014 11:00:00 5 9.50
9/14/2014 1:30:00 9/14/2014 11:00:00 6 9.50
9/19/2014 23:00:00 9/20/2014 9:30:00 5 10.50
9/21/2014 0:30:00 9/21/2014 10:00:00 6 9.50
9/27/2014 1:30:00 9/27/2014 10:00:00 5 1 9.50
9/28/2014 0:30:00 9/28/2014 10:00:00 6 9.50
As you can see, only rows corresponding to DoW 5 or 6 (Saturday or Sunday) are printed. I can now copy and paste this data into a new sheet in the existing spreadsheet and calculate the average hours of sleep for the weekends. I think that sometimes quick text parsing with Linux/Unix text utilities is much faster than trying to write your own spreadsheet macro or create a multiply-nested spreadsheet formula.
댓글 없음:
댓글 쓰기