Remember the scene in the Matrix where Neo is over Cypher's shoulder and Cypher is looking at monitors filled with ones and zeros? Cypher tells Neo, "All I see now is blonde, brunette, redhead." I think of that scene frequently when I look at log files. Even though I have the full ASCII set in the logs, I can't make much out of a typical log file.
Yet, I'm in the kind of work where I need to be able to read log files. I learned early on how to work around my issues with log files. My tools are awk, grep, and sed. (There are graphical tools for going through log files, and they can be nice. These command line tools are available on almost any system, however.)
Awk is a programming language invented by three guys whose last names started with A, W, and K. It's great for working with columns.
Grep stands for Global Regular Expression Print, and lets you filter files on a line by line basis.
Sed, or stream editor, allows you to take data and modify it. If, for instance, you want to replace the word 'fox' in a file with 'cat,' you can do this with sed.
To describe this, I made a fake log file with which to play.
E Mon Apr 7 12:41:22.123 cherry 5.6
W Mon Apr 7 10:00:00.000 date 3.25
I Mon Apr 7 01:01:00.000 eggplant 4.12
W Sun Apr 6 12:51.20.100 cherry 5
I Sat Apr 5 10:00:00.123 cherry 4.5
E Fri Apr 4 9:00:00.567 cherry 5
For an example, let's say you want to know the average value for 'thing' when the service is 'cherry' and the message type is 'E' for error.
Grep works line by line, so the first thing I like to do is run the file through grep. In this case, I'll pull out only lines with cherry in them.
cat sample.log | grep "<cherry>"
E Mon Apr 7 12:41:22.123 cherry 5.6
W Mon Apr 6 12:51.20.100 cherry 5
I Mon Apr 5 10:00:00.123 cherry 4.5
E Fri Apr 4 9:00:00.567 cherry 5
Next, I only want lines that indicate error messages. Off the top of my head, I can think of two ways to do this, but with Linux, there are probably another dozen ways that I'm not thinking of.
grep -w "E" --> Looks for the letter E surrounded by white space.
awk '$1 == "E"' --> Looks for the letter E only in the 4th column.
The awk method is more precise because it is possible that a single E surrounded by white space exists elsewhere in the log file.
E Mon Apr 7 12:41:22.123 cherry 5.6
E Fri Apr 4 9:00:00.567 cherry 5
At this point, I've narrowed the log file to just the lines with cherry errors. The next step is to take an average value of Error cherries. (Please note that the three consecutive awk statements can be written as one awk statement .)
cat sample.logg | awk '$1 == "E"' | awk '$6 == "cherry"' | awk '{ total += $7 } END { print total/NR}'
5.3
Let's break the awk averaging command down.
awk '{ total += $7 } END { print total/NR}'
The basic format of an awk command is:
# Basic awk format
awk 'BEGIN { actions taken before the first record is read}
/pattern/ { actions for each row}
END { actions after the last record is read} '
Since the keyword begin isn't in my awk command, we do not initialize anything. Note that the results would be the same if the total variable is set to zero in a BEGIN action.
awk 'BEGIN {total = 0} { total += $7 } END { print total/NR}'
The second set of braces, { total += $7 }, indicates that the variable total should have the value of the 8th column added to it.
The braces after the END statement indicate what should be done after all the records have been read. In this case, it will print the total of the 8th column divided by the Number of Rows.
cat sample.log| grep "<cherry>" | awk '$1 == "E"' | awk 'BEGIN {total = 0 } { total += $7} {print $0} END {print "\nAverage price of Error cherries = " total/NR}'
In the above statement, I modified the command so that it will still print out each row that meets the cherry error criteria and then print the total after a blank line with the word Total in front.
E Mon Apr 7 12:41:22.123 cherry 5.6
E Fri Apr 4 9:00:00.567 cherry 5
Average price of Error cherries = 5.3
Awk can be simply used as a grep substitute. Awk can also be used as it's own little programming language, including loops.
Terms & Conditions
Subscribe
Report
My comments