The awk tool is excellent for processing text files and operating on lines, or sets of lines.

Conditional Statements

Here, we print when one of the identified fields is missing

awk '{
if ($3 =="" || $4 == "" || $5 == "")
	print "Some field is missing";'
}' ./myfile.txt

Compare numeric fields, spew one thing or the other

awk {
	if ($2 >=50 && $5 >= 100)
		print $0,"=>","Pass";
	else
		print $0,"=>","Fail";
}' ./myfile.txt

Even More Maths as part of the conditions

awk '{
	sum=$2+$5;

	avg=sum/2;

	if ( avg >= 90 ) res="Great Client";
	else if ( avg >= 80) res="Good Client";
	else if (avg >= 70) res="OK Client";
	else grade="LMIM";

	print $0 , " is " , res;

}' ./myfile.txt

Awk supports a ternary operator too, using ?: operators, here packing three lines into one.

awk 'ORS=NR%3 ? "," : "\n"' ./myfile.txt

Find Longest Lines

This snippet will process the file, find the longest lines & the line number; which pipes to numeric reverse sort and shows the top ten lines. Takes almost 0.4s to process an 80MiB file.

~ $ awk '{ print length, NR }' catalog.sql | sort -nr | head

This modification makes it show those lines sorted by line number (extra awk to switch, sort again)

awk '{ print length, NR }' catalog.sql | sort -nr | head | awk '{ print $2, $1 }' |sort -nr

Code Line Counting

find -type f -name '*.c' -name '*.js' -name '*.rb' -exec cat {} \;  \
   | awk '/^\s*#/ { hash_tick++; next; } \
       /^\s*\/\// { line_tick++; next; } \
       /^\s*$/ { void_tick++; next; }
       /\/\*/ { wide_flag=1; } \
       /\*\// { wide_flag=0; } \
       wide_flag { wide_tick++; next; }
       /./ { code_tick++; next; }
       END {
           print "Hash Comment: " hash_tick
           print "Line Comment: " line_tick
           print "Range Comment: " wide_tick
           print "Blank Lines: " void_tick
           print "Code Lines: " code_tick
           print "Total Lines: " NR
       }'

Which will spew something like:

Hash Comment: 4
Line Comment: 5122
Range Comment: 2798
Blank Lines: 11686
Code Lines: 82145
Total Lines: 101755

Number of Changes

Comparing two directories for lines added/removed

. # diff -ruw /code.old /code.new \
     | awk ' /^\-/ { out++ }; /^\+/ { new++ }; END { print "Removed: " out; print "Added: " new; }'

Outputs something like:

diff: /code.old/http/scripts/states.js: No such file or directory
diff: /code.old/lib/barcode.php: No such file or directory
diff: /code.old/lib/inline.php: No such file or directory
Removed: 26131
Added: 19144

See Also