Let’s start using it.

Awk is a powerful text parsing tool for Unix and UniX-like systems, but it is also considered a programming language because it has programmable functions that you can use for routine parsing tasks. You may not use AWK to develop your next GUI application, and it may not replace your default scripting language, but it is a powerful program for specific tasks.

These tasks can be surprisingly diverse. The best way to understand what problems AWK can solve is to learn AWK. You’ll be amazed at how AWK can help you get more done with less effort.

The basic syntax of AWK is:

awk [options] 'pattern {action}' file
Copy the code

First, create this sample file and save it as colors.txt.

name       color  amount
apple      red    4
banana     yellow 6
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5
Copy the code

Data is separated into columns by one or more Spaces. It is common to organize the data to be analyzed in some way. It doesn’t always have to be a space-separated column, and it may not even be a comma or semicolon, but it usually has a predictable format, especially in log files or data dumps. You can use data formats to help AWK extract and process the data you care about.

Print columns

In awk, the print function displays what you specify. There are many predefined variables that you can use, but the most common are columns named as integers in text files. Give it a try:

$ awk '{print $2; } ' colours.txt
color
red
yellow
red
purple
green
purple
brown
brown
yellow
Copy the code

Here, AWK displays the second column, represented by $2. This is relatively straightforward, so you might guess that print $1 shows the first column, print $3 shows the third column, and so on.

To display all columns, use $0.

The number after the dollar sign ($) is an expression, so $2 and $(1+1) mean the same thing.

Select columns conditionally

The sample files you use are very structured. It has one row that acts as a header, and the columns are directly related to each other. By defining conditions, you can limit what AWK returns when it finds this data. For example, to view the item that matches yellow in the second column and print the contents of the first column:

awk '$2=="yellow"{print $1}' file1.txt
banana
pineapple
Copy the code

Regular expressions also work. This expression approximately matches the values in $2 that begin with p followed by any number of characters (one or more) followed by p:

$ awk '$2 ~ /p.+p/ {print $0}' colours.txt
grape   purple  10
plum    purple  2
Copy the code

Numbers can be interpreted naturally by AWK. For example, to print a row with an integer greater than 5 in the third column:

awk '$3> 5 {print $1, $2}' colours.txt
name    color
banana  yellow
grape   purple
apple   green
potato  brown
Copy the code

Field separator

By default, AWK uses Spaces as field separators. However, not all text files use Spaces to define fields. For example, create a file called colours.csv with the following:

name,color,amount
apple,red,4
banana,yellow,6
strawberry,red,3
grape,purple,10
apple,green,8
plum,purple,2
kiwi,brown,4
potato,brown,9
pineapple,yellow,5
Copy the code

Awk handles data in exactly the same way as long as you specify which character to use as a field separator in a command. Use the –field-separator (or simply -f) option to define the separator:

$ awk -F"," '$2=="yellow" {print $1}' file1.csv
banana
pineapple
Copy the code

Save the output

With output redirection, you can write the results to a file. Such as:

$ awk -F, '$3>5 {print $1, $2} colours.csv > output.txt
Copy the code

This will create a file containing the contents of the AWK query.

You can also split files into multiple files grouped by column data. For example, if you want to split colours.txt into multiple files based on the color displayed on each line, you can include redirection statements in awk to redirect each query:

$ awk '{print > $2".txt"}' colours.txt
Copy the code

This will generate files with names such as yellow.txt and red.txt.

In the next article, you’ll learn more about fields, records, and some of the powerful AWK variables.

This post is adapted from Hacker Public Radio, a community technology podcast.


Via: opensource.com/article/19/…

By Seth Kenlon (lujun9972

This article is originally compiled by LCTT and released in Linux China