How to Use Bash Sort Command?

One of the most useful commands in Bash — sort — is also one of the most underrated. It is engaged in sorting incoming data (whether it is a file or terminal output stream) with the possibility of orienting by numbers or alphabet. With many options available, sort greatly facilitates and speeds up the processing and analysis of any textual information.

Bash Sort Command

In fact, the principle of the command is extremely simple: after its name, specify the name of the desired file. If you use it without options, it sorts the data alphabetically, taking as a basis the first letter of the first field and ranks ascending.

Note! Using sort does not overwrite the file, but simply shows the result of manipulating it.

bash sort

By the way, the fields are separated by default by a space. This can be changed by specifying a special flag, but more on that later.

Sort also works with pipes, which allows Bash to redirect outgoing data from one utility to another. This is useful when working with the system information. Below is an example of the output of the first five lines of the /etc/passwd file without and with sorting.

sort command in linux

If you run the utility without specifying a file, then it will wait for data input from the keyboard. After pressing the keyboard shortcut Ctrl + D, the program will exit and fulfill the specified (if any) conditions.

All the power of the utility is hidden in its flags, with which you can rank data at a higher level.

Extended Use of Sort Command

There are many options that expand the possibilities of using the command. They will be considered on the following example:

bash sort by column

Reverse Sorting

The usual first-field ranking is already known to you. To perform this operation in reverse order (in this case, in descending order) add the -r (or –reverse) flag.

linux sort examples

Field Denotation

Of course, it does not always coincide that you need to sort the data by the first field. And if we want to display the debtors from the largest debt to the smallest, we will have to add the -k (or –key) flag and specify the second field.

linux sort uniq

However, something is wrong here. The list has been changed, but ranking by the amount of debt has not occurred. The thing is that for working with numbers you should specify the appropriate flag, but more on that later.

READ ALSO  Using ldapsearch to Query Active Directory Objects

Consider another example of a file where the name of the person is also indicated next to the surname.

linux sort by number

If you specify sorting by the second field for it, then, obviously, it will be sorted by the letter of the name:

bash sort by date

To sort by the last name, you must specify the character number through the dot after the field number. In this case, this is symbol 3 (the first is the letter of the name, the second is the dot):

bash sort list

Consider other data. The first and last names are separated by a space and, therefore, are separate fields. Moreover, the same surnames are found.

linux sort man

The program simply indicates sorting by the second field, but it will be able to arrange people with the same last name in the specified order (ascending or descending) relative to the field on the right (here it is a name, which is quite logical).

linux sort by first column

If you do not need to consider names, you should limit the operation to only the second field, indicating one more digit 2, separated by a comma after the first one.

linux command line sort

The utility allows you to do secondary sorting by another field. To do this, use another -k flag with the appropriate parameters. So, if with the same sorting it is necessary to show the debt of namesakes in a certain order (descending or ascending), this should be indicated after the initial ranking.

But, as we found out earlier, working with numbers is not the same as with letters, so you should consider it now.

Numerical Sorting

Number handling throughout GNU/Linux has some pitfalls. The fact is that, for example, Bash perceives visible numbers as symbols, and not as specific numbers. Therefore, it will sort them by default according to their position in the ASCII table.

For example, we have a file with numbers from 1 to 15.

bash sort examples

If you apply sorting without options to them, you get the following result:

linux sort uniq

Generally, that’s not what we expected. Why? Because at first the utility sees the number 1 and knows that in the ASCII table it is higher than the rest of the presented ones, therefore it puts in front all the numbers starting with one, and sorts them according to the same principle.

READ ALSO  How to Check Network Usage on Linux?

To rank numbers correctly, add the -n (or –numeric-sort) option.

bash sort linux

We return to our first debtors. We need to sort them by the amount of debt in descending order. Now we know how to do it.

bash sort on linux

Now it’s time to highlight a few nuances of working with sort:

  1. Short forms of flags (where only one letter is indicated) can be written together, indicating at the beginning only one hyphen.
    sort command linux
  2. Flags can be set in any order among themselves. The main thing is to put its parameters next to a specific flag (which can also be written together). For example, in the case of indicating the field, you can make the following record: sort command bash linux
    However, if the flag of K2 stands between two other flags, then the one to the left of it will not work. For example sort -rk2n money.txt makes ranking in ascending order, not decreasing.
    bash sort by column linux

Well, an example with debtors, where there are namesakes. We wanted to do their secondary sorting with ranking by the amount of debt. After specifying the fourth field, the flag n stands for its type. The flag can be indicated elsewhere, but it will be more clear.

linux sort

Sort Numbers in Mathematical Form

A rather useful option is -g (–general-numeric-sort) when it comes to ranking numbers that are given in mathematical form. For floating-point numbers, it is the dot, not the comma that is used.

For example, some numbers are written in the file:

linux sort examples bash

If you use the -n option, the result will be like this:

linux sort examples math

which, of course, is not true, since the number 10e3 is a 10 × 10³ notation, and it is obviously larger than 12.12. But if you use the -g option, then everything falls into place:

command in linux sort by size

Sort by Month

The program allows you to rank even by month of the year. The first file indicated the date next to the amount. To specify sorting by months, the -M flag (–month-sort) is used, after specifying the corresponding field.

command in linux sort by month

But we would like to arrange the data correctly within one month. We are already familiar with secondary sorting.

command in linux sort bash

Separator Denotation

By default, as you know, a space is used as a field separator. The -t flag (–field-separator=SEPARATOR) is used to indicate otherwise.

READ ALSO  Test Domain Controllers Using Dcdiag.exe

So, ranking the file /etc/passwd beyond the first field is not possible without using this option. For example, take the last 5 entries of a file and let the reference field be the UID and the separator the colon:

sort bash command line

Sorting Check

The -c (–check) option allows you to check whether data is sorted by the specified condition. If the data is already located as it should, then nothing will be displayed on the screen. Otherwise, the program will notify of the error by specifying the file and line number with the first “error”.

bash remove duplicate lines

Remove Duplicate Lines

If the same field contains duplicate data (as it was with the last names), then they can be hidden using the -u (–unique) flag. But if we just specify the second field, then we will not get the result that was expected:

remove duplicate lines linux

The thing is that the program tries to find unique data, starting from the second field to the end of the line. But there are no such fields. Therefore, to clearly indicate the hiding of fields with the same last name, it is necessary to limit the operation of the utility to the same second field, separated by a comma.

remove duplicate lines bash

Ignoring Spaces

Earlier, “ideal” file examples were considered when everything is filled without flaws. But there are situations when, for example, the field is preceded by a space, and the program takes it into account when checking as a character (and it, by the way, is in the ASCII table before the letters). Moreover, there may be several in a row. The -b (–ignore-leading-blanks) flag is used to ignore spaces.

These are the most used options for Bash sort command. There are others, but they are less commonly used (because of their specificity). You can read more about them in man sort.

Cyril Kardashevsky

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.