14. uniq (Unique)

Lesson Content

The uniq (unique) command is another useful tool for parsing text.

Let’s say you had a file with lots of duplicates:

reading.txt
book
book
paper
paper
article
article
magazine

And you wanted to remove the duplicates, well you can use the uniq command:

$ uniq reading.txt
book
paper
article
magazine

Let’s get the count of how many occurrences of a line:

$ uniq -c reading.txt
2 book
2 paper
2 article
1 magazine

Let’s just get unique values:

$ uniq -u reading.txt
magazine

Let’s just get duplicate values:

$ uniq -d reading.txt
book
paper
article

Note : uniq does not detect duplicate lines unless they are adjacent. For eg:

Let’s say you had a file with duplicates which are not adjacent:

reading.txt
book
paper
book
paper
article
magazine
article
$ uniq reading.txt
reading.txt
book
paper
book
paper
article
magazine
article

The result returned by uniq will contain all the entries unlike the very first example.

To overcome this limitation of uniq we can use sort in combination with uniq:

$ sort reading.txt | uniq
article
book
magazine
paper

Exercise

What result would you get if you tried uniq -uc?

Quiz Question

# What command would you use to remove duplicates in a file? > The uniq command deletes repeated lines in a file. The uniq command reads either standard input or a file specified by the InFile parameter. The command first compares adjacent lines and then removes the second and succeeding duplications of a line. Duplicated lines must be adjacent. (Before issuing the uniq command, use the sort command to make all duplicate lines adjacent.) Finally, the uniq command writes the resultant unique lines either to standard output or to the file specified by the OutFile parameter. The InFile and OutFile parameters must specify different files. 1. [ ] one 2. [ ] only 3. [ ] distinct 4. [x] uniq