Linux

How to Find and Remove Duplicate Files in Linux using 'fdupes' Command Line Tool

Have you ever downloaded a PDF document from the Internet, moved it to some folder, and ten months later downloaded it again, because you could not find the first one? Do you have ‘Document’, ‘Document(1)’, Document(2), all clustered in the same Downloads folder?

Increased availability of the Internet over the years has ensured that users need not spend time using the (often slow and dull) search functionalities in File Explorers and instead download the needed files again. This, coupled with unorganized folder structures, seldom creates a chaotic storage situation in which duplicate files might consume up to multiple Gigabytes of space.

To deal with these duplicate files, the GNU/Linux community offers us a plethora of command line and GUI based options. One such easy to use command line tool is ‘fdupes’.

Find Duplicates using ‘fdupes’ in Linux

To find duplicates in a particular directory, simply type fdupes <directory_path> on the Linux terminal, and run it. Else go to the required directory using cd and run fdupes. (the . in command means the current directory in the Linux command line).

However, this only checks for files inside the given directory. If the directory contains another directory (which can further contain a hierarchy of directories below it), we simply need to pass the -r (recursive) flag to the fdupes command.

fdupes -r <directory_path>

Removing Duplicates

Now that we have the list of duplicate files, we can make use of the rm command in Linux to remove the duplicates consuming unnecessary space.

rm <filename>

But what if there are large number of duplicate files, and we want to keep one and remove rest of them? It becomes quite cumbersome to remove each file one by one using rm in such a case.

We make use of -d flag. It prompts the user to enter the file which is to be kept and deletes the rest.

fdupes -d <directory_path>

Note: The flags can also be used in combination with most Linux commands.

fdupes -rd <directory_path>

Use -N flag along with -d to keep the first file by default, and remove others, without making the command prompt for files to keep.

fdupes -rdN <directory_path>

These are the most useful options in the fdupes command to efficiently get rid of duplicate files.

Note that, if the command is run on a bigger folder (Eg. on /home or on root folder /), fdupes will take some time to run, and will display a progress bar on the terminal.


We hope you find the information on this page helpful. If you have any questions, let us know in the comments section below.

Leave a Reply