Delete files with same size

So, i got a directory full of files. Some files are duplicate and i'd like to delete those duplicate files. The only way to recognize duplicate files is their size. So how would i delete them?
Last edited on
Just because two files are the same size, doesn't mean they're the same.
Should i try using some hash thing then? Like CRC32 or MD5 to check for duplicates? Because i know there are duplicates
Are the files big? Else you could grab the data from fileA and check it with the data from fileB.
> Should i try using some hash thing then? Like CRC32 or MD5 to check for duplicates

Computing the hash requires processing every byte in the file. If two files hash to the same value, they are probably identical (with the probability close to one if the hash is a cryptographically strong hash); you still need to compare the bytes. Computing a hash will speed up things if it can be computed once, cached, and then reused many times.

1. Compare file sizes
2. If they are equal, and if we have the pre-computed hash values for atleast one file, compare the hashes.
3. If the hashes are equal (or step two was not performed), compare the files byte by byte.

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <iostream>
#include <fstream>
#include <algorithm>
#include <iterator>

bool duplicates( const char* path_one, const char* path_two )
{
    std::ifstream first( path_one, std::ios::binary ) ;
    std::ifstream second( path_two, std::ios::binary ) ;
    using iterator = std::istreambuf_iterator<char> ;
    return first && second &&
            std::equal( iterator(first), iterator(), iterator(second) ) ;
}
Topic archived. No new replies allowed.