Don't format, overwrite before you sell! - Recovering data from 2nd hand HDDs

People who sell their computer usually care about their privacy and don't forget to format the drives. However, often this is not enough and I'll try to explain why in this blogpost by showing how I recovered data from a HDD drive bought on ebay.

The fact that in certain circumstances formating a harddrive does not delete its data is nothing new, but I wanted to test this myself.

Buying a HDD

I've kept an eye on different used haddrive auctions on the probably most popular 2nd hand plattform for some days. Finally, I was lucky to get a cheap 250GB 3.5" SATA harddrive from Samsung. It took me a while, because I didn't want to buy an oversized 0.5TB or 1TB drive and I had to make sure that it's an used drive.

The auction's description said that the data had been "deleted" and the harddrive came with a label "formatted".

2nd hand 250GB harddrive

The setup

This part of the blogpost is a bit hardware-hacky-ish, so feel free to skip to the next one.

I only had one SATA-to-USB connector from an old external 2.5" harddrive left, but the USB port didn't deliver enough current to power the 3.5" harddrive.

Sata-to-USB connector I had to find another way. Luckily, I had a BananaPi with Archlinux laying around which has one SATA port.

The only missing thing was the power supply. I found an old PSU (power supply unit) in my closet, but one needs to use a trick to power up a PSU without a mainboard. The final setup looked like this:

HDD recovery setup
I used another 256GB 2.5" harddrive as the backup location for the recovered data.

The recovery

Reading the S.M.A.R.T values with smartctl reveals that the disk is quite old:

$> sudo smartctl -A /dev/sda | grep -i hours
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       9589

It came with a fresh NTFS partition:

$> sudo parted -l 
Model: ATA SAMSUNG HD252KJ (scsi)
Disk /dev/sda: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End    Size   Type     File system  Flags
 1      1049kB  250GB  250GB  primary  ntfs         boot

And it just showed an empty folder after mounting. A layman might think that all old data is now gone, but in reality creating a new filesystem will only overwrite the references to the files, but not actually the data.

I decided to use the photorec from the testdisk package. After starting the program you select the drive or partition to recover, a destination and then let the program do its work. Photorec ignores the (not existent) filesystem and tries to recover the files from the raw bits and bytes on the drive.

The whole recovery process took a couple of hours, but in the end photorec recovered ~175k files:

Photorec recovered 175k files

Changing into the recovery folder and manually counting the files gives a slightly bigger number.

$> cd /mnt/rescue/recovery

$> find . -type f > /tmp/filelist.txt

$> wc -l /tmp/filelist.txt 
181836 /tmp/filelist.txt

$> du -sh recovery/ 
155G	recovery/

I was able to recover 62% (155 GB) from the apparently cleaned device.

The most common file types by extension are:

$> grep -oP "\..{1,4}\$" /tmp/filelist.txt | sort | uniq -c | sort -rn | head -n 20 
  51641 .jpg
  47832 .txt
  13960 .dll
   8783 .java
   6721 .gif
   5658 .gz
   5147 .mp3
   4458 .DLL
   3994 .png
   3886 .exe
   3618 .mpg
   3369 .xml
   3207 .html
   2798 .sys
   2156 .h
   1806 .lnk
   1019 .swc
    989 .EXE
    822 .ttf
    689 .bmp

I didn't want to go through all 175 thousand files, so I first limited myself to the picture formats jpg, bmp and png.
Those files were mostly thumbnails, but also some hundred full resolution pics. After seeing some sensitive personal pictures I immediately stopped as I don't want to hurt anyone's privacy.

Interestingly, photorec was able to recover full seasons of some children's film.

It didn't work quite well with documents or at least libreoffice only showed gibberish. It also misclassified javascript and XML files as java files.

I did a quick grep for common keywords like username or password, but it only returned lots of javascript, xml, json or other not-immediately-interesting content. A more determined hacker might find more useful stuff.

Nevertheless it proves the point that the data hadn't been deleted properly.

How to do it right

To safely delete data it is enough to overwrite all contents of the harddrive with (random) data. The downside is that this might take a while.

If you're using Windows Vista or later, full format will overwrite all data.

If you're using Windows XP or earlier, you'll need a special tool.

If you're using Linux (on a Live-CD), you can use dd (replace sdX with the actual harddrive) to overwrite everything with zeroes:

$> dd if=/dev/zero of=/dev/sdX bs=4M

After that, nobody should be able to find any data on the drive. To make sure, you can run

$> dd if=/dev/sdX bs=4M | hexdump -C | less

The output should look similar to the following one:
A cleaned harddisk only contains zeroes
The first line shows only zeroes and the * in the second line indicates that the rest is also zero.

I bought two harddrives from the german Informationstechnikzentrum Bund (ITZBund). The picture above proves that they follow the IT Grundschutzkatalog and securely erase harddrives before they sell them.