Don't format, overwrite before you sell! - Recovering data from 2nd hand HDDs
People who sell their computer usually care about their privacy and don't forget to format the drives. However, often this is not enough and I'll try to explain why in this blogpost by showing how I recovered data from a HDD drive bought on ebay.
The fact that in certain circumstances formating a harddrive does not delete its data is nothing new, but I wanted to test this myself.
Buying a HDD
I've kept an eye on different used haddrive auctions on the probably most popular 2nd hand plattform ebay.com for some days. Finally, I was lucky to get a cheap 250GB 3.5" SATA harddrive from Samsung. It took me a while, because I didn't want to buy an oversized 0.5TB or 1TB drive and I had to make sure that it's an used drive.
The auction's description said that the data had been "deleted" and the harddrive came with a label "formatted".
The setup
This part of the blogpost is a bit hardware-hacky-ish, so feel free to skip to the next one.
I only had one SATA-to-USB connector from an old external 2.5" harddrive left, but the USB port didn't deliver enough current to power the 3.5" harddrive.
I had to find another way. Luckily, I had a BananaPi with Archlinux laying around which has one SATA port.
The only missing thing was the power supply. I found an old PSU (power supply unit) in my closet, but one needs to use a trick to power up a PSU without a mainboard. The final setup looked like this:
I used another 256GB 2.5" harddrive as the backup location for the recovered data.
The recovery
Reading the S.M.A.R.T values with smartctl reveals that the disk is quite old:
$> sudo smartctl -A /dev/sda | grep -i hours
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 9589
It came with a fresh NTFS partition:
$> sudo parted -l
Model: ATA SAMSUNG HD252KJ (scsi)
Disk /dev/sda: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1049kB 250GB 250GB primary ntfs boot
And it just showed an empty folder after mounting. A layman might think that all old data is now gone, but in reality creating a new filesystem will only overwrite the references to the files, but not actually the data.
I decided to use the photorec from the testdisk package. After starting the program you select the drive or partition to recover, a destination and then let the program do its work. Photorec ignores the (not existent) filesystem and tries to recover the files from the raw bits and bytes on the drive.
The whole recovery process took a couple of hours, but in the end photorec recovered ~175k files:
Changing into the recovery
folder and manually counting the files gives a slightly bigger number.
$> cd /mnt/rescue/recovery
$> find . -type f > /tmp/filelist.txt
$> wc -l /tmp/filelist.txt
181836 /tmp/filelist.txt
$> du -sh recovery/
155G recovery/
I was able to recover 62% (155 GB) from the apparently cleaned device.
The most common file types by extension are:
$> grep -oP "\..{1,4}\$" /tmp/filelist.txt | sort | uniq -c | sort -rn | head -n 20
51641 .jpg
47832 .txt
13960 .dll
8783 .java
6721 .gif
5658 .gz
5147 .mp3
4458 .DLL
3994 .png
3886 .exe
3618 .mpg
3369 .xml
3207 .html
2798 .sys
2156 .h
1806 .lnk
1019 .swc
989 .EXE
822 .ttf
689 .bmp
I didn't want to go through all 175 thousand files, so I first limited myself to the picture formats jpg
, bmp
and png
.
Those files were mostly thumbnails, but also some hundred full resolution pics. After seeing some sensitive personal pictures I immediately stopped as I don't want to hurt anyone's privacy.
Interestingly, photorec was able to recover full seasons of some children's film.
It didn't work quite well with documents or at least libreoffice only showed gibberish. It also misclassified javascript and XML files as java files.
I did a quick grep for common keywords like username
or password
, but it only returned lots of javascript, xml, json or other not-immediately-interesting content. A more determined hacker might find more useful stuff.
Nevertheless it proves the point that the data hadn't been deleted properly.
How to do it right
To safely delete data it is enough to overwrite all contents of the harddrive with (random) data. The downside is that this might take a while.
If you're using Windows Vista or later, full format
will overwrite all data.
If you're using Windows XP or earlier, you'll need a special tool.
If you're using Linux (on a Live-CD), you can use dd (replace sdX
with the actual harddrive) to overwrite everything with zeroes:
$> dd if=/dev/zero of=/dev/sdX bs=4M
After that, nobody should be able to find any data on the drive. To make sure, you can run
$> dd if=/dev/sdX bs=4M | hexdump -C | less
The output should look similar to the following one:
The first line shows only zeroes and the *
in the second line indicates that the rest is also zero.
I bought two harddrives from the german Informationstechnikzentrum Bund (ITZBund). The picture above proves that they follow the IT Grundschutzkatalog and securely erase harddrives before they sell them.
-=-