What I'm up to these days

Playing With Hard Disks for the Internet of the Future


Recently I was generously given an account on trash.town, a public Unix server made out of recycled computer parts. Trashtown is itself one of a handful or so of small self organized non-commercial online communities that seems to have formed here and there across the internet over the last decade or so. They're a touch nostalgic, hearkening back to the earlier days of geocities and BBS's, but they're also a bit radical. In our current age of hegemonic social media companies small scale and decentralized services can help us imagine alternative futures for internet infrastructure, both physical and social.

Of course trash.town isn't just decentralized, but it is made out of trash and that drew me to it in particular. I have over the years accumulated lots of computer parts that mostly end up just collecting dust on my shelf. It would be so cool if some of this stuff that I have just laying around could become a part of some kind of rad new internet. So that's why I've decided to donate something, and in particular at least one hard disk.

In the rest of this post I'll take you through the steps and experiments I went through while preparing a hard disk for shipping to a third party.

It is often said that it's essential that you wipe data from a disk before shipping to protect the privacy of the previous owner. I would like to demonstrate to the reader and myself why this is true, how to recover improperly wiped data, and finally how to properly wipe data and confirm that it is gone.


Introduction

The drive I am using in this posts is 80gb with two ntfs partitions. The first partition is a 100M boot partitions and the the second occupies the remaining space. The tools I use are partclone, dd, scalpel, fdisk and mkfs.btrfs.

If the reader wants to follow along as well just be careful, all these tools can cause data loss. Preferably use a hard drive you're not going to use, and be sure to change the drive letters from the examples to ones appropriate to your system.

Clone the partition

In order that I might reset back to the beginning in case of any mistakes I began by backing up both partitions. To do so I used partclone which allows for the quick creation of sparse backups of entire partitions.

# partclone.ntfs -c -s /dev/sdb2 -o ntfssdb2_clone.img

The good thing about partclone is that it automatically makes sparse backups, so even though /dev/sdb2 accounts for 80gb of the disk the clone was shrunk down to a smaller size, 23gb in my case.

To restore our image to the partition (if it still exists) we run the following command

# partclone.restore -s ntfssdb2_clone.img -o /dev/sdb2

Mess things up a little bit

Now we're gonna wipe our drive the wrong way by simply deleting our old partitions and reformatting the disk. After deleting both partitions with fdisk (although gnome-disks or some other software is fine) I created a single /dev/sdb1 partition taking up the whole disk and formatted it to btrfs

## we format the partition we created previously
# mkfs.btrfs -L weirdolddisk /dev/sdb1

We can now mount the disk and poke around and confirm that nothing seems to be there.

# mount -t btrfs -o ro /dev/sdb1 /tmp/mnt
# ls /tmp/mnt

## that line is meant to be blank, there is nothing there.
# umount /tmp/mnt

Data Carving With Scalpel

All the data from before should more or less still be on the disk, even if we can't see it. In order to recover it we need to use a type of tool called a data carver. A data carver will read data from a disk and look for tell tale signs of well known file types, e.g. jpeg header and footers. Once found the data carver can attempt to recover the lost file and write it to an external storage medium.

For this experiment I used a data carving tool called scalpel. Scalpel needs to be configured as to which file fingerprints it should be looking for and attempting to recover. I am not after anything in particular so I just went with jpeg and gif files. Uncommenting the following lines of the default config in /etc/scalpel/scalpel.conf achieved the desired result.

# GIF and JPG files (very common)
gif y   5000000     \x47\x49\x46\x38\x37\x61    \x00\x3b
gif y   5000000     \x47\x49\x46\x38\x39\x61    \x00\x3b
jpg y   5242880     \xff\xd8\xff???Exif     \xff\xd9    REVERSE
jpg y   5242880     \xff\xd8\xff???JFIF     \xff\xd9    REVERSE

Once the configuration file is setup the invocation for scalpel is very simple

# scalpel /dev/sdb

After a bit files should start showing up in ./scalpel-ouput. Poking around at the images there I saw mostly icons and other windows assets as well as some vacation photos from over a decade ago.

Destroy the Data

Now that we know scalpel and other data carving tools can easily recover data even after formatting, how do we go about ensuring that our data is unrecoverable in the future? We explicitly overwrite the data on the disk with garbage and dd is the standard Unix tool for reading and writing just about anything to a disk, so we will use that.

# # This command will destroy all data on your drive
# dd if=/dev/zero of=/dev/sdb bs=32M status=progress

Linux users over the years have used variations of this command in attempts to troll one another into destroying their drives and as such there's something spooky about running it, but now lets overcome our fear together by understanding it in detail.

if=/dev/zero/ specifies the input file as the block device containing all 0's.

of=/dev/sdb says we want to write all those zeros to our disk.

bs=32M increases the size of the blocks that dd writes. A larger block size leads to faster writes up to a point. This one took about seven minutes for 80gb on my machine.

status=progress adds a nice progress bar to the output of dd. No longer do users need to nervously send signal USR1 every few seconds to dd's pid to confirm that nothing is going awry with their writes.

Once our command finishes our drive will only contain 0's. But how do we know? What if we're really nervous and we'd just like to check and make sure that really nothing is there? dd is the swiss army knife disks and can be used to read arbitrary blocks straight from the disk. We can use it to read only the first 512 bytes from our drive and then display them with the hexdump utility xxd to boost our confidence that our operation was a success.

# dd if=/dev/sdb bs=512 count=1 | xxd
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000110: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000140: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000170: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Yup, all 0's.

Conclusion

I'm quite happy with how the whole experiment went. I feel more confident slicing and copying bits around disks than when I started.

As for the political dimensions of my mundane act of bit pushing and recycling. I probably can't emphasize enough how small it is in terms of impact. But in valence I really dig it. There should be no confusion that technology is something that arises out of civilization, and has all the attendant political and moral dimensions of any other social process. A decentralized internet of nice weirdos is an increasingly salient alternative to the current internet technology landscape, that holds interesting political potentialities.