My data-recovery story

I was looking through wayback machine at snapshots of my website, when I came across one from 2005. It reminded me of something I had almost forgotten. At some point in 2005, the network card in my FreeBSD server started to die. I got myself a new card and set about replacing the dying one. I can’t recall why anymore, but I guess I had needed to disconnect the hard-drive at some point. I remember that after I plugged it back in and booted up, I was greeted by a screenful of terrifying error-messages. Something horrible had happened to the drive that held my home directory, my website source-code, and my database. I had lost about 6 years worth of posts and images on my website. My first instinct was to power down the machine to prevent anything more being written to the drive, which I immediately did. After that I think I tried a bunch of disk-recovery tools to try and recover my data. But this was difficult because the filesystem was UFS. I can’t remember if there were any UFS recovery tools at the time, or if I tried them, but I remember having tried almost everything I could think of.

Out of desperation, I think I finally decided to use dd. I started dumping the data from the drive using the lowest size-setting possible in dd (I want to say it is a byte, but I don’t really remember). I then piped this into a perl script that would examine each byte, looking for magic numbers. The drive had been corrupted so badly that there wasn’t even any trace of a coherent filesystem anymore. I knew that the data I was getting were most-probably fragmented, but I didn’t care at this point. I would guess the file-type by looking for magic numbers, and then I would start dumping that data into a file until I found an ending marker, or if the file-type didn’t have one, until the start of another magic number. I remember having various settings in the script so that I could tune its behavior, especially when dealing with false positives. My priority was to retrieve my pictures, website, programming projects, and database. For my source-code I only had to look for ASCII data. For pictures I looked for file markers for JPG, PNG, and GIFs. The database was difficult though, because I was using MySQL. By sheer chance, I had decided to take a SQL dump of my website’s database the day before for backup purposes (ironically, on the very drive that would die the next day). This was ASCII data, and so it was one of the first things my script found.

I ran this script over a couple of hours I think, and then for most of the next day for good measure. Then I began the tedious process of sifting through these files, weeding out false positives. All said and done, I retrieved a good chunk of my data. I think I got back around 80% of my pictures, and almost all of my code and website source. It was a scary few days, but I’m glad that my desperation drove me to try something like this!