A hard drive image is a complete exact copy of an entire hard drive. Sometimes it is called ghosting a drive or mirroring a drive. By using linux we can make, verify, and compress hard drive images easily. We can also then decompress, rewrite and verify that the hard drive image is correct.
By making an exact copy of the hard drive we can easily return to that point of time at a later date. For computer forensics this is a must. For computer repair it certainly is handy.
One might be tempted to use this method for backing up files. However because of the time involved creating an image and the fact the computer is not in use during this time make other methods much more feasible.
I ended up using Knoppix after playing around with several installations for imaging. Knoppix seemed to have a huge variety of system tools compared to other distributions and the big winner was that samba was installed on the liveCD so I could use network shares to write the needed files. I did end up with some networking issues but I fixed them quickly.
That being said just about any distribution should work as the tools being used are normal needed core utilities for linux. I would recommend staying away from smaller distributions as they tend to have busybox instead of the actual core utils. I came across some problems using the busybox version of dd on Damn Small Linux.
First you’ll need to download and burn off a Linux LiveCD. I used the Knoppix 5.1.1 CD. Next you’ll need to boot the computer you want to image using the LiveCD. Hopefully all goes well and the your now greeted with some sort of graphical user interface.
We don’t want any of that silliness. Find a command prompt as quickly as possible.
Now most of the following commands must be done as root. I’m going to assume that you are root and leave the sudo’s out of the commands. But if you are not root please become so or modify commands to have root privalidges. You may need to set or reset some passwords to gain access.
Once you do the work is fairly easy, but will take quite a long time.
The first step is to get an MD5 hash for the hard drive you want to image. An MD5 Hash is a cyptographic algorithym used for security and to check the integrity of files. We’ll need this to verify that we have properly made our image. Unfortunatly I skipped this step my first time which wasted a few days of my time. We are going to assume that your hard drive is set up as HDA so to get the md5 hash of the drive issue the command:
This will take awhile, and it may appear your computer is unresponsive during this time. Be patient. Eventually after what could be several hours depending on the size and speed of the drive you should get something resembling this:
But the characters that make up your string should be completely different. You’ll want to write that string down or be a smart guru and perhaps issue a > hda.md5 at the end of the above statement.
Now that you’ve got your filecheck ready you are ready to image the entire drive. The command is fairly simple:
dd if=/dev/hda of=/where/to/put/hda.img
Every time you use dd you’ll want to double check to make sure you have everything correct before hitting enter. One small mistake of switching your input file (if) and output file (of) and you don’t have a hard drive to speak of anymore. Please take care.
There’s actually a bit more we can do here. If we try to image a 40gb hard drive we will obviously need 40gbs of free space. However we can compress the image while it is generating and use far less. You should however count on using as much space as the hard drive holds. Even a 300 gig hard drive with only 5 gigs of files will require 300 gigs of space for an uncompressed image. Compression might be able to get the requirement down considerably but there is no way of easily knowing how much. In addition since the dd program copies bits on the low level it doesn’t know that the 295 gigs are unused. If this area happens to have a randomized pattern of bits it may not compress well.
But I’ve digressed. To compress the image while creating it forget the above dd command and issue:
dd if=/dev/hda | gzip > /where/to/put/hda.img.gz
Again this issue will take forever and a day and your computer may seem unresponsive. However we do have some hope for this function. If you issue the right command to the program it will display a basic status report. To do this first you’ll have to find the process ID which is easily found by typing:
ps -A | grep ‘dd’
This should post one line which has the process ID first. Now say this id is 1234. I can then issue the command:
kill -USR1 1234
The USR1 signal tells dd to just write a status output and keep on trucking. You should see something like:
3385223+0 records in
3385223+0 records out
1733234176 bytes (1.7 GB) copied, 6.42173 seconds, 270 MB/s
There are elaborate shell scripts to tweak dd to produce better status reports but I did just fine by just putting the above command in my second prompt and occasionaly walking by and hitting up and enter.
By now you should have wasted about a day of your life and have an image of the hard drive sitting somewhere safe. We first want to make sure the image is correct. If you gzip’ed the file then you’ll need to uncompress it. Otherwise simply take the md5sum of the image file again.
Again this will take a bit. After this number pops up verify that it matches the previous number exactly. If it does you should be safe. If it doesn’t something went wrong. Repeat the whole thing over again.
At this point you have an image of the entire drive. You are now free to do anything you want to the hard drive and can get back to where you started easily by putting the image back on the hard drive. To do this simply invoke the command:
dd if=/where/to/put/hda.img of=/dev/hda
Again we can use the USR1 trick to display status. Once this command is completed we can again verify the drive by taking the md5sum like we did in the very beggining. Provided this all matches up you should be able to remove your LiveCD and reboot the system right back to where you started.