Tuesday, February 24, 2015

Automating DFIR - How to series on programming libtsk with python Part 6

Hello Reader,
         I really hope you've read all the prior posts in this series because it just keeps building from here! Here are the previous parts if you need to refer to them, each contains the knowledge needed to understand what we talk about in this post.

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator

Following this post the series continues:

Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system 
Part 12 - Accessing different file systems
Part 13 - Accessing Volume Shadow Copies  

What you'll need for this part:

1. Download the sample E01 located here: https://mega.co.nz/#!uhgQzZpL!F9aoPo6pZ_m9cKpYoK5ND_NY26GjCg7YKS60StVrl98

What you should already have installed at this point (from part1):

1. Python 2.7 32bit
2. pytsk
3. pyewf

Accessing an E01 image and extracting a file

The time has arrived and now that we've explained a lot of python constructs and knowledge for creating these DFIR Automation scripts we are prepared to talk about libewf and it's python binding pyewf.

Now libtsk and it's python wrapper pytsk is no slouch, it can open the following image formats:

  • Single Raw Images
  • Split Raw Images
  • Single VMDK's but not their snapshots. Full VMDK support is available from pyvmdk
  • Single VHD, full VHD support is available from pyvhd
  • Live disks

With pyewf library we can access the following image formats:

  • Single E01 or Expert Witness Format images
  • Split E01 image
  • Compressed Raw Images (aka smart format .S01)
  • Single non encrypted Ex01 v1 images
  • Split non encrypted Ex01 v1 images
  • non encrypted Lx01 v1 images
  • L01 images
There is another library for AFF image access but that is a topic for another post.

So the first thing we need to do is import the pyewf library using the import command you should be very familiar with by now

import pyewf

The next thing we need to do is gather up all the possible parts of the image we want to load. Most examiners create multi-part images in the process of their forensic imaging. The multi-part imaging preference first came around because we wanted to archive images to media such as CDs or DVDs but there is another nice thing about multi-part images. You can hash all the parts of a multi part image to verify the image consistency on a per part basis rather than having to hash the contents of the image itself. This is especially useful when you are copying the image to a new drive and you want to know which image segment didn't copy over correctly, then comparing the segement hash will let you replace a single bad segment rather than copying over the whole image again.

pyewf gives us a handy method to gather up all the sequential parts of a multi-part image with a single function named glob. The glob function will take the file name given and then following the rules for how multi-part image extensions are sequentially named (E01-EZZ) it will load the full list into an array that is returned. 

filenames = pyewf.glob("SSFCC-Level5.E01")



Here you can see we are storing the result of glob for our example E01 in a variable named filenames. Our example image is a single image segment but our code will work for both single and split image files. The next thing we need to is open up a handle to our image. We do this first by creating a handle object using handle() and storing it in a variable. In the code below we are calling handle and storing the result in the variable ewf_handle.


ewf_handle = pyewf.handle()

Next we need to use this new object to open up our image. We use the open function contained within the handle object to do so. In the code below we are calling the open method stored within ewf_handle on the filenames variable we made. 

ewf_handle.open(filenames)

Now when we used pytsk we next needed to create an Img_Info object, and we still do. However since pytsk does not support pyewf we are going to use pyewf to do this for us. We are calling the function ewf_Img_info here, passing in our ewf_handle object and storing the result in our imagehandle variable as seen below:

imagehandle = ewf_Img_Info(ewf_handle)

Now ewf_Img_Info is not something provided by the pyewf library. Instead it's a new class we are creating in our program that is based on pytsk's Img_Info object but extends it to handle the pyewf supported formats. So to do this we need to create a class named ewf_Img_Info and declare that it inherits the base classes of pytsk's Img_Info. 

class ewf_Img_Info(pytsk3.Img_Info):

The class specified in () after ewf_Img_Info is the class we are inheriting. Next we need to create a constructor for our class so that we invoke ewf_Img_Info it can prepare a libtsk compatible object that we can use going forward. In looks something like this:

def __init__(self, ewf_handle):
    self._ewf_handle = ewf_handle
    super(ewf_Img_Info, self).__init__(
        url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)

Note the class we are building here is taken from the example provided at: https://github.com/libyal/libewf/wiki/Development 


So the first thing we are doing is creating our constructor function named __init__ which takes as a parameter itself and the ewf_handle object. The next thing we are doing is taking a reference to ewf_handle and storing it within the class as self._ewf_handle. Lastly we are using a neat function called Super to call the constructor of the class we are inheriting, which looks like super(paramaters to pass in).__init__ which is the constructor of the parent class, which in this case the parent class is pytsk3.Img_Info. Super can call any class in the parent which becomes important when we get to method overriding below. Once we call the parent constructor we need to pass it two variables, the url and the type. The url is set to "" and in this case the type TSK_IMG_TYPE_EXTERNAL. For a full list of image types you can pass into Img_Info go here and look at the enumerations section: http://www.sleuthkit.org/sleuthkit/docs/api-docs/tsk__img_8h.html

Great now our constructor has built an Img_Info object that is based on the pytsk Img_Info class so the object will be compatible with all the other pytsk functions we've called before. Now that we have a Img_Info object you might think we are done, but we need to do one more thing. We need to override the functions provided by Img_Info for closing the handle to the image,  reading data from an image and getting the size of the media contained within the image. If we used the base pytsk Img_Info functions they would fail as they do not understand the image formats that pyewf handles for us. So to override just those functions we just need to declare them within our new ewf_Img_Info class as follows:

  def close(self):
    self._ewf_handle.close()

The close function defined above will cal the ewf_handle object's close method instead of Img_Info's close method. 

  def read(self, offset, size):
    self._ewf_handle.seek(offset)
    return self._ewf_handle.read(size)

The read function defined above takes the offset of where to start reading and the total amount to read like the standard Img_Info method would, but it does the reading using the ewf_handle object's version of seek and read. It then returns the data read by ewf_handle's read function to whomever called it. 

  def get_size(self):
    return self._ewf_handle.get_media_size()

The get_size function is using get_media_size from the ewf_handle object to return the total size of the media rather than the standard Img_Info get_size method. 

There we go, with those functions now defined we've created a pytsk compatible object that we can now pass into the rest of our code from part 3 as if we were dealing with a native pytsk support image format. The complete code follows: 

#!/usr/bin/python
# Sample program or step 5 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
     
class ewf_Img_Info(pytsk3.Img_Info):
  def __init__(self, ewf_handle):
    self._ewf_handle = ewf_handle
    super(ewf_Img_Info, self).__init__(
        url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
  def close(self):
    self._ewf_handle.close()
  def read(self, offset, size):
    self._ewf_handle.seek(offset)
    return self._ewf_handle.read(size)
  def get_size(self):
    return self._ewf_handle.get_media_size()

filenames = pyewf.glob("SSFCC-Level5.E01")
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
  if 'NTFS' in partition.desc:
    filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
    fileobject = filesystemObject.open("/$MFT")
    print "File Inode:",fileobject.info.meta.addr
    print "File Name:",fileobject.info.name.name
    print "File Creation Time:",datetime.datetime.fromtimestamp(fileobject.info.meta.crtime).strftime('%Y-%m-%d %H:%M:%S')
    outFileName = str(partition.addr)+fileobject.info.name.name
    print outFileName
    outfile = open(outFileName, 'w')
    filedata = fileobject.read_random(0,fileobject.info.meta.size)
    outfile.write(filedata)
    outfile.close
You can grab this code from our series Github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v5.py

You may have noticed I dropped the code we added in part 5 for automatic elevation. That is because we do not need to run as administrator to access an image file, and if you don't need administrative privileges your code shouldn't run with them. In part 7 of our series we will cover taking in command line parameters so you don't have to hard code in your image file names after which we will move on to hashing and recursing through file systems.