Monday, May 25, 2015

Automating DFIR - How to series on programming libtsk with python Part 13

Hello Reader,
          This is part 13 of a many planned part series. I don't want to commit myself to listing all the planned parts as I feel that curses me to never finish. We've come a long way since the beginning of the series and in this part we solve one of the most persistent issues most of us have, getting easy access to volume shadow copies.

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system
Part 12 - Recursively through multiple file system types

What you will need for this part:

You will need to download and install the pyvshadow MSI found here for Windows. This is the Python library that provides us with volume shadow access. It binds to the libvshadow library:
https://github.com/log2timeline/l2tbinaries/blob/master/win32/pyvshadow-20150106.1.win32-py2.7.msi

The vss helper library I found on the plaso project website:
https://github.com/dlcowen/dfirwizard/blob/master/vss.py

You will need to download the following sample image that has volume shadows on it:
https://mega.nz/#!LlRFjbZJ!s0k263ZqKSw_TBz_xOl1m11cs2RhIIDPoZUFt5FuBgc Note this is a 4GB rar file that when uncompressed will become a 25GB raw image

The Concept:

Volume shadow copes are different than any other file system we've deal with this far. The volume shadow subsystem wraps around a NTFS partition (I'm not sure how it will be implemented in REFS) and stores within itself a differential cluster based backup of changes that occur on the disk. This means that all views our tools give us of the data contained with the volume shadow copies are emulated file systems views based on the database of differential cluster map stored within the volume shadow system. This is what the libvshadow library does for us, it parses the differential database end creates a view of the data contained within it as a file system that can be parsed. 

As with all things technical there is a catch. The base libvshadow was made to work with raw images only. There is of course a way around this, already coded into dfvfs, that I am testing for another part so we can access shadow copies from other types of forensic images.


As per Joachim Metz:
1. VSC is a large subsystem in Windows that can even store file copies on servers2. VSS (volsnap on-disk storage) is not a file system but a volume system it lies below NTFS. I opt to read: https://googledrive.com/host/0B3fBvzttpiiSZDZXRFVMdnZCeHc/Paper%20-%20Windowless%20Shadow%20Snapshots.pdf
Which means I am grossly oversimplifying things. In short I am describing how a volume system that exists within a NTFS volume is being interpreted as a file system for you by our forensic tools. Do not confuse VSS for actual complete volumes, they are differential cluster based records.

The Code:

The first thing we have to do is import the new library you just downloaded and installed as well as the helper class found on the Github.


import vss
import pyvshadow

Next since our code is NTFS specific we need to extend our last multi filesystem script to now to do something special if it detects that its accessing a NTFS partition:

print "File System Type Detected .",filesystemObject.info.ftype,"."
  if (str(filesystemObject.info.ftype) == "TSK_FS_TYPE_NTFS_DETECT"):
    print "NTFS DETECTED"
We do this as seen above by comparing the string "TSK_FS_TYPE_NTFS_DETECT" to the converted enumerated value contained in filesystemObject.info.ftype. If the values match then we are dealing with a NTFS partition that we can test to see if it has volume shadow copies.

To do the test are going to start using our new libraries:

    volume = pyvshadow.volume()
    offset=(partition.start*512)
    fh = vss.VShadowVolume(args.imagefile, offset)
    count = vss.GetVssStoreCount(args.imagefile, offset)

First we create a pyshadow object by calling the volume function constructor and storing the object in the variable named volume. Next we are declaring the offset again to the beginning of our detected NTFS partition. Next we are going to use our vss helper library for two different purposes. The first is to get a volume object that we can work with in getting access to the volume shadows stored within. When we call the VShadowVolume function in our helper vss class we pass it two arguments. The first is the name of the raw image we passed in to the program and the offset to the beginning of the NTFS partition stored within the image. 

The second vss helper function we call is GetVssStoreCount which takes the same two arguments but does something very different. The function as the name implies returns the number of shadow copies that are present on the NTFS partition. The returned value starts at a count of 1 but the actual index of shadow copies starts at 0. Meaning whatever value is returned here we will have to treat as count -1 for our code. The other thing to know, based on my testing, you may get more volume shadows returned than are available on the native system. This is because when libvshadow parses the database it shows all available instances, including those volume shadow copies that have been deleted by the user or system but still partially exist within the database.

Not all NTFS partitions contain volume shadow copies so we need to test the count variable to see if it contains a result:

    if (count):
      vstore=0
      volume.open_file_object(fh)

If it does contain a result (the function will set the count variable to undefined if there are no volume shadows on the partition) then we need to do two things first. The first is to keep track of which volume shadow we are working with which always begins with 0. Next we need to take our pyvshadow volume object and have it access the volume object named fh into the volume we created in the code above.

Once we have the pyvshadow volume object working with our vss helper class fh volume object we can start getting to the good stuff:

      while (vstore < count):
        store = volume.get_store(vstore)
        img = vss.VShadowImgInfo(store)

We are using a while loop here to iterate through all of the available shadow copies using the n -1 logic we discussed before (meaning that there is starting from 0 count -1 copies to go through). Next we are going to get the pyvshadow volume object to return a view of the volume shadow copy we are working with (whatever value vstore is currently set to) from the volume_get_store function and store it in the 'store' variable. 

We will then take the object stored in the 'store' variable and call the vss helper class function VShadowImgInfo which will return a imgInfo object that we can pass into pytsk and will work with our existing code:

        vssfilesystemObject = pytsk3.FS_Info(img)
        vssdirectoryObject = vssfilesystemObject.open_dir(path=dirPath)
        print "Directory:","vss",str(vstore),dirPath
        directoryRecurse(vssdirectoryObject,['vss',str(vstore)])
        vstore = vstore + 1

So we are now working with our volume shadow copy as we would any other pytsk object. We are changing what we print now in the directory line to include which shadow copy we are working with. Next we are changing the parentPath variable we pass into the directoryRecurse function from [] or an empty list we used in our prior examples to a list which includes two members. The first member is the string 'vss' and the second is the string version of which shadow copy we are currently about to iterate and search. This is important so that we can uniquely export out each file that matches the search expression without overwriting the last file exported in prior shadow copy. 

Lastly we need to increment the vstore value for the next round of the while loop.

Only one more thing to do. We need to search the actual file system on this NTFS partition and define an else condition to continue to handle all the other file systems that our libtsk version supports:

      #Capture the live volume
      directoryObject = filesystemObject.open_dir(path=dirPath)
      print "Directory:",dirPath
      directoryRecurse(directoryObject,[])
  else:
      directoryObject = filesystemObject.open_dir(path=dirPath)
      print "Directory:",dirPath
      directoryRecurse(directoryObject,[])

We don't have to change any other code! That's it! We now have a program that will search, hash and export like before but now has volume shadow access without any other programs, procedures or drivers.

Running the program against the sample image looks like this:
E:\development>python dfirwizard-v12.py -i image.dd -t raw -o rtf.cev -e  -s .*r
tf
Search Term Provided .*rtf
Raw Type
0 Primary Table (#0) 0s(0) 1
Partition has no supported file system
1 Unallocated 0s(0) 2048
Partition has no supported file system
2 NTFS (0x07) 2048s(1048576) 204800
File System Type Dectected . TSK_FS_TYPE_NTFS_DETECT .
NTFS DETECTED
WARNING:root:Error while trying to read VSS information: pyvshadow_volume_open_f
ile_object: unable to open volume. libvshadow_io_handle_read_volume_header: inva
lid volume identifier.
3 NTFS (0x07) 206848s(105906176) 52219904
File System Type Dectected . TSK_FS_TYPE_NTFS_DETECT .
NTFS DETECTED
Directory: vss 0 /
Directory: vss 1 /
Directory: vss 2 /
Directory: /
4 Unallocated 52426752s(26842497024) 2048
Partition has no supported file system

The resulting exported data directory structure looks like this: 
You can access the Github of all the code here: https://github.com/dlcowen/dfirwizard

and the code needed for this post here:
and

Next part lets try to do this on a live system!

Sunday, May 10, 2015

Automating DFIR - How to series on programming libtsk with python Part 12

Hello Reader,
      How has a month passed since this last entry? To those reading this I do intend to continue this series for the foreseeable future so don't give up! In this part we will talk about accessing non NTFS partitions and the file system support available to you in pytsk. 

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system 

Following this post the series continues

Part 13 - Accessing Volume Shadow Copies 

In this series so far we've focused on NTFS because that's where most of us spend our time investigating. However, the world is not Windows alone and luckily for us the sleuthkit libraries that pytsk binds to is way ahead of us. Here is the full list of file systems that the sleuthkit library supports:

  • NTFS 
  • FAT12 
  • FAT16 
  • FAT32 
  • exFAT
  • UFS1 (FreeBSD, OpenBSD, BSDI ...)
  • UFS1b (Solaris - has no type)
  • UFS2 - FreeBSD, NetBSD.
  • Ext2 
  • Ext3 
  • SWAP 
  • RAW 
  • ISO9660 
  • HFS 
  • Ext4 
  • YAFFS2 


                                Now that is what the current version of the sleuthkit supports, pytsk3 however is compiled against an older version. I've tested the following file systems to have worked with pytsk3
                                • NTFS
                                • FAT12
                                • FAT16
                                • FAT32
                                • EXT2
                                • EXT3
                                • EXT4
                                • HFS
                                Based on my testing I know that ExFAT does not appear to be supported in this binding and the testing of reader Hans-Peter Merkel I know that YAFFS2 is also not currently supported. I'm sure in the future when the binding is updated these file systems will then come into scope. 

                                So now that we know what we can expect to work let's change our code from part 10 to work against any supported file system, and allow it to work with multiple image types to boot. 

                                What you will need:
                                I'm using a couple of sample images from the CFReDS project for this part as they have a different partition for different variants of the same file system type. For instance the ext sample image we will use has ext2, ext3, and ext4 partitions all on one small image. Pretty handy! 

                                The first thing we will need to change is how we are opening our images to support multiple image types. We are going to be working with raw and e01 images all the time and keeping two separate programs to work with each seems dumb. Let's change our code to allow us to specify which kind of image we are working with and in the future we may automate that as well!

                                We need to add a new required command line option where we specify the type of image we are going to be working with:

                                argparser.add_argument(        '-t', '--type',
                                        dest='imagetype',
                                        action="store",
                                        type=str,
                                        default=False,
                                        required=True,
                                        help='Specify image type e01 or raw'
                                    )



                                We are defining a new flag (-t or --type) to pass in the type of image we are dealing with. We are then storing our input into the variable imagetype and making this now required. 

                                Now we need to test our input and call the proper Image Info class to deal with it. First let's deal with the e01 format. We are going to move all the pyewf specific code into this if block:
                                if (args.imagetype == "e01"):  filenames = pyewf.glob(args.imagefile)  ewf_handle = pyewf.handle()  ewf_handle.open(filenames)

                                Next we are going to define the code to work with raw images:

                                elif (args.imagetype == "raw"):    print "Raw Type"
                                    imagehandle = pytsk3.Img_Info(url=args.imagefile)



                                One last big change to make and all the rest of our code will work:
                                for partition in partitionTable:  print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len  try:        filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))  except:          print "Partition has no supported file system"          continue  print "File System Type Dectected ",filesystemObject.info.ftype

                                We are moving our FS_Info call to open the file system into a try/except block so that if the partition type is not supported our program won't exit on error. Why do we have to test each partition? Because 1. We can't trust the partition description to always tell us what file system is in it, Windows 8 for instance changed them, and the only method tsk makes available to us to determine the file system is within the FS_Info Class. So we will see if pytsk supports opening any partition we find and if it does not we will print an error to the user. It we do support the file system type then we will print the type detected and the rest of our code will work with no changes needed!

                                Le's see what this looks like on each of the example images I linked at the beginning of the post.

                                FAT:
                                E:\development>python dfirwizard-v11.py -i dfr-01-fat.dd -t rawRaw Type0 Primary Table (#0) 0s(0) 1Partition has no supported file system1 Unallocated 0s(0) 128Partition has no supported file system2 DOS FAT12 (0x01) 128s(65536) 16384File System Type Dectected  TSK_FS_TYPE_FAT16Directory: /3 DOS FAT16 (0x06) 16512s(8454144) 65536File System Type Dectected  TSK_FS_TYPE_FAT16Directory: /4 Win95 FAT32 (0x0b) 82048s(42008576) 131072File System Type Dectected  TSK_FS_TYPE_FAT32Directory: /5 Unallocated 213120s(109117440) 1884033Partition has no supported file system

                                Ext:
                                E:\development>python dfirwizard-v11.py -i dfr-01-ext.dd -t raw
                                Raw Type
                                0 Primary Table (#0) 0s(0) 1
                                Partition has no supported file system
                                1 Unallocated 0s(0) 61
                                Partition has no supported file system
                                2 Linux (0x83) 61s(31232) 651175
                                File System Type Dectected  TSK_FS_TYPE_EXT2
                                Directory: /
                                Cannot retrieve type of Bellatrix.txt
                                3 Linux (0x83) 651236s(333432832) 651236
                                File System Type Dectected  TSK_FS_TYPE_EXT3
                                Directory: /
                                4 Linux (0x83) 1302472s(666865664) 651236
                                File System Type Dectected  TSK_FS_TYPE_EXT4
                                Directory: /
                                5 Unallocated 1953708s(1000298496) 143445
                                Partition has no supported file system

                                HFS:
                                E:\development>python dfirwizard-v11.py -i dfr-01-osx.dd -t raw
                                Raw Type
                                0 Safety Table 0s(0) 1
                                Partition has no supported file system
                                1 Unallocated 0s(0) 40
                                Partition has no supported file system
                                2 GPT Header 1s(512) 1
                                Partition has no supported file system
                                3 Partition Table 2s(1024) 32
                                Partition has no supported file system
                                4 osx 40s(20480) 524360
                                File System Type Dectected  TSK_FS_TYPE_HFS_DETECT
                                Directory: /
                                5 osxj 524400s(268492800) 524288
                                File System Type Dectected  TSK_FS_TYPE_HFS_DETECT
                                Directory: /
                                6 osxcj 1048688s(536928256) 524288
                                File System Type Dectected  TSK_FS_TYPE_HFS_DETECT
                                Directory: /
                                7 osxc 1572976s(805363712) 524144
                                File System Type Dectected  TSK_FS_TYPE_HFS_DETECT
                                Directory: /
                                8 Unallocated 2097120s(1073725440) 34
                                Partition has no supported file system

                                There you go! We can now search, hash and extract from most things that come our way. In the next post we are going back to Windows and dealing with Volume Shadow Copies!

                                Follow the github repo here: https://github.com/dlcowen/dfirwizard



                                Sunday, April 26, 2015

                                National CCDC 2015 Red Team Debrief

                                Hello Reader,
                                              Here is this years Red Team debrief. If you have questions please leave them below.

                                https://drive.google.com/file/d/0B_mjsPB8uKOAWnY5ZERHX0RUWEU/view?usp=sharing

                                Friday, April 3, 2015

                                Forensic Lunch 4/3/15 - Devon Kerr - WMI and DFIR and Automating DFIR

                                Hello Reader,

                                We had another great Forensic Lunch!

                                Guests this week:
                                Devon Kerr talking about his work at Mandiant/Fireeye and his research into WMI for both IR and attacker usage.


                                Matthew and I going into the Automating DFIR series and our upcoming talk at CEIC
                                You can watch the show on Youtube:  https://www.youtube.com/watch?v=y-xtRkwaP2g

                                or below!


                                Sunday, March 22, 2015

                                Automating DFIR - How to series on programming libtsk with python Part 11

                                Hello Reader,
                                      I had a bit of a break thanks to a long overdue vacation but I'm back and the code I'll be talking about today has been up on the github repository for almost 3 weeks, so if you ever want to get ahead go there as I write the code before I try to explain it! Github repository is here: https://github.com/dlcowen/dfirwizard

                                Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

                                Part 1 - Accessing an image and printing the partition table
                                Part 2 - Extracting a file from an image
                                Part 3  - Extracting a file from a live system
                                Part 4 - Turning a python script into a windows executable
                                Part 5 - Auto escalating your python script to administrator
                                Part 6 - Accessing an E01 image and extracting files
                                Part 7 - Taking in command line options with argparse to specify an image
                                Part 8 - Hashing a file stored in a forensic image
                                Part 9 - Recursively hashing all the files in an image
                                Part 10 - Recursively searching for files and extracting them from an image

                                Following this post the series continues:

                                Part 12 - Accessing different file systems
                                Part 13 - Accessing Volume Shadow Copies  

                                In this post we are going to augment the script on part 10 which went through an image and search/extracted files from all the NTFS partition in an image, and now we are going to do the same against all the NTFS partitions on a live system. You can obviously tweak this for any other file system but we will get to that in later posts in this series.

                                The first thing we need is a way to figure out what partitions exist on a live system in cross platform way so our future code can be tweaked to run anywhere. For this I choose the python library psutil which can provide a wealth of information about a system its running on, included information about available disks and partitions, you can read all about it here: https://pypi.python.org/pypi/psutil

                                To bring it into our program we need to call the import function again:

                                import psutil

                                and then because we are going to work against a live running system again we need our old buddy admin

                                import admin

                                which if you remember from part 5 will auto escalate our script to administrator just in case we forgot to run it as such.

                                We are going to strip out the functions we need to find all the parts of a forensic image and replace it with out code to test for administrative access:

                                if not admin.isUserAdmin():
                                  admin.runAsAdmin()
                                  sys.exit()
                                Next we replace the functions we called to get a partition table from a forensic image with a call to psutil to return a listing of paritions and iterate through them. The code looks like the following which I will explain:

                                partitionList = psutil.disk_partitions()
                                for partition in partitionList:
                                  imagehandle = pytsk3.Img_Info('\\\\.\\'+partition.device.strip("\\"))
                                  if 'NTFS' in partition.fstype:

                                So here instead of calling pytsk for a partition table we are calling psutil.disk_partitions which will return a list of partitions that are available to the local system. I much prefer this method than trying to iterate through all volume letters as we will get back just those partitions available as well as what file system they are seen as running as. Our list of active partitions will be stored in the varaible partitionList. Next we will iterate through the partitions using the for operator storing each partition returned into the partition variable. Next we are creating a pytsk3 Img_Info object for each partition returned but only continuing if psutil recognized the partition is NTFS.

                                The next thing we are changing is our try catch blog in our recursive directory function. Why? I found in my testing that live systems react much differently than forensic images in setting certain values in libtsk. So rather than using entryObject.info.meta.type to determine if I'm dealing with a regular file I am using entryObject.info.name.type which seem to always be set regardless if its a live system or a forensic image. I'm testing to see if I can capture the type of the file and it's size here as there are a lot of interesting special files that only appear at run time that will throw an error if you try to get their size.

                                try:
                                        f_type = entryObject.info.name.type
                                        size = entryObject.info.meta.size
                                      except Exception as error:
                                          print "Cannot retrieve type or size of",entryObject.info.name.name
                                          print error.message
                                          continue

                                So in the above code I'm getting the type of file (lnk, regular, etc..) and it's size and if I can't I'm handling the error and printing out the error before continuing on. You will see errors, live systems are an interesting place to do forensics.

                                I am now going to make a change I alluded to earlier on in the series. We are going to buffer out reads and writes so we don't crash our of our program because we are trying to read a massive file into memory. This wasn't a problem in our first examples as we were working from small test images I made before, but now they we are dealing with real systems and real data we need to handle our data with care.

                                Our code looks as follows:

                                            BUFF_SIZE = 1024 * 1024
                                            offset=0
                                            md5hash = hashlib.md5()
                                            sha1hash = hashlib.sha1()
                                            if args.extract == True:
                                                  if not os.path.exists(outputPath):
                                                    os.makedirs(outputPath)
                                                  extractFile = open(outputPath+entryObject.info.name.name,'w')
                                            while offset < entryObject.info.meta.size:
                                                available_to_read = min(BUFF_SIZE, entryObject.info.meta.size - offset)
                                                filedata = entryObject.read_random(offset,available_to_read)
                                                md5hash.update(filedata)
                                                sha1hash.update(filedata)
                                                offset += len(filedata)
                                                if args.extract == True:
                                                  extractFile.write(filedata)

                                            if args.extract == True:
                                                extractFile.close

                                First we need to determine how much data we want to read or write at one time from a file. I've copied several other examples I've found and I'm setting that amount to 1 meg of data at a time by setting the variable BUFF_SIZE equal to 1024*1024 or one megabyte. Next we need to keep track of where we are in the file we are dealing with, we do that by creating a new variable called offset and setting the offset to 0 to start with.

                                You'll notice that we are creating our hash objects, directories and file handles before we read in any data. That is because we want to do all of these things one time prior to iterating through the contents of a file. If a file is a gigabyte in size then our function will be called 1,024 times and we just want one hash and one output file to be created.

                                Next we starting a while loop which will continue to execute until our offset is greater or equal to the size of our file, meaning we've read all the data within it. Now files are not guaranteed to be allocated in 1 meg chunks, so to deal with that we are going to take advantage of a python function called min. Min returns the smaller of to values presented which in our code is the size of the buffer compared to the remaining data left to read (the size of the file - our current offset). Whichever value is smaller will be stored in the variable available_to_read.

                                After we know how much data we want to read in this execution of our while loop we are going to read it as before from our entryObject passing in the offset to start from and how much data to read, storing the data read into the variable filedata. We are then calling the update function provided by our hashing objects. One of the nice things our the hashlibs provided by python is that if you provide additional data to an already instantiated object it will just continue to build the hash rather than having to read it all in at once.

                                Next we are incrementing our offset by adding to itself the length of data we just read so we will skip past it on the next while loop execution. Finally we write the data out to our output file if we elected to extract the files we are searching for.

                                I've added one last bit of code to help me catch any other weirdness that may seep through.

                                        else:
                                          print "This went wrong",entryObject.info.name.name,f_type

                                An else to look for any condition that does not match one of existing if statements.

                                That's it! You now have a super DFIR Wizard program that will go through all the active NTFS partitions on a running system and pull out and hash whatever files you want!

                                You can find the complete code here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v10.py

                                In the next post we will talk about parsing partitions types other than NTFS and then go into volume shadow copy access!

                                Friday, March 20, 2015

                                Forensic Lunch 3/20/15 - James Carder and Eric Zimmerman

                                Hello Reader!,
                                           We had another great Forensic Lunch! This broadcast we had:

                                James Carder of the Mayo Clinic, @carderjames, talking all about automating your response process to separate the random attacks from sophisticated attacks. You can hear James talk about this and much more at the SANS DFIR Summit where he'll be a panelist! If you want to work with James Mayo Clinic is hiring.

                                Mayo Clinic Infosec and IR Jobs: http://www.mayo-clinic-jobs.com/go/information-technology-engineering-and-architecture-jobs/255296/?facility=MN
                                Contact James Carder: carder.james@mayo.edu

                                Special Agent Eric Zimmerman of the FBI, @EricRZimmerman , talking about his upcoming in depth Shellbags talk at the SANS DFIR Summit as well as his new tool called Registry Explorer. RE and Eric's research into windows registries will be continued in the next broadcast. Whether you are interested in registries from a research, academic or investigative perspective this is a must see, and FREE, tool!

                                Eric's Blog: http://binaryforay.blogspot.com/
                                Eric's Github:https://github.com/EricZimmerman
                                Registry Explorer: http://binaryforay.blogspot.com/p/software.html


                                You can watch the broadcast here on Youtube: https://www.youtube.com/watch?v=lj7cMHySGSE

                                Or in the embedded player below:

                                Tuesday, March 3, 2015

                                Automating DFIR - How to series on programming libtsk with python Part 10

                                Hello Reader,
                                If you just found this series I have good news! There is way more of it to read and you should start at Part 1. See the links to all the parts below:

                                Part 1 - Accessing an image and printing the partition table
                                Part 2 - Extracting a file from an image
                                Part 3  - Extracting a file from a live system
                                Part 4 - Turning a python script into a windows executable
                                Part 5 - Auto escalating your python script to administrator
                                Part 6 - Accessing an E01 image and extracting files
                                Part 7 - Taking in command line options with argparse to specify an image
                                Part 8 - Hashing a file stored in a forensic image
                                Part 9 - Recursively hashing all the files in an image

                                Following this post the series continues:

                                Part 11 - Recursively searching for files and extracting them from a live system 
                                Part 12 - Accessing different file systems  
                                Part 13 - Accessing Volume Shadow Copies 

                                For those of you who are up to date let's get going! In this part of the series we are going to take our recursive hashing script and make it even more useful. We are going to allow the user to search for the kind of files they want to hash with a regular expression enabled search and give them the option to extract those files as they are found back to their original path under your output directory. Ready? Let's go!

                                First we will need to import two more libraries, so many libraries!. The good news is these are still standard python system libraries, so there is nothing new to install. The first library we will bring is called 'os' which gives us os related functions and will map the proper function based on the os you are running on. The second library called 're' will provide us with regular expression support when evaluating our search criteria. We bring in those libraries as before with an import command:

                                import os
                                import re

                                Now we need to add two more command line arguments to let our user take advantage of the new code we are about to write:

                                argparser.add_argument(
                                '-s', '--search',
                                dest='search',
                                action="store",
                                type=str,
                                default='.*',
                                required=False,
                                help='Specify search parameter e.g. *.lnk'
                                )
                                argparser.add_argument(
                                '-e', '--extract',
                                dest='extract',
                                action="store_true",
                                default=False,
                                required=False,
                                help='Pass this option to extract files found'
                                    )

                                Our first new argument is letting the user provide a search parameter. We are storing the search parameter in the variable args.search and if the user does not provide one we default to .* which will match anything.

                                The second argument is using a different variable then the rest of our options. We are setting the variable args.extract as a True or False value with the store_true option under action. If the user provides this argument then the variable will be true and the files matched will be extracted, if the user does not then the value will be false and the program will only hash the files it finds.

                                It's always good to show the user that the argument we received from them is doing something so let's add two lines to see if we have a search term and print it:

                                if not args.search == '.*':
                                  print "Search Term Provided",args.search

                                 Remember that .* is our default search term, so if the value stored in args.search is anything other than .* our program will print out the search term provided, otherwise it will just move on.

                                Our next changes all happen within our directoryRecurse function. First we need to capture the full path where the file we are looking at exists. We will do this by combining the partition number and the full path that lead to this file in order to make sure its unique between partitions.

                                outputPath ='./%s/%s/' % (str(partition.addr),'/'.join(parentPath))
                                
                                
                                
                                
                                
                                Next we will go into the else if statement we wrote in the prior post to handle regular files with non zero lengths. We will add a new line of code to the beginning of the code block that gets executed to do our regular expression search as follows:


                                elif f_type == pytsk3.TSK_FS_META_TYPE_REG and entryObject.info.meta.size != 0: searchResult = re.match(args.search,entryObject.info.name.name)
                                
                                
                                You can see we are use the re library here and calling the match function it provides. We are providing two arguments to the match function. The first is the search term the user provided us and the second is the file name of the regular, non zero sized file we are inspecting. If the regular expression provided by the user is a match then a match object will be returned and stored in the searchResult variable, if there is not a match then the variable will contain no data. We write a conditional to test this result next:


                                if not searchResult:
                                continue
                                
                                
                                
                                
                                This allows us to skip any file that did not match the search result provided. If the user did not specify a search term our default value of .* will kick in and everything will be a match.

                                Our last modification revolves around extracting out the files if our user selected the option to. The code looks like this

                                if args.extract == True:
                                if not os.path.exists(outputPath):
                                os.makedirs(outputPath)
                                extractFile = open(outputPath+entryObject.info.name.name,'w')
                                extractFile.write(filedata)
                                extractFile.close

                                The first thing we are doing is checking to see if the user has set the extract variable and caused it to be set to True. If they do then we will extract the file, if not we skip it. If it is true the first thing we do is make use of the os library's path.exists function. This will allow us to look at the output directory we want to create (and set in the outputPath variable above) already exists. If it does we can move on, if it does not than we call another os library provided function named makedirs. makedirs is nice because it will recursively create the path you specify so you don't have to loop through all the directories in between if they don't exist.

                                Now that our output path exists its time to extract it. We are modifying our old extractFile variable and now we are appending on our outputPath to the filename we want to create. This will place the file we are extracting in to the directory we have created. Next we write the data out to it as before and then close the handle since we will be reusing it.

                                If I was to run this program against the Level5 image we've been working with and specify both the extraction flag and provide a search term of .*jpg it would look like this

                                C:\Users\dave\Desktop>python dfirwizard-v9.py -i SSFCC-Level5.E01 -e -s .*jpg
                                Search Term Provided .*jpg
                                0 Primary Table (#0) 0s(0) 1
                                1 Unallocated 0s(0) 8064
                                2 NTFS (0x07) 8064s(4128768) 61759616
                                Directory: /
                                Directory: /$Extend/$RmMetadata/$Txf
                                Directory: /$Extend/$RmMetadata/$TxfLog
                                Directory: /$Extend/$RmMetadata
                                Directory: //$Extend
                                match  BeardsBeardsBeards.jpg
                                match  ILoveBeards.jpg
                                match  ItsRob.jpg
                                match  NiceShirtRob.jpg
                                match  OhRob.jpg
                                match  OnlyBeardsForMe.jpg
                                match  RobGoneWild.jpg
                                match  RobInRed.jpg
                                match  RobRepresenting.jpg
                                match  RobToGo.jpg
                                match  WhatchaWantRob.jpg
                                Directory: //$OrphanFiles
                                On my desktop there would be a folder named 2 and underneath that the full path to the file that matched the search term would exist.

                                That's it! Part 9 had a lot going on but now that we've built the base for our recursion it gets easier from here. The code for this part is located on the series Github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v9.py

                                Next in part 11 we will do the same thing from live systems but allow our code to enumerate all the physical disks present rather than hardcoding an option.