Hello Reader,
Welcome to part 9, I really hope you've read parts 1-8 before getting here. If not you will likely get pretty lost at this point. Here are the links to the prior parts in this series so you can get up to speed before going forward, each posts build on the next.
Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Following this post the series continues:
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system
Part 12 - Accessing different file systems
Part 13 - Accessing Volume Shadow Copies
Now that we have that taken care of it's time to start working on DFIR Wizard Version 8! The next step in our roadmap is to recurse through directories to hash files rather than just hashing specific files. To do this we won't need any additional libraries! However, we will need to write some new code and be introduced to some new coding concepts that may take a bit for you to really understand unless you already have a computer science background.
So let's start with a simplified program. We'll strip out all the code after we open up our NTFS partition and store it in the variable filesystemObject and our code now looks like this:
#!/usr/bin/python
# Sample program or step 8 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib
class ewf_Img_Info(pytsk3.Img_Info):
def __init__(self, ewf_handle):
self._ewf_handle = ewf_handle
super(ewf_Img_Info, self).__init__(
url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
def close(self):
self._ewf_handle.close()
def read(self, offset, size):
self._ewf_handle.seek(offset)
return self._ewf_handle.read(size)
def get_size(self):
return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='Extract the $MFT from all of the NTFS partitions of an E01')
argparser.add_argument(
'-i', '--image',
dest='imagefile',
action="store",
type=str,
default=None,
required=True,
help='E01 to extract from'
)
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
if 'NTFS' in partition.desc:
filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
Now previously we were creating file object's by opening a file stored in a known path. If either a) don't know the path to the file or b) want to look at all the files in the file system we need something other than a known path stored in a file object to get us there. What we have instead is a directory object. We create a directory object by called the function open_dir that, filesystemObject inherited from FS_Info, giving it the path to the directory we want to open and then storing the result in a variable. Our code would look like this if we were opening the root directory (aka '/' ) and storing it in a variable named directoryObject.
directoryObject = filesystemObject.open_dir(path="/")
Now we have a directoryObject which makes available to us a list of all the files and directories stored within that directory. To access each entry in this directory listing we need to assign use a loop operator like we did when we went through all of the partitions available in the image. We are going to use the for loop operator, except instead of looping through available partitions we are going to loop through directory entries in a directory. The for loop looks like this:
for entryObject in directoryObject:
This for loop will assign each entry in the root directory to our variable entryObject one a time until it has gone through all of the directory entries in the root directory.
Now we get to do something with the entries returned. To begin with let's print out their names using the info.name.name attribute of entryObject, just like we did earlier in the series.
print entryObject.info.name.name
The completed code looks like this:
#!/usr/bin/python
# Sample program or step 8 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib
class ewf_Img_Info(pytsk3.Img_Info):
def __init__(self, ewf_handle):
self._ewf_handle = ewf_handle
super(ewf_Img_Info, self).__init__(
url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
def close(self):
self._ewf_handle.close()
def read(self, offset, size):
self._ewf_handle.seek(offset)
return self._ewf_handle.read(size)
def get_size(self):
return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='List files in a directory')
argparser.add_argument(
'-i', '--image',
dest='imagefile',
action="store",
type=str,
default=None,
required=True,
help='E01 to extract from'
)
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
if 'NTFS' in partition.desc:
filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
directoryObject = filesystemObject.open_dir(path="/")
for entryObject in directoryObject:
print entryObject.info.name.name
Welcome to part 9, I really hope you've read parts 1-8 before getting here. If not you will likely get pretty lost at this point. Here are the links to the prior parts in this series so you can get up to speed before going forward, each posts build on the next.
Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Following this post the series continues:
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system
Part 12 - Accessing different file systems
Part 13 - Accessing Volume Shadow Copies
Now that we have that taken care of it's time to start working on DFIR Wizard Version 8! The next step in our roadmap is to recurse through directories to hash files rather than just hashing specific files. To do this we won't need any additional libraries! However, we will need to write some new code and be introduced to some new coding concepts that may take a bit for you to really understand unless you already have a computer science background.
So let's start with a simplified program. We'll strip out all the code after we open up our NTFS partition and store it in the variable filesystemObject and our code now looks like this:
#!/usr/bin/python
# Sample program or step 8 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib
class ewf_Img_Info(pytsk3.Img_Info):
def __init__(self, ewf_handle):
self._ewf_handle = ewf_handle
super(ewf_Img_Info, self).__init__(
url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
def close(self):
self._ewf_handle.close()
def read(self, offset, size):
self._ewf_handle.seek(offset)
return self._ewf_handle.read(size)
def get_size(self):
return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='Extract the $MFT from all of the NTFS partitions of an E01')
argparser.add_argument(
'-i', '--image',
dest='imagefile',
action="store",
type=str,
default=None,
required=True,
help='E01 to extract from'
)
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
if 'NTFS' in partition.desc:
filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
Now previously we were creating file object's by opening a file stored in a known path. If either a) don't know the path to the file or b) want to look at all the files in the file system we need something other than a known path stored in a file object to get us there. What we have instead is a directory object. We create a directory object by called the function open_dir that, filesystemObject inherited from FS_Info, giving it the path to the directory we want to open and then storing the result in a variable. Our code would look like this if we were opening the root directory (aka '/' ) and storing it in a variable named directoryObject.
directoryObject = filesystemObject.open_dir(path="/")
Now we have a directoryObject which makes available to us a list of all the files and directories stored within that directory. To access each entry in this directory listing we need to assign use a loop operator like we did when we went through all of the partitions available in the image. We are going to use the for loop operator, except instead of looping through available partitions we are going to loop through directory entries in a directory. The for loop looks like this:
for entryObject in directoryObject:
This for loop will assign each entry in the root directory to our variable entryObject one a time until it has gone through all of the directory entries in the root directory.
Now we get to do something with the entries returned. To begin with let's print out their names using the info.name.name attribute of entryObject, just like we did earlier in the series.
print entryObject.info.name.name
The completed code looks like this:
#!/usr/bin/python
# Sample program or step 8 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib
class ewf_Img_Info(pytsk3.Img_Info):
def __init__(self, ewf_handle):
self._ewf_handle = ewf_handle
super(ewf_Img_Info, self).__init__(
url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)
def close(self):
self._ewf_handle.close()
def read(self, offset, size):
self._ewf_handle.seek(offset)
return self._ewf_handle.read(size)
def get_size(self):
return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='List files in a directory')
argparser.add_argument(
'-i', '--image',
dest='imagefile',
action="store",
type=str,
default=None,
required=True,
help='E01 to extract from'
)
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
if 'NTFS' in partition.desc:
filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
directoryObject = filesystemObject.open_dir(path="/")
for entryObject in directoryObject:
print entryObject.info.name.name
If you were to run this against the SSFCC-Level5.E01 image you would get the following output:
C:\Users\dave\Desktop>python dfirwizard-v8.py -i SSFCC-Level5.E01
0 Primary Table (#0) 0s(0) 1
1 Unallocated 0s(0) 8064
2 NTFS (0x07) 8064s(4128768) 61759616
$AttrDef
$BadClus
$Bitmap
$Boot
$Extend
$LogFile
$MFT
$MFTMirr
$Secure
$UpCase
$Volume
.
BeardsBeardsBeards.jpg
confession.txt
HappyRob.png
ILoveBeards.jpg
ItsRob.jpg
NiceShirtRob.jpg
OhRob.jpg
OnlyBeardsForMe.jpg
RobGoneWild.jpg
RobInRed.jpg
RobRepresenting.jpg
RobToGo.jpg
WhatchaWantRob.jpg
$OrphanFiles
This code works and prints for us all of the files and directories located within the root directory we specified. However, most forensic images contain file systems that do have sub directories, sometimes thousands of them. To be able to go through all the files and directories stored within a partition we need to change our code to be able to determine if a directory entry is a file or a directory. If the entry is a directory then we need to check the contents of the sub directory as well. We will do this check over and over again until we have checked all the directories within the file system.
To write this kind of code we need to make use of a common computer science concept called recursion. In short instead of just writing the code to try to guess how many sub directories exist in an image we will write a function that will call itself every time we find a sub directory. That function will then call itself every time it finds a sub directory over and over until it reaches the end of that particular directory path and then unwind back.
Before we get into the new recursive function let's expand on part 7 by adding some more command line options. The first command line option we will add will allow the user to specify the directory we will start from when we do our hashing. I am not going to repeat what the code means as we covered it in part 7 so here is the new command line argument:
To write this kind of code we need to make use of a common computer science concept called recursion. In short instead of just writing the code to try to guess how many sub directories exist in an image we will write a function that will call itself every time we find a sub directory. That function will then call itself every time it finds a sub directory over and over until it reaches the end of that particular directory path and then unwind back.
Before we get into the new recursive function let's expand on part 7 by adding some more command line options. The first command line option we will add will allow the user to specify the directory we will start from when we do our hashing. I am not going to repeat what the code means as we covered it in part 7 so here is the new command line argument:
argparser.add_argument( |
'-p', '--path', |
dest='path', |
action="store", |
type=str, |
default='/', |
required=False, |
help='Path to recurse from, defaults to /' |
) What is new here that I need to explain is passing in a default. If the user does not provide a path to search then the default value we specify in the default setting will be stored in the variable path. Otherwise if the user does specify a path then that path will be stored in the path variable instead. The next command line option we will give the user will be to specify the name of the file we will write our hashes to. It wouldn't be very useful if all of our output went to the command line and vanished once we exited. Most forensic examiners I know eventually take their forensic tool output to programs like Microsoft Word or Excel to get it into something that whomever asked us to do our investigation can read. The code looks like this:
|
So above we are making sure we are dealing with a regular file that has 0 bytes of data. Once we know that is the case we are calling the same writerow function but now instead of generating hashes we are filling in the already known values for 0 byte files. The MD5 sum of a 0 byte file is d41d8cd98f00b204e9800998ecf8427e and the SHA1 sum of a 0 byte file is da39a3ee5e6b4b0d3255bfef95601890afd80709.
The only thing left to do now is to put in the except conditional for this entire section:
The only thing left to do now is to put in the except conditional for this entire section:
except IOError as e: |
print e |
continue Telling our program that if we got some kind of error in doing any of this to print the error to command window and move on to the next directory entry. That's it! We now have a version of DFIR Wizard that will recurse through every directory we specify in an NTFS partition, hash all the files contained with in it and then write that out long with other useful data to a csv file we specify. You can grab the full code here from our series Github: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v8.py Pasting in the all the code will break the blog formatting so I can't do that this time. In part 10 of this series we will extend this further to search for files within an image and extract them! Special thanks to David Nides and Kyle Maxwell who helped me through some late night code testing! |
Post a Comment