@night 1803 access accessdata active directory admissibility ads aduc aim aix ajax alissa torres amcache analysis anjp anssi answer key antiforensics apfs appcompat appcompatflags applocker april fools argparse arman gungor arsenal artifact extractor attachments attacker tools austin automating automation awards aws azure azuread back to basics backstage base16 best finds beta bias bitcoin bitlocker blackbag blackberry enterprise server blackhat blacklight blade blanche lagny book book review brute force bsides bulk extractor c2 carved carving case ccdc cd burning ceic cfp challenge champlain chat logs Christmas Christmas eve chrome cit client info cloud forensics command line computer forensics computername conference schedule consulting contest cool tools. tips copy and paste coreanalytics cortana court approved credentials cryptocurrency ctf cti summit cut and paste cyberbox Daily Blog dbir deep freeze defcon defender ata deviceclasses dfa dfir dfir automation dfir exposed dfir in 120 seconds dfir indepth dfir review dfir summit dfir wizard dfrws dfvfs dingo stole my baby directories directory dirty file system disablelastaccess discount download dropbox dvd burning e01 elastic search elcomsoft elevated email recovery email searching emdmgmt Encyclopedia Forensica enfuse eric huber es eshandler esxi evalexperience event log event logs evidence execution exfat ext3 ext4 extended mapi external drives f-response factory access mode false positive fat fde firefox for408 for498 for500 for526 for668 forenisc toolkit forensic 4cast forensic lunch forensic soundness forensic tips fraud free fsutil ftk ftk 2 full disk encryption future gcfe gcp github go bag golden ticket google gsuite guardduty gui hackthebox hal pomeranz hashlib hfs honeypot honeypots how does it work how i use it how to howto IE10 imaging incident response indepth information theft infosec pro guide intern internetusername Interview ios ip theft iphone ir itunes encrypted backups jailbreak jeddah jessica hyde joe sylve journals json jump lists kali kape kevin stokes kibana knowledgec korman labs lance mueller last access last logon leanpub libtsk libvshadow linux linux-3g live systems lnk files log analysis log2timeline login logs london love notes lznt1 mac mac_apt macmini magnet magnet user summit mathias fuchs md viewer memorial day memory forensics metaspike mft mftecmd mhn microsoft milestones mimikatz missing features mlocate mobile devices mojave mount mtp multiboot usb mus mus 2019 mus2019 nccdc netanalysis netbios netflow new book new years eve new years resolutions nominations nosql notifications ntfs ntfsdisablelastaccessupdate nuc nw3c objectid offensive forensics office office 2016 office 365 oleg skilkin osx outlook outlook web access owa packetsled paladin path specification pdf perl persistence pfic plists posix powerforensics powerpoint powershell prefetch psexec py2exe pyewf pyinstaller python pytsk rallysecurity raw images rdp re-c re-creation testing reader project recipes recon recursive hashing recycle bin redteam regipy registry registry explorer registry recon regripper remote research reverse engineering rhel rootless runas sample images san diego SANS sans dfir summit saturday Saturday reading sbe sccm scrap files search server 2008 server 2008 r2 server 2012 server 2019 setmace setupapi sha1 shadowkit shadows shell items shellbags shimcache silv3rhorn skull canyon skype slow down smb solution solution saturday sop speed sponsors sqlite srum ssd stage 1 stories storport sunday funday swgde syscache system t2 takeout telemetry temporary files test kitchen thanksgiving threat intel timeline times timestamps timestomp timezone tool tool testing training transaction logs triage triforce truecrypt tsk tun naung tutorial typed paths typedpaths uac unc understanding unicorn unified logs unread usb usb detective usbstor user assist userassist usnjrnl validation vhd video video blog videopost vlive vmug vmware volatility vote vss web2.0 webcast webinar webmail weekend reading what are you missing what did they take what don't we know What I wish I knew whitfield windows windows 10 windows 2008 windows 7 windows forensics windows server winfe winfe lite wmi write head xboot xfs xways yarp yogesh zimmerman zone.identifier

Automating DFIR - How to series on programming libtsk with python Part 11

Hello Reader,
      I had a bit of a break thanks to a long overdue vacation but I'm back and the code I'll be talking about today has been up on the github repository for almost 3 weeks, so if you ever want to get ahead go there as I write the code before I try to explain it! Github repository is here: https://github.com/dlcowen/dfirwizard

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image

Following this post the series continues:

Part 12 - Accessing different file systems
Part 13 - Accessing Volume Shadow Copies  

In this post we are going to augment the script on part 10 which went through an image and search/extracted files from all the NTFS partition in an image, and now we are going to do the same against all the NTFS partitions on a live system. You can obviously tweak this for any other file system but we will get to that in later posts in this series.

The first thing we need is a way to figure out what partitions exist on a live system in cross platform way so our future code can be tweaked to run anywhere. For this I choose the python library psutil which can provide a wealth of information about a system its running on, included information about available disks and partitions, you can read all about it here: https://pypi.python.org/pypi/psutil

To bring it into our program we need to call the import function again:

import psutil

and then because we are going to work against a live running system again we need our old buddy admin

import admin

which if you remember from part 5 will auto escalate our script to administrator just in case we forgot to run it as such.

We are going to strip out the functions we need to find all the parts of a forensic image and replace it with out code to test for administrative access:

if not admin.isUserAdmin():
Next we replace the functions we called to get a partition table from a forensic image with a call to psutil to return a listing of paritions and iterate through them. The code looks like the following which I will explain:

partitionList = psutil.disk_partitions()
for partition in partitionList:
  imagehandle = pytsk3.Img_Info('\\\\.\\'+partition.device.strip("\\"))
  if 'NTFS' in partition.fstype:

So here instead of calling pytsk for a partition table we are calling psutil.disk_partitions which will return a list of partitions that are available to the local system. I much prefer this method than trying to iterate through all volume letters as we will get back just those partitions available as well as what file system they are seen as running as. Our list of active partitions will be stored in the varaible partitionList. Next we will iterate through the partitions using the for operator storing each partition returned into the partition variable. Next we are creating a pytsk3 Img_Info object for each partition returned but only continuing if psutil recognized the partition is NTFS.

The next thing we are changing is our try catch blog in our recursive directory function. Why? I found in my testing that live systems react much differently than forensic images in setting certain values in libtsk. So rather than using entryObject.info.meta.type to determine if I'm dealing with a regular file I am using entryObject.info.name.type which seem to always be set regardless if its a live system or a forensic image. I'm testing to see if I can capture the type of the file and it's size here as there are a lot of interesting special files that only appear at run time that will throw an error if you try to get their size.

        f_type = entryObject.info.name.type
        size = entryObject.info.meta.size
      except Exception as error:
          print "Cannot retrieve type or size of",entryObject.info.name.name
          print error.message

So in the above code I'm getting the type of file (lnk, regular, etc..) and it's size and if I can't I'm handling the error and printing out the error before continuing on. You will see errors, live systems are an interesting place to do forensics.

I am now going to make a change I alluded to earlier on in the series. We are going to buffer out reads and writes so we don't crash our of our program because we are trying to read a massive file into memory. This wasn't a problem in our first examples as we were working from small test images I made before, but now they we are dealing with real systems and real data we need to handle our data with care.

Our code looks as follows:

            BUFF_SIZE = 1024 * 1024
            md5hash = hashlib.md5()
            sha1hash = hashlib.sha1()
            if args.extract == True:
                  if not os.path.exists(outputPath):
                  extractFile = open(outputPath+entryObject.info.name.name,'w')
            while offset < entryObject.info.meta.size:
                available_to_read = min(BUFF_SIZE, entryObject.info.meta.size - offset)
                filedata = entryObject.read_random(offset,available_to_read)
                offset += len(filedata)
                if args.extract == True:

            if args.extract == True:

First we need to determine how much data we want to read or write at one time from a file. I've copied several other examples I've found and I'm setting that amount to 1 meg of data at a time by setting the variable BUFF_SIZE equal to 1024*1024 or one megabyte. Next we need to keep track of where we are in the file we are dealing with, we do that by creating a new variable called offset and setting the offset to 0 to start with.

You'll notice that we are creating our hash objects, directories and file handles before we read in any data. That is because we want to do all of these things one time prior to iterating through the contents of a file. If a file is a gigabyte in size then our function will be called 1,024 times and we just want one hash and one output file to be created.

Next we starting a while loop which will continue to execute until our offset is greater or equal to the size of our file, meaning we've read all the data within it. Now files are not guaranteed to be allocated in 1 meg chunks, so to deal with that we are going to take advantage of a python function called min. Min returns the smaller of to values presented which in our code is the size of the buffer compared to the remaining data left to read (the size of the file - our current offset). Whichever value is smaller will be stored in the variable available_to_read.

After we know how much data we want to read in this execution of our while loop we are going to read it as before from our entryObject passing in the offset to start from and how much data to read, storing the data read into the variable filedata. We are then calling the update function provided by our hashing objects. One of the nice things our the hashlibs provided by python is that if you provide additional data to an already instantiated object it will just continue to build the hash rather than having to read it all in at once.

Next we are incrementing our offset by adding to itself the length of data we just read so we will skip past it on the next while loop execution. Finally we write the data out to our output file if we elected to extract the files we are searching for.

I've added one last bit of code to help me catch any other weirdness that may seep through.

          print "This went wrong",entryObject.info.name.name,f_type

An else to look for any condition that does not match one of existing if statements.

That's it! You now have a super DFIR Wizard program that will go through all the active NTFS partitions on a running system and pull out and hash whatever files you want!

You can find the complete code here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v10.py

In the next post we will talk about parsing partitions types other than NTFS and then go into volume shadow copy access!

Post a Comment


Author Name

Contact Form


Email *

Message *

Powered by Blogger.