@night 1803 access accessdata active directory admissibility ads aduc aim aix ajax alex levinson alissa torres amcache analysis anjp anssi answer key antiforensics apfs appcompat appcompatflags applocker april fools argparse arman gungor arsenal artifact extractor attachments attacker tools austin automating automation awards aws azure azuread back to basics backstage bam base16 best finds beta bias bitcoin bitlocker blackbag blackberry enterprise server blackhat blacklight blade blanche lagny book book review brute force bsides bulk extractor c2 carved carving case ccdc cd burning ceic cfp challenge champlain chat logs Christmas Christmas eve chrome cit client info cloud forensics command line computer forensics computername conference schedule consulting contest cool tools. tips copy and paste coreanalytics cortana court approved credentials cryptocurrency ctf cti summit cut and paste cyberbox Daily Blog dbir deep freeze defcon defender ata deviceclasses dfa dfir dfir automation dfir exposed dfir in 120 seconds dfir indepth dfir review dfir summit dfir wizard dfrws dfvfs dingo stole my baby directories directory dirty file system disablelastaccess discount download dropbox dvd burning e01 elastic search elcomsoft elevated email recovery email searching emdmgmt Encyclopedia Forensica enfuse eric huber es eshandler esxi evalexperience event log event logs evidence execution exfat ext3 ext4 extended mapi external drives f-response factory access mode false positive fat fde firefox for408 for498 for500 for526 for668 forenisc toolkit forensic 4cast forensic lunch forensic soundness forensic tips fraud free fsutil ftk ftk 2 full disk encryption future gcfe gcp github go bag golden ticket google gsuite guardduty gui hackthebox hal pomeranz hashlib hfs honeypot honeypots how does it work how i use it how to howto IE10 imaging incident response indepth information theft infosec pro guide intern internetusername Interview ios ip theft iphone ir itunes encrypted backups jailbreak jeddah jessica hyde joe sylve journals json jump lists kali kape kevin stokes kibana knowledgec korman labs lance mueller last access last logon lateral movement leanpub libtsk libvshadow linux linux forensics linux-3g live systems lnk files log analysis log2timeline login logs london love notes lznt1 mac mac_apt macmini magnet magnet user summit magnet virtual summit mari degrazia mathias fuchs md viewer memorial day memory forensics metaspike mft mftecmd mhn microsoft milestones mimikatz missing features mlocate mobile devices mojave mount mtp multiboot usb mus mus 2019 mus2019 nccdc netanalysis netbios netflow new book new years eve new years resolutions nominations nosql notifications ntfs ntfsdisablelastaccessupdate nuc nw3c objectid offensive forensics office office 2016 office 365 oleg skilkin osx outlook outlook web access owa packetsled paladin pancake viewer path specification pdf perl persistence pfic plists posix powerforensics powerpoint powershell prefetch psexec py2exe pyewf pyinstaller python pytsk rallysecurity raw images rdp re-c re-creation testing reader project recipes recon recursive hashing recycle bin redteam regipy registry registry explorer registry recon regripper remote research reverse engineering rhel rootless runas sample images san diego SANS sans dfir summit sarah edwards saturday Saturday reading sbe sccm scrap files search server 2008 server 2008 r2 server 2012 server 2019 setmace setupapi sha1 shadowkit shadows shell items shellbags shimcache silv3rhorn skull canyon skype slow down smb solution solution saturday sop speed sponsors sqlite srum ssd stage 1 stories storport sunday funday swgde syscache system t2 takeout telemetry temporary files test kitchen thanksgiving threat intel timeline times timestamps timestomp timezone tool tool testing training transaction logs triage triforce truecrypt tsk tun naung tutorial typed paths typedpaths uac unc understanding unicorn unified logs unread updates usb usb detective usbstor user assist userassist usnjrnl validation vhd video video blog videopost vlive vmug vmware volatility vote vss web2.0 webcast webinar webmail weekend reading what are you missing what did they take what don't we know What I wish I knew whitfield windows windows 10 windows 2008 windows 7 windows forensics windows server winfe winfe lite winscp wmi write head xboot xfs xways yarp yogesh zimmerman zone.identifier

Daily Blog #367: Automating DFIR with dfVFS part 1

Hello Reader,
         Today we begin again with a new Automating DFIR series.

If you want to show your support for my efforts, there is an easy way to do that. 

Vote for me for Digital Forensic Investigator of the Year here: https://forensic4cast.com/forensic-4cast-awards/

The last time we started this series (you can read that here http://www.hecfblog.com/2015/02/automating-dfir-how-to-series-on.html) we were using the pytsk library primarily to access images and live systems. This time around we are going to restart the series to the first steps to show how to do this with the dfVFS library which makes use of pytsk and many, many other libraries.

         In comparison to dfVFS pytsk is a pretty simple and straightforward library, but it does have its limitations. dfVFS (Digital Forensics Virtual Filesystem) is not just one library, its a collection of DFIR libraries with all the glue in between so things work together without reinventing the wheel/processor/image format again. This post will start with opening a forensic image and printing the partition table much like we did in part 1 of the original Automating DFIR with pytsk series. What is different is that this time our code will work with E01s, S01s, AFF and other image formats without us having to write additional code for it. This is because dfVFS has all sorts of helper functions built in to determine the image format and load the right library for you to access the underlying data.

Before you get started on this series make sure you have the python 2.7 x86 installed and have followed the steps in the following updated blog post about how to get dfVFS setup:

You'll also want to download our first forensic image we are working with located here:

When I got my new machine setup I realized that a couple new libraries were not included in the original post so I updated it. If you followed the post to get your environment setup before yesterday you should check the list of modules to make sure you have them all installed. Second on my system I had an interesting issue where the libcrypto library was being installed as crypto but dfVFS was calling it as Crypto (case matters). I had to rename the directory under \python27\lib\site-packages\crypto to Crypto and then everything worked.

If you want to make sure everything works then download the full dfvfs package from github (linked in the installing dfvfs post) and run the tests before proceeding any further.

Let's start with what the code looks like:
import sys
import logging

from dfvfs.analyzer import analyzer
from dfvfs.lib import definitions
from dfvfs.path import factory as path_spec_factory
from dfvfs.volume import tsk_volume_system


path_spec = path_spec_factory.Factory.NewPathSpec(
          definitions.TYPE_INDICATOR_OS, location=source_path)

type_indicators = analyzer.Analyzer.GetStorageMediaImageTypeIndicators(

source_path_spec = path_spec_factory.Factory.NewPathSpec(
            type_indicators[0], parent=path_spec)

volume_system_path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/',

volume_system = tsk_volume_system.TSKVolumeSystem()

volume_identifiers = []
for volume in volume_system.volumes:
  volume_identifier = getattr(volume, 'identifier', None)
  if volume_identifier:

print(u'The following partitions were found:')

for volume_identifier in sorted(volume_identifiers):
  volume = volume_system.GetVolumeByIdentifier(volume_identifier)
  if not volume:
    raise RuntimeError(
        u'Volume missing for identifier: {0:s}.'.format(volume_identifier))

  volume_extent = volume.extents[0]
      u'{0:s}\t\t{1:d} (0x{1:08x})\t{2:d}'.format(
          volume.identifier, volume_extent.offset, volume_extent.size))


as you can tell this is much larger than our first code example from the pytsk series which was this:
import sys
import pytsk3
imagefile = "Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len

But the easier to read and smaller pytsk example is much more limited in functionality to what the dfVFS version can do. On its own our pytsk example could only work with raw images, our dfVFS example can work with multiple image types and already has built in support for multipart images and shadow copies!

Let's break down the code:
import sysimport logging

Here we are just importing in two standard python libraries. Sys for default python system library and logging which gives a mechanism to standardized our logging of errors and information messages that we can tweak so we can give different levels of information based on what level of logging is being requested.

Next we are bringing in multiple dfVFS functions:

from dfvfs.analyzer import analyzer

We are bringing in 4 helper functions here from dfVFS. First we are bringing in the 
analyzer function which can determine for us the type of image or archive or partition we 
are attempting to access, in its current version it can auto detect the following:

  • bde - bitlockered volumes
  • bzip2 - bzip2 archives
  • cpio - cpio archives
  • ewf - expert witness format aka e01 images
  • gzip - gzip archives
  • lvm - logical volume management, the linux partitioning system
  • ntfs - ntfs partitions
  • qcow - qcow images
  • tar - tar archives
  • tsk - tsk supported image types
  • tak partition - tsk identified partitions
  • vhd - vhd virtual drives
  • vmdk - vmware virtual drives
  • shadow volume - windows shadow volumes
  • zip - zip archives
from dfvfs.lib import definitions
The next helper function is the definitions function which maps our named types and 
identifiers to the values that the underlying libraries are expecting or returning. 

from dfvfs.path import factory as path_spec_factory
The path_spec_factory helper library is one of the cornerstones of understanding the 
dfVFS framework. path_spec's are what you pass into most of the dfvfs functions and contain 
the type of object you are passing in (from the definitions helper library), the location
where this thing is (either on the file system you are running the script from or the 
location within the image you are pointing to) and the parent path if there is one. As we
go through this code you'll notice we make multiple path_spec's as we work to make the 
object we need to pass to the right helper function to access the forensic image. 

         from dfvfs.volume import tsk_volume_system

This helper library is create a pytsk volume object for us to allow us to use pytsk to enumerate and access volumes/partitions. 


Here we are creating a variable called source_path and storing within it the name of the forensic
image we would like to work with. In future examples we will work with other images types and
sizes but this is a small and simple image. I've tested this script with vhds and e01s and both opened
without issue and without changing any code other than the name of the file.

path_spec = path_spec_factory.Factory.NewPathSpec(
definitions.TYPE_INDICATOR_OS, location=source_path)

Our first path_spec object. Here we are calling the path_spec_factory helper function and using the 
NewPathSpec function to return a path spec object. We are passing in the type of file we are working
with which is TYPE_INDICATOR_OS which is defined in the dfVFS wiki as a file contained within
the operating system and where the file is located as the location by passing in the source_path 
variable we made in the line above. 

type_indicators = analyzer.Analyzer.GetStorageMediaImageTypeIndicators(

Next we are letting the Analyzer helper function's Get Storage Media Image Type Indicator 
function to let it figure out what kind of image, file system, partition or archive 
we are dealing with by its signature. It will return back the type into a variable called
type indicators.

source_path_spec = path_spec_factory.Factory.NewPathSpec( type_indicators[0], parent=path_spec)

Once we have the type of thing we are working with we want to generate another path_spec
object that has that information within it so our next helper library knows what it is
dealing with. We do that by calling the same NewPathSpec function but now for a type
we are passing in the first result that was stored in the type_indicator. Now I am cheating
a bit here to make things simple, I should be checking to see how many types are being
returned and if we know what we are dealing with. However that won't make this program
any easier to read and I'm giving you an image that will work correctly with this code.
In future blog posts we will put in the logic to detect and report such errors.

volume_system_path_spec = path_spec_factory.Factory.NewPathSpec(
definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/',

Another path_spec object! Now that we have a path_spec object that identifies its type as
a forensic image we can create a path_spec object for the partitions contained within it.
You can see we are passing in the type of TSK_PARTITION meaning this is a tsk object 
that will work with the tsk volume functions. Again in future posts we will write code to
determine if there are in fact partitions here to deal with, ,but for now just know it 

volume_system = tsk_volume_system.TSKVolumeSystem()volume_system.Open(volume_system_path_spec)

Now we are going back to our old buddy libtsk aka pytsk. We are creating a TSK Volume
object and storing it in volume_system. Then we are opening the path_spec object we just
made that contains a valid tsk object and passing that into our new tsk volume object and
telling it to open it.

volume_identifiers = []
for volume in volume_system.volumes:
volume_identifier = getattr(volume, 'identifier', None)
if volume_identifier:
Here we are initializing a list object called volume_identifiers and then making use of the volumes
function within the tsk volume_system object to return a list of volumes aka partitions stored within
the tsk volume object we just opened. Our for loop will then iterate through each volume returned
and for each volume it will grab the identifier attribute from the volume object created in the for loop
and store the result in the volume_identifier variable. 

Our last line of code is checking to see if a volume_identifier was returned, if it was then we will append it to our list of volume_identifiers we initialized prior to our for loop. 

print(u'The following partitions were found:')print(u'Identifier\tOffset\t\t\tSize') for volume_identifier in sorted(volume_identifiers): volume = volume_system.GetVolumeByIdentifier(volume_identifier) if not volume: raise RuntimeError( u'Volume missing for identifier: {0:s}.'.format(volume_identifier)) volume_extent = volume.extents[0] print( u'{0:s}\t\t{1:d} (0x{1:08x})\t{2:d}'.format( volume.identifier, volume_extent.offset, volume_extent.size)) print(u'')
In this last bit of code we are printing out the information we know so far about these
partitions/volumes. We do that by doing a for loop over our volume_indetifiers list.
For each volume stored within it we are calling the GetVolumeByIdentifier function and 
storing the returned object in the volume variable. 

We then print three properties from the volume object returned, the identifier 
(the partition or volume number), the offset to where the volume begins (in decimal
and hex) and lastly how large the volume is.

Woo, that's it! I know that is a lot to go through for an introduction post but it all
build on this and within a few posts you will really begin to understand the power of

You can download this python script from the github here: 

Post a Comment


Author Name

Contact Form


Email *

Message *

Powered by Blogger.