Automating DFIR - How to series on programming libtsk with python Part 13

Automating DFIR - How to series on programming libtsk with python Part 13

Hello Reader,
          This is part 13 of a many-planned part series. I don't want to commit myself to listing all the planned parts as I feel that curses me to never finish. We've come a long way since the beginning of the series and in this part we solve one of the most persistent issues most of us have, getting easy access to volume shadow copies.

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3  - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system
Part 12 - Recursively through multiple file system types

What you will need for this part:

You will need to download and install the pyvshadow MSI found here for Windows. This is the Python library that provides us with volume shadow access. It binds to the libvshadow library:

The vss helper library I found on the plaso project website:

You will need to download the following sample image that has volume shadows on it:!LlRFjbZJ!s0k263ZqKSw_TBz_xOl1m11cs2RhIIDPoZUFt5FuBgc Note this is a 4GB rar file that when uncompressed will become a 25GB raw image

The Concept:

Volume shadow copes are different than any other file system we've deal with this far. The volume shadow subsystem wraps around a NTFS partition (I'm not sure how it will be implemented in REFS) and stores within itself a differential cluster based backup of changes that occur on the disk. This means that all views our tools give us of the data contained with the volume shadow copies are emulated file systems views based on the database of differential cluster map stored within the volume shadow system. This is what the libvshadow library does for us, it parses the differential database end creates a view of the data contained within it as a file system that can be parsed. 

As with all things technical there is a catch. The base libvshadow was made to work with raw images only. There is of course a way around this, already coded into dfvfs, that I am testing for another part so we can access shadow copies from other types of forensic images.

As per Joachim Metz:
1. VSC is a large subsystem in Windows that can even store file copies on servers2. VSS (volsnap on-disk storage) is not a file system but a volume system it lies below NTFS. I opt to read:
Which means I am grossly oversimplifying things. In short I am describing how a volume system that exists within a NTFS volume is being interpreted as a file system for you by our forensic tools. Do not confuse VSS for actual complete volumes, they are differential cluster based records.

The Code:

The first thing we have to do is import the new library you just downloaded and installed as well as the helper class found on the Github.

import vss
import pyvshadow

Next since our code is NTFS specific we need to extend our last multi filesystem script to now to do something special if it detects that its accessing a NTFS partition:

print "File System Type Detected .",,"."
  if (str( == "TSK_FS_TYPE_NTFS_DETECT"):
    print "NTFS DETECTED"
We do this as seen above by comparing the string "TSK_FS_TYPE_NTFS_DETECT" to the converted enumerated value contained in If the values match then we are dealing with a NTFS partition that we can test to see if it has volume shadow copies.

To do the test are going to start using our new libraries:

    volume = pyvshadow.volume()
    fh = vss.VShadowVolume(args.imagefile, offset)
    count = vss.GetVssStoreCount(args.imagefile, offset)

First we create a pyshadow object by calling the volume function constructor and storing the object in the variable named volume. Next we are declaring the offset again to the beginning of our detected NTFS partition. Next we are going to use our vss helper library for two different purposes. The first is to get a volume object that we can work with in getting access to the volume shadows stored within. When we call the VShadowVolume function in our helper vss class we pass it two arguments. The first is the name of the raw image we passed in to the program and the offset to the beginning of the NTFS partition stored within the image. 

The second vss helper function we call is GetVssStoreCount which takes the same two arguments but does something very different. The function as the name implies returns the number of shadow copies that are present on the NTFS partition. The returned value starts at a count of 1 but the actual index of shadow copies starts at 0. Meaning whatever value is returned here we will have to treat as count -1 for our code. The other thing to know, based on my testing, you may get more volume shadows returned than are available on the native system. This is because when libvshadow parses the database it shows all available instances, including those volume shadow copies that have been deleted by the user or system but still partially exist within the database.

Not all NTFS partitions contain volume shadow copies so we need to test the count variable to see if it contains a result:

    if (count):

If it does contain a result (the function will set the count variable to undefined if there are no volume shadows on the partition) then we need to do two things first. The first is to keep track of which volume shadow we are working with which always begins with 0. Next we need to take our pyvshadow volume object and have it access the volume object named fh into the volume we created in the code above.

Once we have the pyvshadow volume object working with our vss helper class fh volume object we can start getting to the good stuff:

      while (vstore < count):
        store = volume.get_store(vstore)
        img = vss.VShadowImgInfo(store)

We are using a while loop here to iterate through all of the available shadow copies using the n -1 logic we discussed before (meaning that there is starting from 0 count -1 copies to go through). Next we are going to get the pyvshadow volume object to return a view of the volume shadow copy we are working with (whatever value vstore is currently set to) from the volume_get_store function and store it in the 'store' variable. 

We will then take the object stored in the 'store' variable and call the vss helper class function VShadowImgInfo which will return a imgInfo object that we can pass into pytsk and will work with our existing code:

        vssfilesystemObject = pytsk3.FS_Info(img)
        vssdirectoryObject = vssfilesystemObject.open_dir(path=dirPath)
        print "Directory:","vss",str(vstore),dirPath
        vstore = vstore + 1

So we are now working with our volume shadow copy as we would any other pytsk object. We are changing what we print now in the directory line to include which shadow copy we are working with. Next we are changing the parentPath variable we pass into the directoryRecurse function from [] or an empty list we used in our prior examples to a list which includes two members. The first member is the string 'vss' and the second is the string version of which shadow copy we are currently about to iterate and search. This is important so that we can uniquely export out each file that matches the search expression without overwriting the last file exported in prior shadow copy. 

Lastly we need to increment the vstore value for the next round of the while loop.

Only one more thing to do. We need to search the actual file system on this NTFS partition and define an else condition to continue to handle all the other file systems that our libtsk version supports:

      #Capture the live volume
      directoryObject = filesystemObject.open_dir(path=dirPath)
      print "Directory:",dirPath
      directoryObject = filesystemObject.open_dir(path=dirPath)
      print "Directory:",dirPath

We don't have to change any other code! That's it! We now have a program that will search, hash and export like before but now has volume shadow access without any other programs, procedures or drivers.

Running the program against the sample image looks like this:
E:\development>python -i image.dd -t raw -o rtf.cev -e  -s .*r
Search Term Provided .*rtf
Raw Type
0 Primary Table (#0) 0s(0) 1
Partition has no supported file system
1 Unallocated 0s(0) 2048
Partition has no supported file system
2 NTFS (0x07) 2048s(1048576) 204800
File System Type Dectected . TSK_FS_TYPE_NTFS_DETECT .
WARNING:root:Error while trying to read VSS information: pyvshadow_volume_open_f
ile_object: unable to open volume. libvshadow_io_handle_read_volume_header: inva
lid volume identifier.
3 NTFS (0x07) 206848s(105906176) 52219904
File System Type Dectected . TSK_FS_TYPE_NTFS_DETECT .
Directory: vss 0 /
Directory: vss 1 /
Directory: vss 2 /
Directory: /
4 Unallocated 52426752s(26842497024) 2048
Partition has no supported file system

The resulting exported data directory structure looks like this: 
Automating DFIR - How to series on programming libtsk with python Part 13
You can access the Github of all the code here:

and the code needed for this post here:

Next part lets try to do this on a live system!

Post a Comment