@night 1803 access accessdata active directory admissibility ads aduc aim aix ajax alex levinson alissa torres amcache analysis anjp anssi answer key antiforensics apfs appcompat appcompatflags applocker april fools argparse arman gungor arsenal artifact extractor attachments attacker tools austin automating automation awards aws azure azuread back to basics backstage bam base16 best finds beta bias bitcoin bitlocker blackbag blackberry enterprise server blackhat blacklight blade blanche lagny book book review brute force bsides bulk extractor c2 carved carving case ccdc cd burning ceic cfp challenge champlain chat logs Christmas Christmas eve chrome cit client info cloud forensics command line computer forensics computername conference schedule consulting contest cool tools. tips copy and paste coreanalytics cortana court approved credentials cryptocurrency ctf cti summit cut and paste cyberbox Daily Blog dbir deep freeze defcon defender ata deviceclasses dfa dfir dfir automation dfir exposed dfir in 120 seconds dfir indepth dfir review dfir summit dfir wizard dfrws dfvfs dingo stole my baby directories directory dirty file system disablelastaccess discount download dropbox dvd burning e01 elastic search elcomsoft elevated email recovery email searching emdmgmt Encyclopedia Forensica enfuse eric huber es eshandler esxi evalexperience event log event logs evidence execution exfat ext3 ext4 extended mapi external drives f-response factory access mode false positive fat fde firefox for408 for498 for500 for526 for668 forenisc toolkit forensic 4cast forensic lunch forensic soundness forensic tips fraud free fsutil ftk ftk 2 full disk encryption future gcfe gcp github go bag golden ticket google gsuite guardduty gui hackthebox hal pomeranz hashlib hfs honeypot honeypots how does it work how i use it how to howto IE10 imaging incident response indepth information theft infosec pro guide intern internetusername Interview ios ip theft iphone ir itunes encrypted backups jailbreak jeddah jessica hyde joe sylve journals json jump lists kali kape kevin stokes kibana knowledgec korman labs lance mueller last access last logon lateral movement leanpub libtsk libvshadow linux linux forensics linux-3g live systems lnk files log analysis log2timeline login logs london love notes lznt1 mac mac_apt macmini magnet magnet user summit magnet virtual summit mari degrazia mathias fuchs md viewer memorial day memory forensics metaspike mft mftecmd mhn microsoft milestones mimikatz missing features mlocate mobile devices mojave mount mtp multiboot usb mus mus 2019 mus2019 nccdc netanalysis netbios netflow new book new years eve new years resolutions nominations nosql notifications ntfs ntfsdisablelastaccessupdate nuc nw3c objectid offensive forensics office office 2016 office 365 oleg skilkin osx outlook outlook web access owa packetsled paladin pancake viewer path specification pdf perl persistence pfic plists posix powerforensics powerpoint powershell prefetch psexec py2exe pyewf pyinstaller python pytsk rallysecurity raw images rdp re-c re-creation testing reader project recipes recon recursive hashing recycle bin redteam regipy registry registry explorer registry recon regripper remote research reverse engineering rhel rootless runas sample images san diego SANS sans dfir summit sarah edwards saturday Saturday reading sbe sccm scrap files search server 2008 server 2008 r2 server 2012 server 2019 setmace setupapi sha1 shadowkit shadows shell items shellbags shimcache silv3rhorn skull canyon skype slow down smb solution solution saturday sop speed sponsors sqlite srum ssd stage 1 stories storport sunday funday swgde syscache system t2 takeout telemetry temporary files test kitchen thanksgiving threat intel timeline times timestamps timestomp timezone tool tool testing training transaction logs triage triforce truecrypt tsk tun naung tutorial typed paths typedpaths uac unc understanding unicorn unified logs unread updates usb usb detective usbstor user assist userassist usnjrnl validation vhd video video blog videopost vlive vmug vmware volatility vote vss web2.0 webcast webinar webmail weekend reading what are you missing what did they take what don't we know What I wish I knew whitfield windows windows 10 windows 2008 windows 7 windows forensics windows server winfe winfe lite winscp wmi write head xboot xfs xways yarp yogesh zimmerman zone.identifier

Daily Blog #374: Automating DFIR with dfVFS part 4

Hello Reader,
            In our last entry in this series we took our partition listing script and added support for raw images. Now our simple script should be able to work with forensic images, virtual disks, raw images and live disks.

If you want to show your support for my efforts, there is an easy way to do that. 

Vote for me for Digital Forensic Investigator of the Year here: https://forensic4cast.com/forensic-4cast-awards/


Now that we have that working let's actually get it to do something useful, like extract a file.

First let's look at the code now:

import sys
import logging

from dfvfs.analyzer import analyzer
from dfvfs.lib import definitions
from dfvfs.path import factory as path_spec_factory
from dfvfs.volume import tsk_volume_system
from dfvfs.resolver import resolver
from dfvfs.lib import raw

source_path="stage2.vhd"

path_spec = path_spec_factory.Factory.NewPathSpec(
          definitions.TYPE_INDICATOR_OS, location=source_path)

type_indicators = analyzer.Analyzer.GetStorageMediaImageTypeIndicators(
          path_spec)

if len(type_indicators) > 1:
  raise RuntimeError((
      u'Unsupported source: {0:s} found more than one storage media '
      u'image types.').format(source_path))

if len(type_indicators) == 1:
  path_spec = path_spec_factory.Factory.NewPathSpec(
      type_indicators[0], parent=path_spec)

if not type_indicators:
  # The RAW storage media image type cannot be detected based on
  # a signature so we try to detect it based on common file naming
  # schemas.
  file_system = resolver.Resolver.OpenFileSystem(path_spec)
  raw_path_spec = path_spec_factory.Factory.NewPathSpec(
      definitions.TYPE_INDICATOR_RAW, parent=path_spec)

  glob_results = raw.RawGlobPathSpec(file_system, raw_path_spec)
  if glob_results:
    path_spec = raw_path_spec

volume_path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/',
        parent=path_spec)

volume_system = tsk_volume_system.TSKVolumeSystem()
volume_system.Open(volume_path_spec)

volume_identifiers = []
for volume in volume_system.volumes:
  volume_identifier = getattr(volume, 'identifier', None)
  if volume_identifier:
    volume_identifiers.append(volume_identifier)
 
print(u'The following partitions were found:')
print(u'Identifier\tOffset\t\t\tSize')

for volume_identifier in sorted(volume_identifiers):
  volume = volume_system.GetVolumeByIdentifier(volume_identifier)
  if not volume:
    raise RuntimeError(
        u'Volume missing for identifier: {0:s}.'.format(volume_identifier))

  volume_extent = volume.extents[0]
  print(
      u'{0:s}\t\t{1:d} (0x{1:08x})\t{2:d}'.format(
          volume.identifier, volume_extent.offset, volume_extent.size))

print(u'')

path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/p1',
        parent=path_spec)

mft_path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK, location=u'/$MFT',
        parent=path_spec)

file_entry = resolver.Resolver.OpenFileEntry(mft_path_spec)


stat_object = file_entry.GetStat()

print(u'Inode: {0:d}'.format(stat_object.ino))
print(u'Inode: {0:s}'.format(file_entry.name))
extractFile = open(file_entry.name,'wb')
file_object = file_entry.GetFileObject()

data = file_object.read(4096)
while data:
          extractFile.write(data)
          data = file_object.read(4096)

extractFile.close
file_object.close()

The first thing I changed was what image I'm working from back to stage2.vhd.

source_path="stage2.vhd"

 At this point though you should be able to pass it any type of supported image.

Next after the code we first wrote to list out the partitions within an image we added a new path specification layer to make an object that points to the first partition within the image.

path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/p1',
        parent=path_spec)
You can see we are using the type of TSK_PARTITION again because we know this is a partition but the location has changed from the prior type we made a parition path spec object. This is because our prior object pointed to the root of the image so we could iterate through the partitions and the new object is referencing just the 1st partition.

Next we make another path specification object that build on the partition type object.

mft_path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK, location=u'/$MFT',
        parent=path_spec)

Here we are creating a TSK object and telling it that we want it to point to the file $MFT at the root of the file system. Notice we didn't have to tell it the kind of file system, offsets to where it begins or any other data. The resolver and analyzer helper classes within dfVFS will figure all out of that out for us, if it can. In tomorrows post we will put in some more conditional code to detect when it can in fact not do that for us.

So now that we have a path spec object was a reference to a file we want to work with let's get an object for that file.

file_entry = resolver.Resolver.OpenFileEntry(mft_path_spec)

The resolver helper class OpenFileEntry function takes the path spec object we made that points to the $MFT and if it can access it will return an object that references it.

Next we are going to gather some data about the file we are accessing.

stat_object = file_entry.GetStat()

First we used the GetStat function available from the file entry object to return information about the file into a new object called stat object. This is similar to running the stat command on a file.

Next we are going to print what I'm refering to below as the Inode number:
print(u'Inode: {0:d}'.format(stat_object.ino))

MFT's don't have Inodes this is actually the MFT record number but the concept is the same. We are calling the stat_object property ino to access the mft record number. You could also access the size of the file, dates associated and other data but this is a good starting place.

Next we want to print the name of the file we are accessing.
print(u'Inode: {0:s}'.format(file_entry.name))


The file_entry object property contains the name. This is much easier than with pyTsk where we had to walk a meta sub object property structure to get the file name out.

Now we need to open a file handle to where we want to put the MFT data out to

extractFile = open(file_entry.name,'wb')

Notice two things. One we are using the file_entry.name property directly in the open file handle call, this means our extracted file will have the same name as the file in the image. Two we are passing in the options wb which means that the file handle can be written to, and when it is written to should be treated as a binary file. This is important in Windows systems as when you write out binary data any new lines could be interpreted unless you pass in the binary mode flag.

Now we need to interact with not just the properties of the file in the image, but what data its actually storing

file_object = file_entry.GetFileObject()

We do that by calling the GetFileObject function from the file_entry object. This is giving us a file object just like extractFile that normal python functions can read from. The file handle is being stored in the variable file_object.

Now we need to read the data from the file in the image and then write it out to a file on the disk.

data = file_object.read(4096)
while data:
          extractFile.write(data)
          data = file_object.read(4096)

First we need to read from the file handle we opened to the image. We are going to do that for 4k of data and then enter a while loop. The while loop is saying as long as there is data being read from the read call to file_object to keep reading 4k chunks. When we reach the end of the file our data variable will contain a null return and the while loop will stop iterating.

While there is data the write function on the extractFile handle will write the data we read and then we will read the next 4k chunk and iterate through the loop again.

Lastly for good measure we are going to close the handle to both file within the image and the file we are writing to on our local disk.

extractFile.close
file_object.close()

And that's it!

In future posts we are going to access volume shadow copies, take command line options, iterate through multiple partitions and directories and add a GUI. Lot's to do but we will do it one piece at a time.

You can download this posts code here on GitHub: https://github.com/dlcowen/dfirwizard/blob/master/dfvfsWizardv3.py

Post a Comment

[blogger][disqus][facebook][spotim]

Author Name

Contact Form

Name

Email *

Message *

Powered by Blogger.