Wednesday, December 12, 2018

Daily Blog #565: Seeing Double (access dates)

Hello Reader,
         Got some medicine today so hopefully I'll be able to stop coughing tomorrow. In the meantime I'd like to point you to some very interesting work Maxim Suhanov is doing. You can read the tweet thread here:

https://twitter.com/errno_fail/status/1073012513187479553


Maxim found that Windows is keeping two last access dates, one on the disk and one in memory for a single file if Last Access dates are enabled. In the below python script you can see he can actually see the contents of both version of the timestamp:


#!/usr/bin/env python3

import os
from time import sleep

FILE_PATH = 'ts.txt'

def get_atime_1():
 result = os.stat(FILE_PATH, follow_symlinks = False)
 return result.st_atime

def get_atime_2():
 for entry in os.scandir('.'):
  if entry.name.lower() == FILE_PATH.lower():
   return entry.stat().st_atime

#print('Starting up...')
#sleep(15)
print(get_atime_1(), get_atime_2())

https://gist.github.com/msuhanov/74fd3c795883e0491277e6e27f2434e3

That to me is fascinating, it looks like one entry is coming from the stat of the file itself while the other is coming from the directory index. This is going to become more testing material in the near future. 

16 comments:

  1. I think it's reasonably well established that last access time is cached to avoid excessive writes to disk, and flushed to disk only at certain points in time or file/volume processing. Some of those are almost certainly related to cache management -- if cache gets full, for example, or cache contents gets too old (>= 1 hour). Others are related to file management, such as closing the file by a process.

    This finding could perhaps be interpreted that one of the syscalls does populate st_atime from that cache, while the other does not. This seems possible, perhaps likely, to be unintentional.

    ReplyDelete
    Replies
    1. This comment has been removed by a blog administrator.

      Delete
  2. If this is a cache-related issue only, then two different timestamps won't survive a reboot. But they do, sometimes. And since sometimes the "new" timestamp doesn't survive a reboot, there is a cache (which may be flushed or discarded during a reboot). Thus, there are two issues: a cached timestamp and a second, different, timestamp on a disk.

    ReplyDelete
    Replies
    1. 'Sometimes' is odd. I would like to see some explanation for that -- or perhaps a way to ensure that whatever the explanation is, it's not a methodological mistake.

      Delete
    2. So, I have confirmed that there could be different last access timestamps for the same file in the $MFT and $I30 structures on a disk. In Windows 10 (October 2018 Update), this could be easily reproduced. In Windows 7 SP1, this happens "sometimes" (and it's very hard for me to reproduce the issue, but I copied the $MFT and $I30 files from a successful test yesterday, so I'm sure; when this issue can't be reproduced, this means that I see the same value of the last access timestamp both in $MFT and $I30 structures on a disk after a reboot).

      Delete
  3. > I think it's reasonably well established that last access time is cached to avoid excessive writes to disk...

    No, it's not. Let me quote ForensicsWiki, an old version of an article (was there before 2015): "On NTFS, access time has a resolution of 1 hour".

    ReplyDelete
    Replies
    1. And, I presume, it cited the source for that conclusion? (I suspect the source to be Microsoft's documentation of the FILETIME structure).

      However, it can't be correct as it stands. If the 'resolution' is one hour, last access time stamps would be exactly one hour apart. And they're not.

      I suspect this to be a misinterpretation of another Microsoft source, this time titled 'File Times' that says, among other things: "The NTFS file system delays updates to the last access time for a file by up to 1 hour after the last access."

      That is, unfortunately, somewhat ambiguous, as it doesn't say what 'access time' is meant. My personal interpretation is that is the on-disk datum.

      Many readers seem to miss the '_up_to_ 1 hour'. The statement does not exclude faster update, only sets an upper limit to them.

      Those statements, to me, point to a cache of last access timestamps (at least).

      We should add yet another Microsoft statement (from the same source): "The only guarantee about a file time stamp is
      that the file time is correctly reflected when the handle that makes the
      change is closed." Again, ambiguity as to what time stamp, but I'm fairly certain on-disk time stamps are intended. (And again we may raise the question to what extent write caching may make that statement incorrect.) However, I may be wrong: perhaps it's a time stamp returned by some unspecified API call.

      Dave Kleiman once posted a rather informal test that he said he had performed many times to show the truth of the '1 hour resolution' idea, but I never got it to work in the way he expected. So to me '1-hour resolution' does not exist. Perhaps that's also something that may need to be tested further.



      Delete
    2. > My personal interpretation is that is the on-disk datum.

      This is also supported by this piece of documentation: https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/fsutil-behavior

      It reads:

      "File-based Last Access Time queries are accurate even if all on-disk values are not current. NTFS returns the correct value on queries because the accurate value is stored in memory."

      > Those statements, to me, point to a cache of last access timestamps (at least).

      And this is true, I have seen this cache in the NTFS driver taken from Windows 10 (I call it a shadow last access timestamp, because it's stored in a separate in-memory field of a structure).

      But, in Windows 10, last access updates can be hold for a period of time that is actually longer than an hour. In a recent test, it took two hours for the update to hit the disk (and there were no handles opened for that file, at least for that long).

      Delete
    3. > But, in Windows 10, last access updates can be hold for a period of time that is actually longer than an hour.

      While other updates to the $MFT data (new file records in my tests) can be seen almost immediately. So, this is not a global cache for disk writes.

      Delete
    4. I probably misunderstand but ... How was that two-hour test performed if there were no file handles created?

      Delete
    5. Create a file, write something to it, close it, then open it again, close it, record that the last access timestamp has been updated to the current time. Use TSK to print the last access timestamp from on-disk metadata in a loop. Wait until the timestamp on a disk is updated. The result is: I waited for 2 hours. The file wasn't opened anywhere (by me) for this waiting period.

      Delete
    6. Fix: during this waiting period*

      Delete
    7. That loop -- was it tight, or did it sleep for fairly long periods? Say, print, then sleep for a few seconds? (just paranoia: might the loop affect NTFS time stamp flushing behaviour by keeping the disk busy?)

      Otherwise, seems solid, ... provided that nothing unknown happened.

      I wonder if there's a way to remove or reduce that 'unknown' component. Check the USN Journal? or use DiskMon? Process Monitor?

      Delete
    8. There was a sleep call to wait for 10 seconds. Even if this could keep the disk busy and affect the results, such results are still valid (constant disk I/O isn't uncommon on desktop systems).

      Delete
    9. The main value of this finding seems to be in situations of live acquiry, where one method of obtaining last access time stamp gives a 'better' time stamp than we can get by other methods.

      So the next question is probably 'how much better?' You have one (many?) test that suggests two hours better. Is that the improvement we can expect? Or is it an outlier? At this point, it probably becomes a question of quantifying 'better', by performing repeated tests and seeing how they vary. And also checking if this is behaviour that is applicable to additional Windows releases.

      But last access time is generally not relied on, just because just about anything changes it. So ... does this finding increase that lack of reliability? (It looks so to me, just now.) Or does it help remove it? Either would be a useful result, but the former may not need very deep additional investigation once the mechanism has been established as repeatable.

      I'm dropping off here: thanks for reporting an interesting find. I look forward to further tests results, but probably only as a lurker.

      Delete
  4. I'm with Maxim on this one. There appears to be more than just a write cache, needs more testing to know what exactly.

    ReplyDelete