Friday, August 2, 2013

Daily Blog #40: Web 2.0 Forensic Part 5

Hello Reader,
                    In the past posts in this series we've focused on what you can recover from web 2.0 sites, how data sits on the disk and how data is transmitted across the network. In this post we talk about what these messages fields mean and how to build a quick carver for them. Tomorrow is Saturday Reading and I will be including a link to today's Forensic Lunch cast which i think was the best so far!

Mail folder summary view versus Mail folder full view:
What I noticed in viewing the data as it went across the network is that there are two distinct types of data streams being sent, at least to chrome. The first being the page of the mailbox you requested which contains the message summaries as well as the message contents themselves. The second being additional pages of the mail folder being viewed where only the message summaries are being sent and cached for faster loading to the user.

The full view is the first page sent and contains data in two sections, the first is the message summary for example here is a message summary for my daily win4n6 mailing list digest:

,["cs","140395ee6229f7d4","140395ee6229f7d4",1,,,1375366638336000,"140395ee6229f7d4",["140395ee6229f7d4"]
,[]
,[]
,[["140395ee6229f7d4",["^all","^i","^smartlabel_group","^unsub"]
]
]
,,,[]
,[["","win4n6@yahoogroups.com"]
,["No Reply","notify-dg-win4n6@yahoogroups.com"]
]
,,,[]
,[]
,,,"Digest Number 1388","[win4n6] Digest Number 1388"]
,

Each section of the inbox view with full messages starts with ["cs" which i'm guessing to mean 'content start' and ends with ,["ce"] as shown below. 
]
,0]
,["ce"]
So we can recover full messages with a regex as simple as 
(\["cs",.+\["ce"\]) 

However this is a greedy expression and may capture multiple messages within it.

Other fields of interest in the header include the message number internally assigned by gmail this can be seen as "140395ee6229f7d4", the message sender "win4n6@yahoogroups.com" and subject ""[win4n6] Digest Number 1388"". 

When the content of the message begins you will see ["ms" which again I can only assume is short for message start as seen below:

["ms","140395ee6229f7d4","",4,"win4n6@yahoogroups.com","","win4n6@yahoogroups.com",1375352053000,"There are 5 messages in this issue. Topics in this digest: 1a. Re: TightVNC F...",["^all","^i","^smartlabel_group","^unsub"]
,0,1,"[win4n6] Digest Number 1388",["140395ee6229f7d4",["win4n6@yahoogroups.com"]
,[]
,[]
If this a mail folder summary view (which I've seen for pages preloaded after the first) this would be the end of content cached and retrievable. If this is the first page of the mail folder then it will be followed with the text of the message itself

,["No Reply \u003cnotify-dg-win4n6@yahoogroups.com\u003e"]
,"[win4n6] Digest Number 1388","There are 5 messages in this issue.\... Huge message digest here removed for readability\n",[[]
,[0]
,"",[]
]
,0,[[]
,[["win4n6","win4n6@yahoogroups.com"]
]
,[]
,[]
,[]
,[]
]
,"Thu, Aug 1, 2013 at 5:14 AM",[]
,1,0,0,0,1,"returns.groups.yahoo.com","yahoogroups.com","","\u003c1375352053.298.19336.m7@yahoogroups.com\u003e","[win4n6] Digest Number 1388","\u003cwin4n6.yahoogroups.com\u003e",,[0]
,,[]
,,0,[0]
,-1,,,[]
,[]
,0,0,1,0,0,,,[]
,,5314,-1]
,,0,"5:14 AM","5:14 am",0,,,"",["en"]
,0,"Thu, Aug 1, 2013 at 5:14 AM",[]
,,,,0,,"win4n6.yahoogroups.com",,0,1,"","win4n6@yahoogroups.com",[[]
,[["win4n6","win4n6@yahoogroups.com"]
]
,[]
,[]
,[]
,[]
]
,-1,,,,"yahoogroups.com",,[]
,[[[2013,7,31,5,37,,0,0]
,,"Wed Jul 31, 2013 5:37 am",0,0,0,0]
,[[2013,7,31,10,6,,0,0]
,,"Wed Jul 31, 2013 10:06 am",0,0,0,1]
,[[2013,7,31,8,28,,0,0]
,,"Wed Jul 31, 2013 8:28 am",0,0,0,3]
,[[2013,7,31,8,42,,0,0]
,,"Wed Jul 31, 2013 8:42 am",0,0,0,4]
,[[2013,7,31,8,50,,0,0]
,,"Wed Jul 31, 2013 8:50 am",0,0,0,6]
]
,0]
,["ce"]
You'll notice there is no matching message end (me) to the message start (ms) as we saw in the cs and ce pairing earlier. Instead the message ends with some index data about the messages in the thread related to this message so it can display them easily and finished with "ce"] again.

For each message retrieved from gmail you'll find these pairings. On Tuesday I'll dig into the javascript that interprets this data to see if we can find more data points for analysis. Until then happy hunting for gmail fragments and I hope you stick around for tomorrow's Saturday reading and Sunday Funday!