Monday, March 15, 2010

Tidbits

Windows 7 XPMode
I was doing some Windows 7 testing not long ago, during which I installed a couple of applications in XPMode. The first thing I found was that you actually have to open the XP VPC virtual machine and install the application there; once you're done, the application should appear on your Windows 7 desktop.

What I found in reality is that not all applications installed in XPMode appear to Windows 7. I installed Skype and Google Chrome, and Chrome did not and would not appear on the Windows 7 desktop.

Now, the next step is to examine the Registry...for both the Windows 7 and XPMode sides...to see where artifacts reside.

When it comes to artifacts, there are other issues to consider, as well...particularly when the system you're examining may not have the hardware necessary to run XPMode, so the user may opt for something a bit easier, such as VirtualBox's Seamless Mode (thanks to Claus and the MakeUsOf blog for that link!).

Skype IMBot
Speaking of Skype, the guys at CERT.at have an excellent technical analysis and write-up on the Skype IMBot. Some of what this 'bot does is pretty interesting, in it's simplicity...for example, disabling AV through 'net stop' commands.

I thought that the Registry persistence mechanism was pretty interesting, in that it used the ubiquitous Run key and the Image File Execution Options key. As much as I've read about the IFEO key, and used it in demos to show folks how the whole thing worked, I've only seen it used once in the wild.

The only thing I'd say that I really wasn't 100% on-board with about the report overall was on pg 7 where the authors refer to "very simple rootkit behavior"...hiding behavior, yes, but rootkit? Really?

ZBot
I found an update about ZBot over at the MMPC site. I'd actually seen variant #4 before.

Another interesting thing about this malware is something I'd noticed in the past with other malware, particularly Conficker. Even though there are multiple variants, and as each new variant comes out, everyone...victims, IR teams, and AV teams...is in react mode, there's usually something about the malware that remains consistent across the entire family.

In the case of ZBot, one artifact or characteristic that remains persistent across the family is the Registry persistence mechanism; that is, this one writes to the UserInit value. This can be extremely important in helping IT staff and IR teams in locating other infected systems on the network, something that is very often the bane of IR; how to correctly and accurately scope an issue. I mean, which would you rather do...image all 500 systems, or locate the infected ones?

From the MMPC write-up, there are appears to be another Registry value (i.e., the UID value mentioned in the write-up) that IT staff can use to identify potentially infected systems.

The reason I mention this is that responders can use this information to look for infected systems across their domain, using reg.exe in a simple batch file. Further, checks of these Registry keys can be added to tools like RegRipper, so that a forensic analyst can quickly...very quickly...check for the possibility of such an infection, either during analysis or even as part of the data in-processing. With respect to RegRipper, there are already plugins available that pull the UserInit value, and it took me about 5 minutes (quite literally) to write one for the UID value.

Millenials
I was talking with a colleague recently about how pervasive technology is today for the younger set. Kids today have access to so much that many of us didn't have when we were growing up...like computers and cell phones. Some of us may use one email application, whereas our kids go through dozens, and sometimes use more than one email, IM, and/or social networking application at a time.

I've also seen where pictures of people have been posted to social networking sites without their knowledge. It seems that with the pervasiveness of technology comes an immediate need for gratification and a complete dismissal for the privacy and rights of others. While some people won't like it when it happens to them, they have no trouble taking pictures of people and posting them to social network sites without their knowledge or permission. Some of the same people will freely browse social networking sites, making fun of what others have posted...but when it's done to them, suddenly it's creepin' and you have no right.

Combine these attitudes with the massive pervasiveness of technology, and you can see that there's a pretty significant security risk.

From a forensics perspective, I've seen people on two social networking sites, while simultaneously using three IM clients (none of which is AIM) on their computer (and another one on an iTouch), all while texting others from their cell phone. Needless to say, trying to answer a simple question like, "...was any data exfiltrated?" is going to be a bit of a challenge.

Accenture has released research involving these millenials, those younger folks for whom technology is so pervasive, and in many cases, for whom "privacy" means something totally different from what it means to you, me, and even our parents. In many ways, I think that this is something that lots of us need to read. Not too long ago, I noted that when examining a system and looking for browser activity, a number of folks responded that they started by looking at the TypedURLs key, and then asked, what if IE isn't the default browser? Let's take that a step further...what if the computer isn't the default communications device? Many times LE will try to tie an IP address from log files to a specific device used by a particular person...but now the question is, which device? Not only will someone have a laptop and a cell phone, now what happens when they tether the devices?

The next time you see some younger folks sitting around in a group, all of them texting other people, or you see someone using a laptop and a cell phone, think about the challenges inherent to answering the most basic questions.

USN Journal
Through a post and some links in the Win4n6 Yahoo Group, I came across some interesting links regarding the NTFS USN Journal file (thanks, Jimmy!). Jimmy pointed to Lance's EnScript for parsing the NTFS transaction log; Lance's page points to the MS USN_RECORD structure definition. Thanks to everyone who contributed to the thread.

An important point about the USN Journal...the change journal is not enabled by default on XP, but it is on Vista.

So, why's this important? Well, a lot of times you may find something of value by parsing files specifically associated with the file system, such as the MFT (using analyzeMFT.py, for example).

Another example is that I've found value bits in the $LogFile file, both during practice, and while working real exams. I simply export the file and run it through strings or BinText, and in a couple of cases I've found information from files that didn't appear in the 'active' file system.

For more detailed (re: regarding structures, etc.) information about the NTFS file system, check out NTFS.com and the NTFS documentation on Sourceforge.

Resources
MFT Analysis/Data Structures
Missing MFT Entry

LNK Files
Speaking of files...Harry Parsonage has an interesting post on Windows shortcut/LNK files, including some information on using the available timestamps (there's 9 of them) to make heads or tails of what was happening on the system. Remember that many times, LNK files are created through some action taken by a user, so this analysis may help you build a better picture of user activity.

Monday, March 01, 2010

Thoughts on posing questions, and sharing

I ran across a question on a list recently that I responded to when I saw it, but as time has passed, I've reconsidered my response somewhat. And whatnot.

The question I saw had to do with RegRipper, specifically my thoughts on meeting the needs of the community and creating new plugins. Basically, all I've ever asked for in that regard is a concise description of the need or issue, and a sample hive file. The person asking the question wanted to know if I seriously expected folks to provide hive files from live cases. My initial reaction was no, there are other ways to provide the necessary data. Such as setting up a test environment, replicating the issue, and sending me that hive file. However, I began to reconsider that response...if someone doesn't really know the difference between a Registry key and a value, and they have a question, how would they go about crafting the question? Once they do that, how would they go about discerning the responses they received, and figuring out which applied to what they were working on?

Seriously, there are a lot of things out there that require specific use of language, and specificity of language can be somewhat lacking in our community.

Taking that a step further, one of the problems I've seen for a number of years is that some questions that need to be asked simply don't get asked, because people in the community don't want to share information; apparently, "sharing information" has a number of different connotations. Some folks don't want it publicly known that they don't know something...even if asking the question means that they'll end up knowing the answer. I've seen this before...I didn't want to ask the question, because I didn't want to look dumb. To that, my response is usually along the lines of, so you don't ask the question, and we overcharge the customer for an inferior deliverable, our billing rate drops, AND you don't know the answer for the next time you need it. Really...which situation really makes you look dumb? Another one I see is that some folks don't ask questions publicly because they just don't want others to know that they had to ask...to which I usually suggest that if they had asked the question, they'd then know the answer, obviating the issue all together.

Others apparently don't ask questions because they're afraid that they'll have to give up sensitive information...information about a case that they're working on, etc. I understand that folks working CP cases don't want that stuff out...and to be honest, I don't either. I do want to help...and sometimes, due the "cop-nerd language barrier" sometimes the best and fastest way to help is to get the actual Registry hive or Event Log file. And guess what? Hive files don't (usually) contain graphics.

Like many folks, my desire to help comes from just that...a desire to help. If my helping makes it easier for an LE to be prepared to address the Trojan Defense, or better yet, to do so in a manner that gets a plea agreement, then that's good. I do NOT want to see the images, and I and others can help without seeing them.

Another issue is that some folks don't ask questions because they don't know enough about the situation to ask the question. This can be a particular issue in digital forensics, because there are certain things that really make a difference in how the respondent answers...such as, the file system, or even the version of the operating system. NTFS is different from FAT is different from ext2/3, and Windows XP has a number of differences from Windows 2000, as well as Vista.

Here's an example...some folks will ask questions such as, "how do I tell when a file was first created on a system?", without really realizing that the system in question, and perhaps even the document type, can greatly affect the answer. So sometimes the initial question is asked, but there may not be any response to (repeated) requests for clarification to the original question.

Does the version of Windows really matter, generally speaking? When you're dealing with any kind of IR or forensic analysis, the answer is most often going to be "yes".

So the big question is, if you have a question, do you want an answer to it? Are you willing to provide the necessary information such that someone can provide a succinct response? I know some folks who will not even attempt to answer a question that require an encyclopedic answer.

Before we go on, let me say that I complete understand and agree that we can't know everything. No one of us can know it all...that's where there's strength in a community of sharing. There's no way that you're going to know everything you need to know for every exam...there are going to be things that we don't remember (maybe from a training course a couple of years ago, or something you read once...), and there are going to be things that we just don't know.

So what can we, as a community, do? Well, one way to look at is that the question I have...well, someone else in the room or on the board may have the same question; they may not know it yet. So if that question gets asked, then others will be able to see the answers and then ask the next question, expanding that information. The point is that no one of us is as smart as all of us together.

Find someone you can trust, someone you're willing to share information with. If you need to, establish an NDA. Have community meetings in local areas. If you don't feel comfortable sharing with some folks because you don't know them...get to know them.

The other option is that you learn to do it yourself...and that's not always going to work. You may spend 8 months examining MacOSX systems, and suddenly have to examine a Windows 7 system. What're you going to do then? Sure, spending all weekend gettin' giggy wit' Google will likely net (no pun intended) you something, but at what point do you reach overload?

Over the years, I've met a number of folks with skills and abilities for which I have a great deal of respect, and some of those I've reached to for assistance when I've needed it. Conversely, I've done my best to respond to those folks who've reached to me with questions regarding areas I'm specifically interested in.

Anyway, I'll bring this rambling to a close...

Addendum: Sometimes a really good place to start with questions is to seek answers at the ForensicsWiki. This is also good place to post the answers once you get them.

Tuesday, February 23, 2010

More on AV write-ups

Okay, okay...the title of this post isn't the greatest, I get it...but no pun intended. Anyway, I left it as is because (a) I couldn't think of anything witty, and (b) it is kinda funny.

Anyway, on with the show...

I was looking at an issue recently and I came across the following statement in a malware write-up:

It also creates a hidden user account named "HelpAssistant" and creates the following hidden folder: C:\Documents and Settings\HELPASSISTANT

Hhhmmm. Okay, so an artifact for an infected system is this hidden user account...interesting. So I go to a command prompt on my Windows XP box and type net user, and I get a list of user accounts on my system, one of which is HelpAssistant. Wow. So does that mean I'm infected?

Well, the next thing I do is export the Registry hive files from my system and hit the SAM hive with the samparse.pl RegRipper plugin, and I see:

Username : HelpAssistant [1000]
Full Name : Remote Desktop Help Assistant Account

User Comment : Account for Providing Remote Assistance

Account Created : Mon Aug 7 20:23:36 2006 Z

Last Login Date : Never

Pwd Reset Date : Mon Aug 7 20:23:36 2006 Z

Pwd Fail Date : Never

Login Count : 0

--> Password does not expire
--> Account Disabled

--> Normal user account

Okay, so this is a normal user account, never logged in, and appears to have been created in 2006. I think that this is interesting, because I installed this system in Sept, 2009. It appears that this is a default account that's set up with some settings already set to specific values.

Now, a little research tells me that this is an account used for Remote Assistance. If that's the case, does malware create or take over the account? It's possible, with the appropriate privileges, to use the API (or the net user commands) to delete and then create the account. To see if this is what happened, you may be able to find some information in the Event Log (assuming the proper auditing is enabled...) having to do with account deletion/creation. Another analysis technique is to examine the RID on the account as RIDs are assigned sequentially, and to check the unallocated space within the SAM hive (using regslack) to see if the original key for the HelpAssistant account was deleted.

What about this hidden thing? Well, as the write-up never states how the account is hidden, one thing to consider is that the fact that it's hidden is part of normal system behavior. That's right...Windows has a special Registry key that tells it to hide user accounts from view on the Welcome screen, essentially making those accounts hidden. Win32/Starter and Win32/Ursnif both take advantage of this key.

This is just another example of how AV write-ups can be incomplete and misleading, and how responders and analysts should not allow themselves to be mislead by the information provided in these write-ups.

Researching Artifacts

One of the things I really like about this industry is that there's always something new...a new challenge, a new twist to old questions, etc. This is fun, because I like to see about approaching these issues with a novel approach.

Here's an example; I recently found this article discussing an issue with web cams on laptops issued to high school students having been allegedly turned on remotely and used to monitor students in their homes. More and more laptops are available with built-in web cams, and web cams are relatively inexpensive. How long before there are stalking cases or civil suits in which the victim's web cam is enabled? The "Trojan Defense" (ie, the malware did it, not me) has been around for a while, so how long before we can expect to see other devices (web cams, in particular) being recognized as a source for illicit images, or somehow involved in other issues or crimes? Not long afterward, we're going to hear, "hey, I didn't do it...it was the virus."

So the novel approach comes in when you start to consider, what are the artifacts of the use of a web cam on a system? How do you tell if a web cam (or any other device) has been used, and more importantly, how do you address attribution? Was it the local user that started the web cam, was it malware, or was the web cam activated remotely by a legitimate user (or, activated remotely by someone with access to a legitimate user account)?

So what happens when this sort of issue lands on an analysts desk? This may be an example of one of those new, we haven't seen this kind of thing before issues. There very likely isn't a public repository of data, artifacts, and analysis plans somewhere, is there? Maybe there's a private one, but how does that help folks who don't have access to it, particularly if it's only accessible by a very small group of individuals? Where do folks go to start developing answers to questions like those in the previous paragraph, and once they determine those answers, what do they then do with the information? Is it available to the next analyst who runs into this sort of thing, or do we have to start all over again?

There's a good deal of research that goes on in a number of areas within the IR/DF community...file carving, for example. However, a lot of new issues that land on an analyst's desk are just that...new. New issue, new device, new operating system. Most of use are intimately familiar with the fact that the automated analysis approach we used in XP systems was, in some cases, broken when we got our first Vista system in for analysis. Oh, and hey...guess what? Windows 7 is out...in some ways, we need to start all over again.

So what happens when something new...some new issue, operating system, or application...comes out? Sometimes, someone puts forth the effort to conduct analysis and document the process and the findings, and then make that available, like what was done with Limewire examinations, for example.

Speaking of artifacts, I've posted before about browser stuff to look at beyond the traditional TypedURLs key and index.dat files. Well, as it happens, there appears to be data that indicates that it's not so much the browser that's being targeted...it's the stuff running in support of the browser. Brian Krebs posted recently about BLADE (no, not this Blade); the point of the post is that it isn't the browser that's the issue, it's the stuff running behind the scenes; the plugins, the add-ons, etc.

Consider this...someone gets an email or IM with a link to a PDF or other file format, and they click on it. Their default browser is opened, but it isn't the browser that's popped...it's the old/unpatched version of Adobe Reader (or some other unpatched add-on) that results in the system being compromised. Ultimately, a compromise like this could lead to significant losses. So while there will be artifacts in the browser history, this tells us that we need to look beyond that artifact if we're going to attribute an incident to the correct root cause; finding the first artifact and attributing the issue to a browser drive-by may not be correct, and in the long run, may hurt both your employer's reputation, and most certainly your customer. What happens if your customer reads your report and updates or changes the browser used throughout their infrastructure, only to get hit again?

IT firm looses...a lot!

I caught a very interesting post on Brian Krebs' site this morning...you'll find it here.

As an incident responder, the first thing that caught my eye was:

Since the incident, he has conducted numerous scans with a variety of anti-virus and anti-malware products – which he said turned up no sign of malicious software.

Ouch! When I read things like that, I hope that it's not all (nor the first thing) that was done, and that it's a gross, over-simplification of the summation of response activities. Most times, though, it isn't.

I've read Brian's stuff for years, and I think that he's done a great job of bringing some very technical issues into the public eye without subjecting them to the glitz and hoopla that you see in shows like CSI. For example, while Brian mentioned some specific malware that could have been involved, he also made a very clear statement at the beginning of a paragraph that it has not been confirmed that this or any other malware had been involved. I think that's very important when presenting these kinds of stories.

So, look at the situation...the IT firm had a dedicated system with extra protective measures that was used to perform online banking. Even with those measures in place (I did some research on biometric devices back in 2001, and they don't provide the level of protection one would think), a bank official "...said the bank told him that whoever initiated the bogus transaction did so from another Internet address in New Hampshire, and successfully answered two of his secret questions."

I think that Brian's story is a very good illustration of what many of us see in the response community.

Malware may have been associated with what happened, but no one knows for sure. Many of us have been on-site, working with victims, and AV scans can't find anything, but the victims were clearly (and we later determine it to be true) subject to some sort of malware infection. It's interesting how an AV scan won't find anything, but check a few Registry keys and you start to find all sorts of interesting things.

Many of the "protection measures" that folks have in place are easily circumvented, or worse, lead the victims themselves to not consider that as an avenue of infection or compromise, because of the fact that they do have that "protection".

Finally, if malware was involved in this situation, it's a great illustration of how attacks are becoming smarter...for example, rather than logging keystrokes, as pointed out in the article, the malware will read the contents of the form fields; when it comes to online banking and some of the protective measures that have been put in place, this approach makes sense.

Friday, February 19, 2010

Fun Analysis Stuff

Event Log Analysis
Here's another one for all of you out there doing Event Log Analysis. I installed Office 2007 (ie, version 12) on an XP system, and now I have two new .evt files...Microsoft Office Sessions and Microsoft Office Diagnostics. The Microsoft Office Sessions Event Log really seems promising...most of the events are ID 7000 or 7003 (session terminated unexpectedly). The ID 7000 events include how long the session was up, and how long it was active. While the event record doesn't appear to include a specific username or SID, this information can be correlated to Registry data...UserAssist, RecentDocs, application MRUs, etc...to tie the session to a specific user.

As we've seen before, Event Log records can be very useful...sorting them based on record number may show us that the system clock had been manipulated in some way. Another is to show activity on a system during a specific time frame.

Timeline Analysis
Speaking of Event Log records, an interesting and useful way to determine if the system clock had been set back is to sort Event Log records by event record number and observe the times...for each sequential record number, does the generated time for the record increment accordingly?

Another way to check for this (on XP) via the Event Log is to look for event ID 520, with a source of "Security". This event indicates the system time was successfully changed, and includes such information as the PID and name of the process responsible for the change, as well as the old system time (prior to the change) and the new time. An excellent resource and example of this is Lance's first practical.

Now, does event ID 520 necessarily mean that the user changed the system time? By itself, no, it doesn't. In fact, if you create a timeline using the image Lance provided in his first practical, incorporating the Event Logs, you'll see where event ID 520 is in close association with an event ID 35, with a source of W32Time...the system time was automatically updated by the W32Time service! You'll also find a number of instances where the system time was updated via other means. I'll leave it as an exercise for the reader to determine that other means.

An interesting side-effect of creating a timeline using multiple sources is that it provides us with event context. Historically, timelines have consisted of primarily file system metadata, and as such, did not give us a particularly clear picture of what was going on on the system when, say, a file was accessed or modified. Who was logged in, and from where? Was a user logged in? Was someone logged in via RDP? Was the file change a result of someone running a remote management tool such as PSExec, or perhaps due to something else entirely?

Devices
It's been a while since Cory Altheide and I published a paper on tracking USB removable storage devices on Windows systems. Recently, Cory asked me about web cams, and I started looking around to see what I could find out about these devices. As you might think, Windows stores a LOT of information about devices that have been connected to it...and with USB ports, and so many devices coming with USB cables, it just makes sense to connect them to your computer for updates, etc.

Now you may be wondering...who cares? Someone has a web cam...so what? Well, if you're law enforcement, you might be interested to know if a web cam, or a digital camera, or a cell phone...pretty much anything capable of taking or storing pictures...had been connected to the system. Or if there's an issue of communications, and you know the application (Skype, NetMeeting, etc.), then knowing that there was a web cam attached might be of interest. I'm thinking that having device information would be useful when dealing with pictures (EXIF data), as well as looking at different aspects of the use of applications such as Skype...did the user send info as an IM, or via video chat, etc.?

Interestingly, I have access to a Dell Latitude with a built-in web cam, and I took a couple of pictures with the software native to Windows XP...the pictures were placed in the "All Users" profile.

Speaking of taking pictures, got nannycam? Microsoft PowerToys for XP includes a Webcam Timershot application.

Resources
If you don't have a copy of the paper that Cory and I wrote, there's another one available here

Addendum: Volume Shadow Copies
Much like System Restore Points, you can't say enough about accessing files in Volume Shadow Copies...I'm sure that a lot of it bears repeating. Continually. Like from the sausage factory.

Wednesday, February 17, 2010

More Links, and a Thanks

A Good Example (aka, The Need for Training)
I was reading Brian Krebs' blog the other morning, and found an interesting story...interesting in light of a lot of the reports that have hit the streets recently regarding IR and forensics response.

What stood out in the article was:

...discovered on Feb. 5 that the computer used by their firm’s controller was behaving oddly and would not respond. The company’s computer technician scoured the system with multiple security tools...

...and...

The following Monday, Feb. 8, United Shortline received a call from the Tinker Federal Credit Union at Tinker Air Force Base in Oklahoma, inquiring about a suspicious funds transfer...

Sound familiar? Outside third-party notification of the issue, untrained staff responding to the incident, stomping on everything (i.e., scouring...with multiple security tools)...not too far off from what some of the reports have been illustrating, and what many have seen on their own. Oh, yeah, and the bad guys got away with about half the money.

And that's not all. Brian also posted on an incident at the city of Norfolk, VA (terminal23 picked up on the story, as well, from Brian); this one also has some prime quotes. A city official was quoted as saying, "We speculate it could have been a ‘time bomb’..."...while the investigation was still on-going, a relatively high-up official is speculating. It appears that the primary indicators thus far is that files were deleted from the system32 directories on what may be as many as 784 PCs and laptops.

There appear to be some indications that the malware infection...keeping in mind that there doesn't seem to be anything definitive...originated from a print server;

However, an exact copy of the malware on that server may never be recovered, as city computer technicians quickly isolated and rebuilt the offending print server.

Look, I understand what folks are going to say..."hindsight is 20/20". However, if anything, these stories should be good examples of what NOT to do when faced with an incident. I know that a lot of folks would say, it's easier to just rebuild the system...but look what happens when you use that approach. And when you rebuild the system and have no idea how the incident occurred, then how do you prevent it from happening in the future. It appears from the article that law enforcement was contacted...but why, and to what end? I understand that someone wants this treated as a crime, but it's generally not helpful if your house is burglarized to burn the house down and then call the police.

But again...this is nothing new. Folks who respond to incidents and try to assist customers on an emergency basis have been saying this for years...be prepared, because it's not if you get hit by an incident, it's when. I completely understand that you can't completely know everything, but EMTs are prepared for most types of incidents they would encounter, right? In the case of most victim organizations, though, it's not so much that they got hit, but how they reacted.

What could they have done better? After all, one shouldn't point out deficiencies without offering up a solution, right? Well, going back to article on Tinker Federal...a lot of folks would look at me and say, "hey, at the time there was nothing to justify the time and expense of imaging the system." Okay...I track with you on that...but there's more to IR than imaging systems. How about having a tool set in place to collect specific information, and gather selected files? Make it automated so that it works quickly and efficiently, every time. That way, if you need it, you have something...if you don't need it, fine.

It's a matter of education and training. If you need an example of what to look to for training, try EMTs or the military. EMTs have a defined triage process for assessing victims. The military has immediate actions...stuff we learned to do because the time would come when you needed that particular skill, but you don't have time to look it up in a book or on Wikipedia. That time would be at 3am one morning, after you'd been without sleep for over 2 days, had little food, and needed a particular skill that could save your life, and the lives of your fellow service members.

Rootkit Detection via BSoD
Seems there's a bit of malware being (that had been) detected due to a BSoD after an update. Apparently, after updating with the MS10-015 patch, some folks were experiencing BSoDs...at least in some cases, this had to do with the Tidserv on their system; the malware had infected drivers, and then those drivers made calls to invalid RVA addresses.

Symantec - Backdoor.Tidserv!inf
MS calls it Win32/Alureon; this one is a lot of fun...according to MS, it hides itself (think rootkit) by intercepting file system driver I/O packets.

The really good news is that, according to MS, the code has already been updated to no longer use hard-coded RVAs...which means if your system gets (re) infected, you're not likely to be able to use a BSoD to detect what your AV missed...

MMPC
A recent posting regarding Pushbot over on the MMPC made for some good reading. I picked up several interesting things from just this one posting; code reuse, passing around what works, perhaps some sloppy setup (leaving non-functioning code in the program...although that may be a red herring). The fact that the base code is open and shared, and used/modified by a number of different folks really highlights to me why the good guys and victims always seem to be playing catch up.

The MMPC posting says that the purpose of malware like this is to infect as many systems as possible. On a large scale, you might expect this malware to be somewhat noisy, but on a smaller scale, not so much...infect a system, reach out to an IRC server, spread via some mechanism. There won't be a lot in the way of re-infections. Oh, but wait...some variants appear to write a value to the ubiquitous Run key...nice! Looks like you've got an enterprise-wide detection mechanism available to any domain admin with a copy of Notepad!

What really isn't stated explicitly in the posting is even more telling. While this malware is described as being "old school", it still uses multiple propagation mechanisms. This has apparently made it enough of an issue that MS has added Pushbot to the MRT.

Thanks
Now and again, I get thank yous from readers of my books and/or blog, mostly directly into my inbox. I greatly appreciate the time folks take to reach out and say these things. I also sometimes get a comment or TY that I'd like to share with others, in part because it helps validate some of the things I do or put out there.

I received the following recently from Lt Chris Taylor of the City of Richmond, Indiana Police Dept (posted here with his permission):

I also want to thank you for your contributions to the field of Digital Forensics. Between your book, your blog, and the information you provide on the various list serves I subscribe to, the info and tools you’ve provided have shaved countless hours off of processing cases, making me more efficient as an examiner. Thanks again!

Thanks, LT! I greatly appreciate your words!

Monday, February 15, 2010

Links Plus

I've spent a lot of space in this blog talking about timeline analysis lately, so I wanted to take something of a different tact for a post or two...mostly to let the idea of timeline analysis marinate a bit, and let folks digest that particular analysis technique.

PDF Forensics
Didier Stevens has provided a fantastic resource and tools for analyzing PDF files...so much so, that some have been incorporated into VirusTotal. Ray Yepes has provided an excellent article for locating MYD files, mySql database files used by Adobe Organizer that maintain information about PDF files that have been accessed. Congrats, Ray, on some excellent work!

Web Browser Forensics
When most folks think "web browser forensics", they think cache and cookie files. I also mentioned some other browser stuff that might be of interest...in particular bookmarks and favorites, as well as some other tidbits. Bringing even more to the game, Harry Parsonage has put together an excellent resource describing web browser session restore forensics (woany released a tool inspired by the paper). Here's some additional value add to Harry's information, from the sausage factory.

Associated with web browser forensics, Jeff Hamm has written an excellent (all of the papers are excellent!) regarding Google Toolbar Search Artifacts. Jeff also has a paper available regarding the Adobe Photoshop Album Cache File.

Resources
Woany also has other tools...woanware...available for parsing other data that may be associated with web browser forensics, as well as data from other sources. Some of the other interesting tools include ForensicUserInfo and RegExtract.

NirSoft provides a number of excellent utilities for password recovery, etc. If you're analyzing an acquired image, you may need to boot the image with LiveView and login to run some of the tools.

JADSoftware has several excellent tools, including a couple of free ones. Even the ones that aren't free are definitely worth the purchase price, particularly if you're doing the kind of work that requires you to look at these areas a lot.

Activity
Now and again, I see a posting to a forum or receive an email, and the basic question is, how do I determine if there was activity on a system during a specific time period?

The historical approach to answering this type of question is to look at the file system metadata, and see if there are any file creation, access, or modification times during the window in question. However, this presents us with a couple of challenges. In Vista, MS disabled updating of file last access times by default...it's no longer an option that an administrator can set. Then what happens if we're looking for activity on a system a couple of weeks or months ago? File system metadata will show is the most recent changes to the system, but much of that may be relatively close to our current time and not give us a view into what may have happened during the time window we're interested in.

However, we have more than just file system metadata available to us to answer this type of question (I know...we're circling back to timeline analysis now...):

MFT Analysis: Generate a timeline based on $FILE_NAME attribute timestamps. Chapter 12 of Brian Carrier's File System Forensic Analysis book contains a good deal of information relating to these timestamps.

Event Log Analysis: Generate a timeline based on EVT/EVTX file entries. For EVT records, don't rely on just those in the system32\config\*.evt files; see if there's any indication of records being backed up, and also check the pagefile and unallocated space. All you may need to demonstrate activity during a time window is the event record header information anyway.

Log Files: Windows systems maintain a number of log files that record activity. For example, there's the Task Scheduler log file (SchedLgu.txt), setupapi.log, mrt.log, etc. If you're looking at a Windows XP system, System Restore Points each have an rp.log file that states when the Restore Point was created, as well as the reason for the creation, giving you more than just "something happened on this day". Also, look for application logs, particularly AV application logs...some AV applications may also write to the Application Event Log, in addition to maintaining their own log files.

File Metadata: Lots of applications maintain time-stamped information embedded within the structure of the files they interact with; for example, application Prefetch files on XP and Vista. Also, Scheduled Task/.job files. Office documents are also widely known for maintaining a staggering amount of very useful metadata.

Registry Analysis: Ah...the Registry. In some cases, time-stamped information is maintained as Registry key LastWrite times, but there is also considerable information maintained in binary value data, as well. The system-wide hives...SAM, Software, System, and Security...will maintain some useful information (LastShutdownTime, etc.), but you may find more valuable information in the user's NTUSER.DAT and USRCLASS.DAT hives. Also, don't forget that you may also find considerable information in the unallocated space within hive files! Specifically, when keys are deleted, their LastWrite time is updated to reflect when they were deleted, providing what may be some very valuable information.

Of course, when we're talking about Registry hives, we also have to keep in mind that we may have hive files available in either XP System Restore Points, or within Volume Shadow Copies.

In short, if you need to determine if there was activity on a system during a particular window, and perhaps relate that activity to a particular user account, there are a number of data sources available to you. This type of question lends itself very well to timeline analysis, too.

Monday, February 08, 2010

MFT Analysis

As an aside to timeline analysis, I've been considering the relative confidence levels inherent to certain data sources, something I had discussed with Cory. One of the things we'd discussed was the relative confidence level of file system metadata, specifically the timestamps in the $STANDARD_INFORMATION attribute versus those in the $FILE_NAME attribute. Brian Carrier addresses some specifics along these lines in chapter 12 of his File System Forensic Analysis book.

So, I've been looking at the output of tools like Mark Menz's MFTRipper and David Kovar's analyzeMFT.py tools. Based on the information in Brian's book and my chat with Cory, it occurred to me that quite a bit of analysis could be done automatically, using just the MFT and one of the two tools. One thing that could be done is to compare the timestamps in both attributes, as a means of possibly detecting the use of anti-forensics, similar to what Lance described here.

Another thing that could be done is to parse the output of the tools and build a bodyfile using the timestamps from the $FILE_NAME attribute only. However, this would require rebuilding the directory paths from just what's available in the MFT...that is, record numbers, and file references that include the parent record number for the file or folder. That's the part that I got working tonight...I rebuilt the directory paths from the output of David's tool...from there, it's a trivial matter to employ the same code with Mark's tool. And actually, that's the hardest part of the code...the rest is simply extracting timestamps and translating them, as necessary.

Also, I didn't want to miss mentioning that there's a tool for performing temporal analysis of the MFTRipper output from Mark McKinnon over at RedWolf Computer Forensics. I haven't tried it yet, but Mark's stuff is always promising.

Timeline Analysis...do we need a standard?

Perhaps more appropriately, does anyone want a standard, specifically when it comes to the output format?

Almost a year ago, I came up with a 5-field output format for timeline data. I was looking for something to 'define' events, given the number of data sources on a system. I also needed to include the possibility of using data sources from other systems, outside of the system being examined, such as firewalls, IDS, routers, etc.

Events within a timeline can be concisely described using the following five fields:

Time - A timeline is based on times, so I put this field first, as the timeline is sorted on this field. Now, Windows systems have a number of time formats...32-bit Unix t_time format, 64-bit FILETIME objects, and the 128-bit SYSTEMTIME format. The FILETIME object has granularity to 100 nanoseconds, and the SYSTEMTIME structure has granularity to the millisecond...but is either really necessary? I opted to settle on the Unix t_time format, as the other times could be easily reduced to that format, without loosing significant granularity.

Source - This is the source from which the timeline data originates. For example, using TSK's fls.exe allows the analyst to compile file system metadata. If the analyst parses the MFT using MFTRipper or analyzeMFT, she still has file system metadata. The source remains the same, even though the method of obtaining the data may vary...and as such, should be documented in the analyst's case notes.

Sources can include Event Logs (EVT or EVTX), the Registry (REG), etc. I had thought about restricting this to 8-12 characters...again, the source of the data is independent of the extraction method.

Host - This is the host or name of the system from which the data originated. I included this field, as I considered being able to compile a single timeline using data from multiple systems, and even including network devices, such as firewalls, IDS, etc. This can be extremely helpful in pulling together a timeline for something like SQL injection, including logs from the web server, artifacts from the database server, and data from other systems that had been connected to.

Now, when including other systems, differences in clocks (offsets, skew, etc.) need to be taken into account and dealt with prior to entering the data into the timeline; again, this should be thoroughly documented in the analyst's case notes.

Host ID information can come in a variety of forms...MAC address, IP address, system/NETBios name, DNS name, etc. In a timeline, it's possible to create a legend with a key value or identifier, and have the timeline generation tools automatically translate all of the various identifiers to the key value.

This field can be set to a suitable length (25 characters?) to contain the longest identifier.

User - This is the user associated with the event. In many cases, this may be empty; for example, consider file system or Prefetch file metadata - neither is associated with a specific user. However, for Registry data extracted from the NTUSER.DAT or USRCLASS.DAT hives, the analyst will need to ensure that the user is identified, whereas this field is auto-populated by my tools that parse the Event Logs (.evt files).

Much like the Host field, users can be identified in a variety of means...SID, username, domain\username, email address, chat ID, etc. This field can also have a legend, allowing the analyst to convert all of the various values to a single key identifier.

Usually, a SID will be the longest method of referring to a user, and as such would likely be the maximum length for this field.

Description - This is something of a free-form, variable length field, including enough information to concisely describe the event in question. For Event Log records, I tend to include the event source and ID (so that it can be easily researched on EventID.net) , as well as the event message strings.

Now, for expansion, there may need to be a number of additional, perhaps optional fields. One is a means for grouping individual events into a super-event or a duration event, such as in the case of a search or AV scan. How this identifier is structured still needs to be defined; it can consist of an identifier in an additional column, or it may consist of some other structure.

Another possible optional field can be a notes field of some kind. For example, event records from EVT or EVTX files can be confusing; adding additional information from EventID.net or other credible sources may add context to the timeline, particularly if multiple examiners will be reviewing the final timeline data.

This format allows for flexibility in storing and processing timeline data. For example, I currently use flat ASCII text files for my timelines, as do others. Don has mentioned using Highlighter as a means for analyzing an ASCII text timeline. However, this does not obviate using a database rather than flat text files; in fact, as the amount of data grows and as visualization methods are developed for timelines, using a database may become the standard for storing timeline data.

It is my hope that keeping the structure of the timeline data simple and well-defined will assist in expanding the use of timeline creation and analysis. The structure defined in this post is independent of the raw data itself, as well as the means by which the data is extracted. Further, structure is independent of the storage means, be it a flat ASCII text file, a spreadsheet or a database. I hope that those of us performing timeline analysis can settle/agree upon a common structure for the data; from there, we can move on to visualization methods.

What are your thoughts?