LNK Parsing: You’re doing it wrong (I)
Recent developments on the malware arena (MS10-046) have raised awareness of LNK files. Due to the nature of the LNKs used by this specific malware I tried to challenge a few members of the forensic community to make them question themselves about what's publicly known of LNKs but it seems I failed miserably.
Instead, I've decided to post what I researched nearly 2 years ago. This is what I found...
An introduction no one cares about
What are LNKs really? For a long time there has been few information about them. I remember at least two papers which were quite useful back in the day, one wrote by the fellow researcher Andrés Tarascó titled Análisis de Estructuras de ficheros LNK ("LNK files Structure Analysis", it's in Spanish), the other I haven't been able to find online anymore but I'd say it was an even earlier reference on the actual format. Only recently Microsoft opened the Microsoft Shell Link Binary File Format.
Microsoft describes it as a "a data object that contains information that can be used to access another data object". What we commonly call a Shortcut. It is a very small file with an "lnk" extension.
At its core the most common type of LNK file has:
- A Header. Which tells us what kind of information it contains, along with dates relative to the pointed file.
- An ITEMIDLIST. Which is really an optional field, but is present on almost 99% of the LNKs.
- A LinkInfo structure. That contains either LocalInfo, which tells us where, relative to our local system, the file can be found. Or a NetworkInfo object. Which is present when the file is to be found by using the network, as in, it's not supposed to be in our computer.
LNKs are also quite easy to spot whenever you come across one as they look like this on-disk:
Having seen the basic layout of the most common LNKs we're ready for the next step. What we're gonna do is we'll create a sample LNK and do a fast analysis. I won't get too technical, no worries.
I recommend you follow the same steps I do so you understand what's happening under the hood in case you get lost.
Setting up the environment
We'll use Didier Stevens recently released 010 Editor LNK Template for the 010 Editor program.
- First we'll create 2 files on our desktop called "goforit.txt" and "hax0red.txt"
- We'll make it a bit more dramatic by putting a "duh" as the contents of the first file, and "PWNT!!!!!!1!!11 one one" for the second.
- You know what? We'll set Notepad's default font to Comic Sans 32 too.
Got it? Cool.
Next, we'll create a shortcut to the "goforit.txt" file via Drag with the right-mouse button > Drop > Create Shortcuts here. This will make it get named "Shortcut to goforit.txt.lnk" right? Well, you won't see the extension no matter what, but no worries.
Then we'll open cmd.exe and create a copy of this LNK with a diferent extension so that we can view it. We'll call it "Shortcut to goforit.txt.lnl" instead.
OK, open 010 Editor and apply the LNK Template to this file. You'll see something like this (minus the fancy colors and maybe some structure name).
This link has a Header with a populated ITEMIDLIST, some LinkInfo which in this case points to a Local File plus some extra information we won't care about. It seems a fairly standard LNK file to me.
Now take your time to parse it with the LNK Parser of your choice. I'm gonna use EnCase's LNK parser as it's what I have available that most of you might have as well. Whatever tool you use you'll probably get the following information, if not less:
Link File: Shortcut to goforit.txt.lnk
Full Path: Case 1\Single Files\Shortcut to goforit.txt.lnk
Offset: 0
Size: 455
File Flags: HASITEMID | ISFILEORFOLDER | HASRELATIVEPATH | HASWORKINGDIRECTORY
File Attributes: ARCHIVE
Show Window Value: SW_NORMAL_WT
Created Date: 09/08/10 22:37:28
Last Written Date: 09/08/10 22:37:43
Last Accessed Date: 09/08/10 22:37:28
Volume Label: C
Media Type: Fixed
Volume Serial: 14 2D 33 67
File Length: 3
Icon File Name:
Command Line:
Base Path: C:\Documents and Settings\nnnn\Desktop\goforit.txt
Application Path:
Working Directory: C:\Documents and Settings\nnnn\Desktop
Share Name:
Mapped Drive Letter:
Description:
Everything looks good, right? Great, this LNK is gonna be our baseline.
What no one told you
Let's focus a on the actual data contained in the LNK. In particular on data redundancy. How many times does the filename "goforit.txt" appear? A whopping 4.
- Twice on the ITEMIDLIST. One is ASCII encoded, the other Unicode.
- Once as the LocalBasePath of the LinkInfo structure.
- Once as the RelativePath
So, do you know where does your LNK parser get the information to obtain the file path it reports? Actually, do you know where it doesn't?
The usual algorithm used to resolve the actual path given a LNK file of a post-mortem system goes like this:
- If the LNK has a Local target, the local path is the LocalBasePath member of the LinkInfo structure.
- If the LNK has a Network target, the network path is the NetworkSharePath + RemainingPath if present of the LinkInfo structure. (Note: "+" means to concatenate )
- If RelativePath and WorkingDirectory are present, WorkingDirectory + RelativePath might give you the filename. (Note: I don't remember seeing this case alone in the wild, but it can happen and it DOES work. And yes, I mean ONLY having these)
This covers 2 of the 4 references we've got. What happens with the data in the ITEMIDLIST? It's skipped.
Why? Well, it seems noone cares about it. Probably because it's such an opaque data field. The official Microsoft Shell Link Binary File Format doesn't shed light on how to interpret its contents, nor does any other source of public information about the LNK file format that I know of.
Why care then? After all we're getting the right information from the other fields.
Or are we?
Let's manipulate our LNK. We're gonna change the strings on the ITEMIDLIST member of the LNK file to point to "hax0red.txt" instead of "goforit.txt" and we'll run the resulting LNK.
We'll start by going into INSERT MODE on 010 Editor (press the Insert key) and modify the data. It HAS to look like this: modify the data, don't add a single byte if you don't know what you're doing.
Now save it, create a copy and change the "lnl" extension to "lnk".
And run it
Uh oh... Wait wait... This has to be wrong. Let's run it through our LNK parser again...
Link File: Copy of Shortcut to goforit.txt.lnk
Full Path: Case 1\Single Files\Copy of Shortcut to goforit.txt.lnk
[...]
Base Path: C:\Documents and Settings\nnnn\Desktop\goforit.txt
Let's see what EnCase gives us on the Symbolic Link column for the Entry Table.
Now let's check the properties of the LNK on our Live System.
So what's going on here? Well, you just saw it: Windows gives priority to the ITEMIDLIST information when resolving LNKs, so whatever is in the ITEMIDLIST prevails for path resolution no matter what the other data fields say.
At this point you might be thinking:
But it still says "C:\Documents and Settings\nnnn\" on the File Properties and that string certainly isn't on the ITEMIDLIST so it must be getting it from the information I am already fetching with my parser!
Well spotted Mr.Eagle. But, trust me, it is NOT getting this information from the data fields on the LNK. Yeah, surprise.
Let's do one more test in case you don't believe me. Let's modify both places of the LNK where it says "nnnn" (my test user) to try to make it point to the file "hax0red.txt" from user "kkkk" instead.
Now save it and get its properties.
It is surprising but there's an explanation to it. And the explanation is exactly what the ITEMIDLIST is about.
The data the ITEMIDLIST contains is not defined anywhere because only whoever handles its data knows how to treat it. It is the whole point of an ITEMIDLIST, to be a neutral storage of sorts.
The main handler for LNK files is made so that it can pass the resolve-and-execute duties to other handlers in case there's the need. To be more precise, in case it's specified. This LNK we created specifies no handler and is treated as a kind of default case. And in that case the filename it refers to is relative to the Desktop of the user currently opening the LNK.
Again, if you don't believe me: copy this LNK file anywhere on the filesystem, log in as another user and click on Properties. In my case I used user "IIIIIIII" and the LNK itself was on the C root. These are my properties:
There's a lot more to the ITEMIDLIST contents which I'll try to cover in detail in one or more blog posts. Of particular interest will be how a different handler is specified and how to interpret data for each of the most common handlers.
Wrapping up
I've shown you which data is actually used by Windows for resolving LNK files. As it is data that is currently not being parsed by LNK parsing tools, you're not getting the whole picture. Used in an evil way it's a way of concealing the real target of an LNK to the eyes of the average forensic investigator. I guess you could call it an anti-forensic technique that attacks current tools.
From the half-empty-glass point of view we could put it this way:
Not only you've been looking at half the picture, you've been looking at the bad half thinking it was the good one.
And since, and I think we can all agree here, we use automatic tools to parse LNK files and never check the LNKs themselves by hand one by one we're bound to pass this up.
Now, being more realistic:
Q: What does it mean for forensic investigations? Am I getting wrong information?
A: No. The information you get is definitely there, but in my opinion you were giving it more credit than it really deserves. The information used by Windows regarding the actual target resides on the ITEMIDLIST, mostly.
Q: Is it common that users modify the LNK files this way to cover their tracks? Are there tools to automate this?
A: Absolutely not from what I've seen so far. First, it requires knowing it. Second, it requires automation. I have yet to see it in a real case scenario and, as far as I know, there are no tools to automate hiding in plain sight this way.
Q: How does it change what I've been doing so far with LNK files?
A: Not much really. As the information is replicated both in the ITEMIDLIST and the LinkInfo structure, and LinkInfo is the place where you've been getting all the information, you should be safe...mostly. But it's still wrong to assume LinkInfo has the right information as far as resolving goes. What's more, not all LNK files have a LinkInfo structure at all nor they need it. That's why some LNKs, specially after Vista, seem to hold no information to the average LNK parser.
So, did you know this happened with LNKs? Do you like the trick? If you have any question make sure to drop a comment. Thanks!