CYANLAB

13Aug/100

LNK Parsing: You’re doing it wrong (II)

As we saw in the first article, Windows gives priority to data in the ITEMIDLIST to resolve LNKs whenever there's both an ITEMIDLIST and a LinkInfo structure, so let's focus on the actual format of the ITEMIDLIST.

An ITEMIDLIST is nothing more than a list of ITEMID/SHITEMID (IID) which are contiguous. It is terminated by an empty IID which is defined as a 0-size IID. On-disk it's seen as

typedef struct _ITEMIDLIST
{
	USHORT size;
	BYTE data[ size ];
} ITEMIDLIST;

ITEMIDLIST.data contains all the contiguous IIDs. For LNK files the IIDs are in order as they have to form a path so that IID #0 will be the first element, IID #1 the 2nd, etc.

Example
Let's say you have the path "C:\TEMP\file.tmp"

C: would be in an IID
TEMP would be in the next IID
file.tmp would be the last IID

Every IID conforms to the following definition:

typedef struct _ITEMID
{
	USHORT size;
	BYTE data[ size - 2 ];
} ITEMID;
typedef ITEMID IID;

Note the difference in size here. The size for each IID accounts for the size member itself as well.

I mentioned the ITEMIDLIST has to be NULL-IID terminated, but it it also has to validate on size, as shell32.dll looks to do proper checks on both, the terminating IID and the total size of the list.

Click here in case you want to double check

Having the basic layout mapped, let's go into some more detail.

How to call different namespaces

If you recall from the first article I mentioned that the default behaviour, when no other handler is specified, is to translate the data as if it was relative to the Desktop of the user. That's because the default namespace is CDesktopFolder.

And how do you specify other namespaces? By its CLSID. A CLSID is nothing more than a well-known unique identifier for a specific Windows component.
How do you specify the CLSID? You have to shape the first IID in the list like this (in pseudo-C):

IID.size = (WORD) 0x14
*IID.data = (WORD) TYPE
*(IID.data+2) = (GUID) CLSID

IID.size is 0x14 as CLSID is 0x10 bytes long. TYPE is specific to each namespace.

Probably the fastest way to know what a CLSID is about is to check the registry. Get the GUID (get it right!) and look under HKEY_CLASSES_ROOT\CLSID\{CLSID-YOU-JUST-FOUND} just like this:

Registry view of a CLSID

It should point you to the right DLL and sometimes it will be so kind as to give you some information about what it relates to. If you want anything more than this you'll probably have to do reverse engineering.

So anyway, what are the most common CLSIDs? Here's a table of the ones I tend to see on real systems:

Common CLSIDs

{20D04FE0-3AEA-1069-A2D8-08002B30309D} - CDrivesFolder
Base Path: My computer

An absolute path and the most common form of LNK you'll probably find.

{208D2C60-3AEA-1069-A2D7-08002B30309D} - CNetRootFolder
Base Path: My Network Places
It points to network folders which have been saved and is mostly found in "target.lnk" files in the NetHood folder.

{450D8FBA-AD25-11D0-98A8-0800361B1103} - CMyDocsFolder
Base Path: My Documents
For LNKs relative to the My Documents folder. It does not specify the user as it is relative to the current one.

{871C5380-42A0-1069-A2EA-08002B30309D} - Belongs to ieframe.dll
Base Path: The interwebs
It points to URLs using your default browser. It's not that common really. I've seen it mostly in the Start Menu for Programs that add a LNK to their webpage or for items under the NetHood that point to an FTP.

Let's look at some examples. Keep in mind my test system is a Windows XP SP3. If you test the same on a Vista/7 system you should get little to no different results except for the ones stated later in this article. If you see something which fails epicly in your case, please comment!

You'll see that I'll be pointing to files called "pointedfile.txt" in the examples. It's useful for 2 reasons: first, it's a long file which make Windows assign a special 8.3 name; second, setting the same baseline allows for easier comparison between different LNK types.

You can download the same samples I'm using right here: Sample LNKs for the article LNK Parsing II

Absolute-path LNKs: My Computer relative

We'll start with an absolute path LNK. I've created a file at G:\pointedfile.txt and created a LNK for it through the right-click menu. This is how it looks under the Template
(Note: I've colored the ITEMIDs alternatively so you can easily tell them apart).

Template view of a standard absolute LNK

This one has 4 IIDs.

  • The first one points to the CLSID of My Computer "{20D04FE0-3AEA-1069-A2D8-08002B30309D}" in GUID form "E0 4F D0 20 - EA 3A - 69 10 - A2 D8 - 08 00 2B 30 30 9D" (the first 3 parts of a GUID are always stored in little-endianish form that's why you see them in reverse).
  • The second one holds the drive itself. I've never seen it point to something that is not a drive. You only have to Skip() one byte (it's always 0x2F) and do an ANSI string_read(). Since drive letters can't be Unicode it's fine
  • The third one holds the second (and last) element of the path and is the "base case". To extract the name you follow these simple steps. Don't hold your breath
    • offset: See the last 2 bytes of the IID? "1C 00" at file position 203. Read that as little endian (so it's 28).
    • Skip offset (0x1C in this case) from the start of the IID (or offset-2 from data). We are right on "36 00" at byte 151 of the file.
    • check: Read the value 2 bytes from there, so (IID+offset+2) which is (WORD) 0x03. If it's lower than 0x03...
      • exit_8.3: it seems to mean there's no other name or that it is invalid so you take the first filename, which always seems located at 0xC bytes past data, append it to the path you have and skip to the next IID (not our case so we go to the next step)
    • fullname_unicode_search:< Check what's at IID+offset+0x10. Which in our case is (WORD) 0x14. We are at byte 167 of the file.
      • If it's bigger or equal than what's at (IID+offset) which is (WORD) 0x36 goto exit_8.3 (not our case)
      • If it's smaller than 0x14 goto exit_8.3 (not our case either)
    • Now if (IID+offset+0x10) contains something different than (WORD) 0, which is our case as it contains 0x14, take this value (0x14) and do a Unicode string_read() at IID+offset+0x14. Which points exactly to the start of the "pointedfile.txt" string. And skip to the next IID.
    • fullname_ansi_search: Otherwise check what's at IID+offset+0x12. We'd be at byte 169 of the file.
      • If it's bigger or equal than what's at (IID+offset) goto exit_8.3
      • If it's smaller than 0x14 goto exit_8.3
    • Then if (IID+offset+0x12) contains something different than (WORD) 0take this valuea nd do an ANSI string_read() at IID+offset+this value. Which should point to an ANSI string which should be the fullname. And skip to the next IID.
    • If all fails. Shoot yourself.
    • Rinse and repeat until you get to an empty ITEMID.

Not bad at all for only 82 bytes.

So after using the algorithm we get G:\pointedfile.txt. Which is exactly where it points. Great!

Oh! If it wasn't enough...
I forgot to mention that there's an additional check against the base case type IIDs for the byte at data+0x2. I have yet to see a LNK that fails to pass this check, but it doesn't change the fact it's performed:

if (data[2] &amp; 0x70) != 0x30)

Then it's not a valid ITEMID and won't be parsed. In our case it validates as it's 0x32 and (0x32 & 0x30) = 0x30.

Let's go for the next one.

My Documents relative path

This time I've created a LNK pointing to a file in My Documents which, again, is called pointedfile.txt. This is how it looks like:

Template view of a typical My Documents relative LNK

We only have 3 IIDs here. As it is relative, we have less data for the handler.

  • The first one points to the CLSID of My Documents "{450D8FBA-AD25-11D0-98A8-0800361B1103}" in GUID form "BA 8F 0D 45 - 25 AD - D0 11 - 98 A8 - 08 00 36 1B 11 03"
  • The second one holds the filename. Do you see a patern here? No? I'll wait
  • ...
  • ...
  • ...
  • ...
  • ...
  • Now? Yep. If you use the same algorithm as in the My Computer case you get the filename! The only difference is that you have to append the My Documents path to it. So I tend to translate them as %MYDOCS%\[rest]

So we got %MYDOCS%\pointedfile.txt easily. Next please!

Intertubes style

This one seems fairly uninteresting, except for an uncommon case that I mention below. I created them by going to the My Network Places and clicking on Add Network Place. Here's the sample:

Template view of an Intarnets-style LNK

Three (3) or more IIDs

  • One for the handlin'. Holds the CLSID which is "{871C5380-42A0-1069-A2EA-08002B30309D}". You know how it's stored already.
  • Two for the parsin'. This one always seems to be the start of the URL.
    • type: Read a WORD at data and you get a type
    • When type is 0x8061 (this case)
      • Go to IID.data+0x4 and do an Unicode string_read() to get the URL. This type is for straight URLs which is exactly the example above. You get http://127.0.0.1:8308/ui
    • When type is 0x0361 it holds the protocol, the username, and the hostname. It's the example you have right below.
      • Skip to data+0xC. Read as a FILETIME. It holds the date this LNK was created.
      • hostsize: Skip to data+0x28. Read a DWORD. This is how many bytes in the IID are reserved for the host
      • Skip to data+0x28+0x4 and do an ANSI string_read() for max hostsize bytes. You got the host
      • usersize: Skip to data+0x2c+hostsize. Read a DWORD. This is how many bytes in the IID are reserved for the username.
      • If it's 0x4 you're assured there's no username, so go to next step. Skip to data+0x2c+hostsize+0x4 and do an ANSI string_read() for max usersize. There's the username.
      • protosize: Skip to data+0x2c+hostsize+0x4 + usersize. Read a DWORD. This is how many bytes in the IID are reserved for the proto.
      • Skip to 0x2c+hostsize+0x4+usersize+0x4+0x4 (yes, we skip an additional DWORD that always seems to be 0) and do an ANSI string_read() for max usersize. Got the proto.
      • Join it all as proto://user@host/
    • You shouldn't find anything different than this in the second IID
  • The rest should be folder types which you parse like this:
    • ansifolder: Skip to data+0x24. do an ANSI string_read(). You got the folder in ansi.
    • boundary: Now you have calculate what's the next 4-boundary of the strlen(ansifolder). If you read 2 chars, your boundary is 4. If you read 3 it's still 4. If you read 4, it's 8 and so on...
    • unicodefolder. Skip to data+0x24+boundary and do an Unicode string_read(). There's the folder name in Unicode, just in case.

What's this 0x0361 subtype and weird folder case I mention? It's exactly this:

Another flavor of the same kind of LNK

This LNK parsed with the above algorithm yields: ftp://user@ftp.microsoft.com/test

Only four CLSIDs?

Actually not. When I said these 4 are the most common I lied (albeit just a bit). These were the most common before Windows Vista. Now they are just common :)

Vista brought a few changes to the Operating System and LNKs were not an exception. Quite a few operating system components are now accessed through LNKs. Of all the changes there's a significant one which affects how path resolution is done for the My Documents relative LNKs. My Documents is now placed inside the "Libraries" concept and uses a new CLSID.

Here's the new LNK types and their CLSIDs I've seen so far (if I say nothing in the description it means that I haven't seen data being passed to it and that it should only open that functionality):

New CLSIDs added on Vista or higher

{031E4825-7B94-4DC3-B131-E946B44C8DD5} Windows 7 Libraries Relative Path. Which includes My Documents.
Obviously holds data.

{22877A6D-37A1-461A-91B0-DBDA5AAEBC99} Windows 7 Recent Places.

{5399E694-6CE5-4D6C-8FCE-1D8870FDCBA0} Windows Vista Control Panel.
I think it used the TYPE to specify which CPL to load. I have no Vista at hand atm to double check.

{26EE0668-A00A-44D7-9371-BEB064C98683} Windows 7 Control Panel Subsystem;
The component to load is specified as another CLSID-type IID which follows this one. With some components, a 3rd IID with data is included.
It's interesting to note that Windows Vista uses a different CLSID. I haven't researched what changed here.

{2559A1F1-21D7-11D4-BDAF-00C04F60B9F0} Links to the Windows Vista Help and Support.

{2559A1F3-21D7-11D4-BDAF-00C04F60B9F0} Links to the Windows Vista/7 Run.
This one was unexpected.

{2559A1F5-21D7-11D4-BDAF-00C04F60B9F0} Links to the Windows Vista/7 E-mail app

{3080F90D-D7AD-11D9-BD98-0000947B0257} Links to the Windows Vista/7 Show Desktop functionality.

{3080F90E-D7AD-11D9-BD98-0000947B0257} Links to the Windows Vista Windows Switcher.

{4336A54D-038B-4685-AB02-99BB52D3FB8B} Links to files in the Windows Vista Public Folder.
It does wield data. Lots of it.

{59031A47-3F72-44A7-89C5-5595FE6B30EE} Links to the Windows Vista/7 Special Folders.

{645FF040-5081-101B-9F08-00AA002F954E} Links to the Windows Vista Recycle Bin.

{ED228FDF-9EA8-4870-83B1-96B02CFE0D52} Links to the Windows Vista/7 Game Explorer.

{ED7BA470-8E54-465E-825C-99712043E01C} Links to some tasks under the Control panel for Windows 7. Oddly I've only found them in the Volume Shadow Copies :?.

What's next?

The next article in this series will take a look at the last of the common LNKs, the My Network Places (NET SHARE) type. We'll get hands deep on Vista-specific LNK files such as the new MyDocuments relative type and we'll have fun doing some modding to create unusual LNKs.

See you soon!

Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

(required)


*

No trackbacks yet.