Windows Shellbag Forensics
Microsoft Windows uses a set of Registry keys known as “shellbags” to maintain the size, view, icon, and position of a folder when using Explorer. These keys are useful to a forensic investigator. Shellbags persist information for directories even after the directory is removed, which means that they can be used to enumerate past mounted volumes, deleted files, and user actions.
Yuandong Zhu, Pavel Gladyshev, and Joshua James provided a nice overview of the investigative value of shellbags in “Using shellbag information to reconstruct user activities” [pdf]; however, they do not describe how to programmatically access the data. Allan S Hay went into greater detail in his December, 2004 document “MiTeC Registry Analyser” [pdf], although he also leaves out a thorough analysis of the format. TZWorks provides an effective closed-source shellbag parser sbag, but does not explain its algorithm. Yogesh Khatri first described the basic structure of Windows Shell Items in his blog post for 42 LLC entitled Shell BAG Format Analysis. Joachim Metz went on to described the binary format of the Windows Shell Item structures with great detail in Windows Shell Item format specification [pdf]. This page documents an approach to parsing shellbags in detail, as well as introduces an open-source, cross-platform shellbag parser.
On a Windows 7 system, shellbags may be found under:
With no tools beyond Regedit (or Regview.py), Windows 8.3 filenames (eg.
At
this point, it is easy to write parser that explores the FOLDERDATA
keys under the Shell registry key. For each FOLDERDATA, the parser might
enumerate each ITEMPOS value and consider the binary blob. By applying
the binary template above, the tool could identify filenames, MACB
timestamps, and other metadata independent of the filesystem MFT.
Unfortunately, we’re still missing a key piece of information: the full
file path.
To recover file paths from Shellbags, we’ll need to consider the Registry keys under
All nodes in this tree have a value named
In addition to the values
Joachim’s paper on Window’s shell items is the best resource for understanding the variations among
The following code block lists the algorithm in a Pythonish language for the programmers in the room.
Shellbags.py accepts the path to a raw Registry hive acquired forensically as a command line argument. To ensure interoperability, output is formatted according to the Bodyfile specification by default. The following block lists a demonstration of me running shellbags.py against a Windows XP
To improve readability, I ran the output through the mactime utility to generate a timeline of activity. The following block lists a portion of this sample.
Microsoft Windows uses a set of Registry keys known as “shellbags” to maintain the size, view, icon, and position of a folder when using Explorer. These keys are useful to a forensic investigator. Shellbags persist information for directories even after the directory is removed, which means that they can be used to enumerate past mounted volumes, deleted files, and user actions.
Yuandong Zhu, Pavel Gladyshev, and Joshua James provided a nice overview of the investigative value of shellbags in “Using shellbag information to reconstruct user activities” [pdf]; however, they do not describe how to programmatically access the data. Allan S Hay went into greater detail in his December, 2004 document “MiTeC Registry Analyser” [pdf], although he also leaves out a thorough analysis of the format. TZWorks provides an effective closed-source shellbag parser sbag, but does not explain its algorithm. Yogesh Khatri first described the basic structure of Windows Shell Items in his blog post for 42 LLC entitled Shell BAG Format Analysis. Joachim Metz went on to described the binary format of the Windows Shell Item structures with great detail in Windows Shell Item format specification [pdf]. This page documents an approach to parsing shellbags in detail, as well as introduces an open-source, cross-platform shellbag parser.
Shellbag locations
Shellbags may be found in a few locations, depending on operating system version and user profile. On a Windows XP system, shellbags may be found under:HKEY\_USERS\{USERID}\Software\Microsoft\Windows\Shell\
HKEY\_USERS\{USERID}\Software\Microsoft\Windows\ShellNoRoam\
NTUser.dat
hive file persists the Registry key HKEY\_USERS\{USERID}\
.On a Windows 7 system, shellbags may be found under:
HEKY\_USERS\{USERID}\Local Settings\Software\Microsoft\Windows\Shell\
UsrClass.dat
hive file persists the registry key HKEY\_USERS\{USERID}\
.Shellbag Parsing
Let us begin with theShell\
key. The Shell\
key does not have any values. Under the Shell\
key are two keys: Shell\Bags\
and Shell\BagMRU\
.FOLDERDATA
Each subkey underShell\Bags\
is named as increasing integers from one, such as Shell\Bags\1\
or Shell\Bags\2\
. Let us call these subkeys FOLDERDATA,
since they each represent one item viewed in Explorer, and this is
usually a folder. FOLDERDATA subkeys do not have any values, but often
have subkeys. The most common subkey is Shell\Bags\{Int}\Shell\
, but there are a few other possibilities (ComDlg
, Desktop
,
etc.). The subkeys under a FOLDERDATA describe the settings, position,
and icon when viewing the folder in Explorer. In particular, a Registry
value whose name begins with ItemPos
specifies the location of the icons for a given desktop resolution. For example, on my Windows 7 system, the Registry key HKEY\_USERS\{USERID}\Local Settings\Software\Microsoft\Windows\Shell\Bags\6\Shell\{5C4F28B5-F869-4E84-8E60-F11DB97C5CC7}
has 12 values that record various configurations. This set includes the value ItemPos1427x820(1)
that has type REG_BIN
with length 0x120:With no tools beyond Regedit (or Regview.py), Windows 8.3 filenames (eg.
MOZILL\~1.LNK
)
and Unicode filenames (eg. Mozilla Firefox.lnk) stand out. Fortunately,
by applying the formats found in Joachim’s paper, more details can be
extracted. Throughout this document, I refer to this Registry value type
as an ITEMPOS value.ITEMPOS values
The ITEMPOS value’s structure is a list of Windows File Entry Shell Items (SHITEM_FILEENTRY
)
terminated by an entry whose size field is zero. The list begins at
offset 0x10. Items are preceeded by 0x8 bytes whose meaning is unknown.
The minimum size of a SHITEM_FILEENTRY
structure is 0x15 bytes, so entries whose size field is less than 0x15 should be skipped. The valid SHITEM_FILEENTRY
items have the following structure (in pseudo-C / 010 Editor template format):FILEREFERENCE
is a 64bit MFT file reference structure (48 bits file MFT record number, 16 bits MFT sequence number). FILEATTRS
is a 16 bit set of flags that specifies attributes such as if the item
is read-only or a system file. Applying this template to the ITEMPOS
Registry value, we see there are four list items: one invalid entry, and
three SHITEM_FILEENTRY
items.00 00 00 00 --> header/footer 00 00 00 00 --> unknown padding (item position?) 00 00 00 00 --> invalid SHITEM_FILEENTRY 00 00 00 00 --> SHITEM_FILEENTRY 0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0010 15 00 00 00 51 00 00 00 14 00 1F 60 40 F0 5F 64 ....Q......`@._d 0020 81 50 1B 10 9F 08 00 AA 00 2F 95 4E 15 00 00 00 .P......./.N.... 0030 A0 00 00 00 46 00 3A 00 02 02 00 00 10 3D 0C 8E ....F.:......=.. 0040 20 00 43 79 67 77 69 6E 2E 6C 6E 6B 00 00 2C 00 .Cygwin.lnk..,. 0050 03 00 04 00 EF BE 10 3D 0C 8E 10 3D 0C 8E 14 00 .......=...=.... 0060 00 00 43 00 79 00 67 00 77 00 69 00 6E 00 2E 00 ..C.y.g.w.i.n... 0070 6C 00 6E 00 6B 00 00 00 1A 00 15 00 00 00 02 00 l.n.k........... 0080 00 00 5A 00 3A 00 42 06 00 00 10 3D 91 7C 20 00 ..Z.:.B....=.| . 0090 4D 4F 5A 49 4C 4C 7E 31 2E 4C 4E 4B 00 00 3E 00 MOZILL~1.LNK..>. 00A0 03 00 04 00 EF BE 10 3D 91 7C 10 3D 61 85 14 00 .......=.|.=a... 00B0 00 00 4D 00 6F 00 7A 00 69 00 6C 00 6C 00 61 00 ..M.o.z.i.l.l.a. 00C0 20 00 46 00 69 00 72 00 65 00 66 00 6F 00 78 00 .F.i.r.e.f.o.x. 00D0 2E 00 6C 00 6E 00 6B 00 00 00 1C 00 41 01 00 00 ..l.n.k.....A... 00E0 51 00 00 00 30 00 31 00 00 00 00 00 10 3D 2C 81 Q...0.1......=,. 00F0 10 00 4D 49 52 00 1E 00 03 00 04 00 EF BE 10 3D ..MIR..........= 0100 B0 80 10 3D A7 8C 14 00 00 00 4D 00 49 00 52 00 ...=......M.I.R. 0110 00 00 12 00 41 01 00 00 51 00 00 00 00 00 00 00 ....A...Q.......Taking the first valid entry from offset 0x34, let’s parse out the fields from the binary. The following block visually maps out the relevant bytes, while the table translates each field into a human readable value.
00 00 00 00 --> SHITEM_FILEENTRY size 00 00 00 00 --> filesize 00 00 00 00 --> timestamp 00 00 00 00 --> filename 0000 46 00 3A 00 02 02 00 00 10 3D 0C 8E 20 00 43 79 F.:.....w.=.Ž Cy 0010 67 77 69 6E 2E 6C 6E 6B 00 00 2C 00 03 00 04 00 gwin.lnk..,..... 0020 EF BE 10 3D 0C 8E 10 3D 0C 8E 14 00 00 00 43 00 ï¾.=.Ž.=.Ž....C. 0030 79 00 67 00 77 00 69 00 6E 00 2E 00 6C 00 6E 00 y.g.w.i.n...l.n. 0040 6B 00 00 00 1A 00 k.....
Offset | Field | Value |
---|---|---|
0x00 | ITEMPOS size | 0x46 |
0x04 | Filesize | 0x202 |
0x08 | Modified Date | August 16, 2010 at 17:48:24 |
0x0E | 8.3 Filename | Cygwin.lnk |
0x22 | Created Date | August 16, 2010 at 17:48:24 |
0x26 | Modified Date | August 16, 2010 at 17:48:24 |
0x2E | Unicode Filename | Cywgin.lnk |
BagMRU
tree
To recover file paths from Shellbags, we’ll need to consider the Registry keys under BagMRU
. The subkeys under Shell\BagMRU
form a recursive, tree-like structure that mirrors the file system on disk. Shell\BagMRU
is the root of the tree. Each subkey is a node representing a folder,
and like a folder, may contain children nodes. Yet, unlike (most)
folders, the nodes are named as increasing integers from zero. For
example, the branch Shell\BagMRU\0
might have the children 0
, 1
, and 2
.All nodes in this tree have a value named
MRUListEx
, and many have a value named NodeSlot
. NodeSlot
is what interests us, as it forms the link between the filesystem tree structure and the FOLDERDATA keys. A NodeSlot
value has type REG_DWORD
and should be interpreted as a pointer to the FOLDERDATA key with the same name. For example, on my workstation, the key Shell\BagMRU\1\1\3\0
has a NodeSlot
value of- This means that the FOLDERDATA
Shell\Bags\144\
corresponds to a folder with a path of four components. What are they? The components are described by the values atShell\BagMRU\1
,Shell\BagMRU\1\1
,Shell\BagMRU\1\1\3
, andShell\BagMRU\1\1\3\0
.
SHITEMLIST
In addition to the values MRUListEx
and NodeSlot
, nodes of the Shell\BagMRU
tree have one value for each subkey. The values have the same name as
the subkey; since the subkeys are named as increasing integers, so are
the values. Each value records metadata about the filesystem path
component associated with the subkey. The values have type REG_BIN
, and have an internal binary structure known as an SHITEMLIST
. An SHITEMLIST
is formed by contiguous items terminated by an empty item. Practically, though, the SHITEMLIST
of a BagMRU
node will have two entries: a relevant entry, and the empty terminator item. The first word of each SHITEM
gives the item’s size.Joachim’s paper on Window’s shell items is the best resource for understanding the variations among
SHITEM
entries. From a high level, there are at least ten types of items that range from SHITEM_FILEENTRY
and SHITEM_FOLDERENTRY
to SHITEM_CONTTROLPANELENTRY
.
For each of these types, we can extract at least a path component such
as “My Documents” or “\myserver”. Fortunately, most items have type SHITEM_FOLDERENTRY
,
which provides additional metadata including MAC timestamps. A small
number of items do not conform to the known structure, although these do
not usually contain any human readable strings or hints.Putting it all together
With theSHITEMLIST
structure in hand, we now have enough information to comprehensively
parse Windows shellbags. To do this, first recurse down the Shell\BagMRU
keys while complete directory paths. At each node, record any available
metadata and lookup the associated FOLDERDATA. Recall that the
FOLDERDATA may indicate some of the items contained by the directory, so
record this metadata, too. Finally, format and enjoy!The following code block lists the algorithm in a Pythonish language for the programmers in the room.
Shellbags.py
Using these concepts, I’ve implemented a cross-platform shellbag parser for Windows XP and greater in the Python programming language. The code is freely available here, so all algorithms and structures are accessible to interested parties. I’ve licensed the code under the Apache 2.0 license, so please feel encouraged to take and improve the routines as you feel fit. As a benchmark, shellbags.py tends to identify at least the items returned by the sbag utility, and in some cases returns more.Shellbags.py accepts the path to a raw Registry hive acquired forensically as a command line argument. To ensure interoperability, output is formatted according to the Bodyfile specification by default. The following block lists a demonstration of me running shellbags.py against a Windows XP
NTUSER.dat
Registry hive.To improve readability, I ran the output through the mactime utility to generate a timeline of activity. The following block lists a portion of this sample.