Rafael, on Jan 31 2005, 20:17, said:
ZoRoNaX has more I'm sure ;)
Good job Rafael!
And yes I do have more :P , here is the information I have gathered during my research:
File Layout:
<wim header>
<file data>
<file definition + image directory lists>
<wim configuration file>
Rafael, on Jan 30 2005, 23:41, said:
WIM HEADER:
8 bytes: Signature/WIM version(?)
x bytes: ?
8 bytes: Signature MSWIM\0\0\0
4 bytes: Header length (including signature length)
4 bytes: Unknown
2 bytes: Compression On/Off Flag ?
- 0x1: Uncompressed file data
- 0x2: Does mean something too...
- 0x3: Compressed file data (Probably 0x1 AND 0x2)
2 bytes: Compression Type Flag ?
- 0x0: If no compression is set (if Compression On/Off Flag = 0x1)
- 0x1: MSZIP (deprecated?)
- 0x2: LZNT
- 0x4: LZX
4 bytes: Unknown (hex: 00 80 00 00), If no compression is set hex: 00 00 00 00
The remaining bytes of the header file specify the different sections in the WIM file. As far I have seen, all images files contain 3 section entries here. The first one for the File Lookup Table, the second one for the XML Configuration file and the third one for the image to load while booting. If no boot image is specified, then this entry is completely zero filled. Each entry here is 24 bytes in size.
7 bytes: Size / If the file is compressed, it's the compressed file size.
1 byte: Flag/Type (XML Configuration = 0x2, File Lookup Table = 0x2, Boot Image List = 0x6)
8 bytes: Offset in image (from the beginning)
8 bytes: Uncompressed Size
It could also be the Size and Flag/Type fields could also be both 4 bytes instead of 7 bytes and 1 byte, but I don't know.
If you look at the flag for the Boot Image, it's 0x6. This is probably 0x2 OR 0x4. So I think that if flag 0x4 is set, that the Offset points to the Image List for the boot image. If the section is about the XML Configuration or the File Lookup Table, it has the offset values for those.
Rafael, on Jan 30 2005, 23:41, said:
File/Folder Data:
* Unknown how starting point is determined
* Three formats: Uncompressed, LZNT1, LZX (MSZIP)
* If uncompressed, data terminates with 0xFF 0xFE (?)
* If compressed (LZX), data terminates with CAB header signature (0x4D534346)
* File Data locations are stored in the File Lookup Table
* Three formats: Uncompressed, LZNT1, LZX (In the old version, 4008 and 4015, it was MSZIP)
* Compression type is stored in the File Lookup Table entry for the file.
Rafael, on Jan 30 2005, 23:41, said:
8 bytes = file byte length (or real file length?)
(max = 18,446,744,073,709,551,615 bytes)
8 bytes = file start offset (verified)
4 bytes = 0x00000001 (unknown)
4 bytes = 0x00000001 (unknown)
8 bytes = 0x00000004 (real length? or in wim length?)
20 bytes = file SHA-160 hash (verified)
1 byte = 0x00 (separator? doubt it.)
* Afterwards: 0x00, could be 0x02 (end of files?)
* Proceeding bytes = Unknown at this time
File Lookup Table:
The File Lookup Table contains entries for every file data. Note that multiple files in the Directory list(s) can point to one File Data (so they have the same contents).
Every Entry in this table is 52 bytes long:
7 bytes: Compressed file length (if compressed) otherwise Uncompressed file length
1 byte: Type Flag
- 0x0: Not compressed
- 0x2: Image Listing
- 0x4: Compressed
8 bytes: Offset in image
8 bytes: Uncompressed File Length
4 bytes: File Identifier
4 bytes: Unknown
20 bytes: SHA-1 hash of the file data (used for determining if files are different)
It could also be the Size and Type fields could also be both 4 bytes instead of 7 bytes and 1 byte, but I don't know.
If Type Flag 0x2 is set, the SHA-1 hash is zero filled.
Rafael, on Jan 30 2005, 23:41, said:
File/Folder Data Definition Schtuff(?):
* Under assumption already uncompressed
[Image]
RootDrive = #
[Security]
"0x0"= #
[x1]
"(1)" = (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (next string if exists)
+ RootDrive - 0 or 1 integer - Designates whether or not we're working on the root drive?
+ "0x0" - 365-ish character long security descriptor - Unknown purpose
+ "(1)" - filename string - Name of the file we're describing here
Arguments for above:
(2) - File/Folder flag? (hex) (boolean)
(3) - DOS 8.3 Compatible Filename (string)
(4) - Unknown (hex) (0x20?)
(5) - Unknown (hex) (timestamp data?)
(6) - Unknown (hex) (timestamp data?)
(7) - Unknown (hex) (timestamp data?)
(8) - Unknown (hex) (timestamp data?)
(9) - Unknown (hex) (timestamp data?)
(10) - Unknown (hex) (timestamp data?)
(11) - Unknown (hex) (always 0x00?)
(12) - Unknown (hex) (?)
(13) - Unknown (hex) (?)
* CRLF follows with 0x00 if no more file strings
Image List:
The list is just a plain INI file (compressed) which contains the files and directories and the references to the file data. If no compression is being used, it looks like this file has another format.
General INI Format:
[SectionName]
Name=Value
"Long Name"=Value
The RootDrive attribute in the [Image] section tells which Directory Listing Section to apply to the root of the drive on which to apply the image.
The [Security] section contains the ACL lists for the files and directories. This is why images can only be applied on NTFS file systems.
[Security]
"0x0"=01000480140000...
"0x0" = Name = Security List Identifier in hexadecimal format. If the decimal identifier is 10, the hexadecimal format is 0xA (no leading zero's between the 0x and the A).
01000480140000 = ACL list data which contains the access objects for the file or directory.
The [xN] sections contain the various directory listings with references to the files and directories. The N is also in hexadecimal format and represents the Directory Identifier. [0xA] contains the listing for the directory with as identifier 10 (decimal). The listing to use for the root of the drive is being "told" via the RootDrive attribute in the [Image] section. For other Directory Listing Section an identifier is used in the listing.
[xA]
"msn6.exe"=0x775,,0x20,0x8439AF48,0x1C2F7E3,0x843C11A2,0x1C2F7E3,0x8E25AC40,0x1C2F7DA
,0x27,,,455EC0039363D7....
"(1)" = (2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14)
(1) - "msn6.exe" = Long Name of the file
(2) - The Identifier for the file or directory. If it's a file, you can find it in the File Lookup Table with this identifier. If it's a directory it's defined in the [ID] section where ID stands for the hexadecimal notation without the starting zero.
(3) - The "short" 8.3 name of the file or directory
(4) - Flags
- 0x1: Read Only (File Attribute R)
- 0x2: Hidden (File Attribute H)
- 0x4: System (File Attribute S)
- 0x8: Archive (File Attribute A)
- 0x10: Folder
- 0x20: File
- 0x800: Compressed (File Attribute C)
(5) - File creation time Low DWord (part of C/C++ FILETIME structure)
(6) - File creation time High DWord (part of C/C++ FILETIME structure)
(7) - File last opened?? time Low DWord (part of C/C++ FILETIME structure)
(8) - File last opened?? time High DWord (part of C/C++ FILETIME structure)
(9) - File last modified time Low DWord (part of C/C++ FILETIME structure)
(10) - File last modified time High DWord (part of C/C++ FILETIME structure)
(11) - Security Section Entry identifier
(12) - ????
(13) - ????
(14) - Also something of the ACL list. This one is file specific.
C/C++ FILETIME structure
The FILETIME structure is a structure of 100-nanosecond intervals since January 1, 1601
If you want to change this format to a "normal" date/time stamp, you can use the API calls.
ZoRoNaX
This post has been edited by ZoRoNaX: 31 January 2005 - 07:57 PM