Ouch my eye!

Do not look at LASER with remaining eye!


The CAT is out of the bag

After practically melting my brain figuring out the LZSS compressor over the past several days, it’s time to take a break from the PIC file format for a post or two, and focus in on another file format from MicroProse. Namely the CAT file, found with games such as Silent Service II. The file seems to be some sort of archive that contains other asset files within. Lets dig in and tear it apart to see what we can learn from it.

Before we get into the thick of things, we should discuss where we have seen CAT files so far. This post is largely motivated by the CAT files included with SS2. Back when I started this adventure, I looked at a lot of different files looking for hidden PIC assets. That’s how I found that SPR files were also PIC files. I made a note of the CAT file back then, because it appears to be a aggregate file of many files, and seems to include a catalogue of its contents at the top, and with SS2 many of those file names were with PIC extensions. So with that being the topic for this post, I decided to do a quick scan for files with the CAT extension with what I had assets for. (As always I’m open to submissions, and tips as to what other titles use this format)

  • Silent Service II – 1990
  • Darklands – 1992
  • Sid Meier’s Covert Action – 1990
  • Gunship 2000 – 1991
  • Knights of the Sky – 1990
  • Sword of the Samurai – 1989
  • M1 Tank Platoon*- 1989

This is probably not a comprehensive list, but these are the titles I’ve been able to identify so far. Also I should add that I could swear that I saw a more advanced version of the CAT file with some titles, Back when I first started scouring through titles for hidden PIC files, but I can’t seem to find them now. Also it is quite likely there are files with a different extension that are actually also CAT files, just as we saw with the PIC file. We will try to identify those too at some point, but first the time has come to try and figure out the MicroProse CAT file format.


First look

So what does a CAT file look like you may ask… time to show you what I saw. As Silent Service II (SS2) was the inspiration for this post, I’ll use a couple of CAT assets from there. When you install the game you have to select your graphics adapter, and the game will install only the associated assets. I have the assets for both the EGA and VGA installs, and I’ve chosen EBOOT.CAT and VBOOT.CAT as my guinea pigs.

File: test/EBOOT.CAT  [50031 bytes]
Offset    x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF  Decoded Text
0000000x: 0B 00 42 49 54 53 2E 50 49 43 00 00 00 00 9A 8A  · · B I T S . P I C · · · · · ·
0000001x: C8 14 98 08 00 00 0A 01 00 00 43 41 50 54 43 41  · · · · · · · · · · C A P T C A
0000002x: 42 2E 50 49 43 00 9D 7B C5 14 D0 05 00 00 A2 09  B . P I C · · { · · · · · · · ·
0000003x: 00 00 43 52 45 44 49 54 2E 50 49 43 00 00 97 7C  · · C R E D I T . P I C · · · |
0000004x: E2 14 72 1F 00 00 72 0F 00 00 44 49 53 50 4C 41  · · r · · · r · · · D I S P L A
0000005x: 59 2E 50 49 43 00 33 45 C7 14 AD 0C 00 00 E4 2E  Y . P I C · 3 E · · · · · · · .
0000006x: 00 00 47 41 55 47 45 53 30 30 2E 50 49 43 97 68  · · G A U G E S 0 0 . P I C · h
0000007x: C7 14 80 1F 00 00 91 3B 00 00 53 43 4F 50 45 2E  · · · · · · · ; · · S C O P E .

0B 00: 0x000B (11) File/Record count?

File: test/VBOOT.CAT  [57265 bytes]
Offset    x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF  Decoded Text
0000000x: 06 00 42 49 54 53 2E 50 49 43 00 00 00 00 77 45  · · B I T S . P I C · · · · w E
0000001x: CF 14 F6 0A 00 00 92 00 00 00 43 52 45 44 49 54  · · · · · · · · · · C R E D I T
0000002x: 2E 50 49 43 00 00 26 7B E2 14 2F 63 00 00 88 0B  . P I C · · & { · · / c · · · ·
0000003x: 00 00 53 50 52 49 54 45 53 2E 50 49 43 00 7D 45  · · S P R I T E S . P I C · } E
0000004x: CF 14 F9 1D 00 00 B7 6E 00 00 53 53 5F 54 49 54  · · · · · · · n · · S S _ T I T
0000005x: 4C 45 2E 50 49 43 A2 80 55 14 A0 11 00 00 B0 8C  L E . P I C · · U · · · · · · ·
0000006x: 00 00 53 54 45 58 54 2E 50 49 43 00 00 00 92 45  · · S T E X T . P I C · · · · E
0000007x: CF 14 FA 03 00 00 50 9E 00 00 53 55 42 31 2E 50  · · · · · · P · · · S U B 1 . P

06 00: 0x0006 (6) File/Record count?

The first thing that jumps out to me after the obvious file names in the ASCII view, are the first 2 bytes. Despite the first filename being the same, the first two bytes are not the same. I wonder if this isn’t a count prefix, as we have seen with the SPC file and the MPS show formats which we’ve looked at before. A quick count of readable file names does seem to support that. If we then assume the first part of the record is the file name, we have 24 bytes per record. A quick scan confirms this pattern to hold.


Determining the record layout

So far we’ve determined we have a 24 byte record. Let’s see what else we can determine, and build up a structure we can then use to read out all the records. Focusing on those 24 bytes, we know the first part is the name of the file ASCII encoded. With DOS, back in the era of these games, file names were limited to a 8.3 format. 8 characters for the name maximum, and 3 for the extension. That means at 12 bytes total if we include the period, which we can see in the text view of the file.

File: test/EBOOT.CAT  [50031 bytes]
Offset    x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF  Decoded Text
0000000x:       42 49 54 53 2E 50 49 43 00 00 00 00 9A 8A      B I T S . P I C · · · · · ·
0000001x: C8 14 98 08 00 00 0A 01 00 00                    · · · · · · · · · · 

9A 8A C8 14: 0x14C88A9A
98 08 00 00: 0x00000898 (2200)
0A 01 00 00: 0x0000010A (266)

Looks like the space for the filename is exactly 12 bytes long, as there is a non-zero (non-ascii) byte on the 13th byte. What seems to follow the filename is three 32 bit values (though some could be 16 bit, but I’d say the last 8 bytes are most certainly 32 bit values) So with that we can start to put a structure together. But first let’s see if we can’t figure out some of the other values we see here. One of these values MUST be an index to where it exists in the file, or at the very least be a size value, so the reader is able to extract the files contained within. Now we’re looking at EBOOT here, which had 11 records, and at 24 bytes each that means 11 x 24 = 264 (0x0108), add the two bytes we already consumed for the record count at the start and we get 266 (0x010A). That would be the earliest we could see file data here, assuming it starts right after the catalogue records at the top. And sure enough, the 3rd 32 bit entry is 0x010A. So I think we can safely say the 3rd entry in our struct is the absolute position of the file data within the CAT file. The previous entry with a value of 2200 looks suspiciously like a good candidate for length, certainly the first 32 bits would not be. What happens if we add those two together 2200 + 266 = 2466 (0x09A2). And what does the next record show us for its position?

File: test/EBOOT.CAT  [50031 bytes]
Offset    x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF  Decoded Text
0000001x:                               43 41 50 54 43 41                      C A P T C A
0000002x: 42 2E 50 49 43 00 9D 7B C5 14 D0 05 00 00 A2 09  B . P I C · · { · · · · · · · ·
0000003x: 00 00                                            · · 

A2 09 00 00: 0x000009A2 (2466)

Well would you look at that! With that I think we can safely consider those last two 32 bit values to be length and position respectively. That only leaves us with the first value, it could be 2 16 bit values, but any way I look at it the numbers don’t jump out as something obvious right away. It’s possible this is a CRC or checksum of the file, but I highly doubt it, that would consume resources that you wouldn’t want to spend in a game. Let’s extract a bunch of them and look at them side by side.

00: 9A 8A C8 14: 0x14c88a9a
01: 9D 7B C5 14: 0x14c57b9d
02: 97 7C E2 14: 0x14e27c97
03: 33 45 C7 14: 0x14c74533
04: 97 68 C7 14: 0x14c76897
05: AF 6C C7 14: 0x14c76caf
06: 13 7D C8 14: 0x14c87d13
07: 64 81 C6 14: 0x14c68164
08: 52 51 C6 14: 0x14c65152
09: A0 55 DB 14: 0x14db55a0
10: 2D 85 CB 14: 0x14cb852d

Well now that is interesting, they all start with 0x14, this is far to regular to be the result of a CRC or checksum, it must be something else, and I think I know what. Back in the DOS days, time and date were stored as a 32bit bit-packed structure, though could be accessed as 2 separate 16 bit packed structures. In the high 16 bits is the date, and the low 16 bits is the time. Unlike modern timestamps which are usually a single value representing some unit of time since the epoch. For DOS the epoch is Jan 1, 1980, but the values are actually a struct, and not a linear value. The DOS timedate bit packed struct looks something like the following. (remember that int‘s in this era are 16 bits)

struct DOSTIMEDATE {
    struct DOSTIME {
        unsigned int ticks:5; // 0-29 (1t = 2 seconds)
        unsigned int min:6;   // 0-59
        unsigned int hour:5;  // 0-23
    } time;
    struct DOSDATE {
        unsigned int mday:5;  // 1-31
        unsigned int mon:4;   // 1-12
        unsigned int year:7;  // years since 1980
    } date;
};

With the DOS layout for date and time the year occupies the 7 most significant bits of the value. So as a quick check we can do some math to see what we get. 0x14 >> 1 = 0x0A which is 10, 10+1980 = 1990… well that is certainly a reasonable value for the year. Remember that these files are with SS2, which came out in 1990. With that I think we can proceed under the assumption that the first 32 bits after the name are the timestamp, followed by the size, and then the position. That leaves us with the following structure.

typedef struct {
    char name[12];      // in DOS 8.3 format
    uint32_t timestamp; // DOS timestamp
    uint32_t size;      // data length of this entry
    uint32_t offset;    // absolute file offset to start of data for this entry
} cat_entry_t;

Parsing the CAT file

Now that we know the structure, I think it’s time to parse the file and see what we get. First we’ll read in the 16 bit count, and then read in count catalogue records. I won’t bother with the details of reading the file and allocating memory, as that is pretty basic, and boring. For converting the DOS date/time we can use the following function.

// converts a DOS 32bit timedate val to a unix timedate val
time_t dos_to_local(uint32_t dostime) {
    struct tm utime;
    utime.tm_hour = ((dostime >> 11) & 0x1f);
    utime.tm_min  = ((dostime >>  5) & 0x3f);
    utime.tm_sec  = ((dostime <<  1) & 0x3e); // resolution is 2 sec
    utime.tm_mon  = ((dostime >> 21) & 0x0f) - 1; 
    utime.tm_mday = ((dostime >> 16) & 0x1f);
    utime.tm_year = ((dostime >> 25) & 0x7f) + 1980 - 1900; // DOS epoch is 1980
    return mktime(&utime);
}

With the code written let’s see what we get

MicroProse CAT File Viewer
Opening: 'test/EBOOT.CAT'	File Size: 50031 bytes
Catalogue contains 11 items
 1:     BITS.PIC   2200   8 Jun 1990  17:20:52 [pos:0000010a]
 2:  CAPTCAB.PIC   1488   5 Jun 1990  15:28:58 [pos:000009a2]
 3:   CREDIT.PIC   8050   2 Jul 1990  15:04:46 [pos:00000f72]
 4:  DISPLAY.PIC   3245   7 Jun 1990  08:09:38 [pos:00002ee4]
 5: GAUGES00.PIC   8064   7 Jun 1990  13:04:46 [pos:00003b91]
 6:    SCOPE.PIC   4123   7 Jun 1990  13:05:30 [pos:00005b11]
 7:  SPRITES.PIC   8577   8 Jun 1990  15:08:38 [pos:00006b2c]
 8: SS_TITLE.PIC   3367   6 Jun 1990  16:11:08 [pos:00008cad]
 9:    STEXT.PIC   2056   6 Jun 1990  10:10:36 [pos:000099d4]
10:     SUB1.PIC   5277  27 Jun 1990  10:13:00 [pos:0000a1dc]
11:      TBT.PIC   3318  11 Jun 1990  16:09:26 [pos:0000b679]

MicroProse CAT File Viewer
Opening: 'test/VBOOT.CAT'	File Size: 57265 bytes
Catalogue contains 6 items
 1:     BITS.PIC   2806  15 Jun 1990  08:11:46 [pos:00000092]
 2:   CREDIT.PIC  25391   2 Jul 1990  15:25:12 [pos:00000b88]
 3:  SPRITES.PIC   7673  15 Jun 1990  08:11:58 [pos:00006eb7]
 4: SS_TITLE.PIC   4512  21 Feb 1990  15:05:04 [pos:00008cb0]
 5:    STEXT.PIC   1018  15 Jun 1990  08:12:36 [pos:00009e50]
 6:     SUB1.PIC  15719  27 Jun 1990  15:03:48 [pos:0000a24a]

Well that looks pretty promising. All the dates and times look reasonable, as do all the other values. We can do a quick check of completeness by adding the final length to the last start position and see if that brings us to the end of the file. (0xb679 + 3318 = 0xc36f) or 50031 which is the EOF in this case for EBOOT, and checking VBOOT we get (0xa24a + 15719 = 0xdfb1) 0r 57265 also EOF. So it looks like we have a compete decode here. That was easy!


Extracting the contents

Writing an extractor should be pretty simple, seek to the given offset, and copy the given length number of bytes out to a file. So let’s do that. I’ll add a parameter to the command line of my program to select which file, and we can pull them out selectively.

MicroProse CAT File Extractor
Opening: 'test/VBOOT.CAT'	File Size: 57265 bytes
Catalogue contains 6 items
 1:     BITS.PIC   2806  15 Jun 1990  08:11:46 [pos:00000092]
 2:   CREDIT.PIC  25391   2 Jul 1990  15:25:12 [pos:00000b88]
 3:  SPRITES.PIC   7673  15 Jun 1990  08:11:58 [pos:00006eb7]
 4: SS_TITLE.PIC   4512  21 Feb 1990  15:05:04 [pos:00008cb0] [Extracting]
 5:    STEXT.PIC   1018  15 Jun 1990  08:12:36 [pos:00009e50]
 6:     SUB1.PIC  15719  27 Jun 1990  15:03:48 [pos:0000a24a]

% ls -al *.PIC
-rw-r--r--  4512 21 Feb  1990 SS_TITLE.PIC

Yes I spent way too much time and did the unnecessary thing to apply the “correct” timestamp to the file 😉 But otherwise it worked! we have a file, of the correct size (and creation date in this case). Now it’s a PIC and we can render PIC files, so let’s see what we get… Oh wait, SS2 uses PIC89 so we’ll have to do a quick edit to strip off the added header, and correct the format byte as necessary. (I really need to get back to the PIC code to make it recognize the other types)

Bingo! Though I did take the liberty of using the palette files that come with SS2 and rendered a few times until I found the correct one. (PALETTE.002)

The MicroProse CAT file has turned out to be a super simple format. Didn’t take much effort to discern all of the fields in the header/record data, it’s a blindingly simple format. I’m not going to cover creating a CAT file, as that will be just as easy and not really worth writing about, there is nothing new there. But it will be included when I release the code, which will hopefully be soon. (I plan to release all the code I’ve developed on this blog in the near future)


Bulk Extraction

In Steve Jobs style… One last feature before we wrap this post up. As a final quick change to the program I made it so that we can extract all files in the CAT file, instead of one at a time. (I know, groundbreaking feature, amazing nobody has thought of it before)

MicroProse CAT File Extractor
Opening: 'test/VBOOT.CAT'	File Size: 57265 bytes
Catalogue contains 6 items
 1:     BITS.PIC   2806  15 Jun 1990  08:11:46 [pos:00000092] [Extracting]
 2:   CREDIT.PIC  25391   2 Jul 1990  15:25:12 [pos:00000b88] [Extracting]
 3:  SPRITES.PIC   7673  15 Jun 1990  08:11:58 [pos:00006eb7] [Extracting]
 4: SS_TITLE.PIC   4512  21 Feb 1990  15:05:04 [pos:00008cb0] [Extracting]
 5:    STEXT.PIC   1018  15 Jun 1990  08:12:36 [pos:00009e50] [Extracting]
 6:     SUB1.PIC  15719  27 Jun 1990  15:03:48 [pos:0000a24a] [Extracting]

MicroProse CAT File Extractor
Opening: 'test/EBOOT.CAT'	File Size: 50031 bytes
Catalogue contains 11 items
 1:     BITS.PIC   2200   8 Jun 1990  17:20:52 [pos:0000010a] [Extracting]
 2:  CAPTCAB.PIC   1488   5 Jun 1990  15:28:58 [pos:000009a2] [Extracting]
 3:   CREDIT.PIC   8050   2 Jul 1990  15:04:46 [pos:00000f72] [Extracting]
 4:  DISPLAY.PIC   3245   7 Jun 1990  08:09:38 [pos:00002ee4] [Extracting]
 5: GAUGES00.PIC   8064   7 Jun 1990  13:04:46 [pos:00003b91] [Extracting]
 6:    SCOPE.PIC   4123   7 Jun 1990  13:05:30 [pos:00005b11] [Extracting]
 7:  SPRITES.PIC   8577   8 Jun 1990  15:08:38 [pos:00006b2c] [Extracting]
 8: SS_TITLE.PIC   3367   6 Jun 1990  16:11:08 [pos:00008cad] [Extracting]
 9:    STEXT.PIC   2056   6 Jun 1990  10:10:36 [pos:000099d4] [Extracting]
10:     SUB1.PIC   5277  27 Jun 1990  10:13:00 [pos:0000a1dc] [Extracting]
11:      TBT.PIC   3318  11 Jun 1990  16:09:26 [pos:0000b679] [Extracting]

While we’re here, now that we have extracted everything from our test files, might as well see what we have.

% picid-all VBOOT
 'SPRITES.PIC': PIC90: Type: 6 Image: 320x200 0x0b: max-bits: 11
  'CREDIT.PIC': PIC90: Type: 6 Image: 320x200 0x0b: max-bits: 11
   'STEXT.PIC': PIC90: Type: 6 Image: 320x200 0x0b: max-bits: 11
'SS_TITLE.PIC': PIC90: Type: 6 Image: 320x200 0x0b: max-bits: 11
    'SUB1.PIC': PIC90: Type: 6 Image: 320x200 0x0b: max-bits: 11
    'BITS.PIC': PIC90: Type: 6 Image: 320x200 0x0b: max-bits: 11

% picid-all EBOOT       
 'SPRITES.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
 'DISPLAY.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
  'CREDIT.PIC': PIC90: Type: f Image: 320x200 0x0b: max-bits: 11
   'STEXT.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
     'TBT.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
'SS_TITLE.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
'GAUGES00.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
    'SUB1.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
   'SCOPE.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
    'BITS.PIC': PIC90: Type: ? Image: 320x200 UNKNOWN TYPE 'e'
 'CAPTCAB.PIC': PIC90: Type: f Image: 320x200 0x0b: max-bits: 11

Oh!


By Thread



Leave a comment