We have some unfinished business to do in light of the discoveries we made last time. I now no longer trust my prior results with the RLE encoder, so we need to go back and revisit that result. Last post we wrapped things up by solving the red herring of the LZW compressor generating garbage. Which brings into question the issue we had surrounding the same file for RLE encoding. But first, we need to put all the pieces of the encoder together, and with that perhaps we can kill two birds with one stone. We can address the questionable RLE result, and we can fully validate our encode and decode stacks.
Gluing the PIC encoder together
The process here is going to be very similar to what we did when we glued all the pieces together for the decoder in this post. Just like with the decoder we’re going to start by wrapping our LZW and RLE state types with a PIC state type.
typedef struct {
uint8_t max_width; // maximum code width for the file
bool isPacked; // flag for pixel packing 16 colour files
rle_state_t rle; // RLE state machine
lzw_state_t lzw; // LZW state machine
} pic_state_t;
This is pretty much identical to what we had with the decoder, though no longer carries the raw format identifier as we now understand how to properly decode, and encode, that byte. Next is our main entry to encode a file. for simplicity this time, I stayed with file in and file out, instead of using a memory buffer.
int pic_compress(FILE *dst, FILE *src) {
pic_state_t *ctx;
if(NULL == (ctx = (pic_state_t *)calloc(sizeof(pic_state_t), 1))) {
return PIC_NOMEM;
}
// init the contexts here
lzw_init(&ctx->lzw);
rle_init(&ctx->rle);
int8_t format_identifier = fgetc(src); // read in the format byte
ctx->isPacked = (format_identifier > 0); // positive value indicates packed pixels
ctx->max_width = abs(format_identifier); // absolute val of is max encoding width
fputc(format_identifier, dst);
if(ctx->max_width > LZW_MAX_CODE_WIDTH) ctx->max_width = LZW_MAX_CODE_WIDTH;
// decompress the rest of the file into the image buffer
int rval = pack(dst, src, ctx);
free(ctx);
return rval;
}
Next step is our packing routine that simply reads bytes from the input file until EOF and packs pixels based on the isPacked flag, and then passes the packed pixels off to the RLE encoder.
int rle_encode(FILE *dst, int symbol, pic_state_t *pic) {
rle_state_t *ctx = &pic->rle;
if(EOF == symbol) {
// we've reached the end, we need to drain out anything we may not have emitted yet.
int rval = rle_drain(dst, pic);
if(PIC_NOERROR != rval) return rval;
return lzw_compress(dst, symbol, pic); // pass the EOF forward
}
if(RLE_TOKEN == symbol) {
// handle the special case of the token value in the stream
int rval = rle_drain(dst, pic);
if(PIC_NOERROR != rval) return rval;
return rle_escape(dst, pic);
}
if(false == ctx->encoding) {
// not currently in an encoding, store and count the symbol and exit
ctx->encoding = true;
ctx->symbol = symbol;
ctx->count++;
return PIC_NOERROR;
}
if(ctx->symbol != symbol) { // symbols changed
int rval = rle_drain(dst, pic); // drain it out
if(PIC_NOERROR != rval) return rval;
ctx->encoding = true; // store and count the new symbol
ctx->symbol = symbol;
ctx->count++;
return PIC_NOERROR;
}
ctx->count++;
if(255 == ctx->count) { // max count reached, drain it out
return rle_drain(dst, pic);
}
return PIC_NOERROR;
}
The RLE code is largely the same, The one notable exception is the highlighted change to forward EOF to the next stages, so they can flush any pending data as well. All the other RLE routines saw similar changes with their respective fputc() calls replaced with lzw_compress() calls.
// continually compresses input symbols until EOF is reached
int lzw_compress(FILE *dst, int symbol, pic_state_t *pic) {
lzw_state_t *ctx = &pic->lzw;
// passing EOF triggers a flushing of the output
if(EOF == symbol) {
write_code(dst, ctx->chain_code, ctx); // write the final string code
write_code(dst, LZW_EOF, ctx); // flush the bit buffer
return PIC_NOERROR; // nothing more to do
}
// startup condition
if(LZW_EOC == ctx->chain_code) {
// no need to look it up, all single character codes are
// guaranteed to be in the dictionary
ctx->chain_code = symbol;
return PIC_NOERROR;
}
// see if chain_code + symbol exists already
uint16_t new_code = seek_chain(ctx->chain_code, symbol, ctx);
if(LZW_NO_CODE != new_code) { // match found, keep extending
ctx->chain_code = new_code;
return PIC_NOERROR;
}
// write out the old code
write_code(dst, ctx->chain_code, ctx);
// handle table expansion and reset
if(ctx->next_code >= ctx->resize_code) {
// write_code(dst, symbol, ctx);
// check if we are at max bits, if so reset instead of resize
if(LZW_MAX_CODE_WIDTH > ctx->code_bits) {
lzw_resize(ctx);
} else {
uint8_t chain_symbol = ctx->dictionary[ctx->chain_code].symbol;
lzw_reset(ctx);
// root the new table on the last symbol emitted
ctx->chain_code = chain_symbol;
}
}
// create a new code
// add it to the hash before the dictionary code updates the value
ctx->hash_table[ctx->hash_point] = ctx->next_code;
// add the new string to the dictionary, consumes next_code and increments
lzw_add_code(ctx->chain_code, symbol, ctx);
// start the next string with the current symbol
ctx->chain_code = symbol;
return PIC_NOERROR;
}
The LZW code is nearly identical to what we had before, the only change is how the context is passed in, and unwrapping it.
That is pretty much it, all we need to do now is build it and wrap some simple shell scripts around this and the decoder so we can use them to validate themselves against as much source data as we can give it. First up is a script that takes in a file name, decodes it to a temporary file, then encodes the temporary file back to a pic. Generates a md5 file signature for each of the files, and then compares the signatures.
#!/bin/sh
# strip off the path from the input file name for cleaner prints
name=$(basename "$1")
# decode the file to a temporary file
./pic2mpraw "$1" "OUT.PIX" > /dev/null 2>&1
# re-encode the temporary file back to a PIC
./mpraw2pic "OUT.PIX" "OUT.PIC" > /dev/null 2>&1
# hash the files, and compare the results
master=$(md5 -q "$1" 2>&1 )
encode=$(md5 -q "OUT.PIC" 2>&1 )
result=$([[ "$master" == "$encode" ]] && echo "\033[32m[pass]\033[0m" || echo "\033[7;31m[FAIL]\033[0m")
# cleanup
[ -e OUT.PIX ] && rm "OUT.PIX"
[ -e OUT.PIC ] && rm "OUT.PIC"
printf '%12s: \033[2m%s\033[22m %s\n' "$name" "$master" "$result"
Finally another simple script to find all .PIC and .SPR files within a given directory, and then call the above script for each matching file found.
#!/bin/sh
find "$(pwd)/${1}" -name "*.PIC" -maxdepth 5 -type f -not -path '*/\.*' -exec ./check-one {} \;
find "$(pwd)/${1}" -name "*.SPR" -maxdepth 5 -type f -not -path '*/\.*' -exec ./check-one {} \;
Note from the future: This is an ongoing effort, and as such certain details are incorrect, and will change over time as more titles using the format are discovered. With that said please note that the designations used here for the variants does change in the future, as newer dates of earliest use become uncovered.Both the PIC90 and PIC91 formats will be discovered to occur a year earlier than they do here, making them PIC89 and PIC90 respectively.
(As of Jun 21 2024)
I crave validation
Now that we have our decoder and encoder written, let’s use them to validate our work by decoding actual game assets, and then re-encoding the decoded result and comparing it to the original. As noted above, I wrote up a pair of scripts to find all the .PIC and .SPR files in a given directory, and then go through the process of decoding, encoding and comparing the results. Below are the results for F15 Strike Eagle II and the Desert Storm scenario expansion pack.
% check-all f15-se2
DEATH.PIC: 6e26bcc7228da3563f17c1c919e14349 [pass]
COCKPIT.PIC: ff1bba14e570245e94e5f1a6186422a3 [pass]
HISCORE.PIC: 537bbf274d79ebaa616f55917d14bf19 [pass]
RIGHT.PIC: 94190aa3fe2f7de2395b15dab9c04a53 [pass]
MEDAL.PIC: 3f75ad96c4d9c61842d58e087394d968 [pass]
PROMO.PIC: fa3efcb64547c7fc73bc165b4d06bbf2 [pass]
256RIGHT.PIC: d18609791beb5a4b6bbc3c95a49979ee [pass]
TITLE16.PIC: bd7a9c10dcd06fec62af1071a678ad7f [pass]
WALL.PIC: 0839cb62142b5d3e5058b596ad36fb32 [pass]
LEFT.PIC: 789455e28757793fec416cc444477063 [pass]
ARMPIECE.PIC: f6b8b7b27b1de44282ca04ea6d369ea4 [pass]
REAR.PIC: 0bbc6acfeaef8c7988cd5e75aaaf6320 [pass]
DESK.PIC: 07ae72e86ae2c38cb7293f86cb108e93 [pass]
TITLE640.PIC: 14c7e302d9ba0b3567196f43b2a914a6 [pass]
256PIT.PIC: b6651ea956bec71890bf90de9a31fb1d [pass]
ADV.PIC: fad492070c3afb3a11f32ad266df428d [pass]
256REAR.PIC: 4c4704170d85e18c842e18b1f438d266 [pass]
LABS.PIC: 157e724e8daa31336fe9f06255bd73c4 [pass]
256LEFT.PIC: 8a8d0d29a6789de4971a5381cb89a60c [pass]
DBICONS.SPR: a490a0bf84b2c36dedab4fee0372bd09 [pass]
F15.SPR: ac3d782b7a7c446dcf3036d532a3dbae [pass]
LIBYA.SPR: fb234541a8fd588684c80417eebbf729 [pass]
PERSIAN.SPR: fff6a0e3d2d16af739d9a36eb413339f [pass]
ME.SPR: 8079c4abfa88c04aad08e82908b7554e [pass]
VN.SPR: 17e00b718ea797eff553ffca1d3ba2bb [pass]
% check-all f15-se2-ds
3.PIC: e488a9127ba89f636fd0151da599ddb1 [FAIL]
2.PIC: a48a7ec1de1637da2d132a0fab3b4894 [FAIL]
1.PIC: 82b6b193954a0abba22f5f8267291d14 [FAIL]
4.PIC: d81029a10c5bcb0705ac98b6cc75621d [FAIL]
NCAPE.SPR: a019f6eade84d4d6a2c8034850979c14 [pass]
JP.SPR: e2252dea49554e9d51482e6ba0a22a5b [pass]
CEUROPE.SPR: bacf2b850fcc6b592f92f046c89b15c8 [pass]
This is fantastic! this confirms that the bad data was already in place when I was doing the RLE encoding. Not sure when I messed up the files, but have so many similarly named variations it’s easy to do. You might ask but what about the four that show FAIL? Those four files are not part of the game, they are part of an attached slideshow demo for some other titles (F117A Nighthawk, and Gunship 2000), and they are of a newer version of the PIC file format (91), so I would expect to see them fail here, I’d be more worried if they passed!
Well since we have no issues, how about we look into some of the other games we know to use the same format, and check their files. (Many will actually be identical to the ones we already tested, as MicroProse reused several assets between games)
% check-all f19
BAD.PIC: e91114812b9882fd03f0ec1225f7282b [pass]
ROSTER.PIC: 50a23dd329707b0b255d7014b778da0c [pass]
COCKPIT.PIC: 33bf8ae93b3e5316e98f4b16bd0ad61d [pass]
RESCUE.PIC: 1018baa5065a5c817afa9bd954868401 [pass]
MEDAL.PIC: c5dd9230d19e27c75bf36d43e1e34c1c [pass]
TITLE16.PIC: 5993be8af48dec234e8cfb7e9c131dfb [pass]
TASS.PIC: de362718594cbb0d44aff18445b78565 [pass]
ARMING.PIC: 2acbef6bc8dd1e2db7bf5c053ad916c5 [pass]
CLIP.PIC: ec1a76337430de510e163a3749dd9022 [pass]
NOTE.PIC: 2da73b28c90a4039dcb3a702d56af413 [pass]
GRAVE.PIC: a856080d7d4a35c4067198680d62a34d [pass]
BARTRI.PIC: 3e65c8ee0c5e2a1d6b6b5420bb0a20fe [pass]
256PIT.PIC: 4cc05dc64129b30b4fe164164f0e0f33 [pass]
ADV.PIC: d11ccbfdad6b71b6f31f412f41697238 [pass]
BARLONE.PIC: 73f12d7432d2de5fecfe775ba519de6d [pass]
GOOD.PIC: 5eb3ecfc9ae18ce54bea510301777ae9 [pass]
BARFULL.PIC: 9c17d3fd446e4aab433e6082f0dc977e [pass]
CREDIT16.PIC: cd50bf8f70432fdb093e2571f383a5ec [pass]
FLAG.PIC: a0a2ebe761b15f56014c2fdb46717d1d [pass]
LAND.PIC: 0684f8db2828f4103e93bb6552e73247 [pass]
MAPS.SPR: 0ff3f60793efa8d4d2981b4f98915368 [pass]
MEDAL.SPR: 1111c1562fc5a12740a4c4e9cf81bc42 [pass]
DBICONS.SPR: d246a3631d9ce6e4746415d4f360ec68 [pass]
ARMING.SPR: 8b235908ca74b93885dd43f7744f94da [pass]
NCAPE.SPR: a019f6eade84d4d6a2c8034850979c14 [pass]
LIBYA.SPR: fb234541a8fd588684c80417eebbf729 [pass]
PERSIAN.SPR: fff6a0e3d2d16af739d9a36eb413339f [pass]
F19.SPR: b95c418e696ff977f3e3022829ce67ae [pass]
CEUROPE.SPR: bacf2b850fcc6b592f92f046c89b15c8 [pass]
% check-all f117a
ROSTSPRT.PIC: 0247ff11c22610cb825844ef778bbce0 [pass]
N_CAPE.PIC: da3b521e06977290959b479dcc6d384a [pass]
PLANES5.PIC: 79e4837af12a0735e2b4df0dc35b2685 [pass]
RESCUE.PIC: dc60e482dcc9b5775fa58dea113100ed [pass]
PLANES4.PIC: 6b756df049e2de3bf7513244c7ded43c [pass]
ORDRSCRN.PIC: 00716e2a405cb61856d97b425b53d5b2 [pass]
VIETNAM.PIC: 986030959f80295568528661c990229f [pass]
ORDRSPRT.PIC: 8da2970829c6df7de0512a8b3fc52363 [pass]
PLANES1.PIC: fffdf8b197e6e3cde18e3826bc4134d9 [pass]
CEMETARY.PIC: 92ef318c9c56c1de9ef3fee53169ddca [pass]
PLANES3.PIC: 1a0e8b92c6e3117487f06f8518500ed2 [pass]
ROSTSCRN.PIC: 0508225374c59446082108b9f9f607a0 [pass]
PLANES2.PIC: 000e1d07a427639ca897df78ea36e633 [pass]
BREFSCRN.PIC: 7b416c3be29683eda15ebd0660397263 [pass]
MID_EAST.PIC: 88675a0cb16ef898f21d59c0afe9c4a8 [pass]
KUWAIT.PIC: f0b5bec340621862c9105d723a2f4c49 [pass]
256RIGHT.PIC: aea3fcbf4b4ca520c0438150ebbc07ac [pass]
HANGSCRN.PIC: 89d6eca2bc4300b094d2ec7ec8267d71 [pass]
HOMESCRN.PIC: d694600ffcc916d21cb62d41d542a443 [pass]
ENDHANGR.PIC: 2f07a70e9718fde552f93587c54749a9 [pass]
SWAP.PIC: 0f4c861db2f1b57777f1d91f2da15271 [pass]
HANGSPRT.PIC: 4ac9e737b0fb0e234691e3d48159f92a [pass]
TASS.PIC: 7108fb5c06e9b2cf27536d28ca2193cf [pass]
ARMSSCRN.PIC: 83a65d134dbd25326e50f432e8ec11e7 [pass]
KOREA.PIC: b506fa625e1b77435587e1b64f87c0dd [pass]
FLIGHT.PIC: 12307c6664a5e9f3b14fdf1f221972ca [pass]
BREFICON.PIC: daccea4654b43190407745b8708dfc18 [pass]
PERSIAN.PIC: e0f3e22b9ff0452f2bf2a73fb5a768ac [pass]
C_EUROPE.PIC: 53a4a031cf457c083e9e14b100e33e29 [pass]
MEDALS.PIC: c36266b4f259ed9acdc6ccde1dcc3d05 [pass]
LIBYA.PIC: cc93671793e143a1a042a911e8072032 [pass]
ORDNANCE.PIC: ceba719c7670e79576bba690bbc6c746 [pass]
CUBA.PIC: b8214a2358aae299348ebdf1509bc278 [pass]
LANDED1.PIC: 0b47f91b2c2279e3d99f2d9b2435f255 [pass]
DESK.PIC: d2011fb3a796a07567ef47f7e67ffa6f [pass]
WEAPONS1.PIC: 348de752660c609bbbe93a69d623edc2 [pass]
WEAPONS3.PIC: d0abf587a009cf52c39d544762fc28ab [pass]
LANDED2.PIC: cd0e20a76c552325f736db061368c713 [pass]
256PIT.PIC: 38e1c13be2b14661bc4e89c274b6c66a [pass]
AWARDS.PIC: f7afe28e6c851c98b978f33d37ba28d1 [pass]
LANDED3.PIC: 6ffd01da7a1f97a776f19d0f3e03960c [pass]
CLIMBIN.PIC: b58f3f2189ff9c41feb5e42f186a1d40 [pass]
ADV.PIC: da1c4dc96f71344fa1ffeb99184e51d7 [pass]
WEAPONS2.PIC: d9d28cd4b7627507cd2c912f6e43be51 [pass]
FOLDER1.PIC: 79a4304a526ce221dd1ec4feab542e3d [pass]
256REAR.PIC: 429308d94f0ed25bd2348c13ff22e047 [pass]
FOLDER2.PIC: ef6d69d6b8862519ff78dc40a350f304 [pass]
LANDED4.PIC: 770d20bcb166aa44c2899c2a7bbe6a2d [pass]
REQUESTR.PIC: fe0097be8eb45e7453f1c4f150cdbbb1 [pass]
FLAG.PIC: 5e87f60659f9a4376c28f10f862e91c9 [pass]
AIR_ID.PIC: 0904951e6c454831071279ebda5f5acc [pass]
256LEFT.PIC: 2944c0964a473d0e6d07c0b52f0c2ba4 [pass]
That’s a great outcome, and I’m sure will light up the eyes of any modders out there. I know I’m happy, and relieved, to be over this hurdle with the PIC File Format. Next steps will be to add the code necessary to read and write the two other variants we are aware of at this stage. This version seems to sit at the core of those so that effort should largely just be handing of the file header. Then we will clean things up, and make it ready to share with the community at large.
This post is part of a series of posts surrounding my reverse engineering efforts of the PIC file format that MicroProse used with their games. Specifically F-15 Strike Eagle II (Though I plan to trace the format through other titles to see if and how it changes). To read my other posts on this topic you can use this link to my archive page for the PIC File Format which will contain all my posts to date on the subject.


Leave a comment