Ouch my eye!

Do not look at LASER with remaining eye!


Unfinished Business

We have some unfinished business to do in light of the discoveries we made last time. I now no longer trust my prior results with the RLE encoder, so we need to go back and revisit that result. Last post we wrapped things up by solving the red herring of the LZW compressor generating garbage. Which brings into question the issue we had surrounding the same file for RLE encoding. But first, we need to put all the pieces of the encoder together, and with that perhaps we can kill two birds with one stone. We can address the questionable RLE result, and we can fully validate our encode and decode stacks.

Gluing the PIC encoder together

The process here is going to be very similar to what we did when we glued all the pieces together for the decoder in this post. Just like with the decoder we’re going to start by wrapping our LZW and RLE state types with a PIC state type.

typedef struct {
    uint8_t max_width; // maximum code width for the file
    bool isPacked;     // flag for pixel packing 16 colour files
    rle_state_t rle;   // RLE state machine
    lzw_state_t lzw;   // LZW state machine 
} pic_state_t;

This is pretty much identical to what we had with the decoder, though no longer carries the raw format identifier as we now understand how to properly decode, and encode, that byte. Next is our main entry to encode a file. for simplicity this time, I stayed with file in and file out, instead of using a memory buffer.

int pic_compress(FILE *dst, FILE *src) {
    pic_state_t *ctx;

    if(NULL == (ctx = (pic_state_t *)calloc(sizeof(pic_state_t), 1))) {
        return PIC_NOMEM;
    }

    // init the contexts here
    lzw_init(&ctx->lzw);
    rle_init(&ctx->rle);

    int8_t format_identifier = fgetc(src); // read in the format byte
    ctx->isPacked = (format_identifier > 0);  // positive value indicates packed pixels
    ctx->max_width = abs(format_identifier);  // absolute val of is max encoding width
    fputc(format_identifier, dst);

    if(ctx->max_width > LZW_MAX_CODE_WIDTH) ctx->max_width = LZW_MAX_CODE_WIDTH;

    // decompress the rest of the file into the image buffer
    int rval = pack(dst, src, ctx);

    free(ctx);
    return rval;
}

Next step is our packing routine that simply reads bytes from the input file until EOF and packs pixels based on the isPacked flag, and then passes the packed pixels off to the RLE encoder.

int rle_encode(FILE *dst, int symbol, pic_state_t *pic) {
    rle_state_t *ctx = &pic->rle;

    if(EOF == symbol) { 
    // we've reached the end, we need to drain out anything we may not have emitted yet.
        int rval = rle_drain(dst, pic);
        if(PIC_NOERROR != rval) return rval;
        return lzw_compress(dst, symbol, pic); // pass the EOF forward
    }

    if(RLE_TOKEN == symbol) { 
    // handle the special case of the token value in the stream
        int rval = rle_drain(dst, pic);
        if(PIC_NOERROR != rval) return rval;
        return rle_escape(dst, pic);
    }

    if(false == ctx->encoding) {
    // not currently in an encoding, store and count the symbol and exit 
        ctx->encoding = true;
        ctx->symbol = symbol;
        ctx->count++;
        return PIC_NOERROR;
    }

    if(ctx->symbol != symbol) { // symbols changed
        int rval = rle_drain(dst, pic); // drain it out
        if(PIC_NOERROR != rval) return rval;
        ctx->encoding = true; // store and count the new symbol
        ctx->symbol = symbol;
        ctx->count++;
        return PIC_NOERROR;
    } 

    ctx->count++;
    if(255 == ctx->count) { // max count reached, drain it out
        return rle_drain(dst, pic);
    }
    return PIC_NOERROR;
}

The RLE code is largely the same, The one notable exception is the highlighted change to forward EOF to the next stages, so they can flush any pending data as well. All the other RLE routines saw similar changes with their respective fputc() calls replaced with lzw_compress() calls.

// continually compresses input symbols until EOF is reached
int lzw_compress(FILE *dst, int symbol, pic_state_t *pic) {

    lzw_state_t *ctx = &pic->lzw;

    // passing EOF triggers a flushing of the output
    if(EOF == symbol) {
        write_code(dst, ctx->chain_code, ctx); // write the final string code
        write_code(dst, LZW_EOF, ctx); // flush the bit buffer
        return PIC_NOERROR; // nothing more to do
    }

    // startup condition
    if(LZW_EOC == ctx->chain_code) {
        // no need to look it up, all single character codes are
        // guaranteed to be in the dictionary
        ctx->chain_code = symbol; 
        return PIC_NOERROR;
    }

    // see if chain_code + symbol exists already
    uint16_t new_code = seek_chain(ctx->chain_code, symbol, ctx); 

    if(LZW_NO_CODE != new_code) { // match found, keep extending
        ctx->chain_code = new_code;
        return PIC_NOERROR;
    }

    // write out the old code
    write_code(dst, ctx->chain_code, ctx);

    // handle table expansion and reset
    if(ctx->next_code >= ctx->resize_code) {
        // write_code(dst, symbol, ctx);
        // check if we are at max bits, if so reset instead of resize
        if(LZW_MAX_CODE_WIDTH > ctx->code_bits) {
            lzw_resize(ctx);
        } else {
            uint8_t chain_symbol = ctx->dictionary[ctx->chain_code].symbol;
            lzw_reset(ctx);
            // root the new table on the last symbol emitted 
            ctx->chain_code = chain_symbol; 
        }
    }

    // create a new code
    // add it to the hash before the dictionary code updates the value
    ctx->hash_table[ctx->hash_point] = ctx->next_code;
    // add the new string to the dictionary, consumes next_code and increments 
    lzw_add_code(ctx->chain_code, symbol, ctx); 
    // start the next string with the current symbol
    ctx->chain_code = symbol;

    return PIC_NOERROR;
}

The LZW code is nearly identical to what we had before, the only change is how the context is passed in, and unwrapping it.

That is pretty much it, all we need to do now is build it and wrap some simple shell scripts around this and the decoder so we can use them to validate themselves against as much source data as we can give it. First up is a script that takes in a file name, decodes it to a temporary file, then encodes the temporary file back to a pic. Generates a md5 file signature for each of the files, and then compares the signatures.

#!/bin/sh

# strip off the path from the input file name for cleaner prints
name=$(basename "$1")

# decode the file to a temporary file
./pic2mpraw "$1" "OUT.PIX" > /dev/null  2>&1

# re-encode the temporary file back to a PIC
./mpraw2pic "OUT.PIX" "OUT.PIC" > /dev/null  2>&1

# hash the files, and compare the results 
master=$(md5 -q "$1" 2>&1 )
encode=$(md5 -q "OUT.PIC" 2>&1 )
result=$([[ "$master" == "$encode" ]] && echo "\033[32m[pass]\033[0m" || echo "\033[7;31m[FAIL]\033[0m")

# cleanup
[ -e OUT.PIX ] && rm "OUT.PIX"
[ -e OUT.PIC ] && rm "OUT.PIC"

printf '%12s: \033[2m%s\033[22m  %s\n' "$name" "$master" "$result"

Finally another simple script to find all .PIC and .SPR files within a given directory, and then call the above script for each matching file found.

#!/bin/sh

find "$(pwd)/${1}" -name "*.PIC" -maxdepth 5 -type f -not -path '*/\.*' -exec ./check-one {} \;
find "$(pwd)/${1}" -name "*.SPR" -maxdepth 5 -type f -not -path '*/\.*' -exec ./check-one {} \;

I crave validation

Now that we have our decoder and encoder written, let’s use them to validate our work by decoding actual game assets, and then re-encoding the decoded result and comparing it to the original. As noted above, I wrote up a pair of scripts to find all the .PIC and .SPR files in a given directory, and then go through the process of decoding, encoding and comparing the results. Below are the results for F15 Strike Eagle II and the Desert Storm scenario expansion pack.

% check-all f15-se2
   DEATH.PIC: 6e26bcc7228da3563f17c1c919e14349  [pass]
 COCKPIT.PIC: ff1bba14e570245e94e5f1a6186422a3  [pass]
 HISCORE.PIC: 537bbf274d79ebaa616f55917d14bf19  [pass]
   RIGHT.PIC: 94190aa3fe2f7de2395b15dab9c04a53  [pass]
   MEDAL.PIC: 3f75ad96c4d9c61842d58e087394d968  [pass]
   PROMO.PIC: fa3efcb64547c7fc73bc165b4d06bbf2  [pass]
256RIGHT.PIC: d18609791beb5a4b6bbc3c95a49979ee  [pass]
 TITLE16.PIC: bd7a9c10dcd06fec62af1071a678ad7f  [pass]
    WALL.PIC: 0839cb62142b5d3e5058b596ad36fb32  [pass]
    LEFT.PIC: 789455e28757793fec416cc444477063  [pass]
ARMPIECE.PIC: f6b8b7b27b1de44282ca04ea6d369ea4  [pass]
    REAR.PIC: 0bbc6acfeaef8c7988cd5e75aaaf6320  [pass]
    DESK.PIC: 07ae72e86ae2c38cb7293f86cb108e93  [pass]
TITLE640.PIC: 14c7e302d9ba0b3567196f43b2a914a6  [pass]
  256PIT.PIC: b6651ea956bec71890bf90de9a31fb1d  [pass]
     ADV.PIC: fad492070c3afb3a11f32ad266df428d  [pass]
 256REAR.PIC: 4c4704170d85e18c842e18b1f438d266  [pass]
    LABS.PIC: 157e724e8daa31336fe9f06255bd73c4  [pass]
 256LEFT.PIC: 8a8d0d29a6789de4971a5381cb89a60c  [pass]
 DBICONS.SPR: a490a0bf84b2c36dedab4fee0372bd09  [pass]
     F15.SPR: ac3d782b7a7c446dcf3036d532a3dbae  [pass]
   LIBYA.SPR: fb234541a8fd588684c80417eebbf729  [pass]
 PERSIAN.SPR: fff6a0e3d2d16af739d9a36eb413339f  [pass]
      ME.SPR: 8079c4abfa88c04aad08e82908b7554e  [pass]
      VN.SPR: 17e00b718ea797eff553ffca1d3ba2bb  [pass]
% check-all f15-se2-ds 
       3.PIC: e488a9127ba89f636fd0151da599ddb1  [FAIL]
       2.PIC: a48a7ec1de1637da2d132a0fab3b4894  [FAIL]
       1.PIC: 82b6b193954a0abba22f5f8267291d14  [FAIL]
       4.PIC: d81029a10c5bcb0705ac98b6cc75621d  [FAIL]
   NCAPE.SPR: a019f6eade84d4d6a2c8034850979c14  [pass]
      JP.SPR: e2252dea49554e9d51482e6ba0a22a5b  [pass]
 CEUROPE.SPR: bacf2b850fcc6b592f92f046c89b15c8  [pass]

This is fantastic! this confirms that the bad data was already in place when I was doing the RLE encoding. Not sure when I messed up the files, but have so many similarly named variations it’s easy to do. You might ask but what about the four that show FAIL? Those four files are not part of the game, they are part of an attached slideshow demo for some other titles (F117A Nighthawk, and Gunship 2000), and they are of a newer version of the PIC file format (91), so I would expect to see them fail here, I’d be more worried if they passed!

F117A Demo Slide included with Desert Storm
Gunship 2000 Demo Slide included with Desert Storm

Well since we have no issues, how about we look into some of the other games we know to use the same format, and check their files. (Many will actually be identical to the ones we already tested, as MicroProse reused several assets between games)

% check-all f19  
     BAD.PIC: e91114812b9882fd03f0ec1225f7282b  [pass]
  ROSTER.PIC: 50a23dd329707b0b255d7014b778da0c  [pass]
 COCKPIT.PIC: 33bf8ae93b3e5316e98f4b16bd0ad61d  [pass]
  RESCUE.PIC: 1018baa5065a5c817afa9bd954868401  [pass]
   MEDAL.PIC: c5dd9230d19e27c75bf36d43e1e34c1c  [pass]
 TITLE16.PIC: 5993be8af48dec234e8cfb7e9c131dfb  [pass]
    TASS.PIC: de362718594cbb0d44aff18445b78565  [pass]
  ARMING.PIC: 2acbef6bc8dd1e2db7bf5c053ad916c5  [pass]
    CLIP.PIC: ec1a76337430de510e163a3749dd9022  [pass]
    NOTE.PIC: 2da73b28c90a4039dcb3a702d56af413  [pass]
   GRAVE.PIC: a856080d7d4a35c4067198680d62a34d  [pass]
  BARTRI.PIC: 3e65c8ee0c5e2a1d6b6b5420bb0a20fe  [pass]
  256PIT.PIC: 4cc05dc64129b30b4fe164164f0e0f33  [pass]
     ADV.PIC: d11ccbfdad6b71b6f31f412f41697238  [pass]
 BARLONE.PIC: 73f12d7432d2de5fecfe775ba519de6d  [pass]
    GOOD.PIC: 5eb3ecfc9ae18ce54bea510301777ae9  [pass]
 BARFULL.PIC: 9c17d3fd446e4aab433e6082f0dc977e  [pass]
CREDIT16.PIC: cd50bf8f70432fdb093e2571f383a5ec  [pass]
    FLAG.PIC: a0a2ebe761b15f56014c2fdb46717d1d  [pass]
    LAND.PIC: 0684f8db2828f4103e93bb6552e73247  [pass]
    MAPS.SPR: 0ff3f60793efa8d4d2981b4f98915368  [pass]
   MEDAL.SPR: 1111c1562fc5a12740a4c4e9cf81bc42  [pass]
 DBICONS.SPR: d246a3631d9ce6e4746415d4f360ec68  [pass]
  ARMING.SPR: 8b235908ca74b93885dd43f7744f94da  [pass]
   NCAPE.SPR: a019f6eade84d4d6a2c8034850979c14  [pass]
   LIBYA.SPR: fb234541a8fd588684c80417eebbf729  [pass]
 PERSIAN.SPR: fff6a0e3d2d16af739d9a36eb413339f  [pass]
     F19.SPR: b95c418e696ff977f3e3022829ce67ae  [pass]
 CEUROPE.SPR: bacf2b850fcc6b592f92f046c89b15c8  [pass]
% check-all f117a
ROSTSPRT.PIC: 0247ff11c22610cb825844ef778bbce0  [pass]
  N_CAPE.PIC: da3b521e06977290959b479dcc6d384a  [pass]
 PLANES5.PIC: 79e4837af12a0735e2b4df0dc35b2685  [pass]
  RESCUE.PIC: dc60e482dcc9b5775fa58dea113100ed  [pass]
 PLANES4.PIC: 6b756df049e2de3bf7513244c7ded43c  [pass]
ORDRSCRN.PIC: 00716e2a405cb61856d97b425b53d5b2  [pass]
 VIETNAM.PIC: 986030959f80295568528661c990229f  [pass]
ORDRSPRT.PIC: 8da2970829c6df7de0512a8b3fc52363  [pass]
 PLANES1.PIC: fffdf8b197e6e3cde18e3826bc4134d9  [pass]
CEMETARY.PIC: 92ef318c9c56c1de9ef3fee53169ddca  [pass]
 PLANES3.PIC: 1a0e8b92c6e3117487f06f8518500ed2  [pass]
ROSTSCRN.PIC: 0508225374c59446082108b9f9f607a0  [pass]
 PLANES2.PIC: 000e1d07a427639ca897df78ea36e633  [pass]
BREFSCRN.PIC: 7b416c3be29683eda15ebd0660397263  [pass]
MID_EAST.PIC: 88675a0cb16ef898f21d59c0afe9c4a8  [pass]
  KUWAIT.PIC: f0b5bec340621862c9105d723a2f4c49  [pass]
256RIGHT.PIC: aea3fcbf4b4ca520c0438150ebbc07ac  [pass]
HANGSCRN.PIC: 89d6eca2bc4300b094d2ec7ec8267d71  [pass]
HOMESCRN.PIC: d694600ffcc916d21cb62d41d542a443  [pass]
ENDHANGR.PIC: 2f07a70e9718fde552f93587c54749a9  [pass]
    SWAP.PIC: 0f4c861db2f1b57777f1d91f2da15271  [pass]
HANGSPRT.PIC: 4ac9e737b0fb0e234691e3d48159f92a  [pass]
    TASS.PIC: 7108fb5c06e9b2cf27536d28ca2193cf  [pass]
ARMSSCRN.PIC: 83a65d134dbd25326e50f432e8ec11e7  [pass]
   KOREA.PIC: b506fa625e1b77435587e1b64f87c0dd  [pass]
  FLIGHT.PIC: 12307c6664a5e9f3b14fdf1f221972ca  [pass]
BREFICON.PIC: daccea4654b43190407745b8708dfc18  [pass]
 PERSIAN.PIC: e0f3e22b9ff0452f2bf2a73fb5a768ac  [pass]
C_EUROPE.PIC: 53a4a031cf457c083e9e14b100e33e29  [pass]
  MEDALS.PIC: c36266b4f259ed9acdc6ccde1dcc3d05  [pass]
   LIBYA.PIC: cc93671793e143a1a042a911e8072032  [pass]
ORDNANCE.PIC: ceba719c7670e79576bba690bbc6c746  [pass]
    CUBA.PIC: b8214a2358aae299348ebdf1509bc278  [pass]
 LANDED1.PIC: 0b47f91b2c2279e3d99f2d9b2435f255  [pass]
    DESK.PIC: d2011fb3a796a07567ef47f7e67ffa6f  [pass]
WEAPONS1.PIC: 348de752660c609bbbe93a69d623edc2  [pass]
WEAPONS3.PIC: d0abf587a009cf52c39d544762fc28ab  [pass]
 LANDED2.PIC: cd0e20a76c552325f736db061368c713  [pass]
  256PIT.PIC: 38e1c13be2b14661bc4e89c274b6c66a  [pass]
  AWARDS.PIC: f7afe28e6c851c98b978f33d37ba28d1  [pass]
 LANDED3.PIC: 6ffd01da7a1f97a776f19d0f3e03960c  [pass]
 CLIMBIN.PIC: b58f3f2189ff9c41feb5e42f186a1d40  [pass]
     ADV.PIC: da1c4dc96f71344fa1ffeb99184e51d7  [pass]
WEAPONS2.PIC: d9d28cd4b7627507cd2c912f6e43be51  [pass]
 FOLDER1.PIC: 79a4304a526ce221dd1ec4feab542e3d  [pass]
 256REAR.PIC: 429308d94f0ed25bd2348c13ff22e047  [pass]
 FOLDER2.PIC: ef6d69d6b8862519ff78dc40a350f304  [pass]
 LANDED4.PIC: 770d20bcb166aa44c2899c2a7bbe6a2d  [pass]
REQUESTR.PIC: fe0097be8eb45e7453f1c4f150cdbbb1  [pass]
    FLAG.PIC: 5e87f60659f9a4376c28f10f862e91c9  [pass]
  AIR_ID.PIC: 0904951e6c454831071279ebda5f5acc  [pass]
 256LEFT.PIC: 2944c0964a473d0e6d07c0b52f0c2ba4  [pass]

That’s a great outcome, and I’m sure will light up the eyes of any modders out there. I know I’m happy, and relieved, to be over this hurdle with the PIC File Format. Next steps will be to add the code necessary to read and write the two other variants we are aware of at this stage. This version seems to sit at the core of those so that effort should largely just be handing of the file header. Then we will clean things up, and make it ready to share with the community at large.

By Thread



Leave a comment