In my last post on the PIC93 version of the MicroProse PIC file format we left off having successfully decompressed the LZSS compressed image data within the file and reconstituting the image. That part was easy, as the code was pretty much already written for us in the form of the unlzexe github repository. It wasn’t easy getting to the point of identifying that Bellard-LZSS was actually the compression used. In the end we got there. In this post we’re going to try and reverse the process. This could prove to be far more challenging as we don’t have a full reference to look back onto.
Where to begin?
I could probably try to reach out to Fabrice Bellard to see if he would be willing to release his source after all these years, and I may do that still, but that would be the easy way out. Part of this is the fun of the challenge in figuring it out for ourselves. It is also possible that the MicroProse implementation isn’t exact to Bellard, but close enough that the Bellard implementation can decode it. One option might be to reverse engineer LZEXE itself by looking at the generated machine code, but I don’t want to go that route. I’m not trying to recreate LZEXE, I just want to recreate the compression algorithm itself. We have a working decompression module, so we should be able to use that to validate our compression module, much in the same way we did with LZW.
Since we don’t have a reference implementation for the Bellard-LZSS compression algorithm we’re going to use the Okumura-LZSS implementation as a guide, just as Fabrice Bellard did back in 1989. Most of the specifics should be fairly straight forward to implement, as we can see how the decoding of Bellard’s version differs from the Okumura version, but there are some specifics that will still need some figuring out.
| Data/Pointer Type | Total Byte Usage | Flag Bits Used | Stream Bytes Used | Distance Bits | Distance Range | Length Bits | Length Range |
|---|---|---|---|---|---|---|---|
| Literal uncompressed byte | 1.125 | 1 | 1 | N/A | N/A | N/A | N/A |
| Short distance | 1.5 | 4 | 1 | 8 | -1 – -256 | 2 | 2–5 |
| Long distance | 2.25 | 2 | 2 | 13 | -1 – -8,192 | 3 | 3–9 |
| Long distance/long length | 3.25 | 2 | 3 | 13 | -1 – -8,192 | 8 | 3–256 |
For reference here is the main compression loop for the Okumura-LZSS implementation.
void Encode(void)
{
int i, c, len, r, s, last_match_length, code_buf_ptr;
unsigned char code_buf[17], mask;
InitTree(); /* initialize trees */
code_buf[0] = 0; /* code_buf[1..16] saves eight units of code, and
code_buf[0] works as eight flags, "1" representing that the unit
is an unencoded letter (1 byte), "0" a position-and-length pair
(2 bytes). Thus, eight units require at most 16 bytes of code. */
code_buf_ptr = mask = 1;
s = 0; r = N - F;
for (i = s; i < r; i++) text_buf[i] = ' '; /* Clear the buffer with
any character that will appear often. */
for (len = 0; len < F && (c = getc(infile)) != EOF; len++)
text_buf[r + len] = c; /* Read F bytes into the last F bytes of
the buffer */
if ((textsize = len) == 0) return; /* text of size zero */
for (i = 1; i <= F; i++) InsertNode(r - i); /* Insert the F strings,
each of which begins with one or more 'space' characters. Note
the order in which these strings are inserted. This way,
degenerate trees will be less likely to occur. */
InsertNode(r); /* Finally, insert the whole string just read. The
global variables match_length and match_position are set. */
do {
if (match_length > len) match_length = len; /* match_length
may be spuriously long near the end of text. */
if (match_length <= THRESHOLD) {
match_length = 1; /* Not long enough match. Send one byte. */
code_buf[0] |= mask; /* 'send one byte' flag */
code_buf[code_buf_ptr++] = text_buf[r]; /* Send uncoded. */
} else {
code_buf[code_buf_ptr++] = (unsigned char) match_position;
code_buf[code_buf_ptr++] = (unsigned char)
(((match_position >> 4) & 0xf0)
| (match_length - (THRESHOLD + 1))); /* Send position and
length pair. Note match_length > THRESHOLD. */
}
if ((mask <<= 1) == 0) { /* Shift mask left one bit. */
for (i = 0; i < code_buf_ptr; i++) /* Send at most 8 units of */
putc(code_buf[i], outfile); /* code together */
codesize += code_buf_ptr;
code_buf[0] = 0; code_buf_ptr = mask = 1;
}
last_match_length = match_length;
for (i = 0; i < last_match_length &&
(c = getc(infile)) != EOF; i++) {
DeleteNode(s); /* Delete old strings and */
text_buf[s] = c; /* read new bytes */
if (s < F - 1) text_buf[s + N] = c; /* If the position is
near the end of buffer, extend the buffer to make
string comparison easier. */
s = (s + 1) & (N - 1); r = (r + 1) & (N - 1);
/* Since this is a ring buffer, increment the position
modulo N. */
InsertNode(r); /* Register the string in text_buf[r..r+F-1] */
}
if ((textsize += i) > printcount) {
printf("%12ld\r", textsize); printcount += 1024;
/* Reports progress each time the textsize exceeds
multiples of 1024. */
}
while (i++ < last_match_length) { /* After the end of text, */
DeleteNode(s); /* no need to read, but */
s = (s + 1) & (N - 1); r = (r + 1) & (N - 1);
if (--len) InsertNode(r); /* buffer may not be empty. */
}
} while (len > 0); /* until length of string to be processed is zero */
if (code_buf_ptr > 1) { /* Send remaining code. */
for (i = 0; i < code_buf_ptr; i++) putc(code_buf[i], outfile);
codesize += code_buf_ptr;
}
printf("In : %ld bytes\n", textsize); /* Encoding is done. */
printf("Out: %ld bytes\n", codesize);
printf("Out/In: %.3f\n", (double)codesize / textsize);
}
As you can see, that unlike the simple elegance of the decompressor, the compressor is far more complex. We will just have to take things one little piece at a time, and hopefully have a functioning compressor at the end of it all. The above code, and the table above it, along with the decompression code will serve as our guides in putting the compressor together. Also from the table pulled form cosmodoc, we can see we have 3 different pointer types, whereas the Okumura implementation has only one, so we have some work to do still.
Implementing the Bellard-LZSS compressor
To try to minimize major structural changes to the Okumura reference, we will use file streams for both input and output. Once we have a working algorithm, we can look at changing one, or both to memory based streams. For now we want to concentrate on making only the changes necessary to implement the Bellard variation, and not how we might want to optimize things.
Unlike decompression, where we don’t need to buffer any data, for compression we need to be able to prefix the 16 flag bits, before the compressed bytes they represent, let’s call this a compression unit. Once a compression unit is full (all the flag bits used), we write the whole thing out. Now we need to determine what the maximal length for a compression unit is. So let’s consider a few cases, from the decompression perspective. If flags was FFFF (all ones) then the data portion would be 16 bytes of uncompressed data, not likely but we are considering extremes here to find extents. Next would be the case of a 00 bit sequence, in this case two more bits are consumed plus one byte of contr0l data. As 4 bits are consumed in this scenario, the data portion would only be 4 more bytes. The final case to consider is a 01 bit sequence, in this case only 2 flag bits are consumed, and 2-3 bytes of additional data. So in this case the worst case scenario would be 24 bytes of data. So that means we need a buffer that can hold our 16 bits of flags plus up to 24 bytes of data, meaning 26 bytes total. For simplicity let’s round this up to 32, leaving us with the following 32 byte structure.
typedef struct {
uint16_t flags; // our 16 bit control flags
uint8_t data[30]; // our data for this compression unit
} lzss_unit_t;
Note there is noting in there for how many flag bits or bytes of additional data are actually valid, this needs to be managed externally, as it is not part of what is written out. Instead of actually tracking how many bits in the flags are used, we will just track the mask used to insert the bits, when it overflows to zero, we know that all 16 are used and we need to flush and start over. So to do that, we will wrap the above structure in another, to hold our working context. (we will add to this structure as we go)
typedef struct {
uint16_t mask; // bit mask for the current flag bit.
int byte_count; // tracks the number of data bytes in the compression unit
lzss_unit_t unit; // the current 'working' lzss compression unit
} lzss_ctx_t;
Since LZSS is a sliding window type compression, the next thing we need is the window itself. We were able to avoid this with the decompressor, since we decompressed to memory, and used the memory buffer as our window. We could do this again with the compressor by having the entire uncompressed source in memory, but to simplify things we will manage it separately for now. The question now becomes, how large is the window? To answer that we need to consider what the maximal reference offset size is that we can produce. From the decompression code (or table earlier) we can determine that the maximal reference offset is 13 bits (213=8192) meaning we need an 8K window. To facilitate quickly finding potential matching strings we will also need some form of tree structure, Okumura uses a binary tree, we will just reuse that same code here,as it does not affect any of the other changes we need for the Bellard implementation. So our updated context structure now looks like this.
#define MAXBITS (13)
#define TABLESZ (1 << MAXBITS)
#define TABLEMSK (TABLESZ - 1) // mask for modulo arithmetic on the sliding window
#define STRINGSZ (256) // maximal matching string length
#define THRESHOLD (2) // break point for encoding literals or pointers
#define NIL (TABLESZ) // marker of empty tree entry
#define MAXFLAG (1UL << 16) // used for checking when FLAGS are full
typedef struct {
uint16_t mask; // bit mask for the current flag bit.
int byte_count; // tracks the number of data bytes in the compression unit
lzss_unit_t unit; // the current 'working' lzss compression unit
// set by insert_node()
uint16_t match_pos; // position of longest match
uint16_t match_len; // length of longest match
// search tree
uint16_t parent[TABLESZ + 1]; // parent node of search tree
uint16_t rchild[TABLESZ + 1 + 256]; // right child of search tree + 256 root entries
uint16_t lchild[TABLESZ + 1]; // left child of search tree
// sliding window
u_int8_t window[TABLESZ + STRINGSZ - 1];
} lzss_ctx_t;
Binary trees
Before we get into the main compression loop, we need a few support functions. These are largely rehashes of the Okumara versions, I’ve just tried to clean things up a bit by making it more readable by hopefully giving the variables more meaningful names. The biggest changes here are around the init, which takes care of most of the initialization of our context.
/*
* initialize the LZSS state context
*
* lchild[] and rchild[] can be left uninitialized (up to TABLESZ)
* parent[] These must be set to NIL to mark as empty
* rchild[] TABLESZ + (0 to 256) are set to NIL - these are the 256 root entries
*/
static void init_state(lzss_ctx_t *ctx) {
// zero out the entire structure, including the buffers
bzero(ctx, sizeof(lzss_ctx_t)); // zero out the buffer
// Clear the buffer with any character that will appear often.
for (int i = 0; i < TABLESZ - STRINGSZ; i++)
ctx->window[i] = ' ';
// initialize the 256 root entries
for (int i = TABLESZ + 1; i <= TABLESZ + 256; i++) {
ctx->rchild[i] = NIL;
}
// mark all entries as empty
for (int i = 0; i < TABLESZ; i++) {
ctx->parent[i] = NIL;
}
ctx->mask = 0x0001;
}
Next is our insert_node() this function is responsible for adding a new node, but also returning the position for the maximal length match. This function is the main workhorse for our compression as it is responsible for finding our matches to reference back to.
// Inserts string of length STRINGSZ, and returns the longest-match position
// and length via the context variables match_pos and match_len.
// If match_len = STRINGSZ, then removes the old node in favour of the new
// one, because the old one will be deleted sooner.
static void insert_node(int root, lzss_ctx_t *ctx) {
int len, pos, cmp;
u_int8_t *key;
cmp = 1;
key = &ctx->window[root];
pos = TABLESZ + 1 + key[0];
ctx->rchild[root] = ctx->lchild[root] = NIL;
ctx->match_len = 0;
for ( ; ; ) {
if (cmp >= 0) {
if (ctx->rchild[pos] != NIL)
pos = ctx->rchild[pos];
else {
ctx->rchild[pos] = root;
ctx->parent[root] = pos;
return;
}
} else {
if (ctx->lchild[pos] != NIL)
pos = ctx->lchild[pos];
else {
ctx->lchild[pos] = root;
ctx->parent[root] = pos;
return;
}
}
for (len = 1; len < STRINGSZ; len++) {
if ((cmp = key[len] - ctx->window[pos + len]) != 0)
break;
}
if (len > ctx->match_len) {
ctx->match_pos = pos;
if ((ctx->match_len = len) >= STRINGSZ)
break;
}
}
ctx->parent[root] = ctx->parent[pos];
ctx->lchild[root] = ctx->lchild[pos];
ctx->rchild[root] = ctx->rchild[pos];
ctx->parent[ctx->lchild[pos]] = root;
ctx->parent[ctx->rchild[pos]] = root;
if (ctx->rchild[ctx->parent[pos]] == pos)
ctx->rchild[ctx->parent[pos]] = root;
else
ctx->lchild[ctx->parent[pos]] = root;
ctx->parent[pos] = NIL; // remove pos
}
Finally we also need a way to remove nodes from the trees as they age out of the sliding window. And that’s the last of the utility functions we need for managing our binary tree.
// deletes node 'node' from tree
static void delete_node(int node, lzss_ctx_t *ctx) {
int child;
if (ctx->parent[node] == NIL)
return; // not in tree
if (ctx->rchild[node] == NIL)
child = ctx->lchild[node];
else if (ctx->lchild[node] == NIL)
child = ctx->rchild[node];
else {
child = ctx->lchild[node];
if (ctx->rchild[child] != NIL) {
do {
child = ctx->rchild[child];
} while (ctx->rchild[child] != NIL);
ctx->rchild[ctx->parent[child]] = ctx->lchild[child];
ctx->parent[ctx->lchild[child]] = ctx->parent[child];
ctx->lchild[child] = ctx->lchild[node];
ctx->parent[ctx->lchild[node]] = child;
}
ctx->rchild[child] = ctx->rchild[node];
ctx->parent[ctx->rchild[node]] = child;
}
ctx->parent[child] = ctx->parent[node];
if (ctx->rchild[ctx->parent[node]] == node)
ctx->rchild[ctx->parent[node]] = child;
else
ctx->lchild[ctx->parent[node]] = child;
ctx->parent[node] = NIL;
}
Flags and the compression unit
Now we are on to one of the first implementation differences between the Okumura and Bellard implementations. With Okumura there was only ever one flag bit per coded symbol, either a literal byte, or a coded pointer. With Bellard we have several types of symbols, and a variable number of flags. As such we will need to manage the flags (and compression unit) outside of the main loop, as they may need to be flushed and reset at anytime. So like with our decompression code, where we had a read_bit() function, we will create a write_bit() function here to manage the flag bits and compression unit.
// write a single bit to the flags word
// any non-zero value will be written as a 1 bit to flags
// if flags becomes full, the compression unit is dumped and reset.
int write_bit(uint8_t bit, lzss_ctx_t *ctx, FILE *dst) {
if(0 != bit) {
ctx->unit.flags |= ctx->mask;
}
ctx->mask <<= 1;
// all bits used, write it out and reset the compression unit
if(0 == ctx->mask) { // write the fields separately in case of padding
if(1 != fwrite(&ctx->unit.flags, sizeof(uint16_t), 1, dst)) {
return -1;
}
if(1 != fwrite(&ctx->unit.data, ctx->byte_count, 1, dst)) {
return -1;
}
ctx->mask = 0x0001; // start back at bit 0
ctx->byte_count = 0; // reset the the count
bzero(&ctx->unit, sizeof(lzss_unit_t)); // clear the compression unit
}
return 0;
}
The main compression loop
With that done, I think we are ready to begin with our main compression loop. First let’s set up the loop according to Okumura’s implementation, though I’m following the cleaner version that is part of Apple’s kernel sources. (this was the version I used as reference for the binary tree functions as well)
int lzss_compress(FILE *dst, FILE *src) {
int result = 0; // return value at 'finish'
lzss_ctx_t *ctx = NULL; // the compression state machine context
uint16_t cur = TABLESZ - STRINGSZ; // position for the current char/string
uint16_t nxt = 0; // position the next byte will be inserted
uint16_t len = 0; // length of bytes in the buffer
int byte; // temporary holder for when fgetc() is
// used so we can check for EOF
// allocate our compression context, init all to 0
if(NULL == (ctx = (lzss_ctx_t *)calloc(1, sizeof(lzss_ctx_t)))) {
result = -2;
goto finish;
}
init_state(ctx); // properly initialize the context
// Read STRINGSZ bytes into the last STRINGSZ bytes of the buffer
len = fread(&ctx->window[cur], 1, STRINGSZ, src);
if(0 == len) { // empty file, nothing to compress
result = -3; // technically not an error
goto finish;
}
// Insert the STRINGSZ strings, each of which begins with one or more
// 'space' characters. Note the order in which these strings are
// inserted. This way, degenerate trees will be less likely to occur.
for(int i = 1; i < STRINGSZ; i++) {
insert_node(cur - i, ctx);
}
// Finally, insert the whole string just read.
// The global variables match_length and match_position are set.
insert_node(cur, ctx);
// the main compression loop
do {
int match_len = ctx->match_len;
int match_pos = ctx->match_pos;
if(match_len > len) { // can't match/remove more than we've read
match_len = len;
}
// match cases go here
// remove the bytes and update the tree
int i;
for (i = 0; (i < match_len) && (EOF != (byte = getc(src))); i++) {
delete_node(nxt, ctx); // Delete old strings and
ctx->window[nxt] = byte; // read new bytes
// replicate bytes close to the end of the buffer,
// to make the compares easier
if (nxt < STRINGSZ - 1) ctx->window[nxt + TABLESZ] = byte;
// Since this is a ring buffer, increment the position modulo TABLESZ.
nxt = (nxt + 1) & TABLEMSK;
cur = (cur + 1) & TABLEMSK;
// Register the string in text_buf[r..r+STRINGSZ-1]
insert_node(cur, ctx);
}
while (i++ < match_len) {
delete_node(nxt, ctx);
// After the end of text, no need to read
nxt = (nxt + 1) & TABLEMSK;
cur = (cur + 1) & TABLEMSK;
// but buffer may not be empty.
if (--len) insert_node(cur, ctx);
}
} while(0 < len);
// send any remaining compression unit data
if(0 != ctx->byte_count) {
if(1 != fwrite(&ctx->unit, sizeof(uint16_t) + ctx->byte_count, 1, dst)) {
result = -4;
goto finish;
}
}
result = 0;
finish:
if(NULL != ctx) free(ctx);
return result;
}
The above is incomplete, but represents the core operations of the loop in terms of handling the sliding window, and updating the search trees. Now we will go through each of out match cases one by one to flesh them out, as this is where things get very different from Okumura.
Pointer sizes
The most basic case is when we have an uncompressed literal byte. This happens whenever the match length is less than the threshold length to prevent expansion, the threshold is basically the break-even point between encoding a literal, and encoding a pointer. That seems simple enough, however with the Bellard implementation the threshold moves around a little. Generally speaking the threshold is set to the size of the encoded match pointer. If the match is smaller than the size of the pointer, we just output the literal data, only when the match is longer than the size of the pointer, do we output the pointer. The challenge here is that with Bellard we actually have 3 pointer sizes to worry about. The pointers both vary in the distance they can travel back in the window, and how long of a length they can encode. Using Table 1 above, we can see that we have a short pointer, which can encode up to a length of 5, and travel at most 256 bytes, and a long pointer which can encode up to 9 bytes (or 256 for a long long), and can travel back the entire window size of 8192 bytes. The good news is I think we can treat a long-long the same as a long here, as it doesn’t come into play unless the match length is greater than 9 bytes, which is well beyond the threshold. So that leaves us with two different thresholds. For a short pointer the threshold is only 1, as at two bytes, the short pointer is already smaller. For a long pointer the threshold is 2, as the length needs to be at least 3 bytes before we see a reduction in size. We have an additional challenge that the ranges actually overlap in terms of length, so that cannot be the only deciding factor. The other determining factor for which pointer to use is how many bits we need to encode the distance part of the pointer (how far back in the window we go to find the bytes we need). So with that in mind I think we can use the following simple logic to select between the two pointer types.
uint16_t distance = (cur - match_pos) & TABLEMSK;
if((distance <= 256) && (match_len <= 5)) { // short pointer range
// handle short pointer cases here
} else { // long pointer range
// handle long (and long long) cases here
}
Encoding literals
With that in place, we can finally start to look at encoding some data. As mentioned earlier the simplest case is when we encode a literal byte of data. Not only is it the simplest case, it is also our most important one. Without being able to encode literals, we couldn’t build up a dictionary that the decompressor could use to reconstruct the file. As stated earlier, our threshold value wiggles around a bit between the pointer sizes, but the handling is exactly the same. THRESHOLD is defined has having a value of 2 and is used for both cases, but in the 2nd case it is a less than or equal relationship instead of just a less than. To encode a literal we write a 1 as the flag bit, and then a single byte of data to the output stream. That leaves us with the following for the short pointer case
if(match_len < THRESHOLD) { // too short to encode, send as literal
match_len = 1; // send 1 byte
// send flag 1 - literal
if(0 != (result = write_bit(1, ctx, dst))) goto finish;
ctx->unit.data[ctx->byte_count++] = ctx->window[cur];
}
For the long pointer case we have this. (identical except for the comparison in the if statement)
if(match_len <= THRESHOLD) { // too short to encode, send as literal
match_len = 1; // send 1 byte
// send flag 1 - literal
if(0 != (result = write_bit(1, ctx, dst))) goto finish;
ctx->unit.data[ctx->byte_count++] = ctx->window[cur];
}
Now is where things start to get interesting, so we will focus on one pointer type at a time.
The short pointer
Short pointers work by using a couple of extra bits to encode the length giving a range of 0-3, but since we cannot encode anything less than 2 we add 2 to this, leaving us with a range of 2-5. Before we can encode the length however, we need to encode the pointer type, this is done by sending 2 zero flag bits for a short pointer, that is then followed by the two bits for the length, meaning it takes 4 flag bits in total to encode a short pointer. We then encode 8 bits of distance into the output data stream. That leaves us with something that looks like this. (note that this is implemented as the else condition to the if case looking to encode a literal above.
} else { // encode as short pointer
// send flags 0,0 - short pointer
if(0 != (result = write_bit(0, ctx, dst))) goto finish;
if(0 != (result = write_bit(0, ctx, dst))) goto finish;
if(match_len > 5) match_len = 5; // short pointer can't encode more than 5 chars
uint8_t encode_len = match_len - THRESHOLD;
// send flags x,x - length
if(0 != (result = write_bit(encode_len & 0x02, ctx, dst))) goto finish;
if(0 != (result = write_bit(encode_len & 0x01, ctx, dst))) goto finish;
// send the distance pointer
ctx->unit.data[ctx->byte_count++] = (-((int16_t)distance) & 0x00ff);
}
The long pointer
Unlike the short pointer, the long pointer uses only 2 flag bits, to identify it, the rest is done with additional bytes in the data stream. A long pointer encodes 3 length bits, and 13 distance bits into a 2 byte word in the data stream. This results in a full reach for distance of 1-8192 bytes. 3 length bits gives us a length range of 0-7, again like the short pointer we add 2 to this as we cannot encode less than 2 bytes (3 in this case), giving us a range of 2-9 bytes. A 3 bit length code of zero here is reserved to indicate a long-long (or control) pointer, which leaves us with a range of 3-9 bytes for the length. the encoding used for length and distance is a little odd, with the first byte being the low 8 bits of the distance, the low 3 bits of the second byte are the length, and the upper 5 bits of the second byte are the remaining distance bits. The result is something like the following. (again this is implemented as an else case to the literal check)
} else { // encode as long pointer
// send flags 0,1 - long pointer
if(0 != (result = write_bit(0, ctx, dst))) goto finish;
if(0 != (result = write_bit(1, ctx, dst))) goto finish;
if(match_len > 256) match_len = 256; // can't encode more than 256 chars
uint16_t encode_len = match_len - THRESHOLD;
distance = (-((int16_t)distance) & 0x1fff);
// send low byte of distance
ctx->unit.data[ctx->byte_count++] = (distance & 0x00ff);
distance >>= 5; // shift in the upper 5 bits
distance &= 0x00f8; // mask off the low 3 bits
if(encode_len <= 7) { // long pointer can only have a length code up to 7
distance |= encode_len; // encode the length to the distance
// send high byte of distance + length
ctx->unit.data[ctx->byte_count++] = (distance & 0x00ff);
}
}
The long long and control pointer
The long long pointer, or long distance and long length, is also used for some in-band signalling using certain reserved values. A value of zero here is used to mark EOF. There is one other code here, a value of one which is used to mark a segment change. This 2nd code is to ensure that the segment pointers don’t fall out of range relative to the offset, in the 16 bit DOS world. I think we can largely ignore this second code for now, as I don’t think it will come into play with any of the data we are compressing. Any other value is the length value – 1, leaving a range of 3-256 bytes. To encode this longer length, a value of zero is encoded into the distance/length byte of a long pointer, and then another length byte is pushed into the data stream for a total of 3 data bytes. The end result will look something like the following. (this is set up as an else case to the encoded length check above)
} else {
// send high byte of distance + '000'
ctx->unit.data[ctx->byte_count++] = (distance & 0x00ff);
encode_len--;
// send length-1
ctx->unit.data[ctx->byte_count++] = (encode_len & 0x00ff);
}
The above code only encodes an actual length pointer, the special case control code for EOF is handled separately, when we exit the compression loop, just before we flush out the final compression unit.
Encoding End of File
To encode EOF we must send a long long style pointer, with a value of zero as the final length. Generally speaking the distance bits have no meaning, but in the files we have to work with, they always seem to end with the sequence 00 f0 00 so we will do the same in our code. The code to encode the EOF marker is placed just after we exit the compression loop, but before we flush out the last compression unit from the buffer.
} while(0 < len);
// Encode EOF
if(0 != (result = write_bit(0, ctx, dst))) goto finish;
if(0 != (result = write_bit(1, ctx, dst))) goto finish;
ctx->unit.data[ctx->byte_count++] = 0x00; // offset (low) 00
ctx->unit.data[ctx->byte_count++] = 0xF0; // offset (high) F0
ctx->unit.data[ctx->byte_count++] = 0x00; // length 0
// send remaining compression unit
if(0 != ctx->byte_count) {
That is it for the compressor, for now. This post is getting on the long side again, so next time we will pick up with testing, debugging, and verification of the code we’ve written here.
This post is part of a series of posts surrounding my reverse engineering efforts of the PIC file format that MicroProse used with their games. Specifically F-15 Strike Eagle II (Though I plan to trace the format through other titles to see if and how it changes). To read my other posts on this topic you can use this link to my archive page for the PIC File Format which will contain all my posts to date on the subject.
Leave a comment