This document describes the generic Huffman bitstream format used by several Wasteland resource files.
Some file formats wrap this bitstream in a container such as an msq block, but the encoding described here is the bitstream itself, independent of any specific file type.
All Huffman bits are read most-significant-bit first within each byte.
The Huffman payload consists of:
The decoded output size is not stored inside the Huffman bitstream itself. It must be known from the surrounding container or from the calling code.
The tree is serialized recursively.
To read one node:
1, the node is a leaf and the next byte is the symbol value.0, the node is an internal node.In pseudocode:
readNode():
if readBit() == 1:
return readByte()
else:
left = readNode()
readBit()
right = readNode()
return Node(left, right)
Important detail:
After the tree has been read, the remaining bits encode the payload bytes.
To decode one output byte:
0 means go to the left child.1 means go to the right child.Repeat until exactly decodedSize bytes have been produced.
In pseudocode:
for i in 0 .. decodedSize-1:
node = root
while node is not a leaf:
if readBit() == 0:
node = node.left
else:
node = node.right
output[i] = node.value
The tree and the compressed payload are a single continuous bitstream. There is no byte alignment between them.
After the required number of decoded bytes has been produced, a container format may advance to the next full byte boundary before reading the next field.