[Solved] C/C++ Compressing Integer to Short and Decompressing to Integer [closed]


For 16 bits, there are 216 = 65,536 possible settings. Each setting of bits can represent at most one value. So 16 bits of data can represent only 65,536 values.

32 bits have 232 = 4,294,967,296 possible settings. If your 32-bit data uses more than 65,536 of the possible values that could be represented, it is not possible to compress it to 16 bits and always recover the original value when decompressing it.

Options include:

  • If your 32 bits of data use no more than 65,536 values, you can create a code that maps each used value to a number from 0 to 65,535. Encoding the value then compresses the data to 16 bits, and decoding the value decompresses it.
  • If full accuracy is not required, you can map nearby 32-bit values to a single 16-bit value. For example, if values from 0 to 150,000 are used, you can shift right by two bits to get a value from 0 to 37,500, which will fit in a 16-bit unsigned short. You can recover approximately the original value by shifting left two bits. You could also divide by 3 instead of shifting (which is equivalent to dividing by 4), and that would reduce the error somewhat, but division takes longer than shifting in many processors. Division by 3 still produces a result that fits in 16 bits because 150,000/3 < 65536. You could also divide by 2.3 to reduce error further, but that is more work.
  • Do not use compression, because it is not possible to obtain the desired result of compressing a 32-bit value into 16 bits.

Regarding the shifting-and-losing-accuracy solution: Sometimes it is preferred to add a bit to the decompressed value to reduce the average and maximum error. For example, 100, 101, 102, and 103 would all be compressed to the same value, 25. Then simple recovery by multiplying by four would produce 100. This has errors of 0, −1, −2, and −3 for 100, 101, 102, and 103, respectively. (Average error magnitude is 1½, maximum is 3.) If we add 1 after multiplying, 25 decompresses to 101, and then the errors are +1, 0, −1, and −2 (average 1, maximum 2).

solved C/C++ Compressing Integer to Short and Decompressing to Integer [closed]