I’m guessing the problem is that in your compiler char
is signed (the standard allows it to be either signed or unsigned, it’s implementation-defined/specific). As such, whenever you convert chars that have bit 7 set to 1 (0x80 through 0xFF) into any larger integer type, it’s treated as a negative value and it gets sign-extended to preserve the negative value, or, in other words, this bit 7 gets copied to bit 8, bit 9 and so on, into all higher bits of the bigger integer type. So, 0xC7 can turn into 0xFFC7 and 0xFFFFFFC7. To prevent that from happening, cast chars
to unsigned chars
first.
5
solved Unicode conversion issues