[Solved] Bit manipulation, return 0 if x != 0, or nonzero otherwise

Question

It turns out there is a general solution, which does not depend on word size, or even on two’s complement arithmetic. Here it is:

int is_zero(int x)
{
    unsigned    y   = x;
    unsigned    c0  = y^y;
    unsigned    c1  = ~(~c0 + ~c0);
    unsigned    c1h = ~c0 ^ (~c0 >> c1);
    unsigned    y2  = y + ~c0;
    unsigned    t1  = y ^ y2;
    unsigned    t2  = t1 & y2;
    unsigned    s   = t2 & c1h;
    int         r   = s >> c1;

    return r;
}

Note that everything is done in unsigned, which is necessary to avoid overflow issues, and also to force the right shift to zero-fill. I left the argument and return value as signed ints, with the first and last assignments simply changing the signed/unsigned behavior (the final shift is only present to avoid overflow – see the comment below).

The solution is actually pretty straightforward. As has been noted, constant generation is trivial. Zero is obtained by xor-ing something with itself. All 1’s is the bitwise complement of that. One is all 1’s xor’d with all 1’s shifted left one.

So far this is all pretty trivial. The tricky part is to get it to work regardless of word size. To do this, it constructs a value whose high-order bit is 0 if x is non-zero, and 1 if x is zero. It then masks this with a constant which is a single 1 in the high-order bit position, which is constructed from all 1’s xor’d with all 1’s shifted right 1. The shift is guaranteed to zero-fill (rather than sign-extend) since the value is unsigned.

Note that s is either zero or a one in the high-order bit position. The final return value, r, shifts s right by one in order to avoid overflow when assigning back to a signed integer. This fix was suggested by Paul Hankin. This makes the final return value either zero or else zero followed by one followed by all zeros.

If you want to avoid the implicit conversions between signed and unsigned at the beginning and end of the function, you can use a union to alias the values:

int is_zero(int x)
{
    union {int s; unsigned u;} su;
    su.s = x;
    unsigned    y   = su.u;
    unsigned    c0  = y^y;
    unsigned    c1  = ~(~c0 + ~c0);
    unsigned    c1h = ~c0 ^ (~c0 >> c1);
    unsigned    y2  = y + ~c0;
    unsigned    t1  = y ^ y2;
    unsigned    t2  = t1 & y2;
    unsigned    s   = t2 & c1h;
    su.u = s;
    int         r = su.s;

    return r;
}

In this case, the final shift of s is unnecessary, and the return value is either zero or else one followed by all zeros. Note that the C90 standard doesn’t allow mixed code and declarations, so if that’s an issue, you would have to separate the declarations from the assignments, but the net result would be the same.

Accepted Answer