Yes, for gcc5.x and later specifically, that specific expression is optimized very early to just p
, even with optimization disabled, regardless of any possible runtime UB.
This happens even with a static array and compile-time constant size. gcc -fsanitize=undefined
doesn’t insert any instrumentation to look for it either. Also no warnings at -Wall -Wextra -Wpedantic
int *add(int *p, long long x) {
return (p+x) - x;
}
int *visible_UB(void) {
static int arr[100];
return (arr+200) - 200;
}
Using gcc -dump-tree-original
to dump its internal representation of program logic before any optimization passes shows that this optimization happened even before that in gcc5.x and newer. (And happens even at -O0
).
;; Function int* add(int*, long long int) (null)
;; enabled by -tree-original
return <retval> = p;
;; Function int* visible_UB() (null)
;; enabled by -tree-original
{
static int arr[100];
static int arr[100];
return <retval> = (int *) &arr;
}
That’s from the Godbolt compiler explorer with gcc8.3 with -O0
.
The x86-64 asm output is just:
; g++8.3 -O0
add(int*, long long):
mov QWORD PTR [rsp-8], rdi
mov QWORD PTR [rsp-16], rsi # spill args
mov rax, QWORD PTR [rsp-8] # reload only the pointer
ret
visible_UB():
mov eax, OFFSET FLAT:_ZZ10visible_UBvE3arr
ret
-O3
output is of course just mov rax, rdi
gcc4.9 and earlier only do this optimization in a later pass, and not at -O0
: the tree dump still includes the subtract, and the x86-64 asm is
# g++4.9.4 -O0
add(int*, long long):
mov QWORD PTR [rsp-8], rdi
mov QWORD PTR [rsp-16], rsi
mov rax, QWORD PTR [rsp-16]
lea rdx, [0+rax*4] # RDX = x*4 = x*sizeof(int)
mov rax, QWORD PTR [rsp-16]
sal rax, 2
neg rax # RAX = -(x*4)
add rdx, rax # RDX = x*4 + (-(x*4)) = 0
mov rax, QWORD PTR [rsp-8]
add rax, rdx # p += x + (-x)
ret
visible_UB(): # but constants still optimize away at -O0
mov eax, OFFSET FLAT:_ZZ10visible_UBvE3arr
ret
This does line up with the -fdump-tree-original
output:
return <retval> = p + ((sizetype) ((long unsigned int) x * 4) + -(sizetype) ((long unsigned int) x * 4));
If x*4
overflows, you’ll still get the right answer. In practice I can’t think of a way to write a function that would lead to the UB causing an observable change in behaviour.
As part of a larger function, a compiler would be allowed to infer some range info, like that p[x]
is part of the same object as p[0]
, so reading memory in between / out that far is allowed and won’t segfault. e.g. allowing auto-vectorization of a search loop.
But I doubt that gcc even looks for that, let alone takes advantage of it.
(Note that your question title was specific to gcc targeting x86-64 on Linux, not about whether similar things are safe in gcc, e.g. if done in separate statements. I mean yes probably safe in practice, but won’t be optimized away almost immediately after parsing. And definitely not about C++ in general.)
I highly recommend not doing this. Use uintptr_t
to hold pointer-like values that aren’t actual valid pointers. like you’re doing in the updates to your answer on C++ gcc extension for non-zero-based array pointer allocation?.
10
solved Does (p+x)-x always result in p for pointer p and integer x in gcc linux x86-64 C++