TL;DR: do it the obvious way, and make sure your compiler optimizes it.
It’s hard to reason about performance without a full, reproducible program example. So let’s start with a straightforward implementation:
#include <array>
#include <algorithm>
std::array<int, 362856427> a = {};
int main()
{
a[500] = 1;
a[5000] = 1;
a[50000] = 1;
a[500000] = 1;
auto counter = 0u;
for (auto i = 0u; i < a.size(); ++i) {
if (a[i] != 1)
++counter;
}
return counter != 362856423;
}
Timing this, I got 1.79 seconds of user time. Then I realised my mistake, and added -O3
to my compilation command. That’s better:
g++ -std=c++17 -g -Wall -Wextra -O3 16385733.cpp -o 16385733
time ./16385733
0.07user 0.08system 0:00.16elapsed 98%CPU (0avgtext+0avgdata 2212maxresident)k
We can attempt to simplify the loop, but this makes no measurable difference (the optimizer already beat us to this one):
for (auto i = 0u; i < a.size(); ++i)
counter += a[i] != 1;
Another option is to make the code clearer, by using a standard algorithm:
auto counter = a.size() - std::count(a.begin(), a.end(), 1);
This consistently takes 50% longer than the obvious loop.
If your input array was substantially larger, you might be able to gain by parallelizing the computation like this:
auto counter = 0ul;
#pragma omp parallel for reduction(+:counter)
for (auto i = 0ul; i < a.size(); ++i)
counter += a[i] != 1;
My benchmarking showed this as being as slow as the standard algorithm when the array size is 362856427, and no faster when it’s increased to 3628564270.
There are a few ways to re-write in equivalent form:
for (auto i: a)
counter += i != 1;
for (auto p = a.data(); p < end; ++p)
counter += *p != 1;
All these exhibit similar performance, and aren’t improved with OpenMP.
So the short answer is
- do it the obvious way
- make sure your compiler is optimizing, and
- benchmark your alternatives.
solved Fast comparing array to the number