[Solved] Cuda compilation error: class template has already been defined

I have posted the same post in Nvidia CUDA forum: Link Here I reinstalled multiple times with the method from the post but still having the same “class template” problems. Then I reinstalled CUDA 9.1 and VS2017 ver 15.6.7 also with the same method and it finally works. Further problems that I encountered is in … Read more

[Solved] Comparing 2 different scenarios on 2 different architectures when finding the max element of an array

I think you’re really trying to ask about branchless vs. branching ways to do m = max(m, array[i]). C compilers will already compile the if() version to branchless code (using cmov) depending on optimization settings. It can even auto-vectorize to a packed compare or packed-max function. Your 0.5 * abs() version is obviously terrible (a … Read more