I found an answer in a completely unrelated thread in the forums. Couldn’t find a Googleable answer, so posting here for future users’ sake.
Since CUDA calls are executed asynchronously, you should run your code
withCUDA_LAUNCH_BLOCKING=1 python script.py
This makes sure the right line of code will throw the error message.
solved Pytorch crashes cuda on wrong line