[Solved] Cython interfaced with C++: segmentation fault for large arrays


The memory is being managed by your numpy arrays. As soon as they go out of scope (most likely at the end of the PySparse constructor) the arrays cease to exist, and all your pointers are invalid. This applies to both large and small arrays, but presumably you just get lucky with small arrays.

You need to hold a reference to all the numpy arrays you use for the lifetime of your PySparse object:

cdef class PySparse:

  # ----------------------------------------------------------------------------

  cdef Sparse *ptr
  cdef object _held_reference # added

  # ----------------------------------------------------------------------------

  def __cinit__(self,**kwargs):
      # ....
      # your constructor code code goes here, unchanged...
      # ....

      self._held_reference = [data] # add any other numpy arrays you use to this list

As a rule you need to be thinking quite hard about who owns what whenever you’re dealing with C/C++ pointers, which is a big change from the normal Python approach. Getting a pointer from a numpy array does not copy the data and it does not give numpy any indication that you’re still using the data.


Edit note: In my original version I tried to use locals() as a quick way of gathering a collection of all the arrays I wanted to keep. Unfortunately, that doesn’t seem to include to cdefed arrays so it didn’t manage to keep the ones you were actually using (note here that astype() makes a copy unless you tell it otherwise, so you need to hold the reference to the copy, rather than the original passed in as an argument).

6

solved Cython interfaced with C++: segmentation fault for large arrays