How (not) to copy a NumPy array

How (not) to copy a NumPy array

The Python code below has puzzled me for a while. It shows the initialization of NumPy array a and three examples of copy assignment stored in b, c, and d.

import numpy as np

a = np.arange(3,5)
#a = [3, 4]
b = a
c = a[:]
d = a.copy()

print(b is a) # True
print(c is a) # False
print(d is a) # False

print(a, b, c, d) #[3 4] [3 4] [3 4] [3 4]

a[0] = -11.

print(a, b, c, d) #[-11   4] [-11   4] [-11   4] [3 4]

Based on these assignments, I expected b to be exactly equal to a, but c and d not. The outcome of the print statements on lines 9-11 confirmed my expectations.

Now, when I changed the first element of a to -11, I expected the first element of b to change as well, as b is only a reference to a, but those of c and d to remain constant. To my great surprise, however, the first element of c had changed as well! It makes the behavior of NumPy very different than that of lists (just comment out line 4 to see the difference).

With some help from StackOverflow, and the SciPy Cookbook, I discovered that the [:] operator of a NumPy array does not make a copy of the data, but it provides a so-called view to the same data. This means that even though a and c are different objects, as confirmed in line 10, they still point to the same data.

The use case of copy assignment of the form c = a[:] is not entirely clear from the current example. A better one is to create a variable a_even_indices = a[::2] to provide a way to access only the even indices of a with simple assignments as a_even_indices[:] = 3.

My most important lesson: to make a deep copy of a NumPy array, always use the copy function that NumPy provides.