NumPy

One of the most powerful packages in the Python ecosystem is NumPy. In order to use the NumPy package, we must import it.

Importing packages

The importing of packages is what makes the Python programming language so powerful and attractive, allowing the use of powerful algorithms and methods developed by (better) programmers to be easily accessed by anyone. To import the NumPy package we use the following.

import numpy as np

This allows us to access all elements of the NumPy package, using the dot syntax that you have seen previously for the math module. Therefore, we can access the log function in NumPy by calling,

np.log(7.38905609893065)
2.0

The as keyword in the import statement simply lets us create a shortcut. So, instead of writing numpy.log we can write just np.log (a lot of programming is about saving keystrokes). np is the standard abbreviation for NumPy.

The above method imports all of the NumPy package, however, this may not be necessary. For example, if you only want to use the pi object from the NumPy package, it is possible import only this.

from numpy import pi

print(pi)
3.141592653589793

The above code, tells the kernel to import the pi object from the NumPy package. Note that with this import method, the dot syntax is no longer required.

NumPy Arrays

One of the most useful tools in the NumPy package is the array. This stores a series of numerical data of the same type, similar to a list but capable of performing highly optimised for mathematical operations. We can initialise a NumPy array with the following.

mass_numbers = np.array([112, 114, 115, 116, 117, 118, 119, 120, 122, 124])
print(mass_numbers)
[112 114 115 116 117 118 119 120 122 124]
isotopic_abundances = np.array([0.0097, 0.0066, 0.0034, 0.1454, 0.0768, 0.2422, 0.0859, 0.3258, 0.0463, 0.0579])
print(isotopic_abundances)
[0.0097 0.0066 0.0034 0.1454 0.0768 0.2422 0.0859 0.3258 0.0463 0.0579]

Above, we have defined two NumPy arrays, the first are the mass numbers for the stable isotopes of tin and the second are the abundances of each isotope. The efficiency of the NumPy array comes from the fact that all of the items in the array must be of the same type. We can see the type by investigating the dtype object of the array.

print(mass_numbers.dtype)
int64
print(isotopic_abundances.dtype)
float64

The mass numbers are integers while the isotopic abundences are floating point numbers (the 64 refers the number of bits of computer memory that the number occupies).

Arithmetic with NumPy arrays

The fact that NumPy arrays are all of a same type means that it is possible to perform arthmetic on them. For example, all of the items in an array can be multipled by a single value.

np.array([0.1, 0.2, 0.3]) * 10.0
array([1., 2., 3.])

Or two NumPy arrays can operator on each other.

np.array([5, 10, 15]) + np.array([5.0, 0.0, -5.0])
array([10., 10., 10.])

These operations on every element of a NumPy array are called vector operations.

Using NumPy arrays for value-wise operations such as those shown where are very efficient and run much faster than explicitly looping over elements in a list.

NumPy functions

In addition to the power of the NumPy array (on which some of Python’s most impressive libraries are built), the NumPy library also enables access to a broad range of useful functions. For example, the np.log function that was introduced at the start, differs from the math.log function introduced earlier As the former can operate on NumPy array when the latter cannot.

K = np.array([1.06, 3.8, 15.0, 45.44, 150.6])

np.log(K)
array([0.05826891, 1.33500107, 2.7080502 , 3.81639277, 5.01462732])
from math import log

log(K)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_2395/4007023428.py in <module>
      1 from math import log
      2 
----> 3 log(K)

TypeError: only size-1 arrays can be converted to Python scalars

The function from the math module will result in an error.

Alongside these mathematical operations, the NumPy library also enables statistical operations on the NumPy arrays. For example, sum, mean and standard deviations are easy to find.

np.sum(mass_numbers)
1177
np.mean(mass_numbers)
117.7
np.std(mass_numbers)
3.4942810419312296

Exercises:

  1. Using the mass_numbers and isotopic_abundances arrays above, evaluate the average mass number for tin.

  2. Rewrite or modify your code from the Loops Exercise to calculate the distances between each pair of atoms, using numpy arrays to store the atom positions, and vector arithmetic to calculate the vectors between pairs of atoms.

Worked Example