Array Operations

Array Operations#

In the previous section, we learned how to create and index NumPy arrays. Now we will look at how to perform calculations with arrays.

import numpy as np

Vector Arithmetic#

The fact that NumPy arrays are all of the same type means that it is possible to perform arithmetic on them. For example, all of the items in an array can be multiplied by a single value.

list_of_numbers = [1, 9, 17, 22, 45, 68]

array_of_numbers = np.array(list_of_numbers)

print(array_of_numbers)

[ 1  9 17 22 45 68]

If we print a NumPy array, it looks much like a list. NumPy arrays, however, allow us to perform vector arithmetic where we perform the same operation on every element of an array, or on every pair of elements from two arrays.

print(array_of_numbers * 2)

[  2  18  34  44  90 136]

Using the standard mathematical operators such as * and + allows us to operate simultaneously on all elements of the array. If we run array_of_numbers * 2, we multiply every element of array_of_numbers by 2.

print(array_of_numbers + 10)

[11 19 27 32 55 78]

This is particularly useful for tasks like unit conversions:

# Convert temperatures from Kelvin to Celsius
temps_K = np.array([298, 310, 323, 335])
temps_C = temps_K - 273.15

print(f"Temperatures in K: {temps_K}")
print(f"Temperatures in °C: {temps_C}")

Temperatures in K: [298 310 323 335]
Temperatures in °C: [24.85 36.85 49.85 61.85]

Exercise#

Convert these pressures from atmospheres to pascals (1 atm = 101325 Pa):

pressures_atm = np.array([1.0, 1.5, 2.0, 0.5])

Element-wise Operations#

We can also add, subtract, divide and multiply arrays together. Each of these operations is carried out element-wise:

array_1 = np.array([1, 5, 7])
array_2 = np.array([3, 1, 2])

# array addition
print(array_1 + array_2)

[4 6 9]

# array subtraction
print(array_1 - array_2)

[-2  4  5]

# array multiplication
print(array_1 * array_2)

[ 3  5 14]

# array division
print(array_1 / array_2)

[0.33333333 5.         3.5       ]

array_3 = array_1 + array_2

print(array_1)
print(array_2)
print('-------')
print(array_3)

[1 5 7]
[3 1 2]
-------
[4 6 9]

Here, the first element of array_3 is the sum of the first element of array_1 and the first element of array_2. The second element of array_3 is the sum of the second element of array_1 and the second element of array_2. And the third element of array_3 is the sum of the third element of array_1 and the third element of array_2.

This is precisely how vector addition works, hence the term vector arithmetic.

Note how much simpler element-wise arithmetic is using NumPy arrays than with lists:

# With NumPy arrays
array_1 = np.array([1, 5, 7])
array_2 = np.array([3, 1, 2])
result = array_1 + array_2
print(f"NumPy: {result}")

# With lists (using a list comprehension)
list_1 = [1, 5, 7]
list_2 = [3, 1, 2]
result = [a + b for a, b in zip(list_1, list_2)]
print(f"Lists: {result}")

NumPy: [4 6 9]
Lists: [4, 6, 9]

Exercise#

You’re diluting a set of solutions by different factors. Calculate the final concentrations:

initial_conc = np.array([0.5, 1.0, 2.0])  # mol/L
dilution_factor = np.array([2, 5, 10])     # fold dilution

NumPy Functions#

NumPy provides many useful functions for statistical and mathematical operations.

example_array = np.array([200, 220, 198, 181, 201, 156])

mean_of_example = np.mean(example_array)
sum_of_example = np.sum(example_array)
std_dev_of_example = np.std(example_array)

print(f'The mean = {mean_of_example}')
print(f'The sum = {sum_of_example}')
print(f'The standard deviation = {std_dev_of_example}')

The mean = 192.66666666666666
The sum = 1156
The standard deviation = 19.913702708325125

These operations can also be written as methods of arrays:

# These are equivalent
print(f"Using function: {np.mean(example_array)}")
print(f"Using method: {example_array.mean()}")

Using function: 192.66666666666666
Using method: 192.66666666666666

You can think of methods as functions that belong to specific variables. Calling the method for a particular object is equivalent to calling a function with that object as the first argument. So my_array.mean() is analogous to np.mean(my_array).

Use whichever style you prefer—both work the same way.

Mathematical Functions#

NumPy also provides mathematical functions that work on arrays:

# Square root
distances_squared = np.array([4, 9, 16, 25])
distances = np.sqrt(distances_squared)
print(f"Distances: {distances}")

# Exponential
x = np.array([0, 1, 2])
exp_x = np.exp(x)
print(f"exp(x): {exp_x}")

# Natural logarithm
values = np.array([1, 2.718, 7.389])
ln_values = np.log(values)
print(f"ln(values): {ln_values}")

Distances: [2. 3. 4. 5.]
exp(x): [1.         2.71828183 7.3890561 ]
ln(values): [0.         0.99989632 1.99999241]

Calculating Distances#

For chemistry, we often need to calculate distances between atoms. We can do this with vector arithmetic combined with mathematical functions.

# Two atomic positions
atom1 = np.array([0.0, 0.0, 0.0])
atom2 = np.array([1.0, 1.0, 0.0])

# Vector between atoms
vector = atom2 - atom1
print(f"Vector: {vector}")

# Distance (magnitude of vector)
distance = np.sqrt(np.sum(vector**2))
print(f"Distance: {distance:.3f} Å")

Vector: [1. 1. 0.]
Distance: 1.414 Å

This calculation uses Pythagoras’s theorem: $$d = \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2 + (z_2-z_1)^2}$$.

NumPy provides a function to calculate the magnitude (length) of a vector directly:

# Using np.linalg.norm() (easier!)
distance = np.linalg.norm(vector)
print(f"Distance: {distance:.3f} Å")

Distance: 1.414 Å

Exercise#

Calculate the O-H bond length in water:

oxygen = np.array([0.0, 0.0, 0.0])
hydrogen = np.array([0.96, 0.0, 0.0])

Finding Minima and Maxima#

NumPy also provides functions to find both the values and the positions of minima and maxima in arrays:

energies = np.array([145.2, 132.8, 128.3, 135.1, 142.7])

# Find the minimum and maximum values
min_energy = np.min(energies)
max_energy = np.max(energies)
print(f"Minimum energy: {min_energy} kJ/mol")
print(f"Maximum energy: {max_energy} kJ/mol")

# Find the INDEX of minimum and maximum
min_index = np.argmin(energies)
max_index = np.argmax(energies)
print(f"Minimum occurs at index {min_index}")
print(f"Maximum occurs at index {max_index}")

Minimum energy: 128.3 kJ/mol
Maximum energy: 145.2 kJ/mol
Minimum occurs at index 2
Maximum occurs at index 0

This is particularly useful when you have corresponding arrays. For example, finding which temperature gave the lowest energy:

# Experimental data: energy at different temperatures
temperatures = np.array([298, 310, 323, 335, 348])
energies = np.array([145.2, 132.8, 128.3, 135.1, 142.7])

# Find which temperature gave the minimum energy
min_index = np.argmin(energies)
optimal_temp = temperatures[min_index]
min_energy = energies[min_index]

print(f"Minimum energy: {min_energy} kJ/mol")
print(f"Occurs at temperature: {optimal_temp} K")

Minimum energy: 128.3 kJ/mol
Occurs at temperature: 323 K

Exercise#

You’ve measured reaction rates at different pH values:

pH_values = np.array([3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])
rates = np.array([0.12, 0.45, 0.89, 1.34, 0.95, 0.52, 0.18])  # μmol/s

Find the pH that gives the maximum reaction rate.

Common Pitfalls#

There are a few common mistakes to watch out for when working with NumPy arrays.

Shape Mismatches#

Arrays must have compatible shapes for element-wise operations:

a = np.array([1, 2, 3])
b = np.array([1, 2, 3, 4])

print(f"Shape of a: {a.shape}")
print(f"Shape of b: {b.shape}")

# This causes an error:
try:
    result = a + b
except ValueError as e:
    print(f"ValueError: {e}")

Shape of a: (3,)
Shape of b: (4,)
ValueError: operands could not be broadcast together with shapes (3,) (4,) 

Always check array shapes when debugging: print(array.shape)

Accidental Type Conversion#

Mixing types can give unexpected results:

# If you accidentally include a string...
mixed = np.array([1, 2, '3', 4])
print(f"Array: {mixed}")
print(f"Data type: {mixed.dtype}")

# Now it's all strings! Mathematical operations won't work as expected

Array: ['1' '2' '3' '4']
Data type: <U21

try:
    np.mean(mixed) # Raises an error. We can't compute a mean of strings.
except TypeError as e:
  print(f"TypeError: {e}")

TypeError: the resolved dtypes are not compatible with add.reduce. Resolved (dtype('<U21'), dtype('<U21'), dtype('<U42'))

Check data types if calculations seem wrong: print(array.dtype). Look for <U or object dtype—these indicate strings, not numbers.

Summary#

You have learned how to:

Perform scalar operations on arrays (multiply, add, etc.)
Perform element-wise operations between arrays
Use NumPy functions for statistical operations (mean, std, sum)
Use mathematical functions (sqrt, exp, log)
Calculate distances between atoms using np.linalg.norm()
Recognise and avoid common pitfalls

In the next section, you can test your understanding with exercises that combine these concepts.

Array Operations

Contents

Array Operations#

Vector Arithmetic#

Exercise#

Element-wise Operations#

Exercise#

NumPy Functions#

Mathematical Functions#

Calculating Distances#

Exercise#

Finding Minima and Maxima#

Exercise#

Common Pitfalls#

Shape Mismatches#

Accidental Type Conversion#

Summary#