Lists#

In our first lab, we encountered several distinct data types: int (integers), float (decimal point numbers), complex (complex numbers), str (strings) and bool (booleans). All of these types represent various categories of individual data: a single number, a single string of letters etc. Often, we are interested in collections of data, for example a set of temperatures at which a given experiment has been performed. Python accommodates this requirement by providing us with various data structures that allow us to collate related data into a single object. Lists are one of these data structures.

A list is an ordered collection of values. The values that comprise a list are usually referred to as the elements or items in that list. To create a list, we enclose the elements (separated by commas) we wish to include in a set of square brackets:

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]

Just like the other data types we have encountered so far, we can store lists in variables with the assignment operator =:

some_numbers = [1, 2, 3, 4, 5]
print(some_numbers)
[1, 2, 3, 4, 5]

The example above is a list of int objects, but lists can contain any type of data:

some_strings = ['Praise', 'The', 'Sun!']
print(some_strings)
['Praise', 'The', 'Sun!']

We can even mix different types together:

a_variety = [5, 'five', 5.0, 5.0 + 0j]
print(a_variety)
[5, 'five', 5.0, (5+0j)]

Or make a list of lists!

some_numbers = [1, 2, 3, 4, 5]
some_strings = ['Praise', 'The', 'Sun!']

list_of_lists = [some_numbers, some_strings]
print(list_of_lists)
[[1, 2, 3, 4, 5], ['Praise', 'The', 'Sun!']]

This latter example is also referred to as a nested list. Unsurprisingly, if we call the type function, we discover that there is a list type:

type(list_of_lists)
list

List indexing#

Let’s say we have a list of strings, perhaps some of the first row transition metals:

transition_metals = ['Vanadium', 'chromium', 'Manganese']

Okay, now consider that, for whatever reason, I only want the name of the first transition metal in the list - how can I access just the first element? This can be accomplished with list indexing:

transition_metals[0]
'Vanadium'

Or perhaps we only want the second element:

transition_metals[1]
'chromium'

Or else the third element:

transition_metals[2]
'Manganese'

As you have probably observed in the code above, the general syntax for list indexing is:

name_of_variable[index]

Where the index is an int that refers to which element we would like to access. Notice that we obtain the first element with index [0], not [1].

Important

Python, and for that matter the vast majority of programming langauges, start counting from zero. Don’t worry if this takes you a while to get used to, it is admittedly somewhat counterintuitive.

Okay so our example list has 3 elements which can therefore be accessed with indicies 0 through 2. What if we try to index the list with an index greater than 2?

transition_metals[3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 transition_metals[3]

IndexError: list index out of range

Error

Here we have another new type of error: an IndexError. This one is relatively straightforward, the index we have given is “out of range” meaning that the index refers to an element that does not exist. In this case, the index 3 refers to the fourth element of the list, but transition_metals only has 3 items, so this doesn’t make any sense.

You might think that using a negative index might yield a similar error, and yet:

transition_metals[-1]
'Manganese'

What’s happening here is that the negative index counts elements from back to front. Index [0] referred to the first element of the list, index [-1] refers to the last element of the list.

transition_metals[-2]
'chromium'

As you can see above, we can continue to use increasingly negative indices to access more elements of transition_metals going from the end to the beginning, index [-2] giving us the second to last item.

Modifying lists#

Something that we have neglected to mention thus far is that lists are mutable: they can be modified in various ways. You may have noticed previously that there is a typo in the third element of the list. We can fix that by assigning a new value to that element using the = operator:

print(transition_metals[1])
transition_metals[1] = 'Chromium'
print(transition_metals[1])
chromium
Chromium

Adding elements to a list#

Aside from modifying pre-existing elements of a list, we might conceivably want to add new items to a list and make it longer. For example, what if we want to add another element to transition_metals? There are actually many ways you could do this, several of which rely on the methods associated with lists. A method is just a function that is associated with a particular object. When we used functions, we type the name of the function, followed by brackets enclosing our inputs to that function:

len(transition_metals)
3

Here we call the built-in len function, which gives us the length of the list i.e. how many elements there are. Calling a method is much the same as the above, except that it is associated with the list itself:

transition_metals.append('Iron')
print(transition_metals)
['Vanadium', 'Chromium', 'Manganese', 'Iron']

This example uses the append method, which adds the input to the end of the list. Notice that the syntax for calling the append method is largely the same as if it was a standalone function, except that it follows a dot after the object it is associated with. This can be generically rendered as:

name_of_object.method_associated_with_that_object(input_to_method)

As well as appending elements to the end of a list, we can also insert elements at any given index in the list:

transition_metals.insert(0, 'Titanium')
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron']

This is the first time we’ve encountered a function or method that takes two arguments (two inputs). The insert method requires you to specify firstly the index for where you want the new element to go (here we use 0 to insert a new element at the start of the list) and secondly the new element itself.

Both the append and insert methods allow us to add one new element to the list. We can use the extend method to add multiple elements to the end of a list:

more_transition_metals = ['Cobalt', 'Nickel']
transition_metals.extend(more_transition_metals)

print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron', 'Cobalt', 'Nickel']

We can also accomplish this same task by use of the addition operator +, in a similar manner to how we can “add” strings together:

even_more_transition_metals = ['Copper', 'Zinc']
transition_metals = transition_metals + even_more_transition_metals

print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron', 'Cobalt', 'Nickel', 'Copper', 'Zinc']

Removing elements from a list#

We now have a range of tools at our disposal for adding elements to a list, but how can we remove them instead? Well, for a start, there is a remove method:

transition_metals.remove('Copper')
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron', 'Cobalt', 'Nickel', 'Zinc']

This removes the first element which matches the input argument from the list. There is also a pop method:

pop_example = transition_metals.pop(4)

print(pop_example)
print(transition_metals)
Iron
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']

This method removes the element at the index provided as an input (in this case 4: the fifth element). Notice that the pop method also returns the value of the removed element. In the example above, we have assigned the output of the pop method to a variable pop_example, which when printed, shows us the removed element.

List slicing#

We are now familiar with accessing individual elements of a list:

print(transition_metals)
print(transition_metals[4])
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
Cobalt

But what if we want to obtain multiple elements at once? Python facilitates this task with list slicing:

transition_metals[1:5]
['Vanadium', 'Chromium', 'Manganese', 'Cobalt']

Here we have acquired 4 elements of transition_metals: the second, third, fourth and fifth items (a slice out of the full list). The first number in the square brackets above is 1: this is the index from which we want to start the slice. The second number (after a colon :) is 5, which indicates the index where we would like to stop the slice. Notice that the element at index 5 is not included in the slice:

transition_metals[5]
'Nickel'

In general, we can therefore take a slice out of a list with the following syntax:

name_of_list[start_index : end_index]

Where the element at the end_index is not included in the slice. You actually don’t have to provide both the start and end indices:

transition_metals[2:]
['Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
transition_metals[:4]
['Titanium', 'Vanadium', 'Chromium', 'Manganese']

What’s happening here is that by default, if you do not provide an end index, Python will assume you want the highest value possible. In other words, your slice will start from the provided start index and go all the way to the end of the list. Similarly, if you provide the end index but no start index, Python will assume that you want your slice to start from the beginning of the list, up to (but not including) the end index you have given. To summarise, you can read:

name_of_list[start_index:]

As ‘give me every element of name_of_list from start_index and onwards’ and you can read:

name_of_list[:end_index]

As ‘give me every element of name_of_list from the beginning up to (but not including) the element at end_index.

We have not used this feature so far, but we can also provide a third index when we slice a list:

transition_metals[1:5:2]
['Vanadium', 'Manganese']

This latter index is a step size. By default, the step size is 1:

transition_metals[1:5:1]
['Vanadium', 'Chromium', 'Manganese', 'Cobalt']

By increasing this value to 2, we retrieve every other element from index 1 up to but not including index 5. If we set the step size to 3, we would obtain every third element etc:

transition_metals[1:7:3]
['Vanadium', 'Cobalt']

Tip

What do you think will happen if you set the step size to a negative number? Make a new list in a Jupyter notebook and try it out!

It should be noted that list slicing can be used as another method to remove elements from a list:

print(transition_metals)

transition_metals = transition_metals[2:4]

print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
['Chromium', 'Manganese']

By assigning the slice to transition_metals, we have effectively replaced the full list with the slice, removing everything not captured by our specified start and end indices.

Lists of lists of lists of lists of lists…#

We saw earlier that we can create nested lists (lists containing more lists) in Python:

transition_metals = [['Iron', 'Cobalt', 'Nickel'], 
                     ['Ruthenium', 'Rhodium', 'Palladium']]

Note

Notice in the example above, the list has been split over multiple lines. Python allows some statements to be written like this to improve readability.

This list has 2 elements:

len(transition_metals)
2

Each of which is, in itself, another list. We know that we can access individual elements of a list via indexing:

transition_metals[0]
['Iron', 'Cobalt', 'Nickel']

But what if we want an individual element of a list within a list? Say that I want to obtain just the 'Cobalt' string, we could do this by assigning a new variable:

first_row_transition_metals = transition_metals[0]
first_row_transition_metals[1]
'Cobalt'

Here we have stored just the first list in transition_metals in a new variable called first_row_transition_metals. We can then index this new variable as normal to retrieve our desired element. More succinctly, Python allows this process to be condensed into one line:

transition_metals[0][1]
'Cobalt'

Reading this from left to right, we type the name of the list (transition_metals), and index this to retrieve only the first sublist (['Iron', 'Cobalt', 'Nickel']). We then provide another index in a second set of square brackets to index this sublist and obtain only the second element: 'Cobalt'.

Mathematical interlude: the sum function#

We have already become acquainted with several new functions and methods related to lists, but it’s worth noting one more: the sum function.

some_numbers = [1, 3, 13, 2, 33, 102, -36]
sum(some_numbers)
118

As illustrated by the example above, the sum function literally sums up the elements of a list. In this case we used a list of int objects, but of course this works with float objects too:

some_floats = [5.13, 6.93, 10.11, -20.67]
sum(some_floats)
1.4999999999999973

On the other hand, it doesn’t make sense to try and sum up:

some_numbers_and_strings = [1, 5.5, 'purple', 10]
sum(some_numbers_and_strings)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[48], line 2
      1 some_numbers_and_strings = [1, 5.5, 'purple', 10]
----> 2 sum(some_numbers_and_strings)

TypeError: unsupported operand type(s) for +: 'float' and 'str'

Here we have the same problem we encountered previously when trying to add a str and an int or a float, Python gives us a TypeError and essentially tells us: “This operation you are asking me to do doesn’t make any sense!”

In combination with the len function, we can use the sum function to calculate the mean of a list of numbers:

some_numbers = [1, 3, 13, 2, 33, 102, -36]

sum(some_numbers) / len(some_numbers)
16.857142857142858

Multiple assignment#

We have already seen that, just like other types of data, a list can be assigned to a variable:

some_numbers = [1, 2]

Sometimes, we want to assign the various elements of a list to separate variables. With respect to our example above, we might want to store 1 in one variable and 2 in a different variable, rather than storing the whole list in some_numbers. This can be accomplished with multiple assignment:

number_1, number_2 = some_numbers

print(number_1)
print(number_2)
1
2

This syntax for this process is relatively straightforward, we just provide as many variable names as there are elements in the list, separated by commas:

variable_1, variable_2, variable_3 = [element_1, element_2, element_3]

# This sets variable_1 = element_1, variable_2 = element_2 etc.

Exercise#

1. Copy the following block of code into a Jupyter notebook:

colours = [['Red', 'Magenta'], 
           ['Cyan', 'Green'], 
           [['Yellow', 'Blue'], 'White']]

a) Use list indexing to print Magenta from the colours list.

b) Again, using list indexing, assign the element Red to a variable called colour_1, the element Green to a variable called colour_2, and the element Yellow to a variable called colour_3, so that the following line of code runs without error and displays a true statement:

print(f'{colour_1} + {colour_2} -> {colour_3}')

2. Create a list of strings corresponding to the second row of the periodic table and assign this to a variable.

a) By slicing this list, assign just the non-metals and metalloids to a new variable.

b) Slice your list of non-metals to obtain only the elements with even atomic numbers. Assign this subset to a new variable.

c) Using any of the methods described previously, add the non-metals and metalloids from the third row of the periodic table (with even atomic numbers) to the list you created in b).