Lists#
In our first lab, we encountered several distinct data types: int
(integers), float
(decimal point numbers), complex
(complex numbers), str
(strings) and bool
(booleans). All of these types represent various categories of individual data: a single number, a single string of letters etc. Often, we are interested in collections of data, for example a set of temperatures at which a given experiment has been performed. Python accommodates this requirement by providing us with various data structures that allow us to collate related data into a single object. Lists are one of these data structures.
A list is an ordered collection of values. The values that comprise a list are usually referred to as the elements or items in that list. To create a list, we enclose the elements (separated by commas) we wish to include in a set of square brackets:
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
Just like the other data types we have encountered so far, we can store lists in variables with the assignment operator =
:
some_numbers = [1, 2, 3, 4, 5]
print(some_numbers)
[1, 2, 3, 4, 5]
The example above is a list of int
objects, but lists can contain any type of data:
some_strings = ['Praise', 'The', 'Sun!']
print(some_strings)
['Praise', 'The', 'Sun!']
We can even mix different types together:
a_variety = [5, 'five', 5.0, 5.0 + 0j]
print(a_variety)
[5, 'five', 5.0, (5+0j)]
Or make a list of lists!
some_numbers = [1, 2, 3, 4, 5]
some_strings = ['Praise', 'The', 'Sun!']
list_of_lists = [some_numbers, some_strings]
print(list_of_lists)
[[1, 2, 3, 4, 5], ['Praise', 'The', 'Sun!']]
This latter example is also referred to as a nested list. Unsurprisingly, if we call the type
function, we discover that there is a list
type:
type(list_of_lists)
list
List indexing#
Let’s say we have a list of strings, perhaps some of the first row transition metals:
transition_metals = ['Vanadium', 'chromium', 'Manganese']
Okay, now consider that, for whatever reason, I only want the name of the first transition metal in the list - how can I access just the first element? This can be accomplished with list indexing:
transition_metals[0]
'Vanadium'
Or perhaps we only want the second element:
transition_metals[1]
'chromium'
Or else the third element:
transition_metals[2]
'Manganese'
As you have probably observed in the code above, the general syntax for list indexing is:
name_of_variable[index]
Where the index is an int
that refers to which element we would like to access. Notice that we obtain the first element with index [0]
, not [1]
.
Important
Python, and for that matter the vast majority of programming langauges, start counting from zero. Don’t worry if this takes you a while to get used to, it is admittedly somewhat counterintuitive.
Okay so our example list has
transition_metals[3]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[11], line 1
----> 1 transition_metals[3]
IndexError: list index out of range
Error
Here we have another new type of error: an IndexError. This one is relatively straightforward, the index we have given is “out of range” meaning that the index refers to an element that does not exist. In this case, the index transition_metals
only has
You might think that using a negative index might yield a similar error, and yet:
transition_metals[-1]
'Manganese'
What’s happening here is that the negative index counts elements from back to front. Index [0]
referred to the first element of the list, index [-1]
refers to the last element of the list.
transition_metals[-2]
'chromium'
As you can see above, we can continue to use increasingly negative indices to access more elements of transition_metals
going from the end to the beginning, index [-2]
giving us the second to last item.
Modifying lists#
Something that we have neglected to mention thus far is that lists are mutable: they can be modified in various ways. You may have noticed previously that there is a typo in the third element of the list. We can fix that by assigning a new value to that element using the =
operator:
print(transition_metals[1])
transition_metals[1] = 'Chromium'
print(transition_metals[1])
chromium
Chromium
Adding elements to a list#
Aside from modifying pre-existing elements of a list, we might conceivably want to add new items to a list and make it longer. For example, what if we want to add another element to transition_metals
? There are actually many ways you could do this, several of which rely on the methods associated with lists. A method is just a function that is associated with a particular object. When we used functions, we type the name of the function, followed by brackets enclosing our inputs to that function:
len(transition_metals)
3
Here we call the built-in len
function, which gives us the length of the list i.e. how many elements there are. Calling a method is much the same as the above, except that it is associated with the list itself:
transition_metals.append('Iron')
print(transition_metals)
['Vanadium', 'Chromium', 'Manganese', 'Iron']
This example uses the append
method, which adds the input to the end of the list. Notice that the syntax for calling the append method is largely the same as if it was a standalone function, except that it follows a dot after the object it is associated with. This can be generically rendered as:
name_of_object.method_associated_with_that_object(input_to_method)
As well as appending elements to the end of a list, we can also insert elements at any given index in the list:
transition_metals.insert(0, 'Titanium')
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron']
This is the first time we’ve encountered a function or method that takes two arguments (two inputs). The insert
method requires you to specify firstly the index for where you want the new element to go (here we use
Both the append
and insert
methods allow us to add one new element to the list. We can use the extend
method to add multiple elements to the end of a list:
more_transition_metals = ['Cobalt', 'Nickel']
transition_metals.extend(more_transition_metals)
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron', 'Cobalt', 'Nickel']
We can also accomplish this same task by use of the addition operator +
, in a similar manner to how we can “add” strings together:
even_more_transition_metals = ['Copper', 'Zinc']
transition_metals = transition_metals + even_more_transition_metals
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron', 'Cobalt', 'Nickel', 'Copper', 'Zinc']
Removing elements from a list#
We now have a range of tools at our disposal for adding elements to a list, but how can we remove them instead? Well, for a start, there is a remove
method:
transition_metals.remove('Copper')
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Iron', 'Cobalt', 'Nickel', 'Zinc']
This removes the first element which matches the input argument from the list. There is also a pop
method:
pop_example = transition_metals.pop(4)
print(pop_example)
print(transition_metals)
Iron
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
This method removes the element at the index provided as an input (in this case pop
method also returns the value of the removed element. In the example above, we have assigned the output of the pop
method to a variable pop_example
, which when printed, shows us the removed element.
List slicing#
We are now familiar with accessing individual elements of a list:
print(transition_metals)
print(transition_metals[4])
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
Cobalt
But what if we want to obtain multiple elements at once? Python facilitates this task with list slicing:
transition_metals[1:5]
['Vanadium', 'Chromium', 'Manganese', 'Cobalt']
Here we have acquired transition_metals
: the second, third, fourth and fifth items (a slice out of the full list). The first number in the square brackets above is :
) is
transition_metals[5]
'Nickel'
In general, we can therefore take a slice out of a list with the following syntax:
name_of_list[start_index : end_index]
Where the element at the end_index
is not included in the slice. You actually don’t have to provide both the start and end indices:
transition_metals[2:]
['Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
transition_metals[:4]
['Titanium', 'Vanadium', 'Chromium', 'Manganese']
What’s happening here is that by default, if you do not provide an end index, Python will assume you want the highest value possible. In other words, your slice will start from the provided start index and go all the way to the end of the list. Similarly, if you provide the end index but no start index, Python will assume that you want your slice to start from the beginning of the list, up to (but not including) the end index you have given. To summarise, you can read:
name_of_list[start_index:]
As ‘give me every element of name_of_list
from start_index
and onwards’ and you can read:
name_of_list[:end_index]
As ‘give me every element of name_of_list
from the beginning up to (but not including) the element at end_index
.
We have not used this feature so far, but we can also provide a third index when we slice a list:
transition_metals[1:5:2]
['Vanadium', 'Manganese']
This latter index is a step size. By default, the step size is
transition_metals[1:5:1]
['Vanadium', 'Chromium', 'Manganese', 'Cobalt']
By increasing this value to
transition_metals[1:7:3]
['Vanadium', 'Cobalt']
Tip
What do you think will happen if you set the step size to a negative number? Make a new list in a Jupyter notebook and try it out!
It should be noted that list slicing can be used as another method to remove elements from a list:
print(transition_metals)
transition_metals = transition_metals[2:4]
print(transition_metals)
['Titanium', 'Vanadium', 'Chromium', 'Manganese', 'Cobalt', 'Nickel', 'Zinc']
['Chromium', 'Manganese']
By assigning the slice to transition_metals
, we have effectively replaced the full list with the slice, removing everything not captured by our specified start and end indices.
Lists of lists of lists of lists of lists…#
We saw earlier that we can create nested lists (lists containing more lists) in Python:
transition_metals = [['Iron', 'Cobalt', 'Nickel'],
['Ruthenium', 'Rhodium', 'Palladium']]
Note
Notice in the example above, the list has been split over multiple lines. Python allows some statements to be written like this to improve readability.
This list has
len(transition_metals)
2
Each of which is, in itself, another list. We know that we can access individual elements of a list via indexing:
transition_metals[0]
['Iron', 'Cobalt', 'Nickel']
But what if we want an individual element of a list within a list? Say that I want to obtain just the 'Cobalt'
string, we could do this by assigning a new variable:
first_row_transition_metals = transition_metals[0]
first_row_transition_metals[1]
'Cobalt'
Here we have stored just the first list in transition_metals
in a new variable called first_row_transition_metals
. We can then index this new variable as normal to retrieve our desired element. More succinctly, Python allows this process to be condensed into one line:
transition_metals[0][1]
'Cobalt'
Reading this from left to right, we type the name of the list (transition_metals
), and index this to retrieve only the first sublist (['Iron', 'Cobalt', 'Nickel']
). We then provide another index in a second set of square brackets to index this sublist and obtain only the second element: 'Cobalt'
.
Mathematical interlude: the sum
function#
We have already become acquainted with several new functions and methods related to lists, but it’s worth noting one more: the sum
function.
some_numbers = [1, 3, 13, 2, 33, 102, -36]
sum(some_numbers)
118
As illustrated by the example above, the sum
function literally sums up the elements of a list. In this case we used a list of int
objects, but of course this works with float
objects too:
some_floats = [5.13, 6.93, 10.11, -20.67]
sum(some_floats)
1.4999999999999973
On the other hand, it doesn’t make sense to try and sum up:
some_numbers_and_strings = [1, 5.5, 'purple', 10]
sum(some_numbers_and_strings)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[48], line 2
1 some_numbers_and_strings = [1, 5.5, 'purple', 10]
----> 2 sum(some_numbers_and_strings)
TypeError: unsupported operand type(s) for +: 'float' and 'str'
Here we have the same problem we encountered previously when trying to add a str
and an int
or a float
, Python gives us a TypeError
and essentially tells us: “This operation you are asking me to do doesn’t make any sense!”
In combination with the len
function, we can use the sum
function to calculate the mean of a list of numbers:
some_numbers = [1, 3, 13, 2, 33, 102, -36]
sum(some_numbers) / len(some_numbers)
16.857142857142858
Multiple assignment#
We have already seen that, just like other types of data, a list
can be assigned to a variable:
some_numbers = [1, 2]
Sometimes, we want to assign the various elements of a list to separate variables. With respect to our example above, we might want to store 1
in one variable and 2
in a different variable, rather than storing the whole list in some_numbers
. This can be accomplished with multiple assignment:
number_1, number_2 = some_numbers
print(number_1)
print(number_2)
1
2
This syntax for this process is relatively straightforward, we just provide as many variable names as there are elements in the list, separated by commas:
variable_1, variable_2, variable_3 = [element_1, element_2, element_3]
# This sets variable_1 = element_1, variable_2 = element_2 etc.
Exercise#
1. Copy the following block of code into a Jupyter notebook:
colours = [['Red', 'Magenta'],
['Cyan', 'Green'],
[['Yellow', 'Blue'], 'White']]
a) Use list indexing to print Magenta
from the colours
list.
b) Again, using list indexing, assign the element Red
to a variable called colour_1
, the element Green
to a variable called colour_2
, and the element Yellow
to a variable called colour_3
, so that the following line of code runs without error and displays a true statement:
print(f'{colour_1} + {colour_2} -> {colour_3}')
2. Create a list of strings corresponding to the second row of the periodic table and assign this to a variable.
a) By slicing this list, assign just the non-metals and metalloids to a new variable.
b) Slice your list of non-metals to obtain only the elements with even atomic numbers. Assign this subset to a new variable.
c) Using any of the methods described previously, add the non-metals and metalloids from the third row of the periodic table (with even atomic numbers) to the list you created in b).