Functions

Functions#

In our previous labs, we have already been using functions: the print function, the type function, functions from the math library etc. So far, we’ve largely glossed over the details and have just introduced various functions as and when they become relevant. We are now going to dive headfirst into exactly what functions are, how they work and how you can write your own.

A function is a piece of reusable code that takes some inputs, manipulates them in a useful way, and (usually) returns some output. To call a function, we type the function’s name followed by a set of brackets in which we place the inputs:

message = 'Here we are calling the print function'

print(message)

Here we are calling the print function

The inputs to a function placed in the brackets are usually referred to as arguments. We could describe the line of code above as “calling the print function with message passed as the only argument”. In this case, we are calling a built-in function: the print function is made available to us in Python by default. We have also seen examples where we have to import the function we want to use (i.e. it is not built-in):

import math

math.cos(math.pi)

-1.0

Here we have imported the math library and used this to call the cos function with \(\pi\) as the only argument. We mentioned previously that functions usually return an output. Here, that output is -1.0: the value of \(\cos{\pi}\).

Defining your own functions#

Now we’ve reminded ourselves of what functions are and what they do, let’s get on with writing some of our own!

def add(a, b):
    result = a + b
    
    return result

In this simple example, we have defined an add function which outputs the sum of two numbers. Let’s go line-by-line:

def add(a, b):

This first line starts with the def (short for define) keyword: this is how we tell Python that we want to define a new function.

def is followed by add, this is the name we want to give our new function.

Next comes a set of brackets (a, b). This is where we specify what arguments we want our function to take. Here we want to add two numbers together, so we specify two arguments: a and b.

Important

We are not specifying at this stage what a and b are, these are just the names for our arguments. Only when we call the function later will a and b take on actual values.

Just like if statements and for loops, a function definition line ends with a colon : and is followed by an indent.

    result = a + b

The second line does the actual calculation we are interested in, adding the value of a to the value of b and storing this in a new variable result.

    return result

Finally, the last line starts with the return keyword. This is how we tell Python what the output of the function is. In this case, we want to return the sum of a and b, so we return our result variable.

Having defined our add function, we can call it just like any other function:

add(5, 10)

When we call our add function, Python goes and looks at its definition:

def add(a, b):
    result = a + b

    return result

The arguments we supplied were \(5\) and \(10\), so what Python does under the hood is:

# What Python is doing when we type add(5, 10)

a = 5
b = 10

result = a + b

And then result is returned as the output of the function call.

It is worth emphasising at this point that we only run the function definition once. In other words we do not need to run:

def add(a, b)
    result = a + b

    return result

Every time we call the add function, only once beforehand.

All about arguments#

Now that we have a general overview, let’s talk a bit more about the specific components of a function, starting with arguments.

In our previous example, it does not matter what order we supply our arguments in:

add(5, 10)

add(10, 5)

This is simply a reflection of the fact that addition is symmetric, \(a + b = b + a\). The vast majority of functions do not have this symmetry, for example \(a \div b \neq b \div a\):

def divide(a, b):
    result = a / b

    return result

divide(10, 5)

2.0

divide(5, 10)

0.5

This is why you will sometimes read about positional arguments: the position/order matters.

As opposed to positional arguments, we can also make use of keyword arguments:

def divide(a, b, round_to_int=False):
    result = a / b

    if round_to_int:
        return round(result)

    else:
        return result

Here we have added an example keyword argument to our divide function: round_to_int. Notice that this argument is followed by the assignment operator = and the False keyword, this indicates that we are assigning False to round_to_int by default.

Keyword arguments are optional:

divide(10, 3)

3.3333333333333335

This means that we can call the divide function with only the two positional and required arguments, and it will function just like before. If we also set the optional keyword argument:

divide(10, 3, round_to_int=True)

Our output is now rounded to the nearest integer as per the logic in the function definition.

Note that keyword arguments cannot come before positional arguments in the function definition:

def divide(a, round_to_int=False, b):
    result = a / b

    if round_to_int:
        return round(result)

    else:
        return result

  Cell In[13], line 1
    def divide(a, round_to_int=False, b):
                                      ^
SyntaxError: parameter without a default follows parameter with a default

You can also imagine other errors that we might encounter depending on the arguments that a function expects compared to those that we actually pass to it:

divide('10', 5)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 divide('10', 5)

Cell In[13], line 2, in divide(a, b, round_to_int)
      1 def divide(a, b, round_to_int=False):
----> 2     result = a / b
      4     if round_to_int:
      5         return round(result)

TypeError: unsupported operand type(s) for /: 'str' and 'int'

Here’s a familiar example from our previous labs. We have passed '10' as the first argument, which is then assigned to a. Our function definition instructs Python to divide a / b, which is impossible if a is a string: hence the TypeError. There are ways in which we can make our functions at least somewhat tolerant to bad user input, but we will come to this later.

Both the add and divide functions that we have written require two arguments, but there is nothing stopping us from writing functions that take more or less arguments. In fact, you can write functions that take zero arguments:

def zero_arguments():
    print('This function takes zero arguments!')

Even though there are no arguments, notice that the brackets are still required before the colon : in the function definition. The same goes for when we call zero_arguments:

zero_arguments()

This function takes zero arguments!

If we forget the brackets, Python will treat zero_arguments as a variable:

zero_arguments

<function __main__.zero_arguments()>

Note

This is true for all functions, regardless of the number of arguments they take. To call a function, we must follow its name with a set of brackets (argument_1, argument_2 ...).

Of course, if zero_arguments takes… zero arguments, then you can imagine what might happen if we try to pass some anyway:

zero_arguments(5)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 zero_arguments(5)

TypeError: zero_arguments() takes 0 positional arguments but 1 was given

What to `return`?#

Functions use the return keyword to specify what the output should be. For example, our add function returns the result of adding a and b:

def add(a, b):
    result = a + b

    return result

What happens if we don’t return anything?

def add(a, b):
    result = a + b

add(5, 10)

The answer appears to be… nothing: running add(5, 10) produces no output at all. This is actually a little deceptive, as Python is returning something. We can see this if we print the output of the add function:

print(add(5, 10))

None

Or we could use the type function:

type(add(5, 10))

NoneType

If you do not specify what to return from a function, the default is to return None.

We have not encountered None before, but the idea is relatively simple. None is Python for “nothing” or “null”, it is a way of representing a lack of data. If our function doesn’t return anything, then None represents exactly that: it is the absence of data. We have actually already met a few functions that don’t return anything:

type(print('The print function does not return anything!'))

The print function does not return anything!

NoneType

It may seem strange, but the print function does not return anything, hence the type of its output is NoneType: it returns None. Notice the distinction between what is displayed on the screen using print versus what is actually returned. You can see this even more clearly using variables:

print_return_value = print('This will allow us to store whatever the print function returns!')

print(print_return_value)

This will allow us to store whatever the print function returns!
None

Our examples have always returned one output, but Python allows us to return as many outputs as we would like! Let’s say I want to write a function that calculates the sum of two numbers and the difference between them:

def add_and_subtract(a, b):
    addition = a + b
    subtraction = a - b

    return addition, subtraction

add_and_subtract(3, 9)

(12, -6)

As you can see above, add_and_subtract returns two values: addition and subtraction; the syntax for returning multiple values is simply to separate them by commas return addition, subtraction.

When we return multiple values, they are output from the function as a tuple:

result = add_and_subtract(3, 9)
print(result)

type(result)

(12, -6)

tuple

Each element of this tuple can be accessed in all the ways we learnt about back in lab 2. It is relatively common in the context of functions returning multiple outputs to use multiple assignment to store each output in a variable:

addition, subtraction = add_and_subtract(3, 9)

print(addition, subtraction)

12 -6

Scope#

Returning again to our divide function, this time without the keyword argument:

def divide(a, b):
    result = a / b

    return result

Let’s talk about a and b. As discussed prior, these are the arguments to the divide function. When we call divide, the first argument we pass is assigned to a and the second is assigned to b.

divide(60, 10)

6.0

In other words, when we call divide, a and b are used as variables. When we ran divide(60, 10) above, Python implicitly ran:

a = 60
b = 10

So we would imagine that if we ran print(a), we should get \(60\), right?

print(a)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[22], line 1
----> 1 print(a)

NameError: name 'a' is not defined

As you can see, the answer is no. Not only do we not get \(60\), attempting to print(a) doesn’t work at all!

Error

This is an example of a NameError, which communicates that we have attempted to access a variable that does not exist.

But hang on, how can a not be defined if we have just run the divide function? This is a consequence of local scope:

a = 5
b = 10

print(a, b)

divide(60, 10)

print(a, b)

5 10
5 10

Hopefully the example above clears things up a bit. The idea of scope is simple: some variables are only defined (i.e. they only exist) within certain bounds. In the example above, we explicitly assign a = 5 and b = 10. We then call the divide function, which also assigns a and b, but to different values (here \(60\) and \(10\)). Nonetheless, if we print(a, b) before and after calling divide, we see that they have not changed! This is because the variables in the divide functions are restricted to local scope. In other words, variables defined within functions are not defined outside of those functions.

It is worth emphasising that the opposite is not true. Variables defined in global scope are indeed defined within local scope:

global_scope = 10

def scope_example(a, b):
    result = (a - b) / global_scope

    return result

Here we have defined a scope_example function, notice that it refers to the global_scope variable which is not an argument to the function. Despite the fact that global_scope is not explicitly passed to scope_example:

scope_example(10, 5)

0.5

We find that it is able to access global_scope all the same.

It is generally a bad idea to mix up global and local variables. For example, if we change the value of global_scope:

global_scope = 'This is now a string.'

scope_example(10, 5)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[26], line 3
      1 global_scope = 'This is now a string.'
----> 3 scope_example(10, 5)

Cell In[24], line 4, in scope_example(a, b)
      3 def scope_example(a, b):
----> 4     result = (a - b) / global_scope
      6     return result

TypeError: unsupported operand type(s) for /: 'int' and 'str'

Our scope_example function suddenly stops working, as global_scope is now a string. This behaviour can lead to very troublesome bugs, as it is all too easy to accidentally change something in global scope that then propagates into any functions that use use that variable in local scope.

There is one situation in which it is more acceptable to rely on global scope:

R = 8.314

def ideal_gas_volume(p, n, T):
    return (n * R * T) / p

Here we define the variable R = 8.314 in global scope. We then access R in the local scope of the ideal_gas_volume function. Objectively, this is the same situation as our previous example with global_scope and the scope_example function. The key detail is that here R is (roughly) the gas constant, so we do not expect its value to change. If we access global scope from within a function, but we expect that the variables we are accessing will not change, then this shouldn’t cause us any unexpected problems, as we can rely on R always being 8.314.