Data Analysis using Jupyter Notebooks Part 2

Benjamin J. Morgan

Functions, in more detail

You have already seen, and used, several functions that are part of the standard Python library, or part of modules that you have use with import.

math.sqrt() is a function. It takes a single number as an input, and gives the square root of that number as an output:

import math

result = math.sqrt(4)

result
In [ ]:
 

numpy.min() is a function. It takes a list or array of numbers as an input, and gives the minimum values in that set as an output:

import numpy as np

a = np.array([4,6,8,2,4,1])

minimum = np.min(a)

minimum
In [ ]:
 

These two examples both correspond to equivalent mathematical functions, but Python functions can be much more varied.

np.array() is a function that creates an array. It takes a list of numbers as an input, and gives a numpy array containing those same numbers as an output.

import numpy as np

a = np.array([1,2,3,4,5])

a
In [ ]:
 

print() is a function that prints a string on the screen. It takes a string (or something that can be converted to a string, such as a number) as an input.

print("Yes, I am a function")
In [ ]:
 

A function takes a list of inputs and does something with them, before (possibly) returning an output. For example the numpy.sum() function, takes one input: a numpy array; adds the contents, and returns the result:

import numpy as np

a = np.array( [1,2,3,4] )

result = np.sum( a )

print( result )
In [ ]:
 

The inputs to a function are called arguments. We say that numpy.sum() takes one argument.

The output is called the return value. We say that numpy.sum() returns a single number.

Look again at the print() function. Passing a single string argument to print() causes it to be printed to the screen. But what is the return value of the print() function?

return_value = print('hello')

print( return_value )
In [ ]:
 

The first line hello is produced when we call the function print() with the argument 'hello', as you would expect. We also store the return value from the print() function in our variable return_value, and print this on the second line. This gives us the output None. None is a special Python value that indicates an empty variable. The print() function does not return a value, and our return_value variable is therefore set to None, which tells you that it is empty.

Writing your own functions

In Python, you are not limited to the functions in imported modules, but you can also write your own. Writing a function requires a function definition, which looks like

def my_function( arguments ):

    # do something with the arguments

    return a_return_value

A function definition begins with the keyword def, followed by the name you want to give your function. This can be any allowed variable name, so underscores and capitals are allowed; names beginning with numbers are not allowed.

The function name is followed by a pair of brackets, and then a colon, like this: function_name():. If the function takes any arguments as inputs, these are indicated by variable names inside the brackets: function_name( arg1, arg2, arg3 ):.

The colon indicates that the the following lines will contain the actual function code. These lines are indented, and the end of the function is marked by the end of this indentation:

def my_function( arguments ):

    # inside the function

    # do something

    # still inside the function

# outside the function

# still outside the function

Finally, if a function returns a value, this is specified using the return keyword.

Let us look at a simple example, that adds two numbers together:

def add( a, b ):

    result = a + b

    return result
Use the following code cell to define the add() function.
In [ ]:
 

Now you can run (or call) the add() function by using it in the same way as you would math.sqrt() or print(), with the appropriate arguments inside the brackets:

add(2,3)
In [ ]:
 
Check that your add function behaves how you would expect with different pairs of number as arguments. What happens when you call `add()` with only one number argument? What if you call `add()` with more than two arguments?
In [ ]:
 

The operation of “adding” a and b is now contained within the add() function. Once this has been defined, you never need to write this calculation again, but instead can just call the function.

Write a subtract() function that takes two numbers as arguments, and subtracts the second from the first. Use the assert statements in the subsequent code cell to test that your function works correctly.
In [ ]:
 
In [ ]:
# Use these assert statements to test whether your function produces correct output.

assert subtract(4,2) == 2

assert subtract(-3,-5) == 2

Both of these functions have been very simple toy examples, and could be replaced with the + and - operations. So why do we want to write functions? As you write more complex code, functions are one way to keep this organised. Functions are self-contained. Any variables you create or change inside a function are separate from variable outside that function, either in your main code, or in other functions. This lets you tackle complex problems by breaking them up into a sequence of smaller steps, describing each step with a function, and then calling each function in turn. If one function is not working as it is supposed to, you know that you only need to change code inside that function, instead of having to check every line in your code. Functions also let you reuse complex bits of code multiple times without having to rewrite them, making your code easier to read and keep organised.

The quadratic formula, $y=ax^2+bx+c$, has two roots: these are the values of $x$ that give $y=0$ as the solution.
The quadratic root equation gives these roots in terms of the coefficients $a$, $b$, and $c$:

\begin{equation} x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}. \end{equation}
Write a quadratic_roots() function that takes three numbers, $a$, $b$, and $c$ as arguments, and calculates **both** of the roots.
You can return multiple values using, e.g. return value_1, value_2.
Use the assert statements in the subsequent code cell to test that your function works correctly.
In [ ]:
 
In [ ]:
# Use these assert statements to test whether your function produces correct output.

assert 2.0 in quadratic_roots(1,1,-6)

assert -3.0 in quadratic_roots(1,1,-6)

assert 0.5 in quadratic_roots(2,-3,1)

assert 1.0 in quadratic_roots(2,-3,1)

A chemical example: an equilibrium_constant() function

As a simple example, you will now write a function to calculate an equilibrium constant from the concentrations of a set of products and reactants, and then use this function to calculate a Gibbs free energy for the reaction.

For this example we will consider the equilibrium constant $K$ for a reaction

\begin{equation} \mathrm{A} + \mathrm{B} \leftrightharpoons \mathrm{C} + \mathrm{D} \end{equation}

as

\begin{equation} K = \frac{\left[\mathrm{C}\right]\left[\mathrm{D}\right]}{\left[\mathrm{A}\right]\left[\mathrm{B}\right]} \end{equation}
Create a function named equilibrium_constant that takes four arguments, corresponding to the concentrations of reagents A, B, C, and D, and returns the equilibrium constant $K$.
In [ ]:
 
In [ ]:
# Use these assert statements to check that your function works correctly.

assert equilibrium_constant( 3, 4, 5, 6 ) == 2.5

assert equilibrium_constant( 1, 3, 8, 21 ) == 56.0

Now that you have a function for calculating an equilibrium constant, you can use this to calculate a Gibbs free energy for your reaction:

\begin{equation} \Delta G = -RT \ln(K) \end{equation}
Wite a function named gibbs that takes two arguments, corresponding to the temperature, and the equilibrium constant. You can get the value of the molar gas constant from the scipy.constants module.
from scipy.constants import R
In [ ]:
 
In [ ]:
# Use these assert statements to check that your function works correctly.

assert gibbs( 298, 1.0 ) == 0.0

assert gibbs( 198.0, 10.0 ) == -3790.6607359720947

Combining these two functions, you can now calculate the Gibbs free energy change for an equilibrium $\mathrm{A} + \mathrm{B} \leftrightharpoons \mathrm{C} + \mathrm{D}$ using two lines of code!

K = equilibrium_constant( 1.0, 2.0, 3.0, 4.0 )

delta_G = gibbs( 298.0, K )

print( delta_G )
In [ ]:
 

Because you have tested your equilibrium_constant() and gibbs() functions when you wrote them, you can now trust these to always give you the correct results (assuming you input the correct concentrations and temperatures).