Data Analysis with Jupyter Notebooks.

Tutorial 2 — Simple Calculations

Benjamin J. Morgan, University of Bath.

Simple calculations

One of the simplest forms of “code” that can be run in code cells is mathematical expressions:

1+2+3+4
In [ ]:
 
4*5/2
In [ ]:
 
2**4 - 2
In [ ]:
 

** is the “power” operator. This code calculates $2^4 - 2$.

Perform the following calculations in the three cells below: $1+1+2+3+5+8$ $1*2*3*4*5*6$ $2^{10}$
In [ ]:
 
In [ ]:
 
In [ ]:
 

Running the cell below will check your results for these three calculations.

If any of these give errors, you can go back to edit your code, and re-run to check your changes.

In [ ]:
# This cell tests your answers from the three previous code cells.

# You do not need to edit it

assert ___ == 20

assert __ == 720

assert _ == 1024

Mathematical functions and modules

In mathematics, a function converts one number to another number; $y=f(x)$.

In programming, a function is more general than this, and converts an input into an output. For example, if we want to calculate a square root, we can use the sqrt() function.

sqrt(4)
In [ ]:
 

This has given us an error:

NameError: name 'sqrt' is not defined.

Python has a lot of built in commands (functions). Solving any particular problem will only require a small subset of these. To keep Python code efficient, only a minimal set of the available tools are available "out of the box". Other commands (such as mathematical functions) are collected in modules that we can load, to make these available in our notebook.

sqrt lives in the math module. We can load it like this:

from math import sqrt

sqrt(4)
Edit the last code cell to import the math function and re-run it.

Or we can import the entire math module:

import math

math.sqrt(4)
Edit the previous code cell again to calculate $\sqrt 4$ by importing the entire math module.

You can think of math.sqrt() as instructing the computer to “use the sqrt() function provided inside the math module”.

The math module contains a large set of common mathematical functions, and the constants $\pi$ and $\mathrm{e}$ (natural logarithm).

from math import pi, sin, e, log

The log function calculates the natural logarithm, i.e. $\ln x$.

In [ ]:
 
pi
In [ ]:
 
sin( pi/2 )
In [ ]:
 
e # natural logarithm
In [ ]:
 
log(e) # the natural logarithm of e
In [ ]:
 
e**2
In [ ]:
 
Calculate the following:

$\cos(2\pi)$

$\ln(2\mathrm{e})-\ln(2)$

$\log_{10}(10)$

The math function for $\log_{10}$ is log10().
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
# This cell tests your answers from the three previous code cells.

# You do not need to edit it

from math import fabs

assert fabs( ___ )- 1.0 < 1e-10

assert fabs( __ ) - 1.0 < 1e-10

assert fabs( _ )  - 1.0 < 1e-10

Variables

Many of the code examples we have already seen produce some information. When each code cell is executed, if that code returns a result, this is printed directly underneath the corresponding code cell, next to Out[ ]:.

72/4
In [ ]:
 
print("Nothing is returned here!")
In [ ]:
 

print() is another function, like math.sqrt() or math.log(). Instead of performing a mathematical calculation, print<() just prints whatever is inside the brackets. In this example, this is exactly what happened. The text is printed to the screen, but the print() function does not return a value.

print() can print more than one variable if these are separated by commas:

print("72/4 =", 72/4)
In [ ]:
 

Import statements are another example of code that does not return anything.

from math import sqrt
In [ ]:
 

If you run an code cell and nothing appears underneath, the code ran okay (and hopefully did what you expected). Any output under a cell will either be the returned result, or an error.

One aspect that makes programmatic data analysis useful comes from the ability to write complex procedures with many steps, that are then performed identically every time the code is run against new data sets. To build up more sophisticated data analysis workflows, we often want to keep a result of one step, to use in a later step. Storing results in computer memory is called assigning variables. A variable is just a name; a sequence of letters and numbers; that labels the stored result. Then, to access the value stored in the variable, we can use the label (the variable name) to refer to the original result.

# calculate 2 + 3

2 + 3
In [ ]:
 
# calculate 2 + 3 and store the result in the variable `my_result`

my_result = 2 + 3
In [ ]:
 

Notice there is no return value printed to Out[ ]:.

Instead a variable my_result is created, and the value returned by the calculation is stored here.

Variable names can be nearly anything, as long as that name is not already used for some part of Python (e.g. print). Two limitations are they cannot begin with a number (but can contain numbers), and they cannot contain spaces. Underscores are commonly used instead of spaces to keep the code readable.

1st_result = 3 + 4
In [ ]:
 
this result = 5 + 6
In [ ]:
 
Fix the two previous code cells to use first_result and this_result so that they will run without errors.

To check the value stored in a variable we can just type the variable name, which returns the stored value.

my_result
In [ ]:
 

Note that we only get the last value returned if we have multiple lines of code.

my_result

this_result
In [ ]:
 

We can get round this by using the print function to print out the value stored in one or more variables

print( my_result )

print( this_result )
In [ ]:
 

Variables can be used to store raw numbers, and can then be used for calculations.

the_number_six = 6

my_result + the_number_six
In [ ]:
 

Any code that uses variables may itself return a further result, which can be assigned to a new variable, and used later (and so on).

yet_another_variable = my_result + the_number_six

print( yet_another_variable )
In [ ]:
 

If you refer to a variable that has not yet been created you will get an error.

print( bananas )
In [ ]:
 
Edit the code cell above so to print "bananas" (the quotes are part of the new code), and re-run the cell.
Create three variables, $x$, $y$, and $z$, and use them to store the numbers $5,6,7$. Using these variables, calculate: $5+6+7$, and $(5+6)\times7$.
In [ ]:
# create your variables and store the numbers 5, 6, 7

◽◽◽

◽◽◽

◽◽◽
In [ ]:
# use the variables to calculate 5+6+7

◽◽◽
In [ ]:
# use the variables to calculate (5+6)×7

◽◽◽
In [ ]:
# This cell tests your answers from the three previous code cells.

# You do not need to edit it

assert (z,y,x) == (7,6,5)

assert __ == 18

assert _ == 77