The Jupyter Notebook offers an interactive interface to multiple programming languages (including Python and R) that can be viewed and manipulated in web browsers.
Jupyter notebooks can run code, and also store the code and a current snapshot of the output (graphics and text), together with notes (in markdown), in an editable document called a notebook (like this one you are currently reading).
Thus a Jupyter notebook is similar to a Mathematica or Maple notebook. The difference is that Jupyter is free, fast improving, and unlike Mathematica or Maple notebooks, the Jupyter notebook can support almost 100 programming languages (more on this below). Furthermore, Jupyter notebooks are saved on disk as a JSON files (with a .ipynb extension), so are fully open source and can be kept under version control.
Jupyter is a great tool for "literate programming"; a software development style pioneered by (yes, again!) Donald Knuth. In his words:
Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.
In literate programming, human-friendly text is as important as code, making coding accessible to a larger proportion of people.
And just as importantly, literate programming also helps address the generally poor reproducibility of scientific research by making your code and your methods better reproducible and reusable.
With Jupyter notebooks, you can explore quantitative ideas by combining snippets of computer code and human text. The main uses of Jupyter notebooks are:
Jupyter Notebooks are increasingly being used as a research tool in academic, government (e.g., NASA), as well as industry (e.g., IBM, Facebook, Microsoft & JP Morgan) sectors.
The Jupyter Notebook:
Brian Granger and Fernando Pérez created IPython (Interactive Python) as an implementation of Python for literate programming, taking advantage of improved web browser technologies (e.g., HTML5). IPython notebooks quickly gained popularity (they were a great alternative to Mathematica and Maple as well), and in 2013 the IPython team won a Sloan Foundation Grant to accelerate development of this technology. The IPython Notebook concept was expanded upon to allow for additional programming languages, which became project Jupyter (why "Jupyter"? See this. IPython itself is now focused on the interactive Python project per se, part of which is providing a Python kernel for Jupyter.
Your Jupyter notebooks will typically open in IPython as well, but will not allow you Jupyter-only features (of course!)
Please see project Jupyter's instructions for installation. Also, see this.
See this for instructions for installing the Python and R kernels. Or you can go through here (and install addtional language kernels if you want).
Again, see this for installing jupyter extensions.
In Linux / Mac, Jupyter will start in the directory that toy launch it from though terminal. So in your terminal first cd
to your weekly coursework' code
directory. Then, simply launch jupyter (Linux / Mac):
jupyter notebook
This will inititalize a new server instance (watch your active terminal) and launch a web browser with the Jupyter interface, include a menu at the top.
Play abound a bit. For starters,
$\star$ Create a new nb called MyFirstJupyterNb.ipynb
using the menu in your code
directory.
One you have created the new nb, you can launch it next time by cd
'ing to its location and then launching it by name:
jupyter notebook MyFirstJupyterNb.ipynb
tmux
¶You may notice that jupyter locks up the terminal you launch it from until it is closed. This can be particularly frustrating, and there is an elegant solution. Enter tmux
!
tmux
is a "terminal multiplexer", allowing you to run terminal sessions independent of (or "detatched" from) a terminal window. You can install tmux using apt:
sudo apt install tmux
To run jupyter in a tmux session, start tmux using the command tmux
.From here you can use the tmux terminal as you would your usual terminal session (so launch jupyter notebook
as before), however if at any time you decide that you want to do something else, you can detach.
To detach your terminal window from the session, use the shortcut ctrl+b, d
(ctrl+b, then let go of that and hit d). This kicks you back to the normal terminal, however your other session is still running in the background.
To see which sessions are currently running, use
tmux ls
And to attach to the session:
tmux a
Note: you can attach to a session from a different terminal window or even terminal program!
To quit tmux entirely, just quit the terminal session either with ctrl+d
or exit
tmux
?¶While this is mostly just a quality of life suggestion, countless are the times that I have accidentally closed my terminal only to find that my jupyter session has also died. By decoupling the window from the terminal session, this no longer happens.
Additionally tmux is absolutely invaluable when running long jobs on a remote server. In this case you can simply disconnect from the tmux session on the remote server. Then you can log in later (and/or from a different computer) and reattach to the interactive session (using tmux a
) to check progress. In combination with good logging for long jobs, this can make your life infinitely easier.
The two main elements (and the associated syntax) you need to know to modify the content of a Jupyter nb are as follows. These will either be entered into text-only cells ("markdown cells"), or code cells. To insert a new cell, click on the edge of an existing cell, and hit A
or B
on your keyboard to insert a cell above and below, respectively. By default, Jupyter inserts a code cell. To convert it to a text cell, hit M
immediately after inserting the cell, or if you have clicked elsewhere after inserting the cell, by clicking on the edge of the new cell and then hitting M
. You can now either enter and modify text or code:
Text is formatted using markdown and standard html. The markdown cheatsheet for github is very handy. Give it a quick spin to get started with markdown.
You can insert inline or standalone equations by using standard $\LaTeX$ environments (and you can access more $\LaTeX$ functionality and environments using the LaTeX_envs
extension.). For example, how about the ecologist's beloved logistic growth equation:
$$
\frac{dN}{dt} = rN \left( 1-\frac{N}{K}\right)
$$
which will render as:
$$\frac{dN}{dt} = rN \left( 1-\frac{N}{K}\right)$$Depending on your kernel, you will type the appropriate programming language code in these cells.
A full list of keyboard shortcuts can be displayed by clicking outside the current cell and hitting "h". The following are important keyboard shortcuts:
Key | Command |
---|---|
A or B : |
insert cell above or below current cell, respectively |
M : |
convert code or raw cell to text cell |
Y : |
convert raw or text cell to code cell |
R : |
convert code or text cell to raw cell |
CTRL+ENTER : |
render (text) or evaluate (code) cell |
D, D : |
delete cell |
I, I : |
interrupt a running kernel (evaluation of a code cell) |
X : |
cut selected cell(s) |
SHIFT+V : |
Insert cell(s) cell above selected cell |
V : |
Insert cut cell(s) below selected cell |
SHIFT+M : |
merge selected cells, or current cell with cell below if only one cell selected |
For a full list of keyboard shortcuts or to cutomize them, click anywhere outside your current cell and hit the H
key. Or, click on the "help" menu item at the top.
Let's try running code in a code cell. Python first. Make sure that your kernel is Python using the dropdown Jupyter nb menu. Then try the following:
a = "this is python!"; print(a)
You can also run a script file that is saved on your computer by using the ipython %run
command --- basically you have a fully functional ipython environment inside each code cell.
Now R. Switch the kernel to R by using the Kernel menu along the top of the Jupyter window.
Then try:
a <- "this is R!"; cat(a)
Plotting is straightforward. Again, let's try Python first. Switch back to the Python kernel. Then try:
import matplotlib.pyplot as p
import scipy as sc
x = sc.arange(0, 5, 0.1); y = sc.sin(x)
p.plot(x, y); p.show()
Now lets try R. Switch to the R kernel, and try:
require(ggplot2)
library(repr)# to resize plot within jupyter - this package is part of IRKernel
options(repr.plot.width=3.3,repr.plot.height=2.5)
x <- seq(0, 5, 0.1); y <- sin(x)
qplot(x, y, geom = "line") # large figure
Exiting is easy. Just CTRL+S
(to save) and close the browser. The server will still be active in the bash terminal. You may then relaunch by typing the local url (e.g., by typing http://localhost:8888/notebooks/MyJupyterNbName.ipynb
) in your web browser window, or by exiting the kernel with CTRL+C
an then relaunching as you originally did.
That's the end of our intro to Jupyter.
Play around a bit. You can also try out the Data in Jupyter and Maths in Jupyter tutorials.