The Multilingual Quantitative Biologist

This repository contains the source code for the The Multilingual Quantitative Biologist online book.

All code in this project was written in and tested with R 3.xx and Python 3.xx, though older language versions (including R 2.xx and Python 2.7) should work in most cases.

Notes on ScienceData integration
The text below and all the material in this directory were copied from the Multilingual Quantitative Biologist GitHub repository.

The book is best read in its HTML version via the link below, but it can also be read here. Just notice that some of the formatting of the book is lost. The notebooks from which the book is built are available here, listed below, to conveniently import to your ScienceData account - from where you you can run them directly in the browser.

Some, but not all, of the notebooks have been adapted to use ScienceData as data store instead of the local filesystem. This is important, as the notebooks run in ephemeral pods without persistent storage. Output files you want to keep, must be copied to ScienceData (or elsewhere) before you shut down / destroy your pod. We've tried to make this as seamless as possible. Examples can be found in the notebooks 05-Python_I.ipynb, 08-Data_R.ipynb and Appendix-Data-Python.ipynb.

In general, the procedure is to read and write files to a folder, say tmp in your home directory on your ScienceData home server, via http. The endpoint for this is simply https://sciencedata/files/tmp. See "Utilities/python_file_management.ipynb" and "Utilities/sddk.ipynb" for details.

Usage

Building the book

If you'd like to develop and and build the Multilingual Quantitative Biologist book, you should:

  • Clone this repository
  • cd to it and run pip install -r requirements.txt (it is recommended you do this within a virtual environment)
  • (Recommended) Remove the existing The Multilingual Quantitative Biologist/_build/ directory
  • Run jupyter-book build ... (more on this below)

A fully-rendered HTML version of the book will be built in _build/html/.

Hosting/Deploying the book online

The html version of the book is hosted on the gh-pages branch of this repo, which is then rendered on https://mhasoba.github.io/TheMulQuaBio/.

To deal with non-standard kernel dependencies (bash, R) we will not use an GitHub actions workflow to automatically build and push the book to this branch on a push or pull request to master (something to look into in the future).

However, the workflow for building the book manually is not too onerous:

  • Make changes to the book's content on the master branch of this repository
  • Re-build the book with jupyter-book build content (assumes you are running the command from root directory of the repository)
  • Ensure that HTML has been built for each page of your book. There should be a collection of HTML files in the content/_build/html folder. Also load _build/html/notebooks/index.html and check that the book has been built (with navigation etc.) as expected.
  • Run ghp-import -n -p -f content/_build/html to deploy the book.

The last command will automatically push the latest build to the gh-pages branch. More information on this hosting process can be found here.

Typically after a few minutes the site should be viewable online at https://mhasoba.github.io/TheMulQuaBio. If not, check repository settings under Options -> GitHub Pages to ensure that the gh-pages branch is configured as the build source for GitHub Pages.

An example command to push all new changes to the git repository and is:

git add -A && git commit -m "Commit message" && git push -u origin master

Please do not push changes for every little edit you make to the book (e.g., after fixing some typos). Push only significant changes. Remember, you can deploy the book (by pushing to the gh-pages branch using ghp-import as explained above) without pushing changes to the master branch.

Other tips:

  • Read the jupyter book; it is short and to the point and addresses all of the key tools and guidelines succinctly

  • In particular, if you want to remove a particular cell from the rendered book see this

  • If you want to remove a cell when exporting a jupter notebook (outside of jupter book), say as html, add

    {
      "tags": [
          "remove_cell"
          ]
      }
    

    to the metadata of the cells that you want to remove from the output, and then run:

    jupyter nbconvert younotebookname.ipynb --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags="['remove_cell']" --to html
    

    (html can be replaced with another export format, such as pdf). Read the nbconvert documentation for more info on exporting/converting. ## Contributors

We welcome and recognize all contributions. You can see a list of current contributors in the contributors tab.

Also, note that:

  • The master branch of this repository is protected, so even users with write (push) access need to push changes on a branch and make a pull request (also, see this). New commits to a non-master branch after a pull request has been made will result in any pull requests from that non-master branch to be discrded. Please read this for good practices for branching (and merging).
  • The solutions to the exercises in this book are in a separate, private git repo that students do not have access to. Ask Samraat (mhasoba@gmail.com) if you need access to that repository. Students will be provided the solutions when the time comes.
  • The results directory in content is populated when scripts are run, but these are not version controlled (all files in this directory under .gitignore).

Credits

This project is created using the excellent open source Jupyter Book project, initiated using executablebooks/cookiecutter-jupyter-book template.

Some chapter-specific credits:

  • The computing sections were originally inspired by, and many of the materials are based on Stefano Allesina's excellent
  • Most of the sections on Data Analysis and Basic Statistics were originally written by David Orme (d.orme@imperial.ac.uk).
In [ ]: