Introduction to Jupyter Notebooks#

In this tutorial, we discuss some basic tasks to get your Jupyter notebooks up and running on your computer.

Spellchecker: LanguageTool Browser Extension#

Last thing you want on your Jupyter notebooks is typos. Jupyter notebooks have some spellchecker extensions, but it gets problematic installing them on different software environments. For spellchecking, we actually recommend a free browser-based extension called LanguageTool here. This extension not only checks for typos in your notebooks, but also anything you type within your browser as an extra bonus. Sweet!

How to check for Python and module versions#

Within your shell, you can issue the command:

> python --version

Sometimes python’s executable command name will be “python3”, so you might need:

> python3 --version

Within the Jupyter notebooks environment, to issue a system command, you will need to an exclamation mark (“!”) in front as shown below, which will have the same effect:

!python3 --version
Python 3.11.9

To check for version number of a Python module, you can view its __version__ attribute as below.

import numpy as np
np.__version__
'2.0.0'

How to read CSV files#

import pandas as pd

Assuming that your file is under a directory called data:

df = pd.read_csv("./data/breast_cancer_wisconsin.csv")
df.head()
mean_radius mean_texture mean_perimeter mean_area mean_smoothness mean_compactness mean_concavity mean_concave_points mean_symmetry mean_fractal_dimension ... worst_texture worst_perimeter worst_area worst_smoothness worst_compactness worst_concavity worst_concave_points worst_symmetry worst_fractal_dimension diagnosis
0 17.99 10.38 122.80 1001.0 0.11840 0.27760 0.3001 0.14710 0.2419 0.07871 ... 17.33 184.60 2019.0 0.1622 0.6656 0.7119 0.2654 0.4601 0.11890 M
1 20.57 17.77 132.90 1326.0 0.08474 0.07864 0.0869 0.07017 0.1812 0.05667 ... 23.41 158.80 1956.0 0.1238 0.1866 0.2416 0.1860 0.2750 0.08902 M
2 19.69 21.25 130.00 1203.0 0.10960 0.15990 0.1974 0.12790 0.2069 0.05999 ... 25.53 152.50 1709.0 0.1444 0.4245 0.4504 0.2430 0.3613 0.08758 M
3 11.42 20.38 77.58 386.1 0.14250 0.28390 0.2414 0.10520 0.2597 0.09744 ... 26.50 98.87 567.7 0.2098 0.8663 0.6869 0.2575 0.6638 0.17300 M
4 20.29 14.34 135.10 1297.0 0.10030 0.13280 0.1980 0.10430 0.1809 0.05883 ... 16.67 152.20 1575.0 0.1374 0.2050 0.4000 0.1625 0.2364 0.07678 M

5 rows × 31 columns

df.shape
(569, 31)

Python Package Management#

We strongly recommend installing a virtual environment to avoid module version clashes. For Mac:

The command below will create a folder called .venv which will host your virtual environment.

> python3 -m venv .venv

Activate it so that you can use it:

> source .venv/bin/activate

When done, simply deactivate your virtual environment:

> deactivate

Please look this up on Google if you’re on Windows.

To get a list of all the Python modules on your current environment, try pip list:

!pip list 
!pip install --upgrade pip

How to install multiple packages at once:

!pip install pandas matplotlib

The pipreqs Module#

In many cases, you will need to compile a list of all the modules you installed in your virtual environment for documentation, which is usually in a text file called requirements.txt.

We recommend the pipreqs module for this purpose, which is usually better than the common practice of pip freeze requirements.txt. In particular, pipreqs will avoid listing Jupyter notebooks modules, which is what you need as you won’t need these in case you just need to run the code elsewhere without Jupyter notebooks.

Simply install it via

> pip install pipreqs 

Next, save the list of your installed modules to requirements.txt via

> pipreqs . --force

The --force option above overrides any existing requirements.txt file. The dot means “this folder”. That is, you will need to run this command where your virtual environment folder .venv is, so that pipreqs picks up the correct modules.

When you need to replicate your Python environment on a different machine, create a new virtual environment and install all the modules in your requirements.txt file as below:

> pip install -r requirements.txt

This way, all the modules you need for your project will be installed with the correct version numbers.