Introduction to Jupyter Notebooks#
In this tutorial, we discuss some basic tasks to get your Jupyter notebooks up and running on your computer.
Spellchecker: LanguageTool Browser Extension#
Last thing you want on your Jupyter notebooks is typos. Jupyter notebooks have some spellchecker extensions, but it gets problematic installing them on different software environments. For spellchecking, we actually recommend a free browser-based extension called LanguageTool here. This extension not only checks for typos in your notebooks, but also anything you type within your browser as an extra bonus. Sweet!
How to check for Python and module versions#
Within your shell, you can issue the command:
> python --version
Sometimes python’s executable command name will be “python3”, so you might need:
> python3 --version
Within the Jupyter notebooks environment, to issue a system command, you will need to an exclamation mark (“!”) in front as shown below, which will have the same effect:
!python3 --version
Python 3.11.9
To check for version number of a Python module, you can view its __version__
attribute as below.
import numpy as np
np.__version__
'2.0.0'
How to read CSV files#
import pandas as pd
Assuming that your file is under a directory called data
:
df = pd.read_csv("./data/breast_cancer_wisconsin.csv")
df.head()
mean_radius | mean_texture | mean_perimeter | mean_area | mean_smoothness | mean_compactness | mean_concavity | mean_concave_points | mean_symmetry | mean_fractal_dimension | ... | worst_texture | worst_perimeter | worst_area | worst_smoothness | worst_compactness | worst_concavity | worst_concave_points | worst_symmetry | worst_fractal_dimension | diagnosis | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 17.99 | 10.38 | 122.80 | 1001.0 | 0.11840 | 0.27760 | 0.3001 | 0.14710 | 0.2419 | 0.07871 | ... | 17.33 | 184.60 | 2019.0 | 0.1622 | 0.6656 | 0.7119 | 0.2654 | 0.4601 | 0.11890 | M |
1 | 20.57 | 17.77 | 132.90 | 1326.0 | 0.08474 | 0.07864 | 0.0869 | 0.07017 | 0.1812 | 0.05667 | ... | 23.41 | 158.80 | 1956.0 | 0.1238 | 0.1866 | 0.2416 | 0.1860 | 0.2750 | 0.08902 | M |
2 | 19.69 | 21.25 | 130.00 | 1203.0 | 0.10960 | 0.15990 | 0.1974 | 0.12790 | 0.2069 | 0.05999 | ... | 25.53 | 152.50 | 1709.0 | 0.1444 | 0.4245 | 0.4504 | 0.2430 | 0.3613 | 0.08758 | M |
3 | 11.42 | 20.38 | 77.58 | 386.1 | 0.14250 | 0.28390 | 0.2414 | 0.10520 | 0.2597 | 0.09744 | ... | 26.50 | 98.87 | 567.7 | 0.2098 | 0.8663 | 0.6869 | 0.2575 | 0.6638 | 0.17300 | M |
4 | 20.29 | 14.34 | 135.10 | 1297.0 | 0.10030 | 0.13280 | 0.1980 | 0.10430 | 0.1809 | 0.05883 | ... | 16.67 | 152.20 | 1575.0 | 0.1374 | 0.2050 | 0.4000 | 0.1625 | 0.2364 | 0.07678 | M |
5 rows × 31 columns
df.shape
(569, 31)
Python Package Management#
We strongly recommend installing a virtual environment to avoid module version clashes. For Mac:
The command below will create a folder called .venv
which will host your virtual environment.
> python3 -m venv .venv
Activate it so that you can use it:
> source .venv/bin/activate
When done, simply deactivate your virtual environment:
> deactivate
Please look this up on Google if you’re on Windows.
To get a list of all the Python modules on your current environment, try pip list:
!pip list
!pip install --upgrade pip
How to install multiple packages at once:
!pip install pandas matplotlib
The pipreqs Module#
In many cases, you will need to compile a list of all the modules you installed in your virtual environment for documentation, which is usually in a text file called requirements.txt
.
We recommend the pipreqs
module for this purpose, which is usually better than the common practice of pip freeze requirements.txt
. In particular, pipreqs
will avoid listing Jupyter notebooks modules, which is what you need as you won’t need these in case you just need to run the code elsewhere without Jupyter notebooks.
Simply install it via
> pip install pipreqs
Next, save the list of your installed modules to requirements.txt
via
> pipreqs . --force
The --force
option above overrides any existing requirements.txt
file. The dot means “this folder”. That is, you will need to run this command where your virtual environment folder .venv
is, so that pipreqs picks up the correct modules.
When you need to replicate your Python environment on a different machine, create a new virtual environment and install all the modules in your requirements.txt file as below:
> pip install -r requirements.txt
This way, all the modules you need for your project will be installed with the correct version numbers.