
The Jupyter software suite is a powerful tool to work on pre-production data analysis and machine learning prototyping. However, when developing a software solution to solve real world problems, I seek to apply good Python development practices from the get-go: one of these standard is setting up a proper development environment ensuring isolation, reproducibility, and frictionless CI/CD integration down the road.
That means, I want to work within isolated Python environments, even in my Jupyter notebook / lab sandbox.
In my experience mentoring data science trainees through various Machine and Deep learning techniques and use cases, each requiring setting up dedicated Python environments, I have sometimes faced difficuties with exposing my Python environments for Jupyter.
Below I explain the logic working under the hood, and provide a recipe to avoid common pitfalls.
Jupyter under the hood: the IPython and ipykernel packages
Section titled [object Undefined]Jupyter is an umbrella term refering to a broad software suite which has been developed by the Jupyter community since 20141. It comprises a variety of tools for writing computer code, embedding documentation, building beautiful data visualizations and interacting with software components.
In the context of developing or running Python code, the main component doing the heavy lifting is IPython.
When starting a Jupyter client (e.g., a Jupyter notebook or a Jupyter Lab IDE), IPython set up a connection2 between this client and a Python kernel, essentially decoupling the client, which plans and consumes the results of Python commands, from the Python kernel which executes the code. This allows for example, several clients to be connected to the same underlying Python process (for example with jupyter console).
Setting up an optimized Jupyter ecosystem thus consists in installing Jupyter globally, i.e., in a user- or system-wide Python environment, and declaring a kernel in each of the environments one wishes to access from within a Jupyter component.
Python kernels are created via the ipykernel module:
python -m ipykernel install --user --name <KERNEL_NAME>
This command creates a Json
file holding the configuration of the kernel named <KERNEL_NAME>.
One can access the list of kernels via jupyter CLI:
jupyter kernelspec list
which prints a list of kernels and the location of their configuration files. On my Mac, it looks like this:
Available kernels:
dl /Users/benroland/Library/Jupyter/kernels/dl
hf /Users/benroland/Library/Jupyter/kernels/hf
langchain /Users/benroland/Library/Jupyter/kernels/langchain
latentspaceexplorer /Users/benroland/Library/Jupyter/kernels/latentspaceexplorer
statsplotly /Users/benroland/Library/Jupyter/kernels/statsplotly
python3 /Users/benroland/.pyenv/versions/3.11.5/share/jupyter/kernels/python3
In the list above, we can see I have created several kernels whose configurations are stored in my user-scoped /Users/benroland/Library/
folder.
python3
is the default kernel pointing to my user-wide pyenv-managed Python environment.
Kernel configurations are specified in a kernel.json
file, which can be read from the terminal:
nano /Users/benroland/Library/Jupyter/kernels/statsplotly/kernel.json
This file looks like below:
{
"argv": [
"/Users/benroland/python_projects/statsplotly/.venv/bin/python",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"display_name": "statsplotly",
"language": "python",
"metadata": {
"debugger": true
}
}
In this example, the statsplotly
kernel is launched by calling the ipykernel_launcher
module from the virtual environment. This ensures that all libraries installed in this environment are available when loading the kernel.
Setting up a Jupyter kernel in a Python virtual environment
Section titled [object Undefined]From a Poetry-managed environment
Section titled [object Undefined]To supervise my Python environments, I use Poetry, which I argue3 is the go-to solution for managing Python development environments.
Setting up a Jupyter kernel within a Poetry-managed virtual environment is as easy as:
# Add ipykernel to your Poetry environment
poetry add ipykernel
If the solution won’t run inside Jupyter in production, it is even better to restrict that dependency to the “dev” group requirements:
# Add ipykernel to the "dev" requirements group
poetry add ipykernel --group dev
Then, to install the kernel:
# Install the kernel
poetry run python -m ipykernel install --user --name=<KERNEL_NAME>
This will make the Poetry environment available as a Jupyter kernel.
From a classic venv-managed environment
Section titled [object Undefined]If one prefers sticking with python native venv module:
# Assuming the virtual env resides in a .venv directory
source .venv/bin/activate # Activate the environment
python -m ipykernel install --user --name=<KERNEL_NAME>
or equivalently:
# Call the ipykernel from the venv directly
.venv/bin/python -m ipykernel install --user --name=<KERNEL_NAME>
This will make the virtual environment available as a Jupyter kernel.
Avoid being tricked by IPython
Section titled [object Undefined]Running:
python -m ipykernel ...
is equivalent to running:
ipython kernel install ...
Be careful though ! IPython called from a virtual environment where it is not installed will fall back to the user-level defined Python executable, and hence configure a kernel pointing to the global Python environment.
This is illustrated by the command below:
# Calling IPython from an environment where it is not installed
poetry run ipython
which open an IPython in our console, printing:
/Users/benroland/.pyenv/versions/3.11.5/lib/python3.11/site-packages/IPython/core/interactiveshell.py:913: UserWarning: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
warn(
Python 3.11.5 (main, Nov 3 2023, 11:46:15) [Clang 15.0.0 (clang-1500.0.40.1)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.17.2 -- An enhanced Interactive Python. Type '?' for help.
IPython is actually detecting our intent and issues a warning about this situation.
Calling in our IPython console:
In [3]: import sys
...: sys.executable
confirms we are using the user-wide Python executable:
Out[3]: '/Users/benroland/.pyenv/versions/3.11.5/bin/python3.11'
Thus, using this executable to set up our kernel:
poetry run ipython kernel install --user --name=mltoolbox>
will create the following kernel.json
configuration file:
{
"argv": [
"/Users/benroland/.pyenv/versions/3.11.5/bin/python3.11",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"display_name": "mltoolbox",
"language": "python",
"metadata": {
"debugger": true
}
}
unbekownst to us, until we encounter a ModuleNotFoundError
when trying to import the relevant libraries in a Jupyter notebook.
My recommandation is thus to directly call the ipykernel
module, for example to create a kernel named mltoolbox
:
poetry run python -m ipykernel install --user --name=mltoolbox>
which, in case of a missing IPython
, will raise:
/Users/benroland/python_projects/mltoolbox/.venv/bin/python: No module named ipykernel
early on, encouraging us to install ipykernel
in our virtual environment.
Wrapping it up
Section titled [object Undefined]A reliable procedure thus boils down to:
1. Making sure Jupyter is installed globally
Section titled [object Undefined]In a shell:
jupyter --version
should print in the console something like:
IPython : 8.17.2
ipykernel : 6.26.0
ipywidgets : 8.1.1
jupyter_client : 8.5.0
jupyter_core : 5.5.0
jupyter_server : 2.9.1
jupyterlab : 4.0.8
nbclient : 0.8.0
nbconvert : 7.10.0
nbformat : 5.9.2
notebook : 7.0.6
qtconsole : 5.4.4
traitlets : 5.13.0
2. Creating, activating, and installing ipykernel
in your project virtual environment
Section titled [object Undefined]with Poetry:
poetry init # Initialize a Poetry environment
poetry add ipykernel --group dev # Add the ipykernel module to the "dev" requirements group
with the venv module:
python -m venv .venv # Create the venv in .venv
source .venv/bin/activate # Activate the environment
pip install ipykernel # Add the ipykernel module to the environment
3. Creating a new Jupyter kernel inside your virtual environment
Section titled [object Undefined]with Poetry:
poetry run python -m ipykernel install --user --name=<KERNEL_NAME>
with the venv module:
.venv/bin/python -m ipykernel install --user --name=<KERNEL_NAME>
4. Enjoying your isolated environment in Jupyter !!
Section titled [object Undefined]jupyter lab --MultiKernelManager.default_kernel_name=<KERNEL_NAME>
or for a notebook:
jupyter notebook --MultiKernelManager.default_kernel_name=<KERNEL_NAME>
The --MultiKernelManager.default_kernel_name
option will preselect your newly created kernel when creating a new notebook. A little overkill never hurts.