AImageLab SRV

Table of Contents

Using Python


When using Python on AImageLab-SRV, you are working within a High Performance Computing (HPC) environment composed of various computing nodes with diverse hardware configurations. This setup can introduce complexities that differ from running Python on a local machine. Thus, it’s important to understand that what works on your local system might not directly translate to our HPC environment.

Python on AImageLab-SRV

đź’ˇ Note: The anaconda module is loaded by default on AImageLab-SRV. You do not need to manually load it, but you should be aware of how to manage your Python environment effectively.

Python Setup

To check the Python version you are using and its path, you can use:

which python

The default Anaconda environment is automatically available. If you need to install additional packages or use different versions, you can either create a Virtual Environment and install packages through Anaconda or pip, or use pip with the --user flag to install packages locally in your home directory.

Creating Virtual Environments with Anaconda

Creating virtual environments with Anaconda is straightforward and is the recommended approach:

  1. Create a Virtual Environment:

    conda create --name myenv python=3.x

  2. Activate the Environment:

    conda activate myenv

  3. Install Packages:

    Use conda install or pip install within the activated environment to add packages as needed.

Creating Virtual Environments with venv

If you prefer or need to use Python’s venv module, follow these steps:

Creating a Virtual Environment

python3 -m venv /path/to/your/venv_name
source /path/to/your/venv_name/bin/activate

Replace /path/to/your/venv_name with the desired location and name for your virtual environment. We recommend creating environments in your project directory to manage space efficiently.

Installing Packages

Once the virtual environment is activated, use pip to install packages:

source /path/to/your/venv_name/bin/activate
pip install package_name

Note that certain packages requiring complex dependencies or GPU support might be better managed through Anaconda.

Using Pip outside of a Virtual Environment

If you prefer to use pip without a Virtual Environment for installing packages, ensure you use the --user flag to install them in your home directory:

pip install --user package_name

Be aware that using pip outside of a managed environment (like Anaconda or venv) might lead to compatibility issues, and it is generally recommended to use Anaconda for package management.

Using Anaconda in SLURM Batch Jobs

When submitting a batch job using sbatch, you need to activate the Anaconda environment in your job script. Add the following line to your batch script to ensure Conda is initialized:

. /usr/local/anaconda3/etc/profile.d/conda.sh

Then, activate your desired environment with:

conda activate myenv

Jupyter Kernels

To use a specific environment with Jupyter, you need to set up a kernel for it:

  1. Load Required Modules:

    module load anaconda

  2. Install ipykernel in the Environment:

    conda activate myenv; pip install ipykernel

    Replace myenv with your environment name.

Accessing Jupyter Notebooks

Since all nodes in AImageLab-SRV are behind a firewall, Jupyter notebooks may run on different nodes from the login nodes, and external access is restricted. To access Jupyter notebooks, follow these steps:

  1. Submit a Jupyter Job:

    If you are running Jupyter notebooks on a compute node, submit your job script as usual.

  2. Setup Port Forwarding:

    You need to set up port forwarding from your local machine to the node where Jupyter is running. First, find out which node is running Jupyter. This information can usually be obtained from your job submission output or log files.

  3. Establish an SSH Tunnel:

    Use SSH tunneling to forward a port from your local machine to the compute node where Jupyter is running. Replace compute-node with the actual node name:

    ssh -L 8888:compute-node:8888 your_username@ailb-login-02.ing.unimore.it

    or

    ssh -L 8888:compute-node:8888 your_username@ailb-login-03.ing.unimore.it

  4. Start Jupyter on the Compute Node:

    After establishing the SSH tunnel, start Jupyter on the compute node where your job is running:

    jupyter notebook --no-browser --port=8888

  5. Access Jupyter Locally:

    Open a web browser and navigate to http://localhost:8888 to access your Jupyter notebooks.

Python for AI and Machine Learning

For AI and machine learning packages like PyTorch, using Anaconda is preferred for managing dependencies and compatibility. However, pip will also work for installing these packages directly.

Common Issues

  • Jupyter Kernel Not Connecting: Ensure you created the kernel after activating the correct Anaconda environment or virtual environment. Verify that you have correctly set up port forwarding.

  • I’m trying to download a model from huggingface but running out of space.: You can set new cache directories when running from_pretrained() to download your model. You can also set paths in your job scripts to change your cache directories.

If you encounter issues or have specific questions, please consult our detailed documentation or reach out for support.

Last updated: August 13, 2024