Deep Learning Model Demo

Session Overview

  • Goals
  • Running Deep Learning

Goals

  1. Become familiar with basic constructs for building a deep learning model in Python
  2. Understand the difference between training and inference.
  3. Become familiar with pulling and using pre-trained off the shelf models.

Context:

  • Using Nvidia GPUs (G4DN instances with CUDA)

  • Using PyTorch

  • Off the shelf models come from Hugging Face and GitHub

Running Deep Learning

  1. Insect Classification for Biodiversity Monitoring
  2. Aardvark for Weather Prediction
  3. Super resolution for Downscaling


We are going to go in reverse order.

Running Deep Learning

  1. Basic architecture, training and transfer-learning inference. (3 Super Resolution)
  2. Inference using pre-trained models for MLWP applications (Transformers!). (2 Aardvark)
  3. Off the shelf models for useful computer vision tasks. (1 Insect Classification)

Getting Started

  1. Go into Coder and create a new workspace from the ESDS GPU PyTorch template.

  2. Name the workspace your-initials-esds-dl.

  3. Choose at least 200GB of storage.

  4. Keep the defaults for the rest. (4 vCPU, 16 GiB RAM, 1 NVIDIA T4 GPU; US East)

  5. Let your workspace spin up.

Setting up Environment

Once your workspace has deployed and you can access JupyterLab:


Go to the terminal and set up your AWS Credentials:

In your home directory:

vi credentials
  1. Copy and paste your credentials from the SSO Login Page

  2. Make sure that the first line is [default].

  3. Save out the file

  4. As admin, move the file to the .aws directory:

sudo mv credentials ~/.aws/

Cloning the Repo:

Clone the Module 4 GitLab Repo to your home directory.

You should see a day-3 directory.

Running the Demo

We are going to go in reverse order and start with:

3: Super Resolution for Downscaling


Taken and lightly adapted from PyTorch Examples.

3: Super Resolution for Downscaling

  1. Navigate to the 03-super-resolution directory.

  2. Run uv sync to load the environment.

  3. Activate the environment source .venv/bin/activate.

  4. Run the training:

python main.py --upscale_factor 3 --batchSize 4 --testBatchSize 100 --nEpochs 30 --lr 0.001
  1. In a new terminal window, begin watching GPU with:
watch -n 1 nvidia-smi
  1. Run the training:
python main.py --upscale_factor 3 --batchSize 4 --testBatchSize 100 --nEpochs 30 --lr 0.001 --cuda
  1. Run the inference:
python super_resolve.py --input_image dataset/BSDS300/images/test/16077.jpg --model model_epoch_30.pth --output_filename test_out.png --cuda
  1. Run the inference:
python super_resolve.py --input_image persiann-08262024.jpg --model model_epoch_30.pth --output_filename persiann_out.png --cuda

2 Aardvark

End-to-end data-driven weather prediction


From the abstract:

Aardvark Weather, an end-to-end data-driven weather prediction system, ingests observations and produces global gridded forecasts and local station forecasts. The global forecasts outperform an operational NWP baseline for multiple variables and lead times. The local station forecasts are skillful up to ten days lead time, competing with a post-processed global NWP baseline and a state-of-the-art end-to-end forecasting system with input from human forecasters. End-to-end tuning further improves the accuracy of local forecasts. Our results show that skillful forecasting is possible without relying on NWP at deployment time, which will enable the full speed and accuracy benefits of data-driven models to be realised.

Running Aardvark

Materials taken and adapted from Aardvark Zenodo.


  1. Navigate to the 02-aardvark directory.

  2. Copy the necessary files from S3:

aws s3 cp s3://esds-mod4-demo/02-aardvark/ . --recursive
  1. Run uv sync to set up environment.

  2. Activate the environment source .venv/bin/activate.

  3. Set up notebook to talk to environment:

python -m ipykernel install --user --name=coder --display-name="Python (Aardvark Environment)"
  1. Open notebooks/forecast_demo.ipynb.

  2. Ensure the Aardvark Environment Kernel is selected.

  3. Have fun!

Insects!

Using Segment Anything and Self Supervised Learning to understand insect diversity.


  1. Navigate to the 01-insect-images directory.

  2. Pull insect images from AWS s3:

mkdir insect-inputs
aws s3 cp s3://esds-mod4-demo/01-insect-inputs/ insect-inputs/ --recursive
  1. Run uv sync to load environment

  2. Activate environment: source .venv/bin/activate

  3. Run python scripts in order and monitor outputs.