Earth System Data Science in the Cloud

Session Overview

  • Welcome
  • Interim Check-In
  • Data Driven Science
  • What is Data Science?
  • Where We Are/Context
  • Course Goals and Objectives
  • Module Goals and Objectives
  • Course Logistics

Welcome To Module 2!

Interim Check-In


How did you apply Module 1 to your work?


What did you explore?

Where are you in your team environment?

This Module

Exploratory Data Analysis


…And Becoming Cloud Native

Data Driven Science

Data Science

Where We Are/Context

Course Goals & Objectives

  1. Make the Impossible Possible
  2. 10x Performance

Course Goals & Objectives

  • Conversant and practiced in developing cloud based earth system data science workflows.

  • Comfortable developing data science products.

  • Comfortable and practiced in working effectively on interdisciplinary teams.

  • Able to rapidly pick up new skills, tools, and techniques.

Principles & Practices

Module Goals & Objectives

By the end of this Module, you will be familiar with and conversant in the following areas:

  • Reproducible Research Tools, Techniques, and Tactics.
  • The key building blocks of cloud computing.
  • Data Input/Output.
  • The basics of data manipulation.

Module Goals & Objectives

Specifically by the end of the Module, you will have accomplished the following:

  • Launched, committed to, and collaborated on development in a GitLab Repo.
  • Used Overleaf and linked Overleaf to GitLab.
  • Built and used a container.
  • Managed running containers in a cloud environment.
  • Managed credentials on AWS.
  • Deployed and used an S3 bucket, and an EC2 Instance.
  • Imported and Exported data to, from, and across cloud resources.
  • Deployed code in parallel across a single compute instance.
  • Developed a team project research question.
  • Presented an exploratory data analysis of the data used to answer that research question.

Module Outline

Days:

  1. To the Cloud!
  2. The Cloud & Containers
    • Guest Speaker
  3. Layers
    • Containers, IO, OverLeaf
  4. Parallel Computing
  5. Team Presentations and Next Steps

Team Project Outline

Days:

  1. Team Project Repo
  2. Team Project Check-In
    • NEED: Team Name
    • Idea Curation
    • EDA Development
  3. Team Project Work
    • Data Access
    • EDA Questions
  4. Presentation Practice
  5. Presentations

Team Project Presentations

  • 10 minutes (no more, can be less)
  • Team Name & Introductions
  • Research Question (START)
  • EDA Results
  • Everyone Participates

Session Overview

  • Welcome
  • What is Data Science?
  • What is Earth System Data Science in the Cloud?
  • Course Goals and Objectives
  • Module Goals and Objectives
  • Course Logistics

Course Logistics

Strategies for Success

This is a lot of information.

  • You would not be here if you could not handle it.
  • Be present.
  • You will not understand everything the first time. That is OK!
  • Keep a Journal of topics to return to and explore more
  • You will see each topic/idea at least 3 times on separate days
  • Ask questions
  • Invest the time now…

Final Note

We made this course for you! We want your feedback!

Please reach out anytime on Slack or at dwillett@cicsnc.org & ggraham@cicsnc.org.

Team Pre-Project Assessment

Pre-Project Assessment