Earth System Data Science in the Cloud

Session Overview

  • Welcome
  • Interim Check-In
  • Data Driven Science
  • What is Data Science?
  • Where We Are/Context
  • Course Goals and Objectives
  • Module Goals and Objectives
  • Course Logistics

Welcome To Module 2!

Interim Check-In


How did you apply Module 1 to your work?


What did you explore?

Where are you in your team environment?

This Module

Exploratory Data Analysis


…And Becoming Cloud Native

Data Driven Science

Data Driven Science Timeline

Data Science

Data Science Venn Diagram - Domain Expertise, Stats AI/ML, and Computer Science

Course Goals & Objectives

  1. Make the Impossible Possible
  2. 10x Performance

Course Goals & Objectives

  • Conversant and practiced in developing cloud based earth system data science workflows.

  • Comfortable developing data science products.

  • Comfortable and practiced in working effectively on interdisciplinary teams.

  • Able to rapidly pick up new skills, tools, and techniques.

Principles & Practices

Module Goals & Objectives

By the end of this Module, you will be familiar with and conversant in the following areas:

  • Reproducible Research Tools, Techniques, and Tactics.
  • The key building blocks of cloud computing.
  • Data Input/Output.
  • The basics of data manipulation.

Module Goals & Objectives

Specifically by the end of the Module, you will have accomplished the following:

  • Launched, committed to, and collaborated on development in a GitLab Repo.
  • Used Overleaf to develop publications.
  • Built and used a container.
  • Managed running containers in a cloud environment.
  • Managed credentials on AWS.
  • Deployed and used an S3 bucket, and an EC2 Instance.
  • Imported and Exported data to, from, and across cloud resources.
  • Deployed code in parallel across a single compute instance.
  • Developed a team project research question.
  • Presented an exploratory data analysis of the data used to answer that research question.

Module Outline

Days:

  1. To the Cloud!
  2. The Cloud & Containers
  3. Layers
  4. Parallel Computing
  5. Team Presentations and Next Steps

Team Project Outline

Days:

  1. Team Project Repo
  2. Team Project Check-In
    • NEED: Team Name
    • Idea Curation
    • EDA Development
  3. Team Project Work
    • Data Access
    • EDA Questions
  4. Presentation Practice
  5. Presentations

Team Project Presentations

  • 10 minutes (no more, can be less)
  • Team Name & Introductions
  • Research Question (START)
  • EDA Results
  • Everyone Participates

Session Overview

  • Welcome
  • What is Data Science?
  • What is Earth System Data Science in the Cloud?
  • Course Goals and Objectives
  • Module Goals and Objectives
  • Course Logistics

Course Logistics

Strategies for Success

This is a lot of information.

  • You would not be here if you could not handle it.
  • Be present.
  • You will not understand everything the first time. That is OK!
  • Keep a Journal of topics to return to and explore more
  • You will see each topic/idea at least 3 times on separate days
  • Ask questions
  • Invest the time now…

Final Note

We made this course for you! We want your feedback!

Please reach out anytime on Slack or at dwillett@cicsnc.org & ggraham@cicsnc.org.

Pre-Module Assessment

Pre-Module Assessment