All of Statistics
… in a half hour (give or take)
Foundational Language for Science
Counting
Bayesian = Probability of Hypothesis + Probability of Data
Frequentist = Probability of Data
Understanding: How does something work?
Prediction: What will happen? And…
Do we need to understand something in order to predict it?
Approach:
Apply a mechanistic model
Multivariate Regression
Differential Equations
Goal: Understand the relative weighting of parts and how they interact.
Result: Estimation of how a natural phenomena works.
Example:
Research question: How does insolation drive sea surface temperature?
Frequentist Approach: Compare measurements of insolation and sea surface temperature and establish a coefficient of relation based on observational data.
Bayesian Approach: Based on previous knowledge and studies of this relationship, update the distribution of the relation with new observational data to produce an updated distribution of the coefficient of relationship.
Approach:
Apply a model and estimate it’s predictive performance (not necessarily it’s goodness of fit)
Pretty much any model
Including ‘Black Box’ Models
Goal: Reproducibly produce accurate and precise estimates of desired outcome given inputs.
Result: A tool to map inputs to outputs with known performance.
Example:
Research question: How does insolation drive sea surface temperature?
Build models on subsets of data that link insolation with sea surface temperature.
Measure the accuracy and precision of these models in their linking ability.
Keep the better models and return the model as a mapping function with its performance metrics
“All models are wrong, some are useful.”
Earth System Data Science in the Cloud