Jumping In!…
Use the Module 4 Repo
- Open Cloud9
- Clone the Module 4 Repo
- Navigate to 01-Multiple-Models
- Run
run-jupyter.sh
Why Manage Multiple Models?
- Diversity of predictions: Different models capture different patterns in data.
- Robustness: A combination of models can be more robust to overfitting. (Or you can REALLY overfit your data…)
- Performance: Sometimes individual models are weak, but together they are strong.
Model Management
Let the computer do it for you!…
- Use Sci-Kit Learn Pipelines and swap out models.
Model Assessment
What should we look for?
Model Assessment
What should we look for?
- In-Sample (CV) and Out of Sample Performance
- Where does the model perform well (and where does it fail)?
Model Ensembling
Definition: Combining predictions from multiple models.
Methods:
- Bagging
- Boosting
- Stacking
AutoML
- Simplifies model selection and hyperparameter tuning.
- Automatically handles preprocessing (and feature engineering)
- Tries many models
- Optimizes model ensembles