HHS Python program Fall 2017

checkmark

Build your skills in Python

checkmark

Learn how to implement powerful data science methods

checkmark

Create a capstone project to move your team forward

This course is part of the in-person / live-streaming delivery of the HHS Python program, and is not meant as a standalone, asynchronous course.

Python 8 Week Bootcamp Curriculum

Please note that this is subject to change

October 24 – December 14

Week 1

Capstone target: Discuss potential capstone projects

I – Introduction to Python

October 24

  • Introduction to Python and IDEs
  • Variables, data types, operations, loops, conditionals, functions, list comprehension
  • Data structures (tuples, lists, dictionaries)

II – Introduction to Data Mining and Visualization

October 26

  • Data Mining Intro with Python, Pandas, NumPy
  • Introduction to data visualization
  • Customizing graphs with matplotlib
  • Data visualization best practices

 

 

Week 2

Capstone target: Students will select their capstone project topic

I – Introduction to SQL and connecting Python to SQL

October 31

  • Overview of SQL and CRUD
  • Working with data using SQL statements
  • Working with SQLite in Python

II – Building Web Applications with Python

November 2

  • Introduction to web applications
  • Introduction to HTML and CSS
  • Building simple applications with Flask

 

 

 

 

 

 

 

 

 

Week 3

Capstone target: Students will identify and compile the data they need for their project

I – Introduction to Text Mining: Cleaning & Pre-processing

November 7

  • Applications and intuition of text mining
  • Capitalization
  • Punctuation
  • Stemming

II – Text Mining: A Bag of Words Approach

November 9

  • Putting it all together
  • Bag of words
  • n-grams
  • Word clouds

 

 

Week 4

Capstone target: Students will present preliminary data visualizations based on their capstone

I – Web Scraping and Automating Data Cleaning

November 14

  • Web scraping in Python
  • Possibly applying to text analysis?
  • Building functions for automating data analyses and workflows

II – Unsupervised Machine Learning: Introduction to Clustering

November 16

  • Use cases and intuition of clustering
  • k-means clustering on multi-dimensional data
  • Evaluating the quality of clustering
  • Determining the right number of clusters to use
  • Pitfalls of clustering

 

 

 

 

 

 

 

 

Week 5

Capstone target: Students will continue to work on capstone project data analyses and visualizations

 

Note there is only one lecture this week, as Thursday the 23rd is Thanksgiving.

I – Fundamentals of Statistics

November 21

  • Mean, Variance, standard deviation, quantiles
  • Covariance and correlation
  • R-squared
  • Normal distribution and bell curves
  • t-tests

Week 6

Capstone target: Students will give peer feedback on data analysis results and visualizations

 

Note: we have three lectures this week to make up for the lecture we lost to Thanksgiving.

I – Supervised Machine Learning: Introduction to Regression and Evaluation

November 28

  • Linear Regression (slope, y-intercepts, variable interactions)
  • Distribution of errors: Q-Q plot
  • Train/Test/Validation
  • R2 and adjusted R2
  • p-values and t-test
  • F-test and F-distribution

II – Supervised Machine Learning: Intermediate Regression

November 29

  • Multiple regression
  • Multicollinearity test
  • Heteroscedasticity test
  • Model selection: stepwise regression
  • Akaike Information Criterion (AIC)
  • Confidence intervals

III – Supervised Machine Learning: Introduction to Classification

November 30

  • Logistic Regression
  • Confusion matrices, misclassification rates, and overfitting
  • Base line errors and lift
  • ROC, AUC

Week 7

Capstone target: Students will finalize their presentations based on their conclusions from their analyses

I – Introduction to Classification: Decision Trees

December 5

  • Continuation of topics from November 30
  • Decision Trees

II – Intermediate Classification, part 1

December 7

  • Random Forests
  • Tuning Random Forests

 

 

Week 8

Capstone target: Students will present capstone projects to the class

I – Intermediate Classification, part 2

December 12

  • K Nearest Neighbors
  • Naïve Bayes

II – Introduction to Neural Networks and Deep Learning (conceptual)

December 14

  • Brief non-technical lecture about neural networks
  • Neuron, Layer, Perceptron, Multilayer network
  • Demos on FF, MLP and Deep Networks using TensorFlow

Course Forum