Data 102: Data, Inference, and Decisions

UC Berkeley

This course develops the probabilistic foundations of inference in data science, and builds a comprehensive view of the modeling and decision-making life cycle in data science including its human, social, and ethical implications. Topics include: frequentist and Bayesian decision-making, permutation testing, false discovery rate, probabilistic interpretations of models, Bayesian hierarchical models, basics of experimental design, confidence intervals, causal inference, robustness, Thompson sampling, optimal control, Q-learning, differential privacy, fairness in classification, recommendation systems and an introduction to machine learning tools including decision trees, neural networks and ensemble methods.

This class is listed as DATA C102.

Offerings

Prerequisites

While we are working to make this class widely accessible, we currently require the following (or equivalent) prerequisites :

  1. Principles and Techniques of Data Science: Data 100 covers important computational and statistical skills that will be necessary for Data 102.

  2. Probability: Data 140, EECS 126, STAT 134, IEOR 172, or Math 106. Data 140 and EECS 126 are preferred. These courses cover the probabilistic tools that will form the underpinning for the concepts covered in Data 102.

  3. Math: Math 54, Math 56, Math 110, both EE 16A and EE 16B, STAT 89a, or Physics 89. We will need some basic concepts like linear operators, eigenvectors, derivatives, and integrals to enable statistical inference and derive new prediction algorithms.