Data 102: Data, Inference, and Decisions

Course Description

UC Berkeley Data 102 develops the probabilistic foundations of inference in data science, and builds a comprehensive view of the modeling and decision-making life cycle in data science including its human, social, and ethical implications. Topics include: frequentist and Bayesian decision-making, permutation testing, false discovery rate, probabilistic interpretations of models, Bayesian hierarchical models, basics of experimental design, confidence intervals, causal inference, Thompson sampling, optimal control, Q-learning, differential privacy, clustering algorithms, recommendation systems and an introduction to machine learning tools including decision trees, neural networks and ensemble methods.

Offerings

Spring 2025 (Ramesh Sridharan, Peng Ding)
Fall 2024 (Ramesh Sridharan, Alexander Strang)
Spring 2024 (Ramesh Sridharan, Alexander Strang)
Fall 2023 (Ramesh Sridharan, Aditya Guntuboyina)
Spring 2023 (Ramesh Sridharan, Eaman Jahani)
Fall 2022 (Ramesh Sridharan, Jacob Steinhardt)
Spring 2022 (Ramesh Sridharan, Nika Haghtalab)
Fall 2021 (Ramesh Sridharan, Jacob Steinhardt)
Spring 2021 (Ramesh Sridharan, Yan Shuo Tan)
Fall 2020 (Michael Jordan, Jacob Steinhardt)
Spring 2020 (Jacob Steinhardt, Moritz Hardt)
Fall 2019 (Michael Jordan, Fernando Perez)

Prerequisites

Principles and Techniques of Data Science: Data 100 covers important computational and statistical skills that will be necessary for Data 102.
Probability: Data 140, EECS 126, STAT 134, IndEng 172, or Math 106. Data 140 and EECS 126 are preferred. These courses cover the probabilistic tools that will form the underpinning for the concepts covered in Data 102.
Math: Math 54, Math 56, Math 110, Stat 89A, Physics 89, or both of EECS 16A and EECS 16B. We will need some basic concepts like linear operators, eigenvectors, derivatives, and integrals to enable statistical inference and derive new prediction algorithms.