Syllabus

Table of Contents

About the Course

This course develops the probabilistic foundations of inference in data science. It builds a comprehensive view of the decision-making and modeling life cycle in data science, including its human, social, and ethical implications. Topics include: frequentist and Bayesian decision-making, permutation testing, false discovery rate, probabilistic interpretations of models, Bayesian hierarchical models, basics of experimental design, confidence intervals, causal inference, robustness, Thompson sampling, optimal control, Q-learning, differential privacy, fairness in classification, recommendation systems and an introduction to machine learning tools including decision trees, neural networks and ensemble methods.

This class is listed as Data 102.

Prerequisites

While we are working to make this class widely accessible, we currently require the following (or equivalent) prerequisites :

  1. Principles and Techniques of Data Science: Data 100 covers important computational and statistical skills that will be necessary for Data 102.

  2. Probability: Data 140, EECS 126, STAT 134, IEOR 172, or Math 106. Data 140 and EECS 126 are preferred. These courses cover the probabilistic tools that will form the underpinning for the concepts covered in Data 102.

  3. Math: Math 54, Math 56, Math 110, both EE 16A and EE 16B, STAT 89a, or Physics 89. We will need some basic concepts like linear operators, eigenvectors, derivatives, and integrals to enable statistical inference and derive new prediction algorithms.

Please consult the Resources page for additional resources for reviewing prerequisite material.

Course Components

Lectures

Lectures will be held in-person Tuesdays and Thursdays from 12:30 PM - 2 PM in Li Ka Shing 245. Recordings will be made available on the course website within 24 hours.

Discussion

Discussion section will be held on Wednesdays, led by your GSIs. These sections will cover important problem-solving skills that bridge the concepts in lectures with the skills you’ll need to apply the ideas on the homework and beyond.

A few weeks into the semester, we will start the Supplemental Sections. A typical Supplemental Section consists of prerequisite content you might’ve forgotten or missed from Data 100/140 and “catch up” content to reinforce material you may have missed in the previous week. Details will be posted on Ed. Attending these sections is optional.

Lab

Labs will be held on Mondays by GSIs. You will be working on lab assignments with your GSI in these sections. You can complete the assignments on your own time, but you are highly encouraged to attend lab sessions to work with your classmates and get help from the staff. Help will be limited on Ed and office hours because of this.

Homeworks

Homework assignments are released every other week on Fridays and due two Fridays after. These assignments are designed to help students develop an in-depth understanding of both the theoretical and practical aspects of ideas presented in lectures. They contain both math and coding tasks.

  • All homeworks must be submitted to Gradescope by their posted deadlines.
  • Each assignment will include detailed instructions on how to submit your work for grading. It is the student’s responsibility to read these carefully and ensure that their work is submitted correctly. Assignment accommodations will not be granted in cases where students have mis-submitted their work (for example, by submitting to the wrong portal, submitting only part of an assignment, or forgetting to select pages).
  • The primary form of support students will have for homeworks are office hours and Ed.

Vitamins

Vitamins are weekly short Gradescope assignments to check that you are keeping up with lectures. They will be released on Thursdays after lecture and due on Sundays.

Exams

There will be two midterms in this class:

  • Midterm I on February 27th, 7 PM - 9 PM
  • Midterm II on April 16th, 7 PM - 9 PM

There will not be a final exam.

All exams must be taken in-person. You must sit the midterms at the specified time: if you have a conflict, please contact course staff ASAP at data102@berkeley.edu. We will not accept any conflicts after the drop deadline. If you have extended time accommodations for tests from DSP, please make sure that you have enough available hours around the times of the regularly scheduled exams.

Final Project

At the end of the semester, you will apply the knowledge you learned in this class on a real-world dataset to complete a final project. You will be working in groups of 4.

More details will be announced on Ed closer to the end of term.

Grading Policies

Grading Scheme

Grades will be assigned using the following weighted components:

CategoryPercentageDetails
Vitamins5%Drop 2 lowest scores
Homeworks20%No drop; 5 slip days
Labs15%Drop 2 lowest scores
Midterm 120% 
Midterm 220% 
Final project20% 

Grading Criteria

  • Homework will be graded on completion and correctness. No assignment may be dropped, but we have a slip day policy (see below).
  • Lab assignments will be graded on completion and correctness, but all test cases for autograded questions will be public. Your two lowest lab scores will be dropped.
  • When submitting assignments on Gradescope, you must match each page to the corresponding question on Gradescope. If you fail to do so, you may not receive credit for your work!
  • A grading rubric and more details regarding the final project will be released later in the semester.

Regrade Requests

  • After each assignment is graded, course staff will post the deadline for regrade requests for that assignment on Ed.
  • To ensure that our grading team is not overworked, regrade requests for each assignment must be submitted before the deadline (except in cases of emergencies).
  • Note: When you submit a regrade request, we will take a fresh look at the question, so it is possible that you will receive a grade lower that what you originally received.

Slip Days

Each student gets an extension budget of 5 total slip days. You can use the extension on homework assignments only (not lab assignments, vitamins, or the final project) during the semester. Some important notes on slip days:

  • Do not plan to use your slip days: we’re providing them for unforeseen circumstances.
  • Slip days are self-serve: we’ll apply them to your assignments automatically.
  • Slip days are full days, not hours. We round up, so if you are 1 hour late, then 1 slip day will be used. (Why? We’d rather you get some sleep and make an attempt to finish the assignment the next day instead of staying up to micromanage hours.)
  • After you have used your slip-time budget, any assignment handed in late will be marked off 20% per day late (rounded up to the nearest integer number of days).
  • No assignment will be accepted more than 5 days late.

Extenuating Circumstances

We recognize that our students come from varied backgrounds and have widely-varying experiences. If you encounter extenuating circumstances at any time in the semester, please do not hesitate to let us know. The sooner we are made aware, the more options we have available to us to help you.

The Extenuating Circumstances Form is for any circumstances that cannot be resolved via slip days and drops. Within two business days of filling out the form, a course staff will reach out to you and provide a space for conversation, as well as to arrange course/grading accommodations as necessary. For more information, please email data102@berkeley.edu.

We recognize that at times, it can be difficult to manage your course performance — particularly in such a huge course, and particularly at Berkeley’s high standards. Sometimes emergencies just come up (personal health emergency, family emergency, etc.). The Extenuating Circumstances Form is meant to lower the barrier to reaching out to us, as well as build your independence in managing your academic career long-term. So please do not hesitate to reach out.

Note that extenuating circumstances do not extend to the following:

  • Logistical oversight, such as Datahub/Gradescope tests not passing, submitting only one portion of the homework, forgetting to save your notebook before exporting, submitting to the wrong assignment portal, or not properly tagging pages on Gradescope. It is the student’s responsibility to identify and resolve these issues in advance of the deadlines.
  • Workload-related issues. It is the student’s responsibility to manage their other coursework and extracurricular commitments. We will not grant accommodations for these cases; instead, please use drops or slip days to cushion these issues.
  • Requests made after the assignment deadlines. Please make sure to submit a request before the assignment is due.

Finally, simply submitting a request does not guarantee you will receive an extension. Even if your work is incomplete, please submit before the deadline so you can receive credit for the work you did complete.

DSP Accommodations

If you are registered with the Disabled Students’ Program (DSP) you can expect to receive an email from us during the first week of classes confirming your accommodations. Otherwise, email data102@berkeley.edu. DSP students who receive approved assignment accommodations will have a 2-day extension on homeworks and 1-day extension on labs and vitamins. Please note that any extension, plus slip days, cannot exceed 5 days. DSP students can submit assignment extension accommodation requests via the Extenuating Circumstances Form.

You are responsible for reasonable communication with course staff. If you make a request close to the deadline, we can not guarantee that you will receive a response before the deadline.

Collaboration and Academic Integrity

Data science is a collaborative activity. While you may talk with others about the homework, we ask that you write your solutions individually. If you do discuss the assignments with others please include their names at the top of your notebook. Keep in mind that content from the homeworks and labs will likely be covered on both of the midterms. We will be following the campus policy on Academic Honesty, so be sure you are familiar with it.

As a member of the Berkeley community, we expect you to follow the Berkeley Honor Code:

“As a member of the UC Berkeley community, I act with honesty, integrity, and respect for others.”

Waitlist

If you are on the waitlist, you should complete and submit all assignments as if enrolled: we will not offer any makeup assignments or extensions for waitlisted students.

For all other enrollment related issues, please reach out to the Data Science advisors, as instructors and staff do not manage enrollment into the class.

Community Resources

Device Lending Options

Students can access device lending options through the Student Technology Equity Program STEP program.

Data Science Student Climate

Data Science Undergraduate Studies faculty and staff are committed to creating a community where every person feels respected, included, and supported. We recognize that incidents may happen, sometimes unintentionally, that run counter to this goal. There are many things we can do to try to improve the climate for students, but we need to understand where the challenges lie. If you experience a remark, or disrespectful treatment, or if you feel you are being ignored, excluded or marginalized in a course or program-related activity, please speak up. Consider talking to your instructor, but you are also welcome to contact Executive Director Christina Teller at cpteller@berkeley.edu or report an incident anonymously through this online form.

Community Standards

Ed is a formal, academic space. We must demonstrate appropriate respect, consideration, and compassion for others. Please be friendly and thoughtful; our community draws from a wide spectrum of valuable experiences. For further reading, please reference Berkeley’s Principles of Community and the Berkeley Campus Code of Student Conduct.