9.S915: Introduction to Probabilistic Programming (Fall 2016)

Introduces probabilistic programming, a computational formulation of probability theory. Covers how to formalize key ideas from probabilistic modeling and inference as probabilistic meta-programs, and provides hands-on probabilistic programming experience with Venture, an open-source research platform. Emphasizes practical AI-based techniques for probabilistic data analysis while also surveying applications to computer vision, robotics, and the exploration and modeling of complex databases in domains such as public health and neuroscience. Illustrates connections with other approaches to engineering and reverse-engineering intelligence.

Course Information

Lecturer: Vikash Mansinghka (vkm[at]mit.edu)
Teaching Assistants: Feras Saad (fsaad[at]mit.edu), Marco Cusumano-Towner (marcoct[at]mit.edu)
Administrative Assistant: Rax Dillon (rax[at]mit.edu)
Course Email: Please send an email to (9s915-staff[at]mit.edu) for questions to the course staff, and the right person will get back to you.
Schedule: Lectures TR 3:00-4:30PM in 46-5193, Recitations W7-10PM in 1-115
Prerequisite: Permission of instructor, but please come to the first lecture on September 8th if you are interested even if you do not have permission
Units: 3-0-9

Grading: 20% participation, 40% problem sets, 40% final project

Participation (lecture & recitation): 20%

Students are expected to attend all lectures; the problem sets do not cover all the critical material, and there is no written reference for much of the course. Students will also be expected to scribe 3 lectures, and to revise their scribe notes by merging them with others scribing the same lectures; the final quality of the scribe notes will be considered as part of the participation grade.

Recitation is an optional clinic-style meeting where students get together in a cluster and work on problem sets and questions about course material. Vikash and/or his research assistants will be available for questions and support. Recitation participation is strongly encouraged, but you will not be penalized for not attending.
Problem Sets : 40%

There are three problem sets, due 10/13, 11/3, and 11/18. Each problem set is worth 13.3% of the total course grade. These problem sets emphasize (i) probabilistic programming skills, (ii) technical communication skills, and (iii) reading comprehension for both technical and non-technical background material.
Final Project : 40%

You will be responsible for producing a research workshop quality poster in four stages:

(i) Proposal for your final project topic and scope. You will discuss these in small groups with members of the MIT Probabilistic Computing Project during the week of 10/10. You will have the option of selecting a poster topic from a list or developing your own.

(ii) Stub poster outlining what you will demonstrate using synthetic results and figures, to enable in-depth feedback on the story and ensure project viability before the research itself is complete. You will hand this in by 11/22.

(iii) Poster presentation where you show a revised draft poster in class on 12/6.

(iv) Final poster must be turned in by 12/13.

This structure is designed to help you learn how to organize work on projects whose goals and formulation are evolving as the work unfolds and communicate preliminary results. Exemplary projects may lead to additional mentorship from MIT ProbComp members towards external presentation and/or publication, but this is not required.

Learning Outcomes

After taking this course, students will be able to:

Explore sparse databases and solve predictive modeling problems using BayesDB BQL and MML.
Diagnose model and inference quality in functional terms — e.g. predictive accuracy — and normative terms — e.g. how close to ``fully Bayesian'' was the algorithmic process that produced the model.
Use key invariants of Bayesian probability — chiefly, the Bayesian Occam's Razor and asymptotic concentration — to guide the testing and debugging of probabilistic programs.
Formulate simple textbook models from Bayesian statistics as generative programs in VentureScript.
Use inference programming to apply them to draw approximately Bayesian inferences from real data.
Formulate core conceptual problems in computer vision and robotics in terms of probabilistic programming, and solve simple instances of them.

Students will also (i) gain experience using the probabilistic programming framework to critically assess technical proposals for engineering and reverse engineering intelligence; (ii) get exposure to several core theoretical and conceptual issues in probabilistic programming and probabilistic computing; and (iii) learn about the research frontier in these fields.

Topics and Weekly Schedule

Fundamentals of probabilistic programming

9/8: Overview of the field
9/13: Lab session to provide students with necessary technical resources. Problem Set 0 available.
9/15: Building & querying models of sparse databases with BQL, SQL and MML.
9/20: Representing models and queries as generative probabilistic programs in VentureScript.
9/22: Measuring the quality of approximate inference both functionally and normatively. Problem Set 0 due 9/22. Problem Set 1 available.

Probabilistic data analysis

9/27: Types of data analysis: descriptive, exploratory, inferential, predictive, causal, and mechanistic.
9/29: Composable Generative Population Models (CGPMs): a computational abstraction for multivariate probabilistic models.
10/4: CrossCat: a generic Bayesian meta-CGPM for modeling statistical (sub)populations, Part 1
10/6: Lab session for Problem Set 1.
10/11: CrossCat: a generic Bayesian meta-CGPM for modeling statistical (sub)populations, Part 2
10/13: Implementing BQL SIMULATE, ESTIMATE, and INFER. Problem Set 1 due 10/13. Problem Set 2 available.

Probabilistic meta-programming

10/18: An overview of MML: a Bayesian meta-modeling language for sparse databases.
10/20: Representing inference strategies as probabilistic inference programs in VentureScript. [[NOTE: This lecture may be replaced with a lab session on PS2, with contents merged into 10/25 and 10/27 lectures.]]
10/25: Measuring the quality of approximate inference, continued, and Bayesian relaxation.
10/27: “Computationally universal” approximate probabilistic inference.

Applications to computer vision and robotics

11/1: 3D vision as probabilistic inverse graphics in VentureScript.
11/3: Lecture cancelled due to illness. Problem Set 2 due 11/3. Problem Set 3 available.
11/8: TBA
11/10: TBA

Advanced topics

11/15: TBA
11/17: TBA Problem Set 3 due 11/18.
11/22: TBA Final project poster stubs due.
11/24: No class, Thanksgiving.
11/28: TBA
12/1: TBA
12/6: Final project poster session.
12/8: TBA
12/13: TBA Final projects due.

Relationships to Other Courses

Intended for graduate students who have already taken 9.014, 9.660[J], or a comparable graduate-level class covering topics from machine learning and AI. Because this course presents probabilistic concepts from a probabilistic programming point of view, there will be only minimal overlap in content with these courses.
Does not cover non-probabilistic learning algorithms or non-Bayesian learning theory. Students interested in these topics should take 9.014, 9.520J, or 6.036/6.867 instead.
Does not emphasize computational cognitive science; students interested in this topic should take 9.660[J] instead. Does illustrate connections between probabilistic programming and computational cognitive science.
Uses probabilistic programs and meta-programs, not mathematical formulae, to define models, data, priors, likelihoods, and Bayes’ Rule.
Lab assignments require students to do (i) hands-on data analysis using probabilistic programs that integrate discriminative learning with state-of-the-art semi-parametric Bayesian generative models and (ii) prototyping of generative probabilistic solutions to problems in 3D computer vision and robotics.
Includes introductory material on how to implement “computationally universal” approximate inference engines as probabilistic meta-programs, and to build the other core components of a probabilistic programming platform. This is critical background for students interested in contributing to research in probabilistic programming.

Notes

9/20: In class, Vikash discussed Anthony Lu's thesis, Venture: an extensible platform for probabilistic meta-programming.