CMU 15-799 - Spring 2022

Special Topics: Self-Driving Database Management Systems


Educational Objectives

This is a graduate-level course on the application of automation via machine learning to database management systems. This course has a heavy emphasis on programming projects. There are also readings assigned for each class. Upon successful completion of this course, the student should be able to:

  • Identify trade-offs among automation techniques and contrast alternatives for both on-line transaction processing and on-line analytical workloads.
  • Apply and customize state-of-the-art implementation techniques for single-node database management systems following modern coding practices.
  • Develop and justify design decisions for building tools, infrastructure, and system architectures to support autonomous operations.
  • Implement and evaluate database systems, with emphasis on providing experimental evidence for design decisions.
  • Interpret and comparatively criticize state-of-the-art research talks and papers, with emphasis on constructive improvements.

All programming projects will be completed in the NoisePage project for the PostgreSQL database management system.

Grading Scheme

This is a seminar course, thus there are no exams. You will be graded on the basis of your participation in projects and presentations.

The final grade for the course will be based on the following weights:

Reading Assignments & Reviews

For each class, there is set of assigned readings. Unless you are the presenter for that day, each student is required to turn in a one paragraph synopsis of the mandatory paper (denoted by the symbol on the course schedule). Students are encouraged to peruse the supplemental readings to enhance their knowledge about a particular, but this not required and these papers will not be covered in the final exam. Students are allowed to miss reading review submissions for three classes during the semester. Late submissions will not be accepted without prior approval from the instructor.

Each review must include the following information:

  • An overview of the main idea and contributions (Three sentences).
  • Three strengths of the proposed method (One sentence each).
  • Three weaknesses of the proposed method (One sentence each).
  • The workloads that they used for their evaluation (One sentence).

Students will submit their synopsis using this Google Form before class begins. Late submissions will not be accepted.

WARNING: These reading reviews must be your own writing. You may not copy from the papers or other sources that you find on the web. Plagiarism will not be tolerated. See CMU's Policy on Academic Integrity for additional information.

Paper Presentations

Each student will choose at least two dates from the schedule and present the paper assigned on those days to the class. This talk is supposed to be an in depth description and analysis of the assigned reading. The studebt should prepare to speak for 30 minutes. The format of the talk should be similar to a conference presentation. Because it is the responsibility of the presenter to teach the class about the papers, they are expected to understand the key aspects of the material. Thus, it is important to be prepared. This may require you to do additional background reading. If you have questions regarding the content of your assigned papers, you should arrange to meet with the instructor well in advance of your talk date.

WARNING: It is acceptable for students to use information and content (e.g., images and graphics) found on the Internet but the original source must be properly attributed/cited. No credit will be given for presentations without proper citations. See CMU's Policy on Academic Integrity for additional information.

Project #1 — PostgreSQL Auto Tuner

This is a single-person project that will be completed individually (i.e., no groups). Students will be provided with instructions and sample data sets to evaluate their implementation. Grading will be based on both correctness and performance.

  • Release Date: Jan 24, 2022
  • Due Date: Feb 28, 2022 @ 11:59pm

Project #2 — Self-Driving Infrastructure

Students will organize into groups of three to build NoisePage project infrastructure to support self-driving operations. Each group will choose to implement a project that is (1) relevant to the materials discussed in class, (2) requires a significant programming effort from all team members, and (3) unique (i.e., two groups may not choose the same project topic). We will discuss this more in depth during class, though students are encouraged to begin to think about projects that interest them early on. If a group is unable to come up with their own project idea, the instructor will provide suggestions on interesting topics. The projects will vary in both scope and topic, but they must satisfy this criteria.

  • Release Date: Mar 02, 2022
  • Due Date: Apr 06, 2022 @ 11:59pm

Project #3 — Group Project

The last programming project of this course will be to combine together the group projects into a single platform.

  • Release Date: Apr 06, 2022
  • Due Date: May 07, 2022 @ 11:59pm