CMU 15-799 - Spring 2022

Special Topics: Self-Driving Database Management Systems


[PRESENTATION] NoisePage Control Plane

NoisePage Control Plane

Students: He-Wei Lee, Kushagra Singh
Source Code:

This project forms the infrastructure of the NoisePage self-driving database DBMS. It comprisies of three microservices: (i) a primary worker which coordinates with the primary database instance for collecting metrics and executing recommended tuning actions, (ii) an exploratory worker which runs simulations and collects training data, and (iii) a control plane which maintains global states and orchestrates the entire database tuning life cycle. The control plane is a persistent service, while the workers are ephemeral services launched as and when required. Internally, NoisePage control executes user defined worklfows (modelled as directed acyclic graphs) in an asynchonous manner, allowing users to use different approches for tuning a target PostgreSQL database in a plug and play fashion.

[PRESENTATION] Extended Workload Forecasting

Extended Workload Forecasting

Students: Yingjie Ling, Jia Qi Dong, Wan Shen Lim
Source Code:

This project aims to forecast the future workload from a point-in-time dump of database state and the subsequent query log. Prior work has generally restricted itself to forecasting clustered arrival rates and/or reusing the original workload, which optimizes for the workload of the past. However, a truly self-driving database management system needs to anticipate the workload of the future. By using a mix of statistical and deep models, we aim to generate a forecast that can provide representative workloads to other self-driving components.

[PRESENTATION] GaRBAGE -- GlobAl Rule-Based Action Generation & Enumeration

GaRBAGE -- GlobAl Rule-Based Action Generation & Enumeration

Students: Ying (Jenny) Jiang, Deepayan Patra, Mike Xu
Source Code:

GaRBAGE is an action enumeration system that will guide the search and planning of actions in NoisePage. The space of all actions a self-driving DBMS can take to improve performance, e.g. building indexes, tuning knobs, adding MVs, is far too vast to search exhaustively and can change as the schema evolve. Based on configurable "rules", GaRBAGE can restrict or expand the search space as needed -- either to help generate training data for system models, or as input into the search and planning modules.

[PRESENTATION] Metrics Forecasting

Metrics Forecasting

Students: Dhruv Arya, Neville Chima, Kai Franz
Source Code:

This project explores the benefits of timeseries forecasting of internal Postgres metrics like dead tuple count and table size. While there have been approaches that predict metrics based on forecasted workloads, this project attempts to predict future metrics solely on the basis of previously seen metric values using the NeuralProphet time-series forecasting library.