NoisePage Control Plane
Students: He-Wei Lee, Kushagra SinghSource Code: https://github.com/cmu-db/noisepage-control
This project forms the infrastructure of the NoisePage self-driving database DBMS. It comprisies of three microservices: (i) a primary worker which coordinates with the primary database instance for collecting metrics and executing recommended tuning actions, (ii) an exploratory worker which runs simulations and collects training data, and (iii) a control plane which maintains global states and orchestrates the entire database tuning life cycle. The control plane is a persistent service, while the workers are ephemeral services launched as and when required. Internally, NoisePage control executes user defined worklfows (modelled as directed acyclic graphs) in an asynchonous manner, allowing users to use different approches for tuning a target PostgreSQL database in a plug and play fashion.
Extended Workload Forecasting
Students: Yingjie Ling, Jia Qi Dong, Wan Shen LimSource Code: https://github.com/cmu-db/noisepage-forecast
This project aims to forecast the future workload from a point-in-time dump of database state and the subsequent query log. Prior work has generally restricted itself to forecasting clustered arrival rates and/or reusing the original workload, which optimizes for the workload of the past. However, a truly self-driving database management system needs to anticipate the workload of the future. By using a mix of statistical and deep models, we aim to generate a forecast that can provide representative workloads to other self-driving components.
GaRBAGE -- GlobAl Rule-Based Action Generation & Enumeration
Students: Ying (Jenny) Jiang, Deepayan Patra, Mike XuSource Code: https://github.com/mkpjnx/noisepage-pilot/tree/main/action/generation
GaRBAGE is an action enumeration system that will guide the search and planning of actions in NoisePage. The space of all actions a self-driving DBMS can take to improve performance, e.g. building indexes, tuning knobs, adding MVs, is far too vast to search exhaustively and can change as the schema evolve. Based on configurable "rules", GaRBAGE can restrict or expand the search space as needed -- either to help generate training data for system models, or as input into the search and planning modules.
Metrics Forecasting
Students: Dhruv Arya, Neville Chima, Kai FranzSource Code: https://github.com/kai-franz/metrics_forecasting
This project explores the benefits of timeseries forecasting of internal Postgres metrics like dead tuple count and table size. While there have been approaches that predict metrics based on forecasted workloads, this project attempts to predict future metrics solely on the basis of previously seen metric values using the NeuralProphet time-series forecasting library.