15-799 :: Special Topics in Database Systems

15-799 :: Special Topics in Database Systems

Fall 2013

  • Instructor: Andy Pavlo
  • Time: Mon/Wed 12:00 - 1:20
  • Location: NSH 3002

Reading List

The highlighted papers in each section are the primary readings and should be emphasized in class.

Sep 11, 2013 - History of Database Systems + Distributed Databases

  1. M. Stonebraker, et al., What Goes Around Comes Around, Readings in Database Systems, 4th Edition, 2006
  2. D. DeWitt, et al., Parallel Database Systems: The Future Of High Performance Database Systems, Communications of the ACM, 1992
  3. A. Halevy, et al., The Unreasonable Effectiveness Of Data, IEEE Intelligent Systems, 2009
  4. M. Stonebraker, et al., Intel "Big Data" Science And Technology Center Vision And Execution Plan, SIGMOD Record, 2013

Sep 16, 2013 - Distributed Transactions

  1. P.A. Bernstein, et al., Concurrency Control In Distributed Database Systems, ACM Comput. Surv., 1981
  2. G. Samaras, et al., Two-Phase Commit Optimizations And Tradeoffs In The Commercial Environment, ICDE, 1993
  3. C. Mohan, et al., Transaction Management In The R* Distributed Database Management System, TODS, 1986
  4. B. Lampson, et al., A New Presumed Commit Optimization For Two Phase Commit, VLDB, 1992
  5. P. Helland, Life Beyond Distributed Transactions: An Apostate's Opinion, CIDR, 2007

Sep 18, 2013 - Consensus Protocols

  1. L. Lamport, Paxos Made Simple, ACM SIGACT News, 2001
  2. T. Chandra, et al., Paxos Made Live, PODC, 2007
  3. L. Lamport, The Part-Time Parliament, ACM TOCS, 1998
  4. H. Robinson, Consensus Protocols: Paxos, Online, 2009

Sep 30, 2013 - NoSQL I

  1. G. DeCandia, et al., Dynamo: Amazon's Highly Available Key-Value Store, SOSP, 2007
  2. B. Cooper, et al., PNUTS: Yahoo!'s Hosted Data Serving Platform, VLDB, 2008
  3. Werner Vogels, Eventually Consistent, ACM Queue, 2009
  4. A. Lakshman, Cassandra - A Decentralized Structured Storage System, SIGOPS Operating Systems Review, 2010

Oct 02, 2013 - NoSQL II

  1. F. Chang, et al., Bigtable: A Distributed Storage System For Structured Data, OSDI, 2006
  2. J. Baker, et al., MegaStore: Providing Scalable, Highly Available Storage For Interactive Services, CIDR, 2011
  3. M. Stonebraker, et al., SQL Databases v. NoSQL Databases, Communications of the ACM, 2010

Oct 14, 2013 - NewSQL I

  1. M. Stonebraker et al., The End Of An Architectural Era: (It's Time For A Complete Rewrite), VLDB, 2007
  2. A. Thomson, et al., Calvin: Fast Distributed Transactions For Partitioned Database Systems, SIGMOD, 2012
  3. M. Aslett, How Will The Database Incumbents Respond To NoSQL And NewSQL?, 451 Group, 2010
  4. M. Stonebraker, New Opportunities For NewSQL, Communications of the ACM, 2012
  5. M. Stonebraker, et al., Ten Rules For Scalable Performance In Simple Operation' Datastores, Communications of the ACM, 2011
  6. A. Thomson, et al., The Case For Determinism In Database Systems, VLDB, 2010

Oct 16, 2013 - NewSQL II

  1. C. Curino, et al., Schism: A Workload-Driven Approach To Database Replication And Partitioning, VLDB, 2010
  2. A. Pavlo, et al., Skew-Aware Automatic Database Partitioning In Shared-Nothing, Parallel OLTP Systems, SIGMOD, 2012
  3. C. Curino, et al., Relational Cloud: A Database Service For The Cloud, CIDR, 2011
  4. A. Pavlo, et al., On Predictive Modeling For Optimizing Transaction Execution In Parallel OLTP Systems, VLDB, 2011

Oct 21, 2013 - Distributed Data Stores I

  1. J.C. Corbett, et al., Spanner: Google's Globally-Distributed Database, OSDI, 2012
  2. J. Shute, et al., F1: A Distributed SQL Database That Scales, VLDB, 2013
  3. Murat Demirbas, Overview Of Spanner, Online, 2013

Oct 23, 2013 - Distributed Data Stores II

  1. L. Qiao, et al., On Brewing Fresh Espresso: Linkedin's Distributed Data Serving Platform, SIGMOD, 2013
  2. N. Bronson, et al., Tao: Facebook's Distributed Data Store For The Social Graph, USENIX ATC, 2013

Oct 28, 2013 - Distributed Stream Processing

  1. T. Akidau, et al., MillWheel: Fault-Tolerant Stream Processing At Internet Scale, VLDB, 2013
  2. L. Abraham, et al., Scuba: Diving Into Data At Facebook, VLDB, 2013
  3. L. Neumeyer, et al., S4: Distributed Stream Computing Platform, ICDMW, 2010
  4. M. Zaharia, et al., Discretized Streams: An Efficient And Fault-Tolerant Model For Stream Processing On Large Clusters, HotCloud, 2012

Oct 30, 2013 - Alternative Data Storage & Models

  1. D. Abadi, et al., Column-Stores vs. Row-Stores: How Different Are They Really?, SIGMOD, 2008
  2. Paul G. Brown, Overview Of Scidb: Large Scale Array Storage, Processing And Analysis, SIGMOD, 2010

Nov 04, 2013 - Data Warehouses I

  1. A. Pavlo, et al., A Comparison Of Approaches To Large-Scale Data Analysis, SIGMOD, 2009
  2. A. Abouzied, et al., HadoopDB: An Architectural Hybrid Of MapReduce And DBms Technologies For Analytical Workloads, VLDB, 2009
  3. A. Thusoo, et al., Hive: A Warehousing Solution Over A MapReduce Framework, VLDB, 2009

Nov 06, 2013 - Data Warehouses II

  1. S. Melnik, et al., Dremel: Interactive Analysis Of Web-Scale Datasets, VLDB, 2010
  2. R. Xin, et al., Shark: SQL And Rich Analytics At Scale, SIGMOD, 2013

Nov 11, 2013 - Machine Learning Systems I

  1. G. Malewicz, et al., Pregel: A System For Large-Scale Graph Processing, SIGMOD, 2010
  2. M. Zaharia, et al., Resilient Distributed Datasets: A Fault-Tolerant Abstraction For In-Memory Cluster Computing, NSDI, 2012

Nov 13, 2013 - Machine Learning Systems II

  1. Y. Low, et al., Distributed GraphLab: A Framework For Machine Learning And Data Mining In The Cloud, VLDB, 2012
  2. A. Kyrola, et al., GraphChi: Large-Scale Graph Computation On Just A PC, OSDI, 2012
  3. Y. Low, et al., GraphLab: A New Parallel Framework For Machine Learning, UAI, 2010

Nov 18, 2013 - In Situ Data Processing

  1. I. Alagiannis, et al., NoDB: Efficient Query Execution On Raw Data Files, SIGMOD, 2012
  2. A. Abouzied, et al., Invisible Loading: Access-Driven Data Transfer From Raw Files Into Database Systems, EDBT, 2013

Nov 20, 2013 - OLTP/OLAP Hybrids

  1. A. Kemper, et al., HyPER: A Hybrid OLTP & OLAP Main Memory Database System Based On Virtual Memory Snapshots, ICDE, 2011
  2. V. Sikka, et al., Efficient Transaction Processing In SAP HANA Database: The End Of A Column Store Myth, SIGMOD, 2012
  3. J. Lee, et al., High-Performance Transaction Processing In SAP HANA, ICDE Bulletin, 2013
  4. J. Dittrich, et al., Towards A One-Size Fits All DB Architecture, CIDR, 2011
  5. M. Grund, et al., HYRISE: A Main Memory Hybrid Storage Engine, VLDB, 2010
  6. T. Muhlbauer, et al., ScyPer: Elastic OLAP Throughput On Transactional Data, DanaC, 2013

Nov 25, 2013 - Crowdsourcing

  1. M. Franklin, et al., CrowdDB: Answering Queries With Crowdsourcing, SIGMOD, 2011
  2. M. Stonebraker, et al., Data Curation At Scale: The Data Tamer System, CIDR, 2013
  3. A. Parameswaran, et al., Crowdscreen: Algorithms For Filtering Data With Humans, SIGMOD, 2012
  4. A. Marcus, et al., Human-Powered Sorts And Joins, CIDR, 2013

© Carnegie Mellon University – Built with Pelican