The highlighted papers in each section are the primary readings and should be emphasized in class.
Sep 11, 2013 - History of Database Systems + Distributed Databases
- M. Stonebraker, et al., What Goes Around Comes Around, Readings in Database Systems, 4th Edition, 2006
- D. DeWitt, et al., Parallel Database Systems: The Future Of High Performance Database Systems, Communications of the ACM, 1992
- A. Halevy, et al., The Unreasonable Effectiveness Of Data, IEEE Intelligent Systems, 2009
- M. Stonebraker, et al., Intel "Big Data" Science And Technology Center Vision And Execution Plan, SIGMOD Record, 2013
Sep 16, 2013 - Distributed Transactions
- P.A. Bernstein, et al., Concurrency Control In Distributed Database Systems, ACM Comput. Surv., 1981
- G. Samaras, et al., Two-Phase Commit Optimizations And Tradeoffs In The Commercial Environment, ICDE, 1993
- C. Mohan, et al., Transaction Management In The R* Distributed Database Management System, TODS, 1986
- B. Lampson, et al., A New Presumed Commit Optimization For Two Phase Commit, VLDB, 1992
- P. Helland, Life Beyond Distributed Transactions: An Apostate's Opinion, CIDR, 2007
Sep 18, 2013 - Consensus Protocols
- L. Lamport, Paxos Made Simple, ACM SIGACT News, 2001
- T. Chandra, et al., Paxos Made Live, PODC, 2007
- L. Lamport, The Part-Time Parliament, ACM TOCS, 1998
- H. Robinson, Consensus Protocols: Paxos, Online, 2009
Sep 30, 2013 - NoSQL I
- G. DeCandia, et al., Dynamo: Amazon's Highly Available Key-Value Store, SOSP, 2007
- B. Cooper, et al., PNUTS: Yahoo!'s Hosted Data Serving Platform, VLDB, 2008
- Werner Vogels, Eventually Consistent, ACM Queue, 2009
- A. Lakshman, Cassandra - A Decentralized Structured Storage System, SIGOPS Operating Systems Review, 2010
Oct 02, 2013 - NoSQL II
- F. Chang, et al., Bigtable: A Distributed Storage System For Structured Data, OSDI, 2006
- J. Baker, et al., MegaStore: Providing Scalable, Highly Available Storage For Interactive Services, CIDR, 2011
- M. Stonebraker, et al., SQL Databases v. NoSQL Databases, Communications of the ACM, 2010
Oct 14, 2013 - NewSQL I
- M. Stonebraker et al., The End Of An Architectural Era: (It's Time For A Complete Rewrite), VLDB, 2007
- A. Thomson, et al., Calvin: Fast Distributed Transactions For Partitioned Database Systems, SIGMOD, 2012
- M. Aslett, How Will The Database Incumbents Respond To NoSQL And NewSQL?, 451 Group, 2010
- M. Stonebraker, New Opportunities For NewSQL, Communications of the ACM, 2012
- M. Stonebraker, et al., Ten Rules For Scalable Performance In Simple Operation' Datastores, Communications of the ACM, 2011
- A. Thomson, et al., The Case For Determinism In Database Systems, VLDB, 2010
Oct 16, 2013 - NewSQL II
- C. Curino, et al., Schism: A Workload-Driven Approach To Database Replication And Partitioning, VLDB, 2010
- A. Pavlo, et al., Skew-Aware Automatic Database Partitioning In Shared-Nothing, Parallel OLTP Systems, SIGMOD, 2012
- C. Curino, et al., Relational Cloud: A Database Service For The Cloud, CIDR, 2011
- A. Pavlo, et al., On Predictive Modeling For Optimizing Transaction Execution In Parallel OLTP Systems, VLDB, 2011
Oct 21, 2013 - Distributed Data Stores I
- J.C. Corbett, et al., Spanner: Google's Globally-Distributed Database, OSDI, 2012
- J. Shute, et al., F1: A Distributed SQL Database That Scales, VLDB, 2013
- Murat Demirbas, Overview Of Spanner, Online, 2013
Oct 23, 2013 - Distributed Data Stores II
- L. Qiao, et al., On Brewing Fresh Espresso: Linkedin's Distributed Data Serving Platform, SIGMOD, 2013
- N. Bronson, et al., Tao: Facebook's Distributed Data Store For The Social Graph, USENIX ATC, 2013
Oct 28, 2013 - Distributed Stream Processing
- T. Akidau, et al., MillWheel: Fault-Tolerant Stream Processing At Internet Scale, VLDB, 2013
- L. Abraham, et al., Scuba: Diving Into Data At Facebook, VLDB, 2013
- L. Neumeyer, et al., S4: Distributed Stream Computing Platform, ICDMW, 2010
- M. Zaharia, et al., Discretized Streams: An Efficient And Fault-Tolerant Model For Stream Processing On Large Clusters, HotCloud, 2012
Oct 30, 2013 - Alternative Data Storage & Models
- D. Abadi, et al., Column-Stores vs. Row-Stores: How Different Are They Really?, SIGMOD, 2008
- Paul G. Brown, Overview Of Scidb: Large Scale Array Storage, Processing And Analysis, SIGMOD, 2010
Nov 04, 2013 - Data Warehouses I
- A. Pavlo, et al., A Comparison Of Approaches To Large-Scale Data Analysis, SIGMOD, 2009
- A. Abouzied, et al., HadoopDB: An Architectural Hybrid Of MapReduce And DBms Technologies For Analytical Workloads, VLDB, 2009
- A. Thusoo, et al., Hive: A Warehousing Solution Over A MapReduce Framework, VLDB, 2009
Nov 06, 2013 - Data Warehouses II
- S. Melnik, et al., Dremel: Interactive Analysis Of Web-Scale Datasets, VLDB, 2010
- R. Xin, et al., Shark: SQL And Rich Analytics At Scale, SIGMOD, 2013
Nov 11, 2013 - Machine Learning Systems I
- G. Malewicz, et al., Pregel: A System For Large-Scale Graph Processing, SIGMOD, 2010
- M. Zaharia, et al., Resilient Distributed Datasets: A Fault-Tolerant Abstraction For In-Memory Cluster Computing, NSDI, 2012
Nov 13, 2013 - Machine Learning Systems II
- Y. Low, et al., Distributed GraphLab: A Framework For Machine Learning And Data Mining In The Cloud, VLDB, 2012
- A. Kyrola, et al., GraphChi: Large-Scale Graph Computation On Just A PC, OSDI, 2012
- Y. Low, et al., GraphLab: A New Parallel Framework For Machine Learning, UAI, 2010
Nov 18, 2013 - In Situ Data Processing
- I. Alagiannis, et al., NoDB: Efficient Query Execution On Raw Data Files, SIGMOD, 2012
- A. Abouzied, et al., Invisible Loading: Access-Driven Data Transfer From Raw Files Into Database Systems, EDBT, 2013
Nov 20, 2013 - OLTP/OLAP Hybrids
- A. Kemper, et al., HyPER: A Hybrid OLTP & OLAP Main Memory Database System Based On Virtual Memory Snapshots, ICDE, 2011
- V. Sikka, et al., Efficient Transaction Processing In SAP HANA Database: The End Of A Column Store Myth, SIGMOD, 2012
- J. Lee, et al., High-Performance Transaction Processing In SAP HANA, ICDE Bulletin, 2013
- J. Dittrich, et al., Towards A One-Size Fits All DB Architecture, CIDR, 2011
- M. Grund, et al., HYRISE: A Main Memory Hybrid Storage Engine, VLDB, 2010
- T. Muhlbauer, et al., ScyPer: Elastic OLAP Throughput On Transactional Data, DanaC, 2013
Nov 25, 2013 - Crowdsourcing
- M. Franklin, et al., CrowdDB: Answering Queries With Crowdsourcing, SIGMOD, 2011
- M. Stonebraker, et al., Data Curation At Scale: The Data Tamer System, CIDR, 2013
- A. Parameswaran, et al., Crowdscreen: Algorithms For Filtering Data With Humans, SIGMOD, 2012
- A. Marcus, et al., Human-Powered Sorts And Joins, CIDR, 2013