Papers
A curated collection of research papers and blogs across systems, databases, and infrastructure.
-
Google File System ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/4446.pdf
-
MapReduce: Simplified Data Processing on Large Clusters ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/4449.pdf
-
Bigtable: A Distributed Storage System for Structured Data ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/4443.pdf
-
Colossus (Next-Gen GFS successor) ↗
https://cloud.google.com/blog/products/storage-data-transfer/a-peek-behind-colossus-googles-file-system
-
Megastore: Providing Scalable, Highly Available Storage for Interactive Services ↗
https://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf
-
Monarch: Google's Planet-Scale In-Memory Time Series Database ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/6348.pdf
-
Chubby: The Lock Service for Loosely-Coupled Distributed Systems ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/4444.pdf
-
Spanner: Google's Globally-Distributed Database ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/1974.pdf
-
Spanner — CAP-theorem considerations ↗
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45855.pdf
-
Dapper: A Large-Scale Distributed Systems Tracing Infrastructure ↗
https://static.googleusercontent.com/media/research.google.com/en//archive/papers/dapper-2010-1.pdf
-
Borg Cluster Management (SOSP 2015) ↗
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf
-
Zanzibar: Google's Consistent, Global Authorization System ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/5068.pdf
-
Pregel: A System for Large-Scale Graph Processing ↗
https://15799.courses.cs.cmu.edu/fall2013/static/papers/p135-malewicz.pdf
-
Napa: Data Warehousing (Google internal) ↗
https://www.vldb.org/pvldb/vol14/p2986-sankaranarayanan.pdf
-
Napa: Partitioning Algorithm ↗
https://www.vldb.org/pvldb/vol16/p3475-sankaranarayanan.pdf
-
F1: Fast Analytics on a Distributed RDBMS ↗
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41344.pdf
-
Mesa: Data Warehousing System ↗
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42851.pdf
-
Google Firestore ↗
https://storage.googleapis.com/gweb-research2023-media/pubtools/7076.pdf
-
Amazon Aurora DB Architecture ↗
https://pages.cs.wisc.edu/~yxy/cs764-f20/papers/aurora-sigmod-17.pdf
-
DynamoDB: A NoSQL Database Service ↗
https://assets.amazon.science/33/9d/b77f13fe49a798ece85cf3f9be6d/amazon-dynamodb-a-scalable-predictably-performant-and-fully-managed-nosql-database-service.pdf
-
Apple FoundationDB: A NewSQL Database ↗
https://www.foundationdb.org/files/fdb-paper.pdf
-
TikTok Monolith: Embedding in Real-Time ↗
https://arxiv.org/pdf/2209.07663
-
Gorilla: Time Series Database ↗
https://www.vldb.org/pvldb/vol8/p1816-teller.pdf
-
Cassandra: A Decentralized Structured Storage System ↗
https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
-
Memcache: In-Memory Cache at Facebook ↗
https://scontent.fdel29-1.fna.fbcdn.net/v/t39.8562-6/240873052_277412237132971_6278324660880331641_n.pdf