James Mickens
Table of Contents
https://mickens.seas.harvard.edu/wisdom-james-mickens
- The writer of The Night Watch
1. Papers He likes:
Here are some of my favorite computer science papers, in no particular order.
"Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications" by Stoica et al: This paper describes how a set of nodes can distribute data across themselves in a way that enables efficient lookup, but does not require a centralized coordinator. This is one of the definitive papers on peer-to-peer systems, and it demonstrates the challenges of creating a large-scale application that cannot rely on a single, global coordinator. The Pastry paper by Rowstron and Druschel describes another influential system for peer-to-peer object location. If those papers arouse the fires of science in you, then you should read about the kinds of applications that can be built atop p2p routing layers (e.g,. the PAST paper by Rowstron and Druschel, or the Pastiche paper by Cox et al).
“MapReduce: Simplified Data Processing on Large Clusters” by Dean and Ghemawat: MapReduce is a popular programming model for performing big-data computations. When the paper first came out, everybody was like THIS IS AMAZING. There was some push-back from the database community, who said that MapReduce ignores decades of results from database research (see DeWitt and Stonebraker’s fantastically-titled blog post from 2008 entitled “MapReduce: A major step backwards”). People continue to debate about the best ways to analyze large data sets. For example, for some data sets representing complex graphs (e.g., the Twitter “X follows Y” graph), it may be faster to analyze the data on a single machine rather than on a distributed computing cluster that contains multiple machines (see Frank McSherry’s recent blog post). There’s also debate about the appropriate way to build key/value stores. If you’ve stuck at the airport one day, type something like “MongoDB good or bad” into the Internet and see what happens. Answer: You’re at an airport, so the Internet connection sucks and nothing happens. You’ve stepped into my trap! Meanwhile, I’m at your house, subtly rearranging all of your pillows into non-optimal configurations.
“Hacking Blind” by Bittau et al: A deeply disturbing paper about how a malicious client can launch buffer overflow attacks on a server even if the attacker has no access to the server’s binary or source code, and even if the server uses stack canaries and address space randomization. Reading about these attack methods is like watching a documentary about those horrible goblin fish that live at the bottom of the ocean and use bioluminescence to spread evil. Even if you don’t think about them, they’re thinking about you.