John Wilkes
Location: (Mountain View, CA)
Personal Research Web Page: http://www.e-wilkes.com/john
Keywords: SLAs, service level agreements, autonomics, self-management, large-scale systems, storage, self-healing, self-*, cloud computing
Posted on: Thursday, May 19th, 2011
Broad Research Area: Networks / Operating Systems, Numerical/Scientific Computing / HPC / Data-Intensive Scalable Computing
Research Interests:
I have broad interests in large-scale distributed systems. I’d be delighted to collaborate with somebody on either of my two main activities:
1. Cluster management – Google’s term for the systems that allocate work to our fleet of computers, solving what are effectively large-scale bin-packing problems under a raft of constraints, failures, and uncertainty. We are pushing the envelope of sophistication and capability in managing clusters of machines, and I’m open to collaborating on a wide range of topics, including scheduling, failure modeling, configuration management, and a raft of automation approaches.
2. Using SLAs (Service Level Agreements) to improve our ability to delegate control to automated decision-making systems when they face trade-offs. Work here includes control mechanisms, control systems, feedback loops, and system design. SLAs need a way to specify the consequences of meeting/not meeting their objectives, and economic feedback mechanisms seem a particularly powerful way to achieve this.
As you might imagine, Google provides a great many really exciting opportunities to try these ideas out at significant scale, on real production systems, with real-world problems and requirements. It’s an ideal proving ground for real systems work in support of a wide range of uses, including cloud computing. And we are looking to publish more papers in these areas, so that would be an explicit goal of supporting this program.
Please contact me if you are interested; I’d be happy to discuss opportunities. I’m based in Mt View, California.
