Last week (Nov 13-19) was the annual Supercomputing (SC) conference. This year it was held in New Orleans, Louisiana. Cloud computing was featured by vendors and speakers throughout the conference. There were far too many cool products, talks, and papers to mention in a single post, however, a few of the highlights that we are thankful we caught in person include:
- Two representatives from Platform Computing presented a large-scale cloud deployment being tested at the CERN laboratory in “Building the World’s Largest HPC Cloud.” CERN is testing Platform ISF to run scientific jobs in a virtualized environment. Results included reports of launching several thousand VMs and a comparison of image distribution techniques.
- In “Virtualization for HPC”, members of the academic (Ohio State University, ORNL) and industrial (VMware, Univa UD, Deopli) communities shared their vision of a future for virtualization technologies in HPC. Topics discussed included pro-active fault tolerance using migration, virtualized access to high-performance interconnects, and new hypervisors technologies designed for exascale computing.
- In “Low Latency, High Throughput, RDMA and the Cloud in Between” representatives from Mellanox, Dell, and AMD discussed the advantages of cloud computing and highlighted the importance of reducing latency and increasing throughput for scientific communities. RDMA over Converged Ethernet (RoCE) was emphasized as a specific effort toward reducing latency in virtualized environments.
- The work in “Elastic Cloud Caches for Accelerating Service-Oriented Computations” demonstrated a dynamic and fast memory-based cache using IaaS resources, specifically for a geoinformatics cyberinfrastructure. The system responds to changes in demand by dynamically adding or removing IaaS nodes from the cache.
In addition to some great cloud computing talks and sessions, cloud resources were also involved in a handful of demos and tutorials. In particular, Purdue demoed Springboard, a “hub” to work with NSF’s TeraGrid infrastructure. The hub provides a central point for researchers to collaborate and removes the need for researchers to rely strictly on the command line when interacting with the TeraGrid’s resources. Springboard also interfaces with the TeraGrid’s first cloud resource, Wispy, at Purdue. The National Center for Atmospheric Research (NCAR) and the University of Colorado at Boulder used 150 Amazon EC2 instances for the Linux Cluster Construction tutorial. The virtual machines were launched on-demand the morning of the tutorial. They provided participants with a realistic software environment for configuring and deploying a Linux cluster using a variety of open source tools such as OpenMPI, Torque, and Ganglia.
With cloud computing becoming ever more popular at SC it would be cool to see an HPC challenge category for cloud computing, perhaps running on Amazon’s cluster compute resource that just last week was officially included in the Top500 at 231.