Posted by in Applications, Uncategorized

When we think of science we don’t immediately think of quality assurance and yet… scientists have to run their codes somewhere, they need hardware, the hardware needs software, and the software needs to be operated reliably and efficiently – enter the Quality Assurance (QA) team of TeraGrid: the most powerful open science resource.

Shava Smallen, the co-lead of the TeraGrid QA team, told me recently of their first venture into infrastructure clouds. The TeraGrid Science Gateway projects have been experiencing scalability problems with grid infrastructure. A potential solution came out in the form of GRAM 5; the scientists developed scalability tests to see if it solved their problem — but where could they run them? They tried Ranger, a top-of-the-line TeraGrid resource at Texas Advanced Computing Center (TACC). But Ranger is a powerful resource, very much in demand for large scientific computations that cannot run elsewhere — and thus the QA team found itself with tests all ready to run – but no resources to run them on.

Fortunately, the FutureGrid – which includes several IaaS clouds — had just become available.  The QA team quickly put together a virtual cluster, similar to a typical TeraGrid configuration but using GRAM5, and deployed it on the FutureGrid Nimbus cloud at the University of Florida. They ran, they scaled, they reported… While they were at it, they tested GridFTP for scalability as well: running data transfers between two Nimbus clouds at Florida and San Diego.

For the QA team the ability to find resources on-demand – and find them without disrupting the production scientific runs – was the key motivator for turning to cloud computing. It is not hard to imagine many other use cases in TeraGrid with similar requirements. For example, users could use virtual clusters, configured to represent the environment deployed on Ranger or other TeraGrid resources, as a development or debugging platform. In addition to providing on-demand availability, such resources would also give them root access – often important for debugging, but not available on many TeraGrid resources. To leverage cloud computing for science we don’t have to wait to first decide if it will support all of the HPC applications: there are plenty of places where it can be useful now.