CyVerse Connects Researchers to Jetstream2

Sept. 7, 2022

Now in production for the national research community, Jetstream2 democratizes access to High Performance Computing for more academic communities throughout the U.S.

Image
jet stream

Starting with CyVerse’s Atmosphere Cloud service in 2011, followed by Indiana University and Texas Advanced Computing Center’s Jetstream Cloud in 2016, thousands of U.S. researchers have gained access to open public research cloud computing. These resources complement the National Science Foundation’s (NSF) other major computing facilities for high performance (HPC) and high throughput (HTC). Jetstream2 is accessible from a laptop or tablet — allowing researchers to explore and understand immense amounts of data that are hosted on commercial cloud and national data repositories. Supporting computation, experimentation, and teaching, researchers in a wide range of fields have benefitted from the first Jetstream’s focus on usability and support.

Jetstream2, is the next generation of public cloud in which CyVerse is a collaborative partner. The system is designed to be user-friendly toward researchers who have limited experience with cloud. It also serves smaller communities with no direct access to such resources. Jetstream2 will provide eight petaFLOPS of virtual supercomputing power and 17 petabytes of storage to simplify data analysis, boost discovery, and broaden availability of AI resources. Like CyVerse, this system will help serve more students to gain access to cyberinfrastructure resources, better equipping them to fully participate in the evolving STEM workforce.

In addition, the Jetstream2 cloud environment has other benefits:

  • Extends a broad range of hardware and services, including larger and faster storage systems, graphics processing units (GPUs), large memory nodes, virtual clusters, and much more;
  • Easy to expand and reconfigure and can support diverse modes of on-demand access;
  • Provides infrastructure for science gateways, scientific databases, and other “always-on” services as well as access to on-demand interactive computing and data analysis resources;
  • Provides a core services model for a practical approach to distributed cloud computing that will give academic institutions an incentive to invest their own funds in new advanced cyberinfrastructure facilities.

Today’s data driven research community relies upon a complex ecosystem of applications which involve multiple cloud native technologies (KubernetesNATS, etc.) to power their analyses. The scale at which they need to perform these analyses often necessitates use of high powered GPUs for machine learning model training and artificial intelligence applications.

“Deploying these complex software components alongside novel hardware for each team is time consuming, so Jetstream2 provides the perfect platform that supports automation, for example, Infrastructure as Code, to configure and customize environments for our diverse user communities in CyVerse,” said CyVerse Co-PI Nirav Merchant.

CyVerse has also been developing the next generation of cloud native tooling built on Kubernetes, Terraform, and event-driven services using Jetstream2. This technology, CyVerse CACAO, will enable researchers to leverage contemporary continuous analysis, GitOps, and MLOps workflows using multiple clouds.

Image
Tyson

CyVerse Co-PI Tyson Swetnam is an early adopter of Jetstream2. “Using Jetstream2’s extensible cloud infrastructure, we’ve successfully conducted workshops and training events this summer with our NSF AI Institute partners (AIIRA) for a Deep Learning Workshop series and at CompBio2022 Asia Workshop in Thailand this summer” said Swetnam.

CyVerse is now onboarding other several other collaborations and projects with its Jetstream2 start-up allocations, including:

  • CyVerse will use Jetstream2 to power its cloud native resource CACAO and Discovery Environment testing;
  • ESIIL, a recently awarded NSF ecology synthesis center, will be using Jetstream2 to power JupyterHubs for educational workshops, and for running the NASA ImgSPEC JupyterHub; and
  • California eDNA project with UC Santa Cruz will use Jetstream2 to run large BLAST analyses of environmental DNA.

During the early operations period, the Jetstream2 team successfully integrated the resource into the NSF Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) project that launched on September 1st; ACCESS is the successor to the Extreme Science and Engineering Discovery Environment (XSEDE). Through recent funding in support of this transition, and in support of a successful NSF panel review, the Jetstream2 project now has spending authority for $14.5M of an anticipated $24.5M over the project lifetime.

“We intend Jetstream2 to be a democratizing force within the NSF ecosystem, allowing researchers and educators access to cutting-edge resources regardless of project scale,” said David Hancock, director of advanced cyberinfrastructure, University Information Technology Services, Indiana University.

About Jetstream2

The Jetstream2 project is led by Research Technologies, a division of University Information Technology Services (UITS) and a center in the Pervasive Technology Institute (PTI) at Indiana University (IU). Jetstream2’s primary cloud is at Indiana University Bloomington, with regional clouds at Arizona State University, the Cornell University Center for Advanced Computing, University of Hawaiʻi, and the Texas Advanced Computing Center in Austin, Texas. Additional partnerships with the University of Arizona/CyVerse, Johns Hopkins University, and University Corporation for Atmospheric Research (UCAR) will contribute to Jetstream2’s unparalleled usability and support for a broad range of scientific efforts. Learn more about Jetstream2.

Create Account

An Open Science Workspace for Collaborative Data-driven Discovery