Objectives & Deliverables

New Features, New Science

CyVerse strives to create an innovative, comprehensive, generic, and foundational cyberinfrastructure (CI) in support of life science research. CyVerse develops CI that uniquely enables scientists across the diverse fields that comprise life sciences to address Grand Challenge questions in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of CI in research and education.

This page is updated quarterly as project teams evaluate deliverables and milestones. Below are the most current deliverables.

Deliverables For Project Year 2 (2019-2020)

[1.1] General O&M: Continue efforts to operate and maintain the infrastructure.

[1.2] Data Store:

  • Extend ELK stack to obtain consumption metrics and platforms to deposit consumption data
  • Provide store and forward policies, documentation for its use
  • Implement failover policies between UA and TACC
  • Implement support for read-only copies
  • Provide guidance for workflow systems (Pegasus) for data placement
  • Provide utilities to validate connection and performance for end users
  • Improve monitoring and reporting on network performance, resource consumption per user, team level
  • Provide integrated distributed cloud storage and remote resource servers (at DS layer)
  • Provide integrated distributed cloud storage and remote resource servers (at API layer)
  • Provide support for offline storage inclusion into catalog
  • Provide support for metadata and policies for HT imaging, phenotyping and sensor data
  • Extend ElasticSearch for supporting temporal, spatial indexes for sensor data
  • Provide support for ElasticSearch based analysis (R, Python etc.) with content in Data Store
  • Provide improved support for Machine Learning methods to locate and extract training data
  • Provide improved support for FAIR Data Commons (Ontology, Metadata templates etc.)
  • Provide improved support for publishing and integration with external repositories (SRA, Hubs)

[1.3] Atmosphere:

  • Maintain usability and robustness
  • Provide instances and support for specific communities

[1.4.1] Discovery Environment:

  • DE Public API Documentation
  • Develop ontology-based metadata management
  • Improve DE architecture to support increased volume of jobs
  • Fix various minor bugs
  • Test for quality assurance on DE release
  • Author documentation on DE release

[1.4.2] BisQue:

  • Continue to update BisQue
  • Continue to support herbariums (NSF ADBC) to manage their image collections

[1.5] Science APIs:

  • Continue to improve support for cloud-native technologies
  • Implement new types of cloud-based remote storage

[2.1] External Collaborative Partnerships: Continue to nurture extended community partnerships, including:

  • Partner with Genomes to Fields project to support breeding and ecological research
  • Partner with Legume Federation
  • Partner with Chequamegon Heterogeneous Ecosystem Energy-balance Study Enabled by a High-density Extensive Array of Detectors (CHEESEHEAD19)
  • Assist with resource acquisition to meet the needs of their user communities
  • Share best practices for meeting the needs of their user communities
  • Provide dedicated support for users to adapt to the CyVerse CI
  • Evaluate synergies between ECS requests and overall CI requirements
  • Advertise computational capabilities and explore needs of existing and emerging projects
  • Assist in benchmarking and evaluating computational projects to determine best CI components for scalability
  • Assist in obtaining independent allocation on national cyberinfrastructures

[2.2] Large Scale Data:

  • Streamline and scale workflows for moving rangeland production sUAS data from the field to the database to the user in near real time
  • Streamline and scale workflows for 3D modeling from sUAS images to quantify rangeland production

[2.3] Tools and Workflows:

  • Integrate tools to support long-read sequencing technologies
  • Integrate tools to support non-coding RNA analysis
  • Deploy tools for RNA-Seq analysis
  • Deploy tools for metagenomic analysis
  • Deploy tools for genomic analysis
  • Development of Visual Interactive Computing Environment (VICE)
  • Enable analysis with GPUs in DE
  • Enable machine learning technologies
  • Teach partners to integrate workflows in CyVerse
  • Write peer-reviewed paper
  • Plan workshops
  • Scale up workflows to handling thousands of SRA entries for lncRNA identification (New NSF grant)

[2.3.4] Spatial Data Infrastructure:

  • Adopt projects and group interfaces
  • Expose share spatial data for discovery
  • Write Documentation for SDI releases
  • Create and release templates for user-created spatial data web apps

[2.4] Data Commons:

  • Publish large datasets quarterly for Vertnet biodiversity data
  • Make Genomes 2 Fields data available for phenotype prediction on CyVerse, provide stable identifier
  • Make sorghum high throughput phenotyping datasets public in CyVerse where they can be analyzed
  • Assist with specifying metadata requirements
  • Assist with data publishing through canonical repositories
  • Assist with data organization and formatting for data published through Data Commons
  • Specify requirements for enhanced ontology-based metadata management
  • Enhance metadata entry capabilities via API

[2.5] Documentation and Templates:

  • Develop documentation for using tools and workflows
  • Integrate example data for using tools and workflows
  • Make metadata templates
  • Share and develop best practices with community projects

[2.6] Adoption:

  • Provide Letters of Collaboration for researchers
  • Participate at domestic and international conferences (anticipate 2-4 talks/posters per year)
  • Monitor and respond to user tickets -- Limited Support
  • Provide interactive user support via chat (during business hours) -- Limited Support
  • Write and publish manuscripts on CyVerse Platform

[3.0] Training:

  • Collaborate on training with CyVerse Partners
  • Develop community-driven learning materials on unmet needs (integration of multiple data types, metadata management, and scaling analyses to cloud/HPC)
  • Provide training and support for developers
  • Provide introductory training to the CyVerse platforms
  • Deliver community engagement, collect user feedback, and develop collaborations at conferences and meetings
  • Collaborate with Open Science Training Efforts and Training Missions of CyVerse Partners
  • Provide training in basic computing and data management
  • Publish training materials and support virtual learning
  • Deliver in-depth training through the CyVerse Learning Institute

[4.0] Administration:

  • Execute responsibilities as Principal Investigator
  • Execute responsibilities as Executive Team
  • Execute responsibilities of Project Teams (CI, Science, Training)