CyVerse is Summer Fun for KEYS Students
High School Interns Dive into ML/AI and Data Management Projects
Keep Engaging Youth in Science (KEYS) is a University of Arizona summer program that trains high school students interested in developing STEM skills. Over seven weeks each summer, KEYS interns gain experience working on “immersive, real-world projects under the mentorship of U of A scientists.” This summer, two talented upper class high school students were offered internships to work with CyVerse PI Nirav Merchant, Research Associate Professor Tyson Swetnam, and Senior Data Engineer Tony Edgin to focus on projects that will benefit the CyVerse platform.
Tanmay Dewangan, a Paradise Valley High School senior, and Devan Patel, a BASIS Scottsdale senior, got to dive deeply into data and metadata management and chatbot customization, respectively. Under the mentorship and direction of Merchant, Swetnam and Edgin, Tanmay and Devan gained some serious computational chops that will distinguish them from other incoming college freshmen when they graduate.
photo: N. Merchant
Working closely with Edgin, Tanmay explored using Comprehensive Knowledge Archive Network (CKAN) for deployment on CyVerse as a replacement for its Data Commons. Tanmay started by manually moving existing datasets from the curated Data Commons to the CKAN deployment, but quickly learned to automate the process using Python to interact with the CKAN Application Programming Interface (API) and Discovery Environment API. Although he encountered numerous technical problems right off the bat (API integration issues, metadata inconsistencies, name length limitations), by week 6, Tanmay succeeded in enhancing data management practices using CKAN, integrating various APIs, and developing tools to streamline dataset migration and metadata standardization efforts. “The best thing I learned was the importance of collaboration and effective data management in scientific research. While working under such awesome mentors, I gained valuable insights into how to organize and share research data in ways that enhance accessibility and foster collaboration among researchers. This experience not only improved my technical skills but also gave me a deeper appreciation of the collaborative nature of scientific research,” said Tanmay. View his scientific poster documenting his work here.
photo: T. Swetnam
Devan (in banner photo above), drawn by the siren call of an Artificial Intelligence (AI) project, created a custom AI chatbot. He first integrated a generic open source chatbot into a website he created, then customized the chatbot's user interface (UI) and chat functionality. Under Swetnam’s steady guidance, Devan learned to troubleshoot issues such as maintaining chat history between pages and ensuring the chatbot remains open or closed based on user preference. A huge accomplishment was implementing a method for the chatbot to cite its sources properly and ensuring hyperlinks opened in a new tab. Devan gained proficiency in using developer tools such as GitHub and GPT-4 technologies to enhance chatbot features and to resolve issues like API key management and vector store management for OpenAI's Assistant. Devan's enthusiastic summary: "Interning with CyVerse was an amazing experience! Working under Dr. Swetnam, I had the opportunity to build an open source chatbot and deploy it on various websites. I can’t wait to take the skills I’ve learned to further my career in data science!" His scientific poster summarizing his chatbot project can be found here.
"Devan and Tanmay both impressed me with their ability to quickly ramp up their projects and generate working, functional code," said Edgin. "The disciplines of software development and data science are broadening rapidly, and soon everyone in these fields will be competent in both areas. If they pursue studies in one of these two fields, alongside another STEM field that excites them, they will likely have rewarding careers. Because of students like them," he beamed, "we at CyVerse are excited about what the future may hold."