Interactive High Performance Computing Portal

Background
The University of Technology Sydney requested to develop a fully interactive graphical environment to manage their High Performance Computing (HPC) facility consisting of multiple clusters. The system needed to provide real-time cluster monitoring interface, manage access based on user profile, and display help documentation with advanced search functionality. The application is designed to give users, many of whom have little or no Unix experience, a fully interactive graphical environment to interact with cluster nodes easily which reduces the learning curve by abstracting underlying technical complexity. The web-based interface provides administrator users to manage, monitor, and submit computational workloads across multiple clusters. The goal was to simplify the user experience for researchers, students, and administrators while maintaining system security, performance, and scalability.
Challenge
The core challenges revolve around scalability and performance in a multi-user, real-time environment, combined with complex access control requirements. Specifically:
- Real-time Scalability for Monitoring: Ensuring the graphical environment can display real-time updates for hundreds of nodes to hundreds of concurrent users at peak times without performance degradation or lag. This demands highly efficient data streaming, aggregation, and rendering.
- Low-Latency Metadata Access: The system must be able to access and load metadata for nodes with very low latency. This is crucial for quick navigation, rapid drill-downs, and responsive user interactions, especially when users are exploring specific node details.
- Complex and Granular Access Control: Implementing a robust user access management system that applies multiple access filters and node limit filters, driven by regular expressions and user group affiliations. This ensures users only see and interact with resources they are authorized for, adding significant complexity to the filtering and data presentation logic.
- Advanced documentation search using full-text indexing
Solution
The project was developed in collaboration with research computing experts, system administrators, and representatives from the University of Technology Sydney (UTS). Our engineering experts successfully addressed technical hurdles by leveraging their expertise to optimize data management and system architecture.
Key solutions included:
- Optimized Time-Series Data Management: To handle the substantial volume of real-time data, we implemented a database solution specifically optimized for storing and accessing large amounts of time-series data.
- Enhanced Query and Write Performance: Database choice was further optimized for complex queries and heavy write activity, ensuring high performance even under demanding conditions.
- Advanced access filtering: We built regular expression-based cluster access filters and node user limit controls to manage access for user groups, either on individual nodes or across multiple nodes in bulk.
- Secure and High-Performance CMS with Django: We implemented a help documentation content management system that allows front-end users to access documentation with advanced search capabilities.
High Level Architecture
Our engineering team successfully built a fully interactive graphical user interface which allowed system administrators to monitor node workload, node health status, active/suspended user sessions, underused resources and node system failures. Additionally, administrators can set custom access controls and usage limits for clusters and nodes based on user groups or individual user profiles. The application offered researchers an intuitive interface to interact with on-demand compute resources based on their access rights and offered them documentation and support within the portal with advanced search features. UTS played a key role in identifying data workflows and validating interface requirements. Regular feedback loops ensured the system met real user needs and both stakeholders collaborated effectively to ensure the project’s success.
“Our collaboration with Intersect has been instrumental in delivering a future-ready iHPC portal that aligns with our vision for seamless, scalable and accessible research technology infrastructure.” Pascal Tampubolon, Head of Research Technology, UTS
How Intersect can empower your research:
Our Services
- Analytics Services: Delivering data science, advanced analytics, artificial intelligence, and statistical solutions to empower research endeavours.
- Technology Services: Offering digital storage systems, large-scale computing platforms, data services, software, and more to support research needs.
- Training & Education: Helping researchers develop the essential skills through targeted programs.
- Research Support: Our Digital Research Analysts (DRAs) work with researchers to find the best technologies and apply their expertise effectively.
Membership
Membership involves a commitment to creating better research capability for the communal benefit of all members. Each member can draw on over 50 years of shared, on-campus experience in digital research technology; backed by research-specific technical specialists in IT systems, operations, and engineering. This suite of services is optimally delivered through an on-campus professional: a core component of Intersect membership.
Training
Intersect Australia offers hands-on training tailored to researchers and higher-degree research (HDR) students, covering a wide range of research-relevant digital tools and technologies, such as Excel, Python, R, SPSS, NVivo, REDCap and Qualtrics. These courses are available at beginner, intermediate, and advanced levels.
All members can access these training sessions. For details on all Intersect training courses across all member universities, click here.