LocalTime

Cloud management

What is LocalTime?

Boosting research productivity, increasing capability and focusing investment on research by leveraging National and State research cloud infrastructure and skills.

LocalTime offers research computing cloud management as a service uniquely combined with discretionary on-premise local hardware investment. This is ideal for organisations that prefer not to manage local cloud complexity. OwnTime is backed by the technical infrastructure and services of the Nectar Cloud, partly enabled by the Australian Government NCRIS program. This unlocks leverage of a national investment of approximately $68M made since 2013, and delivers benefits from the collective intellectual property and operational contributions of all major eResearch infrastructure providers in Australia.

Market-defining open source OpenStack technology binds all components together, offering a clear and sustainable software direction that will endure changes in underlying hardware or overlying operating systems and software. The service is experienced as self-serve computing by researchers, dematerialising all hardware and operational concerns.

The net result is a reduction in cost to operate, extremely efficient utilisation, a time to market faster than any alternative, and a level of proven technical reliability and staffing synergy beyond that of an exclusively insourced or exclusively outsourced arrangement.


How Openstack-as-a-Service works

On-premise hardware is physically housed in locally managed racks at a local datacentre. This is configured as a logically allocated exclusive management zone within the Intersect Cell of the Nectar Cloud, referred to as the Local Zone.

This is analogous to and technically extremely similar to the OwnTime zone used to offer private cloud services to Intersect members and customers for thousands of research projects. The Local zone is typically interconnected via a “SpaceTransit” dedicated link to the Intersect NSW eResearch Nexus to permit high performance networking and storage interconnect.

The Nexus is a convergence point for the national Nectar Cloud and Intersect services, and is the administrative aggregation point for OwnTime, OwnTime Local, Space interconnect, the NSW instance of the replicated Distributed Nectar Control Plane, the distributed support network and the Central Nectar Control Plane.

A simplified mapping between logical and physical service colocation is depicted in this diagram:

Local Zone
Physical servers are built using an automated network operating system installation. Installation and configuration of all Openstack components is managed through a central Puppet server.

The standard server configuration is Ubuntu 16.04 LTS, KVM hypervisor and two bonded 10GbE ports connected through to the Intersect Nectar VLANs. All server operating systems are managed by Intersect. Openstack configuration is managed through the Puppet service.

Shared data can be implemented in a number of ways, including standard Openstack volume, object, ephemeral and network storage compatibility, detailed later.

Intersect NSW eResearch Nexus
Intersect Australia owns and operates the NSW Nectar node, incorporating OwnTime.intersect.org.au. Along with co-located Space.intersect.org.au petascale data storage and hpc.Time terascale supercomputing this cloud.Time infrastructure forms part of the NSW eResearch Nexus facility.

Intersect OwnTime zone
OwnTime is a logical zone within the Intersect Nectar Node that is used to offer private cloud services to Intersect members and customers. This can be thought of as a cross-organisational version of a Local Zone.

Nectar Distributed Control Plane
Each Nectar node has a local cell controller and related infrastructure. These systems manage and provide the various Openstack services within the cell, such as scheduling of new virtual machines, volume storage, telemetry, logging, monitoring, local configuration and interface to the National Control plane. Most OwnTime Local administration occurs here.

Nectar National Control Plane
The National Control Plane provides core services to the whole Nectar Research Cloud. It receives messages from all Nectar Nodes; collates performance statistics, logs and telemetry; provides central services such as the Dashboard, Keystone authentication, centralised logging and monitoring; project allocation system; Swift object storage; and centralised Puppet server.

The Nectar Cloud Core Services Team is responsible for the core services, planning upgrades, managing the code repositories for the Nectar Openstack packages and coordinating technical communication between all parties. Intersect is intimately connected and shares responsibilities with this team.


National Leverage

Capacity and Scale

As demand grows OwnTime Local subscribers can take immediate advantage of the significant capacity of the Nectar Research Cloud. A sudden surge in utilisation can potentially be buffered when demand exceeds Local Zone supply, allowing hardware acquisition without downtime. Currently the Nectar Research Cloud boasts 32,000 CPUs in use, with around 9000 virtual machines contained within 7000 projects. This has increased from 5000 CPUs, 2000 virtual machines and a few hundred projects in 2013..

Expanding a Local Zone by adding additional hardware is a straightforward process, and existing Intersect reporting and statistical tools make usage/utilisation tracking simple for use in ongoing capacity planning. Uniquely, OwnTime features resource consumption attribution direct to the responsible CI, allowing monitoring of research project spend.

Software Reuse

OwnTime directly leverages the national investment in the Nectar Research Cloud, Nectar Virtual Labs and Tools, existing community images and official Nectar images.

A distributed team of support staff exists at each of eight Nodes to manage and operate Nectar Cloud. At present there are 15 Nectar funded Virtual Labs, with more planned. These provide enhanced researcher efficiency with pre-configured virtual machines, supporting proven standard workflows and pipelines.

The image catalog currently comprises 18 official Nectar supported images, representing popular operating systems such as Ubuntu, Debian, openSUSE, Fedora, CentOS and Scientific Linux. These are updated on a regular basis by the Core Services and Intersect Time teams..

There are over 500 public images created by other Nectar users. This vast community library yields a broad range of already integrated software stacks.


Roles and Responsibilities

There are three primary entities: Researchers, Local IT and Intersect Operations.  Between them they participate and interact in four areas of responsibility – Researcher Experience, Local Capacity and Performance, OwnTime Sysadmin and Nectar Sysadmin.

A simplified overview of roles and responsibilities is illustrated:

This division of labour offers numerous advantages, chief among them being:

  • Local resources are able to focus their energies on closer-to-researcher activities, including application software configuration, researcher support, application reuse, and so forth.
  • Local autonomy over hardware capacity planning and performance management is absolutely retained.
  • Local resources remain exclusively available to researchers – there is no community tax from joining the Nectar Cloud.
  • Intersect resources already perform these functions for Nectar and OwnTime clouds, meaning instant familiarity, security and time to market.
  • Intersect prioritises automating commodity services, local priorities can be focused on automating research services.

Local Researcher Experience

Researchers access cloud projects using either the standard Nectar Dashboard, a local portal or a custom portal, similar to Intersect Launchpod. They launch virtual machines via self-service on demand using a range of pre-defined virtual disk images containing specific sets of operating systems and applications. Interactive desktops and HPC-on-cloud can continue to be accessed through any existing local portal as desired.

Through the Nectar dashboard users can request the creation of a new project and request access to any existing organisational projects.

Researchers are easily able to deploy any of the Nectar Virtual Labs currently available. These are built by leading research groups, and provide research specific software, data management, and workflows, in an integrated environment. In addition to the pre-built Virtual Labs, a researcher or administrator can create a customised virtual machine from the Nectar Image Catalog.

The researcher or administrator can build their own image by starting with any catalog image and installing extra software. This permits the development of custom workflows or pipelines to analyse and validate results. The resulting image can be saved to the catalog to allow other researchers to reproduce the results at a later stage. These images can be shared with external researchers if required.

Advanced users can take advantage of the Heat orchestration module, to create one or more integrated virtual machines.

In addition to the web-based Dashboard there are extensive APIs with clients in a variety of programming languages. Compatibility APIs support Amazon EC2 and S3 clients. This allows scripting creation and management of virtual machines, storage and security groups.

Local Capacity and Performance

Local IT is responsible for managing the capacity and performance of the exclusive Local Zone:

  • Managing the inventory of systems in the local portal;
  • Monitoring and managing locally owned hardware, including vendor service calls, and hardware repairs;
  • Capacity planning and hardware acquisition;
  • Installation of new hardware; and
  • Network management and configuration within the dedicated cloud zone.

Local IT can use either the Nectar Dashboard or Openstack command line to perform a range of Openstack tasks including, but not limited to:

  • Creating Nectar project allocation requests;
  • Assigning researchers to projects;
  • Managing virtual machine images; creating, updating, sharing the custom images between projects;
  • Launching and managing interactive desktop virtual machines;
  • Performance and monitoring of interactive desktop virtual machines; and
  • Running and managing hpc virtual machines, including performance and capacity monitoring.

Local IT or specific delegated users are able to control which users have access to each project. Critically, Local IT controls the resource quota for each project. Overcommit ratios for CPU and memory can be set for the zone, aggregate or individual server to provide flexibility in the way resources are shared.

Many “flavors” are provided to create virtual machines with different amounts of CPU, GPU, memory and disk space. In addition to the standard Nectar flavors, customised flavors can be created in consultation with Local IT. For example, flavors with CPU, GPU, memory and disk space scaled to the physical hardware.

Local IT is responsible for handling first level support requests from researchers and escalating to Intersect Operations when required.

OwnTime Sysadmin

Intersect’s Operations Team manages the day to day tasks within the Intersect Nectar Node. This includes:

  • Offering level 2 support for Local IT Services;
  • Adding new compute nodes to Openstack as required;
  • Performing Openstack system software upgrades;
  • Patching and upgrading Host (as opposed to Guest) operating systems;
  • Billing and usage reporting; and
  • Monitoring of all Openstack services and systems.

Nectar Sysadmin

The Intersect Operations team, augmented by the Nectar Cloud Core Services team and nationally distributed support system, manages the OpenStack platform, including hypervisor and operating system. This includes

  • Managing puppet configuration;
  • Managing operating system configuration;
  • Openstack configuration;
  • Monitoring and managing of Nexus hardware, including vendor service calls, and hardware repairs;
  • Nagios configuration;
  • Configuring and managing networking;
  • Planning and scheduling upgrades;Participating in Nectar distributed helpdesk; and
  • Escalation point for OwnTime support.

Service Boundaries

LocalTime solutions leverage existing capabilities and expertise. They allow Local IT people to focus on the contents of their organisation’s virtual machine images and managing their execution. Leverage of Intersect shared infrastructure and skills reduce both initial setup time and ongoing management effort.

Local eResearch support people are able to concentrate resources on researcher requirements, research project workflows, and data pipeline management. Training costs are reduced through use of a standard platform that researchers are already familiar with.

Inclusions

Intersect provides second level remote support services for all core software. This includes:

  • Escalation and management of incidents and problems that cannot be resolved by Local IT first level support.
  • Management of the Openstack Platform including cell and zone configuration.
  • Configuring, patching, upgrading, and troubleshooting of the OpenStack Platform, including security incident response.
  • Configuring, patching, upgrading and troubleshooting the host operating systems
  • Access to Nectar supported virtual machine images
  • Access to Launchpod deployment service
  • Monthly usage accounting and dissection
  • Add or remove compute nodes (hardware changes must be prescheduled)

Options

Optional services can be negotiated on top of standard OwnTime Local subscription fees:

  • Local Cinder Volume, Swift Object Storage and Space.intersect.org.au services are extremely site-specific and must be negotiated individually.
  • Training services are available through Learn.intersect.org.au courseware and trainers, generally requiring tailoring to organisational preferences.

Exclusions

The following services are generally excluded from OwnTime Local as they are typically performed by Local IT, however Energy.intersect.org.au consulting services are available and negotiable depending on organisational requirements:

  • Hands-on deployment, management or maintenance of locally owned hardware,
  • Hands-on deployment, management or maintenance of virtual machine instances and Guest operating systems.
  • Hands-on deployment, management, configuration and maintenance of end-user Nectar tools and virtual laboratories.
  • Managing and maintaining Nectar Community unsupported virtual machine images.
  • Customization or modification of OpenStack software packages to site-specific needs.
  • Installation and support of non-standard software like drivers, hypervisors, host OSes that do not form part of the Nectar or OwnTime supported platforms.

Support Arrangements

The Intersect Operations Team provides level 2 support to Local IT Services for OpenStack platform and core software. Standard business support hours are 9am-5pm Monday to Friday, except public holidays.

Process by which issues are raised

Local IT can report an issue by logging an incident at help.intersect.org.au or by sending an email to help@intersect.org.au. This is a simple and fast entry point to our help.intersect.org.au portal.

Process by which issues are resolved

An Intersect support engineer will provide an initial response to Local IT, which may also involve contacting a representative at the Nectar Distributed Helpdesk to initiate triage.

Subsequently, Intersect will provide a second response to Local IT that includes working with them to establish the following:

  • Confirm that we understand the problem and can reproduce it.
  • If the solution is immediately obvious, then a plan will be created outlining how the problem can be fixed, including what needs to be done to implement it.
  • If the solution is not immediately obvious, an estimate of the time needed to investigate will be given.

An Intersect support engineer will then work through the issue, providing regular updates to Local IT. If the issue is related to core services function, Intersect Support Engineer will escalate the incident to the Nectar core services team.

Where the problem requires an upgrade or outage, a plan will be formed in conjunction with Local IT. This will ensure that disruption is minimized and risks have been mitigated.

Maintenance and upgrades are managed in a coordinated manner with Nectar Core Services and the other Nectar Nodes. Upgrades are performed on an approximate six monthly cycle, usually about 3-4 months behind the upstream Openstack releases. Intersect Operations will advise the schedule of planned upgrades.



Openstack Virtualisation

Hypervisor

Nectar Cloud and OwnTime as standard deploy Ubuntu Linux as their hardware base Host operating system and KVM as the generic hypervisor.

KVM provides an extremely efficient and well proven PCI passthrough mode for GPUs which is in routine use across Openstack implementations, including Nectar Cloud. Specific virtual machine templates and images are created to take advantage of the GPUs.

The modular nature of Openstack Nova virtualisation allows for new features to be selected independently. It also provides for staggered upgrades, minimising disruption and downtime.

Storage

Storage within Openstack is virtualised through a number of models.

Ephemeral disk exists for the lifetime of the virtual machine. In Local Zone servers this is backed by physical high speed disks, typically SSDs in each server.

Volume storage is provided through the Openstack Cinder service. This creates software defined block storage which can be attached or detached from running virtual machines. Cinder is extremely widely supported with an extensive hardware compatibility list.

Swift is a highly available, distributed, eventually consistent object/blob store, with an API similar to AWS S3. Nectar Cloud implements a national Swift cluster, providing a distributed and robust storage model that can be accessed from anywhere with hardware-agnostic integration.

Local Zones can interface storage hardware via Cinder and Swift APIs to integrate local data capabilities in addition to network attached options like Space.intersect.org.au, suitable for petascale research storage.

Openstack Security Measures

Beyond standard network and operating system security measures, Openstack is designed with security in mind. Secure protocols and practices are used throughout its applications.

Each individual virtual machine has an associated set of Security Groups attached to it.  A Security Group consists of one or more rules. Each rule has properties indicating ingress or egress, source address range and protocol.  Security Groups are configured per project, so it is easy for a user to select from existing groups when launching a virtual machine.

The security group rules are implemented as local firewall rules in the hypervisor. Security groups are dynamic, and modifying a group or adding a new group dynamically changes the firewall rules applied to the hypervisor.

In addition a virtual machine can have local host based firewall rules configured inside the virtual machine if desired.

Access to the Nectar Research Cloud control services is managed with strict firewall policies and access controls.

When a volume is deleted, the contents are wiped before it is returned to the general storage pool. This prevents a new volume inheriting any of the previous contents.

The Nectar Core Services and Node Support Teams monitor the official Openstack vulnerability announcements as well as Ubuntu and other security sources. Teams coordinates and undertakes upgrades, patches and rectification work in the case of security issues.

Your browser is not supported. Please upgrade your browser.