Building an integrated environment in which to select and perform analysis on corpora through a suite of pre-installed tools
Intersect has completed work on phase one of Alveo, the Human Communication Science (HCS) Virtual Laboratory, a University of Western Sydney led project that will connect HCS researchers with a range of tools by which they can analyse corpora previously spread across various sites with varying degrees of accessibility. The project is thus concerned with consolidating HCS resources, and providing greatly enhanced opportunities for advances in Australian research in Human Communication Science.
Human Communication Science (HCS) is a broad-reaching interdisciplinary mix of research that spans speech, language and music. Nationally, the HCS community consists of 2000 researchers focusing on the manner in which humans communicate with each other, and with computers and machines via codified means — speech, text, music and sound. Internationally there are around 45,000 researchers working in the community.
HCS research encompasses areas such as speech science, computer science, behavioural science, linguistics, music cognition and musicology, and sonics and acoustics. The cross-disciplinary nature of the tools and corpora in HCS vLab promises to facilitate research that will provide new insights into old problems, and involve novel combinations of old ideas to approach new problems.
NeCTAR has funded development of the Human Communication Science Virtual Laboratory (HCS VLab) to enable easy access to shared tools and data, and to overcome the resource and access limitations of individual desktop systems. The HCS vLab will allow a diverse range of researchers to access an amalgamation of existing data collections (corpora) and analytical tools generated by researchers in their community. It will bring significant processing and computational power to bear on the analysis of large-scale corpus data by sophisticated research tools.
The aims are to:
- facilitate access of the Australian and international HCS communities to data and analysis tools;
- afford new tool–corpus combinations and new emergent research output –projects, grant funding, doctoral theses, and publications;
- allow analysis and annotation results to be stored and shared, thus promoting collaboration between institutions and disciplines;
- improve scientific replicability by moving local and idiosyncratic desktop-based tools and data to an accessible, in-the-cloud, environment that standardises, defines, and captures procedures and data output so that research publications can be supported by re-runnable re-usable dataand coded procedure (see, e.g., www.myexperiment.org/).
The lab is designed to make use of national infrastructure – including data storage, discovery and research computing services. It incorporates existing eResearch tools, adapted to work on shared infrastructure, with a data-discovery interface to connect researchers with data sets, orchestrated by a workflow engine with both web and command line interfaces to allow use by technical and non-technical researchers, via a Web interface.
The Human Communication Science (HCS) Virtual Laboratory acknowledges funding from the NeCTAR project http://www.nectar.org.au, an Australian Government project conducted as part of the Super Science initiative and financed by the Education Investment Fund, and co-funding from participant institutions – University of Western Sydney, Macquarie University, Australian National University, Flinders University, the Universities of Canberra, La Trobe, Melbourne, New England, New South Wales, Sydney, Tasmania, Western Australia; and peak bodies – ASSTA (Australasian Speech Science and Technology Association, AusNC (Australian National Corpus), and NICTA (National ICT Australia).
Visit Alveo at http://alveo.edu.au