[Skip to Content]

Science is becoming increasingly data-intensive and this requires a new data focused approach. The prime example is the Square Kilometre Array (SKA) project, one of the world’s latest large- scale global scientific endeavours, which will be co-hosted in Australia and is set to produce orders of magnitude more data than all of mankind’s past accomplishments. Just one of the SKA phase-1 science projects (e.g. the HI survey) will produce derived data in the order of several terabytes per second, and the second phase of the SKA project will be at least an order of magnitude greater.

Two related PhD projects are being offered in Data Intensive Astronomy which will bridge the computer science-focused data-driven approach to the science applications. Data Intensive Science has become fundamental to deliver any modern-day cutting edge science. ICRAR is a recognised world leader in its astronomical applications. One of the two positions focuses on the technical issues and the other on the astronomical requirements. There will be a great deal of commonality in the approaches taken, although the deliverables will be different. These PhD projects will provide industry engagement and a unique training environment, working at the cutting edge of radio astronomy, computer science and commercial business and scientific systems.

The 1st PhD project will focus on the Computer Science components of the problem. The work will involve two elements: 1) profiling basic algorithms to measure various compute and other metrics and creating data slicing helper functions based on information derived from measured metrics; 2) the characterisation of the transitions between compute intense and I/O intense phases, the balancing these being central to getting the best performance. The student will be supervised primarily by Prof. Wicenec.

The 2nd PhD project will work on some of the most extreme datasets observed in Radio Astronomy to date, which will provide a perfect test bed for the data-driven paradigm. The data we will use will come from the Australian SKA Pathfinder, and from a Deep HI pathfinder project called CHILES. The student will investigate the ideas and methods, demonstrating new approaches on frontier data products. The student will be supervised primarily by Dr. Dodson.

We are interested to hear from potential candidates from any STEM background, as the range of skill sets required (and to be developed) can not be limited to one traditional field of study. The candidate would join an active multi-disciplinary group with many scientific and commercial cross fertilisation possibilities.

 

SDP prototyping workflow, based on that for the deep HI project CHILES. The daily observations are split into many small frequency sub-bands that can be imaged and cleaned in parallel and then recombined into the final data product; an image cube covering redshifts between 0 and 0.5, which is allowing us to explore the local Universe in HI and discover the most distant HI galaxies (Fernández 2016).

SDP prototyping workflow, based on that for the deep HI project CHILES. The daily observations are split into many small frequency sub-bands that can be imaged and cleaned in parallel and then recombined into the final data product; an image cube covering redshifts between 0 and 0.5, which is allowing us to explore the local Universe in HI and discover the most distant HI galaxies (Fernández 2016).

Image from a single velocity channel in an image cube made in a highly distributed fashion on the Amazon Web Services Cloud-based computing. The hydrogen line emission from a single galaxy is clearly visible, with flux levels around 1 mJy/beam. The inset spectrum shows the integrated flux across the galaxy as a function of frequency (in GHz) showing the emission is limited to a fraction of a MHz. The frequency allows us to calculate the distance to this galaxy to be 30 Mpc.

Image from a single velocity channel in an image cube made in a highly distributed fashion on the Amazon Web Services Cloud-based computing. The hydrogen line emission from a single galaxy is clearly visible, with flux levels around 1 mJy/beam. The inset spectrum shows the integrated flux across the galaxy as a function of frequency (in GHz) showing the emission is limited to a fraction of a MHz. The frequency allows us to calculate the distance to this galaxy to be 30 Mpc.