Vinicius Santino Alves - AIMS@JCU

Vinicius Santino Alves

vinicius.santinoalves@jcu.edu.au

Recipient of an AIMS@JCU Scholarship

PhD
College of Science and Engineering

Vinicius Santino Alves

vinicius.santinoalves@jcu.edu.au

PhD
College of Science and Engineering
Machine learning approach to restoration, prediction and quality control of oceanographic data from IMOS Moorings.

The researcher acquired experience and knowledge in computational intelligence during him master's degree, in which he learned skills such as intelligent systems, data mining and fuzzy set theory as well as more fundamental skills as theoretical foundations of computing and foundations of programming and data structures.

He has two articles published, as the first author, related to his master's thesis on peer-reviewed international conferences, as well as co-authored with his supervisor and co-supervisor a paper in a high-impact international journal. As a whole, the articles published have more than 100 citations.

During his thesis preparation, He successfully developed 8 algorithm variations and managed to compare them over a large number of datasets in order to identify the most robust and faster algorithm with a valid statistical approach.

Machine learning approach to restoration, prediction and quality control of oceanographic data from IMOS Moorings.

2018 to 2022

Project Description

Reliable data on the state of the ocean and coastal areas are in growing demand. For example, the Australian National Mooring Network, as part of the Integrated Marine Observing System (IMOS), has measured physical and biological parameters at over 50 sites in Australian coastal waters for the past 10 years. The resulting data collection consists of a huge amount of time-series information across many variables. This project provides a candidate the opportunity to investigate new machine learning approaches to time-series analysis and their application to increasing the value of oceanographic data. The full collection of IMOS Moorings data will be available for use in training the new algorithms developed. Much of this data has already been flagged by heuristic quality control routines, and manually annotated by domain experts.

Project Importance

Towards the enhancement and quality control of oceanographic data, the researcher expects to make IMOS Moorings data more reliable and more suitable for the domain experts' analysis.

Project Methods

In what concerns anomaly and outlier detection, the project will follow two main lines of investigation: unsupervised/semi-supervised techniques and supervised techniques. In the case of supervised techniques, the data that have already been flagged by heuristic quality control routines and manually annotated by domain experts will be used to train existing anomaly detection techniques for time-series data. Based on the observed behaviour of the trained models, the methods applied may be further adjusted and improved to fit the requirements of the problem in hand better. New techniques may also be developed if needed.

Since available flagged data may not suffice to train anomaly detectors in a supervised way satisfactorily, techniques for semi-supervised learning and/or unsupervised learning will be investigated. In the case of semi-supervised learning, a possible line of investigation is to generalise methods of semi-supervised classification, such as one-class classification and label expansion techniques, to the domain of time-series. As for unsupervised learning, the student will have the opportunity to investigate novel unsupervised outlier detection techniques and/or adapt state-of-the-art techniques, such as outlier ensembles, to the time-series domain.

Once methods are in place that can successfully detect anomalies/outliers, data-driven predictive techniques such as neural networks can be investigated to cope with the detected anomalies and also for time-series forecasting of the oceanographic phenomena of interest.

Project Results

This PhD project will contribute to the following tasks:
• Develop a forecasting methodology for extrapolating data on different timescales.
• Analyse and understand the relationships between different oceanographic, water quality and ecosystem parameters in the tropics
• Investigate a machine learning approach to automate the quality control process of oceanographic data by flagging anomalies and outliers
• Develop an approach and implement an artificial intelligence technique to the task of data gap interpolation;

Keywords

Biostatistics,
Communication / Education,
Economic development,
Management tools,
Marine planning,
Oceanography,
Quantitative marine science,
Sea level rise

Supervised By:

Ricardo Campello (JCU)

Paul Rigby (AIMS)

Ickjai Lee (JCU)

Oleg Makarynskyy (AIMS)