Artificial Intelligence

Collaborating Partner – University of Sydney

Dr Wanli Ouyang
Associate Professor – University of Sydney
Dr Ouyang graduated from the Department of Electronic Engineering of the Chinese University of Hong Kong with a Ph.D. and is now an associate professor at the University of Sydney. He is one of the three most influential scholars in the field of artificial intelligence in Australia.

Professor Ouyang is the best reviewer of ICCV, guest editor of IJCV, senior member of IEEE, and chairman of ICCV exhibition. Serving as a reviewer for TPAMI, IJCV, TOG, TIP, CVPR, ICCV, SIGGRAPH and other journals/conferences. As the first author, he has published 7 articles in TPAMI and IJCV, and published more than 60 articles in the first-class international conferences in the field of computer vision CVPR, ICCV and NIPS. His team participated in the ImageNet competition and won second place in image object detection in 2014, first in video object detection in 2015, first in image/video object detection in 2016, and first in object detection in 2018 COCO competition.

Project One – STEM Club
Koala Counting System
Project Description

This project aims to develop an AI-based koala counting system. Cameras on drones will be used for capturing RGB+infrared images and videos. Image processing and artificial intelligence are developed to understand the RGB+infrared images for localising and counting koalas.

The deliverable system is composed of the following components:

  • 1) Drones, which are used to flying over the identified koala habitat.
  • 2) RGB+Infrared cameras on each drone to capture koala images.
  • 3) An AI computing device mounted on each drone.
  • 4) An AI algorithm used to locate and count koalas. The AI algorithm takes the visual data from cameras as input and uses computing devices for computation. The located and counted Koala data are transmitted in real time.

In order to automatically find the koalas captured by cameras, intelligent image processing and recognition are the key components in this proposal.

Deep learning is an emerging machine learning method for intelligent image processing. More specifically, taking the multi-modality data from cameras, the deep learning model takes as input the multi-modality data from RGB+Infrared cameras and outputs the location of each koala in the images captured by cameras. When all locations of koalas are identified, they are then used by the deep model for counting the number of koalas.

Proposed Solution

In this project, we propose to develop multi-modality deep learning techniques and the corresponding real-time implementations using light-weight cameras mounted on UAV. In order to deploy the system for koala counting, accuracy and speed are critical issues in the design.

Accuracy is used for measuring how accurate the system is in counting koala populations across NSW. Multi-modality data provide complementary information that helps to improve the accuracy.

Our research team led by Dr Ouyang has led the CUImage team
winning the ImageNet Large Scale Visual Recognition Challenge
on object detection for localising 200 object categories,
including koala, elephant, zebra, and other animals.

The focuses of our solution are listed as follows:

Our proposed solution will use RGB imaging that can easily reach 1920 x 1080. It is well-known that higher resolution leads to better accuracy in computer vision tasks like counting in this task. Besides, the higher resolution from RGB imaging helps to enlarge the distance between the drone and koalas.


RGB images help to distinguish koalas from other animals. The thermal data provide only heat information, which makes it hard to distinguish different animals. In comparison, RGB imaging used in our solution can significantly improve the accuracy in distinguishing koalas from other animals on trees.


Our solution will adopt Dr. Ouyang’s new object detection methods that won ImageNet and COCO object detection challenge, performing much better than the existing solutions in the market.

Project Two – STEM Club
Towards Structured Understanding of Private Educational Scenes
Project Description

This project aims to develop novel deep learning-based machine learning algorithms to understand the activities of classroom scenes for the next generation education.

Expected outcomes of this project include theoretical advances on human pose estimation, object detection and tracking, and human action recognition in classrooms.

This project will enable more effective training for teaching staff, support teaching staff to improve teaching performance with analytic evidence, and eventually reshape the workflow of teaching procedures, which brings invaluable benefits to the quality and expense of education in Australia and worldwide.

This project will develop the first intelligent classroom scene understanding system, utilizing multi-view videos captured by privacy-preserving cameras.


The project opens a new opportunity to enhance classroom teaching and learning with data-driven and evidence-rich insights by harvesting the latest groundbreaking success of computer vision and artificial intelligence techniques. This is critical to empower education sector with AI to improve the safety of interactive classrooms and innovate the practices of STEM teaching in the potential market of both 9,500+ schools in Australia and many more in the world, as video-based classroom observation is the most informative method for measuring quality of teaching.

Advance the Knowledge Base. The outcomes of this project will advance the knowledge base in both computer vision and pedagogy. For computer vision, this project aims to solve a new challenging problem, understanding complex classroom scenes. For pedagogy, the AI-based vision system in this project offers a new, unobtrusive, objective, yet efficient way for classroom observation, which is important for measuring the quality of teaching, monitoring the engagement of students, and updating and enriching the knowledge of classroom teaching and learning.

This project brings together AI researchers, education institute, and teachers together to further advance modern education. Its success will demonstrate the great potential of AI in education sector and encourage the adoption of AI in other sectors. Particularly, the techniques of automatically understanding classroom scenes can be further explored in other sectors, such as health and transportation.