Business Methods

Predictive Modelling, Data Science and Big Data

This course is an introduction to a range of fundamental skills, techniques and tools for those aspiring to become Data Scientists. These include Big Data, Machine Learning and Cloud Computing.

Data Science, Predictive Modelling and Big Data skills are of vital and growing importance in commercial, government, commercial and not-for-profit organisations. Those in the Management, Product, Risk and IT functions benefit from skills and literacy in this area.

This two-day course introduces a range of techniques as they are commonly used in business, and provides practical experience in their use.

Read More

Advanced R

This course is for R users, already applying the tool in real-world applications, who are looking for more efficient and powerful ways to:

  • Manipulate data and automate their analysis and research.
  • Develop R applications.
  • Speed up their R application.
  • Make use of wider memory stores.

Read More

Advanced Machine Learning Masterclass

This course is for experienced machine-learning practitioners who are seeking to improve their skills and understanding of the field and to develop proficiency in building more accurate, efficient, and robust models. The aim of the course is to connect deeper theory to practice, so you can create faster, more accurate and appropriate models, and make more effective use of related techniques.

Ideal preparation for this course includes Presciient's course "Predictive Modelling, Data Science and Big Data," as well as work experience in the field, self-study, and participation in online predictive modelling tasks such as those offered by Coursera, Cloudera, and Kaggle. Participants should ideally be familiar with R, and have experience in using R for machine learning. Attendees should have some background in relevant statistics, as well as some practical experience with machine learning.

This course will cover advanced machine learning tools such as bagging, Lasso, elastic net, randomForest, gradient boosting, neural networks, and deep learning, giving students hands-on experience in using them.

The course will also cover key issues in modern machine learning: sparse data sets, and wide data sets with large numbers of categorical fields.

Matters vital to improved model accuracy, such as feature selection and feature generation, will be dealt with, along with methods to control overfitting in large data sets, including regularisation, dropout, and bagging.

The course will also provide experience with exploratory techniques for the investigation of large data sets, and discuss preprocessing techniques for atypical data sets such as network data.

Central to the course will be methods of error measurement and model selection, including k-fold cross-validation, out-of-time sampling, out-of-bag-based early stopping in boosting, regularisation, and advanced methods such as nested k-fold cross-validation. This section will also include a discussion of the theory of controlling overfitting, measuring model stability over time, and the benefits of robust, simple models. There will also be discussion of "the Curse of Dimensionality," which involves issues with high-dimensional spaces and time variation in the system being modelled.

As well as supervised machine learning, the course will present advanced unsupervised learning methods for data exploration and outlier detection. These will include randomForest-based metric-independent outlier detection and clustering, as well as neural autoencoding (the basic building block of deep learning and an automated method of feature selection) and methods such as principal components analysis and singular value decomposition.

Participants will be introduced to a range of tools in R for enabling advanced machine learning, including:

  • RandomForest
  • Gbm for gradient boosting
  • Glmnet for elastic net regularised generalised linear models
  • Kernlab for support vector machines

In addition to tools in R, this course will time permitting, introduce students to Vowpal Wabbit, an extremely fast and scalable machine learning tool, and Theano, a GPU-enabled deep-learning library in Python, and the use of cloud-based tools to perform advanced machine learning.

Read More

Forecasting and Trend Analysis

Forecasting techniques are a key part of decision making, affecting planning at all levels: strategic, tactical and operational. Effective and accurate forecasting techniques are particularly important in a dynamic, changing and uncertain environment. They are indispensable for coordination, execution, and risk management. An understanding of the fundamentals of forecast assessment is vital for all managers and executives. Knowing how to calculate and build different kinds of forecasts is a related, specialised skill.

Examining "what has happened" is a necessary precursor to predicting "what will happen" in the future. Time series analysis is thus an important forecasting foundation.

This one-day course provides a grounding in both time series analysis and forecasting.

Read More

Introduction to Data Analysis for Australian Public Service (APS) Professionals

This is a course for professionals in the Australian Public Service, which introduces core concepts and skills in data analysis to those who are absolute beginners in the area. It is accessible for those with no experience of programming, no mathematics since high school, and no experience or training in data analysis.

The course combines basic theoretical and practical components, presented in a gentle, accessible way. The focus is on intuition, simple language, pictures, and experience—rather than formulas, mathematical jargon, and rote learning.

Read More

Introduction to Data Analysis for Absolute Beginners

This course introduces core concepts and skills in data analysis to those who are absolute beginners in the area. It is accessible for those with no experience of programming, no mathematics since high school, and no experience or training in data analysis.

It combines basic theoretical and practical components, presented in a gentle, accessible way. The focus is on intuition, simple language, pictures and experience rather than formulas, mathematical jargon, and rote learning.

Read More

Fundamentals of Data Analysis

This course introduces core concepts and skills in data analysis to those who are absolute beginners in the area. It is accessible for those with no experience of programming, no mathematics since high school, and no experience or training in data analysis.

It combines basic theoretical and practical components, presented in a gentle, accessible way. The focus is on intuition, simple language, pictures, and experience—rather than formulas, mathematical jargon, and rote learning.

Read More

Data Literacy for Executives

This course introduces core concepts and skills in data analysis for executives who are growing their analytical skills. It is accessible for those with no experience of programming, no mathematics since high school, and no experience or training in data analysis. It combines basic theoretical and practical components, presented in a gentle, accessible way. The focus is on intuition, simple language, pictures, and experience—rather than formulas, mathematical jargon, and rote learning.

Read More

Predictive Modelling and Advanced Analytics using R

This course is an introduction to a range of fundamental skills, techniques and tools for those aspiring to become Data Scientists. These include Big Data, Machine Learning and Cloud Computing. Data Science, Predictive Modelling and Big Data skills are of vital and growing importance in commercial, government, commercial and not-for-profit organisations.

Those in the Management, Product, Risk and IT functions benefit from skills and literacy in this area. This two-day course introduces a range of techniques as they are commonly used in business, and provides practical experience in their use.

Read More

Introduction to R and Data Visualisation

R is the most popular data mining and statistics package in the world, and it is free to use. It is also easy to use thanks to a range of intuitive graphical user interfaces for statistics, data mining, and interactive visualisation. It is used by a growing number of commercial and government organisations, and is also the tool of choice of elite data mining competition winners. R is open source, flexible, and customisable. Over 4,000 R packages are available as extensions to the base environment, constituting one of the largest and most up-to-date collections of cutting edge Analytics tools in the world. it is also one of the most visually spectacular and universally applicable data visualisation tools.

Read More

Data Analytics for Fraud and Anomaly Detection Security and Forensics

This course introduces attendees to a range of data analysis methods for the detection of fraud, abuse and suspicious behaviour. The course provides key concepts and with hands on practice with a range of readily available and free tools, including Microsoft Excel and R, a powerful open source data analysis tool.

Read More