close

close

Thank you for your enquiry!

One of our training experts will be in touch shortly to go over your training requirements.

close

close

Press esc to close

close close

Back to course information

Thank you for your enquiry!

One of our training experts will be in touch shortly to go overy your training requirements.

close close

Thank you for your enquiry!

One of our training experts will be in touch shortly to go over your training requirements.

Data Science Training

Online Instructor-led (3 days)

Classroom (3 days)

Online Self-paced (24 hours)

Python Data Science Training Course Outline

Introduction of Python

Working with IPython

  • Launching IPython Shell and Jupyter Notebook
  • Keyboard Shortcuts in the IPython Shell
  • Special Commands of Python
    • Pasting Code Blocks: %paste and %cpaste
    • Running External Code: %run
    • Timing Code Execution: %timeit
    • %magic and %Ismagic
  • IPython’s In and Out Objects
  • IPython and Shell Commands
  • Errors and Debugging
  • Profiling and Timing Code

Introduction to NumPy

  • Understand Data Types in Python
  • NumPy Arrays
  • Computation on NumPy Arrays: Universal Functions
  • Aggregations: Min, Max and more
  • Computation on Arrays: Broadcasting
  • Comparison, Boolean Logic, and Masks
  • Fancy Indexing
  • Sorting Arrays
  • NumPy’s Structured Array

Working with Pandas

  • Installing and Using Pandas
  • Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Concat and Append
  • Merge and Join
  • Aggregations and Grouping
  • Pivot Tables
  • Vectorised String Operations
  • Working with Time Series
  • eval() and query()

Visualisation with Matplotlib

  • Overview of Matplotlibs
  • Two Interfaces
  • Simple Line Plots and Scatter Plots
  • Visualising Errors
  • Density and Contour Plots
  • Histograms, Binnings, and Density
  • Customising Plot Legends
  • Customising Colorbars
  • Multiple Subplots
  • Text Annotation
  • Customising Ticks
  • Customising Matplotlib: Configuration and Stylesheets
  • Three-Dimensional Plotting in Matplotlib
  • Geographic Data with Basemap
  • Visualisation with Seaborn

Show moredown

Prerequisites

There are no prerequisites for attending this course. However, a basic understanding of programming would be beneficial.

Audience

Anyone interested in having a career in Python can attend this course. This course is well-suited for:

  • Big Data and Analytics Professionals
  • Software Developers
  • Project and BI Managers ETL Professionals

Python Data Science Training​ Course Overview

Python is a premier and powerful open-source language that is easy to use and has powerful libraries for data manipulation and analysis. It is a multi-paradigm programming language and supports object-oriented programming, functional programming patterns, and structured programming. This Python Data Science Training is designed to equip delegates with the knowledge of programming language for the domain of data science.

In this 3-day training, delegates will learn how to create arrays from scratch and python lists. Delegates will acquire a comprehensive knowledge of data manipulation with pandas. In addition, they will learn how to rearrange multi-indices, combine datasets, and work with time series. Delegates will get an understanding of simple line plots and simple scatter plots.

During this course, delegates will gain in-depth knowledge of how to visualise a three-dimensional function. Furthermore, familiarise yourself with histograms, binnings, and density. Delegates will learn how to customise plot legends and colorbars. Post completion of this training, delegates shall be able to customise matplotlib as well.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (4 days)

Classroom (4 days)

Online Self-paced (32 hours)

Advanced Data Science Certification​ Course Outline

Module 1: Python for Data Analysis - NumPy

  • Introduction to NumPy
  • NumPy Arrays
  • Aggregations
  • Computation on Arrays: Broadcasting
  • Comparison, Boolean Logic and Masks
  • Fancy Indexing
  • Sorting Arrays
  • NumPy’s Structured Arrays

Module 2: Python for Data Analysis – Pandas

  • Installing Pandas
  • Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Concat and Append
  • Merge and Join
  • Aggregations and Grouping
  • Pivot Tables
  • Vectorised String Operations
  • Working with Time Series

Module 3: Python for Data Visualisation – Matplotlib

  • Overview
  • Object-Oriented Interface
  • Two interfaces
  • Simple Line Plots and Scatter Plots
  • Visualising Errors
  • Contour Plots
  • Histograms, Binnings and Density
  • Customising Plot Legends
  • Customising Colour Bars
  • Multiple Subplots
  • Text Annotation
  • Three Dimensional Plotting

Module 4: Python for Data Visualisation – Seaborn

  • Installing Seaborn and Load Dataset
  • Plot the Distribution
  • Regression Analysis
  • Basic Aesthetic Themes and Styles
  • Distinguish between Scatter Plots, Hexbin Plots and KDE Plots
  • Use Boxplots and Violin Plots

Capstone 1: Retrieving, Processing and Visualising Data with Python

Module 5: Machine Learning

  • Introduction
  • Importance
  • Types
  • Working
  • Machine Learning Mathematics

Module 6: Natural Language Processing

  • Introduction
  • NLP Example
  • Advantages
  • NLP Applications

Module 7: Deep Learning

  • Introduction
  • Importance
  • Working

Module 8: Big Data

  • Big Data Analytics
  • State of Practice in Analytics
  • Main Roles for New Big Data Ecosystem
  • Phases of Data Analytics Lifecycle

Capstone 2: Machine Learning Applications in Retail, Hospitality, Education and Insurance Sectors

Module 9: Working with Data in R

  • Data Manipulation in R
  • Data Clean Up
  • Reading and Exporting Data
  • Importing Data
  • Charts and Graphs

Module 10: Regression in R

  • Regression Analysis
  • Linear Regression
  • Logistic Regression
  • Multiple Regression
  • Normal Distribution
  • Binomial Distribution

Capstone 3: Retrieving, Processing and Visualising Data with R

Module 11: Modelling Data in Power BI

  • Power BI Data Model
  • What are the Relationships
  • Viewing Relationships
  • Creating Relationships
  • Cardinality

Module 12: Shaping and Combining Data using Power BI

  • The Query Editor
  • Shaping Data and Applied Steps
  • Advanced Editor
  • Formatting Data
  • Transforming Data
  • Combining Data

Module 13: Interactive Data Visualisations

  • Page Layout and Formatting
  • Multiple Visualisation
  • Creating Charts
  • Using Geographic Data
  • Histograms

Capstone 4: Product- Sales Analysis using Power BI

Show moredown

Prerequisites

There are no formal prerequisites for attending this Advanced Data Science Certification. However, having a prior knowledge of programming languages will be beneficial for the delegates.

Audience

This Advanced Data Science Certification is suitable for anyone who wants to take their skills to the next level and add-on into their existing skillset. However, it is much more beneficial for:

  • Data Scientists
  • Data Engineers
  • Business Analysts
  • Data Analysts
  • Data Architects
  • Machine Learning Engineers

Advanced Data Science Certification​ Course Overview

The Knowledge Academy’s 4-day Advanced Data Science Certification provide delegates with a comprehensive knowledge of basic to advanced concepts to make a Data Scientist. Delegates will learn various concepts such as NumPy arrays, installing Pandas, object-oriented interface, regression analysis, machine learning mathematics, etc. Our highly experienced and professional trainers will conduct this training who have years’ of experience in teaching Data Science training courses.

Apart from these, delegates will learn the following essential concepts, such as:

  • Working with time series
  • Three dimensional plotting
  • Installing Seaborn and load dataset
  • Phases of data analytics lifecycle
  • Shaping and combining data using Power BI

After attending this expert training, delegates will be able to operate on the data in Pandas and working with time series. They will also be able to shape and combine data using Power BI successfully and implement interactive data visualisations.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Courseware
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Probability and Statistics for Data Science​ Training Course Outline

The following modules will be covered during this Probability and Statistics for Data Science Course:

Basic Probability Theory

  • Probability Spaces
  • Conditional Probability
  • Independence

Random Variables

  • Definition
  • Discrete Random Variables 
  • Continuous Random Variables
  • Conditioning on an Event
  • Functions of Random Variables
  • Generating Random Variables
  • Proofs

Multivariate Random Variables

  • Discrete Random Variables
  • Continuous Random Variables
  • Joint distributions of Discrete and Continuous Variables
  • Independence
  • Functions of Several Random Variables
  • Creating Multivariate Random Variables
  • Rejection Sampling

Expectation

  • Expectation Operator
  • Mean and Variance
  • Covariance
  • Conditional Expectation
  • Proofs

Random Processes

  • Definition
  • Mean and Autocovariance Functions
  • Independent Identically-Distributed Sequences Gaussian Process
  • Poisson Process
  • Random Walk

The convergence of Random Processes

  • Types of Convergence Law of Large Numbers
  • Central Limit Theorem
  • Monte Carlo Simulation

Markov Chains

  • Time-Homogeneous Discrete-Time Markov Chains
  • Recurrence
  • Periodicity
  • Convergence
  • Markov-Chain Monte Carlo

Descriptive Statistics

  • Histogram
  • Sample Mean and Variance
  • Order Statistics
  • Sample Covariance
  • Sample Covariance Matrix

Frequentist Statistics

  • Independent Identically-Distributed Sampling
  • Mean Square Error
  • Consistency
  • Confidence Intervals
  • Nonparametric Model Estimation
  • Parametric Model Estimation
  • Proofs

Bayesian Statistics

  • Bayesian Parametric Models
  • Conjugate Prior
  • Bayesian Estimators

Hypothesis Testing

  • The Hypothesis-Testing Framework
  • Parametric Testing
  • Nonparametric Testing: The Permutation Test
  • Multiple Testing

Linear Regression

  • Linear Models
  • Least-Squares Estimation
  • Overfitting

Show moredown

Prerequisites

There are no formal prerequisites for attending this course.

Audience

Anyone interested in learning how to apply probability and statistics to data science can attend this course.

Probability and Statistics for Data Science​ Training Course Overview

Probability is the most fundamental skill required to be successful in the business world. This Probability and Statistics for Data Science training course is designed to acquaint delegates with the most fundamental concepts in the field of probability. The course will equip delegates with the knowledge about probability and statistics to tackle the problems related to business and data science.

The Knowledge Academy’s Probability and Statistics for Data Science training is crafted to equip delegates with a comprehensive understanding of complicated probabilistic concepts. This course will take your career to the next level, which is of probability, Bayesian probability, conditional probability, and probability distributions.

During this 2-day course, delegates will learn about discrete and continuous random variables. The course will teach delegates how to generate multivariate random variables. In addition, delegates will gain knowledge gaussian and poisson process. Post completion of this training, delegates will become familiarised with parametric and nonparametric testing.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Text Mining Training​ Course Outline

Introduction to Text Mining

  • What is Text Mining?
  • Text Mining Systems Architecture

Core Text Mining Operations

  • Text Mining Operations
  • Text Mining Query Languages

Text Mining Preprocessing Techniques

  • Task-Oriented Approaches

Categorisation

  • Text Categorisation Applications
  • Document Representation
  • Knowledge Engineering and Machine Learning Approach to TC
  • Using Unlabeled Data
  • Evaluating Text Classifiers

Introduction to Clustering

  • Clustering Tasks in Text Analysis
  • Clustering Algorithms
  • Clustering of Textual Data

Information Extraction (IE)

  • Define Information Extraction
  • IE Systems Architecture
  • Anaphora Resolution
  • IE Inductive Algorithms
  • Structural Information Extraction (IE)

Probabilistic Models for IE

  • Hidden Markov Models
  • Stochastic Context-Free Grammar
  • Maximal Entropy Modeling
  • Maximal Entropy Markov Models
  • Conditional Random Fields

Preprocessing Applications

  • HMM to Textual Analysis Applications
  • Using MEMM for IE
  • Applications of CRFs to Textual Analysis
  • Using SCFG Rules
  • Bootstrapping

Presentation-Layer Considerations

  • Browsing
  • Accessing Constraints and Simple Specification Filters at the Presentation Layer
  • Accessing the Underlying Query Language

Visualisation Approaches

  • Architectural Considerations
  • Text Mining Visualisation Approaches
  • Visualisation Techniques in Link Analysis

Introduction to Link Analysis

  • Automatic Layout of Networks
  • Paths and Cycles in Graphs
  • Centrality
  • Partitioning of Networks
  • Networks Pattern Matching

Show moredown

Prerequisites:

There are no prerequisites for attending this course.

Audience:

Anyone wishes to develop their knowledge and skill-set can attend this course. This course is well-suited for:

  • Data Scientists and Analysts
  • UX Researchers
  • Machine Learning Engineers

Text Mining Training​ Course Overview

Text mining is a knowledge-intensive process in which a user interacts with a document collection over time with the help of a suite of analysis tools. A document collection can be any grouping of text-based documents. Text Mining seeks to extract valuable information from data sources by identifying and exploring patterns.

This course is designed to provide complete knowledge of text mining operations and preprocessing techniques. Delegates will get an understanding of the text categorisation problem. They will learn about significant algorithms to perform text categorisation. Also, they will learn how to use unlabelled data and evaluate text classifiers.

During this 2-day training, delegates will be equipped with the knowledge of clustering and Information Extraction (IE). Delegates will learn how to access constraints and simple specification filters at the presentation layer. Then, delegates will be introduced to hidden Markov models and maximal entropy Markov models. Post completion of this training, delegates will be able to use MEMM for Information Extraction.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Keras Training for Data Scientists Course Outline

This Introduction to Keras Training will explore the following topics:

Introduction to Keras

  • Define Keras
  • Guiding Principles
  • Why Use Keras?

Exploring Models

  • Overview of Keras Models
  • Sequential
  • Model (Functional API)

Overview of Keras Layers

  • Core Layers
  • Convolutional Layers
  • Pooling Layers
  • Locally-Connected Layers
  • Recurrent Layers
  • Embedding Layers
  • Merge Layers
  • Advanced Activations Layers
  • Normalisation Layers
  • Noise Layers
  • Layer Wrappers
  • Create Your Own Keras Layers

Preprocessing

  • Sequence Preprocessing
  • Text and Image Preprocessing
  • Losses
  • Metrics and Optimisers
  • Activations
  • Usage of Callbacks
    • BaseLogger
    • TerminateOnNaN and ProgbarLogger
    • ModelCheckpoint
    • LearningRateScheduler
    • TenserBoard
  • Datasets
  • Applications
  • Keras Backend
  • Initialisers
  • Usage of Regularizers and Constraints
  • Model Visualisations
  • Scikit-Learn API

Show moredown

Prerequisites

There are no formal prerequisites for attending this course.

Audience

Anyone interested in learning about Keras can attend this two-day intensive course. This course is well-suited for:

  • Business Analytics Professionals
  • Anyone beginning with Machine Learning

Keras Training for Data Scientists Course Overview

Keras is an open-source neural network library written in Python and capable of running on top of CNTK, TensorFlow, or Theano. Keras was developed to enable fast experimentation and is extensively used by data scientists to architect the neural network for complex problems. Keras can serve as higher-level API, which means it can act as an interface for Theano, TensorFlow, etc. Keras also compiles model with loss and optimiser functions, training process with fit function. It does not handle low-level API such as making the computational graph, making tensors or other variables as the backend engine has dealt with it.

This 1-day Introduction to Keras Training is designed to provide knowledge to delegates about Keras and the usage of Keras. Delegates will learn about different Keras layers such as core layers, convolutional layers, pooling layers, locally-connected layers, recurrent layers, etc. In addition, delegates will learn how to perform sequence, text, and image preprocessing. Post completion of this intensive training, delegates will be able to use regularisers and constraints. 

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Pandas for Data Analysis Training Course Outline

The pandas: A Python Data Analysis Toolkit is a two-day course. The following is a brief synopsis of the topics that will be covered in this course.

Introduction and Installation

  • Define pandas
  • Installing pandas
  • Running Test Suite
  • Dependencies

Getting Started with pandas

  • Package Overview
  • Exploring panda
  • Essential Basic Functionality
  • Introduction to Data Structures
  • Comparison with Other Tools

User Guide

  • IO Tools
  • Indexing and Selecting Data
  • MultiIndex/Advanced Indexing
  • Merge, Join, and Concatenate
  • Reshaping and Pivot Tables
  • Working with Text Data, Missing Data, and Categorical Data
  • Nullable Integer Data Type
  • Visualisations
  • Computational Tools
  • Group By: Split-Apply-Combine
  • Time Series and Time Deltas
  • Stylings
  • Options and Settings
  • Enhancing Performance
  • Sparse Data Structures

pandas Ecosystem

  • Statistics and Machine Learning
  • Visualisation
  • IDE
  • Data Validation
  • Extension Data Types

Development Phase

  • Contributing to pandas
  • Internals
  • Extending pandas
  • Storing pandas DataFrame Objects in Apache Parquet Format
  • Themes in pandas Development

Show moredown

Prerequisites

There are no prerequisites to attend this course. However, a basic knowledge of programming would be beneficial.

Audience

Anyone wishes to develop their knowledge and skillset on python libraries can attend this course. This course is beneficial for those who want to make a career in data science and data analytics.

Pandas for Data Analysis Training Course Overview

Pandas is an open-source python library that provides high-performance, data analysis and data structures tool for the Python programming language. Python with pandas can be used in numerous fields, such as statistics, economics, and analytics. This course is designed to provide knowledge of how to quickly and easily analyse data with Python’s powerful library- pandas.

In this 2-day course, delegates will gain comprehensive knowledge of pandas and data structures. Delegates will learn how to work with text data, missing data, and categorical data. In addition, they will get an understanding of merge, join, concatenate, reshaping and pivot tables.

During this course, delegates will become familiarised with visualisation, IDE, data validation, and extension data types. Delegates will learn how to store pandas DataFrame objects in Apache Parquet format. On completion of this course, delegates will get an understanding of various themes in pandas.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Predictive Analytics Training​ Course Outline

This Predictive Analytics Training course will explore the following areas: ​

Introduction to Predictive Analytics

  • Predictive Analytics
  • What is Business Intelligence?
  • Predictive Analytics vs Business Intelligence
  • Challenges in using Predictive Analytics

Setting Up the Problem

  • Predictive Analytics Processing Steps
  • Business Understanding
  • Defining Data for Predictive Modelling
  • Defining the Target Variable
  • Defining Success Measures for Predictive Models

Understanding the Data

  • Single and Multiple Variables
  • Data Visualisation
  • Histograms

Data Preparation

  • Variable Cleaning
  • Feature Creation

Itemsets and Association Rules

  • Parameter Settings
  • Data Organisation Techniques
  • Deploying Association Rules
  • Making Classification Rules from Association Rules

Descriptive Modelling

  • Principal Component Analysis
  • Clustering Algorithms

Interpreting Descriptive Models

Predictive Modelling

  • Decision Trees
  • Logistic Regression
  • Neural Networks
  • K-Nearest Neighbour
  • Naïve Bayes
  • Linear Regression

Predictive Models Assessment

Model Ensembles

  • The Wisdom of Crowds
  • Bias Variance Tradeoff
  • Bagging and Boosting
  • Improving Bagging and Boosting
  • Interpreting Model Ensembles

Text Mining

  • Structured vs Unstructured Data
  • Text Mining Applications
  • Steps of Data Preparation
  • Features of Text Mining
  • Regular Expressions

Model Deployment

Show moredown

Prerequisites

There are no formal prerequisites for attending this course.

Audience

Anyone who needs to gain knowledge on predictive analytics can attend this course. This course is well-suited for data analysts and data scientists.

Predictive Analytics Training​ Course Overview

Predictive analytics is used for making predictions about unknown future events. It makes use of many techniques, including data mining, modelling, statistics, artificial intelligence, and machine learning by analysing current data. The patterns found in transactional and historical data can be used for identifying future risks and opportunities. Predictive analytics models capture relationships among factors for evaluating risk with a specific set of conditions for assigning a score. Predictive analytics enables organisations to become forward-looking and proactive, predicting outcomes and behaviours based on the data.

This Predictive Analytics Training course will provide delegates with the knowledge of predictive analytics and its processing steps. Delegates will become familiarised with cleaning and feature creation. This 2-day course will equip delegates with extensive knowledge of itemsets and association rules. Delegates will also be familiarised with the various predictive modelling techniques including logistic regression, k-nearest neighbour, Naïve Bayes, and more.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Introduction to Knime Analytics Platform Course Outline

Installing Knime Analytics Platform

  • Knime Analytics Platform Installation
  • Extensions and Integrations Installation
  • Update Knime Analytics Platform and Extensions
  • Update Sites

Introduction to Knime Analytics Platform

  • Nodes and Workflows
  • Building Workflows
  • Extensions and Integrations

Exploring Knime Workbench

  • Overview of Knime Workbench
  • Customising Knime Workbench
  • Configure Knime Analytics Platform
  • Knime Tables

Knime Extensions and Integrations

  • Define Knime Extensions and Integrations
  • Community Extensions
  • Partner Extensions

CSS Styling for JavaScript Views and Quickform Nodes

  • Description of CSS Classes
  • Classes by Node

Creating New Knime Extension

  • Setting Knime SDK
  • Creating New Knime Extension Project
  • Project Structure
  • Deploying Extension

Show moredown

Prerequisites

There are no prerequisites to attend this course.

Audience

Anyone who wishes to learn about the basic functionalities of the Knime analytics platform can attend this course.

Introduction to Knime Analytics Platform Course Overview

Knime Analytics Platform is an open-source software to create data science applications and services. With the help of Knime, understanding data, and designing data science workflows and reusable components is accessible to everyone. Knime analytics platform allows you to create visual workflows with an intuitive, drag and drop style graphical interface without any need of coding.

In this 1-day training course, delegates will learn how to install and update the Knime Analytics Platform and extensions. Delegates will gain knowledge of Knime workbench and Knime tables. In addition, they will get an understanding of Knime extensions and integrations.

During this training course, delegates will be equipped with knowledge of community and partner extensions. Delegates will learn how to create new Knime extension project. By the end of this training, delegates will be able to set the Knime SDK and deploy extension.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor
  • Certificate of Completion 

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Data Mining Training​ Course Outline

Getting Started with Data Mining

  • What is Data Mining?
  • What Kinds of Data Can Be Mined?
  • Data Objects and Attribute Types
  • Data Visualisation
  • Measuring Data Similarity and Dissimilarity

Data Preprocessing

  • Data Cleaning and Data Integration
  • Data Reduction
  • Data Transformation and Data Discretization

Data Warehousing and Online Analytical Processing

  • Basic Concepts of Data Warehousing
  • Data Cube and OLAP
  • Design, Usage, and Implementation of Data Warehouse

Data Cube

  • Preliminary Concepts
  • Data Cube Computation Methods
  • Multidimensional Data Analysis in Cube Space

Mining Frequent Patterns, Associations, and Correlations

  • Frequent Itemset Mining Methods
  • Pattern Evaluation Methods

Advanced Pattern Mining

  • Pattern Mining in Multilevel and Multidimensional Space
  • Constraint-Based Frequent Pattern Mining
  • Mining High-Dimensional Data and Colossal Patterns
  • Mining Compressed or Approximate Patterns
  • Pattern Exploration and Application

Classification

  • What is Classification?
  • Decision Tree Induction
  • Bayes Classification Methods
  • Rule-Based Classification
  • Model Evaluation and Selection

Advanced Methods of Classification

  • Bayesian Belief Networks
  • Backpropagation
  • Classification Using Frequent Patterns
  • Lazy Learners
  • Genetic Algorithms, Rough Set Approach, and Fuzzy Set Approaches

Cluster Analysis

  • What is Cluster Analysis?
  • Partitioning and Hierarchical Methods
  • Density-Based and Grid-Based Methods

Advanced Cluster Analysis

  • Probabilistic Model-Based Clustering
  • Clustering High-Dimensional and Graph Data
  • Clustering with Constraints

Outlier Detection

  • Outlier Analysis
  • Outlier Detection Methods
  • Statistical and Proximity-Based Approaches
  • Clustering-Based and Classification-Based Approaches
  • Outlier Detection in High-Dimensional Data

Show moredown

Prerequisites

There are no formal prerequisites for attending this course. However, basic knowledge of the IT industry will be beneficial.

Audience

Anyone who is interested in learning the data mining can attend this course. This course is best-suited for IT managers aiming to improve data management and analysis techniques.

Data Mining Training​ Course Overview

Data mining is the method of detecting patterns in large data sets by making use of statistics, machine learning and database systems. It includes analysing large amounts of data and converting it into useful information. The insights gained from data mining can be used for fraud detection, marketing, scientific discovery, etc.

This Data Mining Training course will provide delegates with extensive knowledge on data mining. This course will cover the main concepts of data mining including data objects, data visualisation, measuring data similarity, and data preprocessing. Delegates will also learn about data transformation and data discretization. Data warehousing and online analytical processing will also be crucial concepts of this course including basic data warehousing concepts, data cube, and OLAP.

In addition, this 2-day training course will cover mining frequent patterns, associations, and correlations including pattern evaluation methods. Delegates will acquire knowledge on advanced pattern mining that comprises constraint-based frequent pattern mining, mining high-dimensional data and colossal patterns, and pattern exploration and application. By the end of this course, delegates will have gained comprehensive knowledge on classification methods, cluster analysis, and outlier detection.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

GIS Development Training Course Outline

Introduction to Geographic Information Systems (GIS)

  • What is GIS?
  • GIS Terminologies
  • Overview of ArcMap
  • Data Representations in GIS
  • Desktop GIS Software Packages
  • GIS Analyst Skills
  • ArcGIS Licensing ad Authorisation
  • Installing ArcGIS Desktop

Basics of ArcGIS

  • Explore Data Using ArcMap
  • View and Change Layer Properties
  • Feature Classes and Attribute Tables
  • Select By Attribute and Calculate geometry
  • Select By Location
  • Define Projections
  • Analyse Data with Geoprocessing Tools
  • Setting Environment Variables
  • Assess Spatial Relationships with Spatial Join Tool

Making Maps With Common Datasets

  • Common Datasets
  • Make Maps Using Layout View
  • Core Map Elements
  • Symbology: Changing How Data Looks
  • Setting Symbology in ArcGIS
  • Labelling Map Features
  • Making Map Books

Retrieving and Sharing Data

  • Using Metadata to Document Data Products
  • Sharing Data and Maps
  • Selecting Data Format
  • Joins and Relates

Show moredown

Prerequisites:

There are no formal prerequisites for attending this course.

Audience:

Anyone interested to have a fundamental knowledge of GIS can attend this course.

GIS Development Training​ Course Overview

The GIS (Geographic Information System) is a framework to gather, manage, and analyse data. GIS integrates several types of data and analyses spatial location as well as organises layers of information into visualisations using maps and 3Dscenes. GIS comes down to just the following for simple ideas:

  • Create geographic data
  • Manage it
  • Analyse it
  • Display it on a map

The Knowledge Academy’s GIS Development Training course is designed to provide a comprehensive knowledge of the fundamentals of geographic information systems. This course will explore the world of spatial analysis and cartography with GIS. Delegates will learn the basics of ArcGIS – the leading software tool. Delegates will get an understanding of how GIS has been developed from paper maps to the globally integrated electronic software packages.

During this 1-day course, delegates will gain knowledge of how to analyse data with geoprocessing tools. In addition, they will learn about core map elements and symbology. Our expert instructors will equip you with the extensive knowledge making the course the best it can be. Delegates will learn how to create and use map packages. Furthermore, they will also get an understanding of how to upload packages to ArcGIS online. Post completion of this training, delegates will be able to create layer files and packages.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Decision Tree Modeling Using R Training​ Course Outline

Introduction to Decision Tree

  • Decision Tree modeling objective
  • Anatomy of a Decision Tree
  • Gains from a Decision Tree (KS calculations)
  • Definitions related to Objective Segmentations

Data Design for Modelling

  • Historical window
  • Performance window
  • Decide Performance Window Horizon Using Vintage Analysis
  • General Precautions related to Data Design

Data Treatment before Modelling

  • Data Sanity Check-Contents
  • View
  • Frequency Distribution
  • Means / Uni-variate
  • Categorical Variable Treatment
  • Missing Value Treatment Guideline
  • Capping Guideline

Classification of Tree development and Algorithm details

  • Preamble to data
  • Installing R package and R studio
  • Developing first Decision Tree in R studio
  • Find Strength of the model
  • Algorithm Behind Decision Tree
  • How is a Decision Tree Developed?
  • First on Categorical Dependent Variable
  • GINI Method
  • Steps taken by software programs to learn the classification (develop the tree)

Industry Practice of Classification Tree - Development, Validation and Usage

  • Discussion on Project
  • Find Strength of the Model
  • Steps taken by the Software Program to Implement the Learning on Unseen Data
  • Learning More from a Practical Point of View
  • Model Validation and Deployment

Regression Tree and Auto Pruning

  • Introduction to Pruning
  • Steps of Pruning
  • Logic of Pruning
  • Understand K Fold Validation for Model
  • Implement Auto Pruning using R
  • Develop Regression Tree
  • Interpret the Output
  • How is it Different from Linear Regression?
  • Advantages and Disadvantages over Linear Regression
  • Another Regression Tree Using R

CHAID Algorithm

  • Key Features of CART
  • Chi-square Statistics
  • Implement Chi-square for Decision Tree Development
  • Syntax for CHAID using R, and CHAID vs CART

Other Algorithms

  • Entropy in the Context of Decision Tree
  • ID3
  • Random Forest Method
  • Using R for Random Forest Method

Show moredown

Prerequisites

Basic knowledge of R programming language required before attending this course.

Audience

This training course is ideal for anyone; however, Professionals and Students who want to enter the Analytics Industry. This course is also ideal for Analytics Professionals and Data Mining Professionals.

Decision Tree Modeling Using R Training​ Course Overview

Decision Tree Modeling Using R is a popular Analytic technique which can be implemented in various business fields such as money lending business, automobile, and telecom.

This 1-day Decision Tree Modeling Using R Certification course is designed to provide delegates with a solid understanding of various concepts such as Data treatment before modelling frequency distribution, the algorithm behind decision tree, how is a decision tree developed, GINI method, steps of pruning, ID3, random forest method, and more. Starting from fundamentals of Decision Tree delegates will learn other advance topics such as data design for modelling, data treatment before modelling, classification of tree development and algorithm details, industry practice of classification tree - development, validation and usage, understand K fold validation for the model, CHAID Algorithm, the syntax for CHAID using R, and CHAID vs CART, using R for Random forest method and more.

After attending this course, delegates will gain expertise in Decision Tree Modeling using the R programming language.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

PySpark Training​ Course Outline

Module 1: Introduction to PySpark

  • What is PySpark?
  • Environment
  • Spark Dataframes
  • Reading Data
  • Writing Data
  • Transforming Data
  • MLlib
  • Pandas UDFs
  • Best Practices

Module 2: Installation

  • Using PyPI
  • Using Conda
  • Using PySpark Native Features
  • Using Virtualenv
  • Using PEX
  • Manual Downloading
  • Installing from Source
  • Dependencies

Module 3: DataFrame

  • DataFrame Creation
  • Viewing Data
  • Selecting and Accessing Data
  • Applying a Function
  • Grouping Data
  • Getting Data In/Out
  • Working with SQL

Module 4: Setting Up a Spark Virtual Environment

  • Understanding the Architecture of Data-Intensive Applications
  • Understanding Spark
  • Understanding Anaconda
  • Setting Up the Spark Powered Environment
  • Setting Up an Oracle VirtualBox with Ubuntu
  • Building First App with PySpark
  • Virtualising the Environment with Vagrant
  • Moving to the Cloud

Module 5: Building Batch and Streaming Apps with Spark

  • Architecting Data-Intensive Apps
  • Connecting to Social Networks
  • Analysing the Data
  • Exploring the GitHub World
  • Previewing App

Module 6: Learning from Data Using Spark

  • Contextualising Spark MLlib in the App Architecture
  • Classifying Spark MLlib Algorithms
  • Spark MLlib Data Types
  • Machine Learning Workflows and Data Flows
  • Clustering the Twitter Dataset
  • Building Machine Learning Pipelines

Show moredown

Prerequisites  

In this PySpark Training course, there are no formal prerequisites. 

Audience 

This PySpark Training provided by The Knowledge Academy is ideal for anyone who wants to learn the use of PySpark to support the collaboration of Apache Spark and Python.

PySpark Training​ Course Overview

PySpark is an interface for Apache Spark in Python and a comprehensive language for conducting exploratory data analysis at scale, for creating machine learning pipelines and building ETLs for a data platform. PySpark supports various features of Spark like Spark SQL, DataFrame, Streaming, MLlib, and Spark Core. It comes with immense benefits to its users and organisations, including simple to write, the framework handles errors, various useful algorithms, etc. This PySpark Training is curated by industry experts to help individuals in mastering skills required by utilising PySpark features in their day-to-day tasks and get opportunities to work on lucrative job posts in multinational companies.

In this 1-day PySpark Training course, delegates will learn about using the Conda environment to export their third-party Python packages by leveraging Conda-pack. They will gain in-depth knowledge about using virtualenv to manage Python dependencies in their clusters by using venv-pack. Further, delegates will learn other crucial concepts, such as reading, writing, and transforming data, MLlib, using PyPI, Conda, PySpark native features, Virtualenv, and PEX, connecting to network servers, etc. Our expert and technically sound trainer, who has years of experience in teaching technical courses, will conduct this training.

This training course will cover various essential concepts, such as:

  • Spark data frames
  • MLlib
  • Setting up a Spark virtual environment
  • Building batch and streaming apps with Spark
  • Exploring the GitHub world
  • Learning from data using Spark
  • Contextualising Spark MLlib in the app architecture

After attending this training course, delegates will be able to use conceptual frameworks for implementing the architecture of data-intensive applications in their organisations. They will also be able to harvest the data, ensuring its integrity and preparing for batch and streaming data processing by Spark.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Data Science with R Training Course Outline

Module 1: Introduction to Data Mining

  • Data Science
  • Knowledge Discovery in Databases (KDD)
  • Model Types
  • Classification of Data Mining Methods
  • Applications
  • Challenges
  • R Programming Language
  • Basic Concepts, Definitions, and Notations
  • Tool Installation

Module 2: Introduction to R

  • Data Types
  • Basic Tasks
  • Control Structures
  • Functions
  • Scoping Rules
  • Iterated Functions
  • Console and Package Installation

Module 3: Types, Quality, and Data Pre-Processing

  • Categories and Types of Variables
  • Pre-Processing Processes
    • Data Cleansing
    • Data Unification
    • Data Transformation and Discretisation
    • Data Reduction
  • dplyr and tidyr Packages

Module 4: Summary Statistics and Visualisation

  • Measures of Position
  • Measures of Dispersion
  • Visualisation of Qualitative Data
  • Visualisation of Quantitative Data

Module 5: Classification and Prediction

  • Classification
  • Prediction
    • Classification Vs Prediction
    • Linear Regression
    • Learning Parameter
  • Overfitting and Regularisation
    • Overfitting
    • Model Regularisation
    • Linear Regression with Normalisation

Module 6: Clustering

  • Unsupervised Learning
  • Cluster
  • k-Means Algorithm
  • Hierarchical Clustering Algorithms
  • DBSCAN Algorithm

Module 7: Mining of Frequent Itemsets and Association Rules

  • Introduction
  • Apriori Algorithm
  • Frequent Itemsets Types
  • Positive and Negative Border of Frequent Itemsets
  • Association Rules Mining
  • Alternative Methods for Large Itemsets Generation
  • FP-Growth Algorithm
  • Arules Package

Module 8: Computational Methods for Big Data Analysis

  • Introduction to Hadoop
  • Advantages of Hadoop’s Distributed File System
  • Hadoop Users
  • Hadoop Architecture
  • Hadoop Cluster Architecture
  • Hadoop Java API
  • Lists Loops, Generic Classes, and Methods

Show moredown

Prerequisites

There are no formal prerequisites for attending this Data Science with R Training course.

Audience

This course is intended for anyone who wants to learn about the methods of analysing data using the R programming language.

Data Science with R Training​ Course Overview

Data science is the study of vast amounts of data using current methodologies and tools to discover previously unknown patterns, derive valuable information, and make profitable business decisions. R is essential for data science that is commonly used as a data analysis tool and statistical software. R for data science delivers extensive support for data wrangling, statistical modelling, and machine learning techniques to obtain insights. Businesses can monitor, manage, and gather performance metrics with the aid of data science to enhance decision-making throughout the firm. This training session equips learners with R programming that deals with data science to help organisations analyse data and make more informed strategic decisions. Individuals with expertise in Data Science and programming skills will get higher designations that ultimately expand their employment options and raise their income.  

This 2-day Data Science with R Training will provide delegates with a comprehensive knowledge of Data Science and how it works with R. Delegates will become familiar with data pre-processing that ensures the quality of the data by cleaning and transformation. They will also get acquainted with summary statistics that deal with summarised and effective representation of statistical data. This course will deliver by a highly professional and skilled trainer with years of teaching experience, who will make every effort to ensure that delegates apply data science abilities to countless businesses, assisting them in data analysis and better business decisions.

You may apply your data science abilities to a number of businesses, assisting them in data analysis and better business decisions, thanks to this data science with R course.

Course Objectives

  • To create social web pages for storing and managing a vast amount of data
  • To transform raw data into furnished data products by churning raw data
  • To develop new features and normalise data based on the existing ones
  • To analyse and construct insights from the data with data science
  • To collect the data for utilising outcomes on a more practical level
  • To find the data and group the objects with similar attributes

At the end of this training, delegates will be able to transform and visualise information using R programming and attain insights about future events. They will also be able to perform data wrangling and facilitate different functions for DataFrame using the dplyr package

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Courseware
  • Experienced Instructor

Show moredown

Not sure which course to choose?

Speak to a training expert for advice if you are unsure of what course is right for you. Give us a call on +61 272026926 or Enquire.

Data Science Training FAQs

FAQ's

Data science is the study of data. It involves developing methods to record, store, and analyse data to extract useful information effectively. Data Science is a mixture of algorithms, tools, and machine learning principles to discover hidden patterns from the raw data.
Yes, The Knowledge Academy’s Data Science courses follow a structure allowing delegates to start from scratch without worrying about any formal prerequisite.
Yes, the demand for data science is growing exponentially. It is a great time to be a data science professional and enter the job market by attending our wide-range of Data Science courses.
The Knowledge Academy is the Leading global training provider in the world for Data Science Training.
The price for Data Science Training certification in Australia starts from AUD.

Why we're the go to training provider for you

icon

Best price in the industry

You won't find better value in the marketplace. If you do find a lower price, we will beat it.

icon

Trusted & Approved

We are accredited by PeopleCert on behalf of AXELOS

icon

Many delivery methods

Flexible delivery methods are available depending on your learning style.

icon

High quality resources

Resources are included for a comprehensive learning experience.

barclays Logo
deloitte Logo
Thames Water Logo

"Really good course and well organised. Trainer was great with a sense of humour - his experience allowed a free flowing course, structured to help you gain as much information & relevant experience whilst helping prepare you for the exam"

Joshua Davies, Thames Water

santander logo
bmw Logo
Google Logo

Looking for more information on Data Science Training