close

close

Thank you for your enquiry!

One of our training experts will be in touch shortly to go over your training requirements.

close

close

Press esc to close

close close

Back to course information

Thank you for your enquiry!

One of our training experts will be in touch shortly to go overy your training requirements.

close close

Thank you for your enquiry!

One of our training experts will be in touch shortly to go over your training requirements.

Data Science Training

Online Instructor-led (3 days)

Classroom (3 days)

Online Self-paced (24 hours)

Python Data Science Training Course Outline

Introduction of Python

Working with IPython

  • Launching IPython Shell and Jupyter Notebook
  • Keyboard Shortcuts in the IPython Shell
  • Special Commands of Python
    • Pasting Code Blocks: %paste and %cpaste
    • Running External Code: %run
    • Timing Code Execution: %timeit
    • %magic and %Ismagic
  • IPython’s In and Out Objects
  • IPython and Shell Commands
  • Errors and Debugging
  • Profiling and Timing Code

Introduction to NumPy

  • Understand Data Types in Python
  • NumPy Arrays
  • Computation on NumPy Arrays: Universal Functions
  • Aggregations: Min, Max and more
  • Computation on Arrays: Broadcasting
  • Comparison, Boolean Logic, and Masks
  • Fancy Indexing
  • Sorting Arrays
  • NumPy’s Structured Array

Working with Pandas

  • Installing and Using Pandas
  • Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Concat and Append
  • Merge and Join
  • Aggregations and Grouping
  • Pivot Tables
  • Vectorised String Operations
  • Working with Time Series
  • eval() and query()

Visualisation with Matplotlib

  • Overview of Matplotlibs
  • Two Interfaces
  • Simple Line Plots and Scatter Plots
  • Visualising Errors
  • Density and Contour Plots
  • Histograms, Binnings, and Density
  • Customising Plot Legends
  • Customising Colorbars
  • Multiple Subplots
  • Text Annotation
  • Customising Ticks
  • Customising Matplotlib: Configuration and Stylesheets
  • Three-Dimensional Plotting in Matplotlib
  • Geographic Data with Basemap
  • Visualisation with Seaborn

Show moredown

Prerequisites

There are no prerequisites for attending this course. However, a basic understanding of programming would be beneficial.

Audience

Anyone interested in having a career in Python can attend this course. This course is well-suited for:

  • Big Data and Analytics Professionals
  • Software Developers
  • Project and BI Managers ETL Professionals

Python Data Science Training​ Course Overview

Python is a premier and powerful open-source language that is easy to use and has powerful libraries for data manipulation and analysis. It is a multi-paradigm programming language and supports object-oriented programming, functional programming patterns, and structured programming. This Python Data Science Training is designed to equip delegates with the knowledge of programming language for the domain of data science.

In this 3-day training, delegates will learn how to create arrays from scratch and python lists. Delegates will acquire a comprehensive knowledge of data manipulation with pandas. In addition, they will learn how to rearrange multi-indices, combine datasets, and work with time series. Delegates will get an understanding of simple line plots and simple scatter plots.

During this course, delegates will gain in-depth knowledge of how to visualise a three-dimensional function. Furthermore, familiarise yourself with histograms, binnings, and density. Delegates will learn how to customise plot legends and colorbars. Post completion of this training, delegates shall be able to customise matplotlib as well.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (4 days)

Classroom (4 days)

Online Self-paced (32 hours)

Advanced Data Science Certification​ Course Outline

Module 1: Python for Data Analysis - NumPy

  • Introduction to NumPy
  • NumPy Arrays
  • Aggregations
  • Computation on Arrays: Broadcasting
  • Comparison, Boolean Logic and Masks
  • Fancy Indexing
  • Sorting Arrays
  • NumPy’s Structured Arrays

Module 2: Python for Data Analysis – Pandas

  • Installing Pandas
  • Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Concat and Append
  • Merge and Join
  • Aggregations and Grouping
  • Pivot Tables
  • Vectorised String Operations
  • Working with Time Series

Module 3: Python for Data Visualisation – Matplotlib

  • Overview
  • Object-Oriented Interface
  • Two interfaces
  • Simple Line Plots and Scatter Plots
  • Visualising Errors
  • Contour Plots
  • Histograms, Binnings and Density
  • Customising Plot Legends
  • Customising Colour Bars
  • Multiple Subplots
  • Text Annotation
  • Three Dimensional Plotting

Module 4: Python for Data Visualisation – Seaborn

  • Installing Seaborn and Load Dataset
  • Plot the Distribution
  • Regression Analysis
  • Basic Aesthetic Themes and Styles
  • Distinguish between Scatter Plots, Hexbin Plots and KDE Plots
  • Use Boxplots and Violin Plots

Capstone 1: Retrieving, Processing and Visualising Data with Python

Module 5: Machine Learning

  • Introduction
  • Importance
  • Types
  • Working
  • Machine Learning Mathematics

Module 6: Natural Language Processing

  • Introduction
  • NLP Example
  • Advantages
  • NLP Applications

Module 7: Deep Learning

  • Introduction
  • Importance
  • Working

Module 8: Big Data

  • Big Data Analytics
  • State of Practice in Analytics
  • Main Roles for New Big Data Ecosystem
  • Phases of Data Analytics Lifecycle

Capstone 2: Machine Learning Applications in Retail, Hospitality, Education and Insurance Sectors

Module 9: Working with Data in R

  • Data Manipulation in R
  • Data Clean Up
  • Reading and Exporting Data
  • Importing Data
  • Charts and Graphs

Module 10: Regression in R

  • Regression Analysis
  • Linear Regression
  • Logistic Regression
  • Multiple Regression
  • Normal Distribution
  • Binomial Distribution

Capstone 3: Retrieving, Processing and Visualising Data with R

Module 11: Modelling Data in Power BI

  • Power BI Data Model
  • What are the Relationships
  • Viewing Relationships
  • Creating Relationships
  • Cardinality

Module 12: Shaping and Combining Data using Power BI

  • The Query Editor
  • Shaping Data and Applied Steps
  • Advanced Editor
  • Formatting Data
  • Transforming Data
  • Combining Data

Module 13: Interactive Data Visualisations

  • Page Layout and Formatting
  • Multiple Visualisation
  • Creating Charts
  • Using Geographic Data
  • Histograms

Capstone 4: Product- Sales Analysis using Power BI

Show moredown

Prerequisites

There are no formal prerequisites for attending this Advanced Data Science Certification. However, having a prior knowledge of programming languages will be beneficial for the delegates.

Audience

This Advanced Data Science Certification is suitable for anyone who wants to take their skills to the next level and add-on into their existing skillset. However, it is much more beneficial for:

  • Data Scientists
  • Data Engineers
  • Business Analysts
  • Data Analysts
  • Data Architects
  • Machine Learning Engineers

Advanced Data Science Certification​ Course Overview

The Knowledge Academy’s 4-day Advanced Data Science Certification provide delegates with a comprehensive knowledge of basic to advanced concepts to make a Data Scientist. Delegates will learn various concepts such as NumPy arrays, installing Pandas, object-oriented interface, regression analysis, machine learning mathematics, etc. Our highly experienced and professional trainers will conduct this training who have years’ of experience in teaching Data Science training courses.

Apart from these, delegates will learn the following essential concepts, such as:

  • Working with time series
  • Three dimensional plotting
  • Installing Seaborn and load dataset
  • Phases of data analytics lifecycle
  • Shaping and combining data using Power BI

After attending this expert training, delegates will be able to operate on the data in Pandas and working with time series. They will also be able to shape and combine data using Power BI successfully and implement interactive data visualisations.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Courseware
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Probability and Statistics for Data Science​ Training Course Outline

The following modules will be covered during this Probability and Statistics for Data Science Course:

Module 1: Basic Probability Theory

  • Probability Spaces
  • Conditional Probability
  • Independence

Module 2: Random Variables

  • What are Random Variables?
  • Discrete Random Variables
  • Continuous Random Variables
  • Conditioning on an Event
  • Functions of Random Variables
  • Generating Random Variables

Module 3: Multivariate Random Variables

  • Introduction to Multivariate Random Variables
  • Discrete Random Variables
  • Continuous Random Variables
  • Joint Distributions of Discrete and Continuous Variables
  • Independence
  • Generating Multivariate Random Variables
  • Rejection Sampling

Module 4: Expectation

  • Expectation Operator
  • Mean and Variance
  • Covariance
  • Conditional Expectation

Module 5: Random Processes

  • Introduction to Random Process
  • Mean and Autocovariance Functions
  • Independent Identically-Distributed Sequences Gaussian Process
  • Poisson Process
  • Random Walk

Module 6: Convergence of Random Processes

  • Types of Convergence
  • Law of Large Numbers
  • Central Limit Theorem
  • Monte Carlo Simulation

Module 7: Markov Chains

  • Markov Property
  • Time-Homogeneous Discrete-Time Markov Chains
  • Recurrence
  • Periodicity
  • Convergence
  • Markov-Chain Monte Carlo

Module 8: Descriptive Statistics

  • Introduction to Descriptive Statistics
  • Types of Descriptive Statistics

Module 9: Frequentist Statistics

  • Mean Square Error
  • Consistency
  • Confidence Intervals
  • Nonparametric Model Estimation
  • Parametric Model Estimation
  • Maximum Likelihood

Module 10: Bayesian Statistics

  • Bayesian Parametric Models
  • Conjugate Prior
  • Bayesian Estimators

Module 11: Hypothesis Testing

  • Hypothesis-Testing Framework
  • Parametric Testing
  • Nonparametric Testing: The Permutation Test
  • Multiple Testing

Module 12: Linear Regression

  • Introduction to Linear Regression
  • Linear Models
  • Least-Squares Estimation

Show moredown

Prerequisites

There are no formal prerequisites for attending this Probability and Statistics for Data Science​ Training course.

Audience

This training is suitable for anyone who wants to learn how to apply probability and statistics in Data Science.

.

Probability and Statistics for Data Science​ Training Course Overview

Probability and Statistics is the mathematical field that includes the gathering, examination, interpretation, and presentation of numerical data, which are concerned with the rules regulating random events. Statistical methods primarily depend on probability theory and make estimates for further analysis. This training assists organisations in applying ideas of randomness, prediction, expected value, and estimation with a more logical and mathematical approach. This training aims to teach probability and statistical best practices for analysing data and deriving meaningful insights from raw and unstructured data. Pursuing this Probability and Statistics for Data Science Training course will allow individuals to handle an investigation of data and advance their careers in Data Science.

In this 2-day Probability and Statistics for Data Science​ Training course, delegates will gain a thorough understanding of applying probability theory and statistics in Data Science. During this training, they will learn about conditional probability to measure in the probability space, which contains the intersection of the sets. They will also learn about the joint distributions of discrete and continuous random variables that define the probability of two events. Our highly skilled tutor with years of teaching experience will conduct this course and help delegates comprehend probability theory.

Course Objectives

  • To represent and analyse uncertain phenomena using a framework
  • To quantify the outcome of the experiment as belonging to a specific event
  • To assign probabilities to each occurrence of interest and an experiment
  • To become accustomed to Markov chains and different statistical types
  • To generate samples from the appropriate conditional distribution
  • To evaluate the occurrence of a particular event that influences another event

After completing this course, delegates will be able to apply probability theory in data science using random variables. They will also be able to define a valid probability measure on the power set of R and integrate density to obtain the probability of the random variables.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Text Mining Training​ Course Outline

Introduction to Text Mining

  • What is Text Mining?
  • Text Mining Systems Architecture

Core Text Mining Operations

  • Text Mining Operations
  • Text Mining Query Languages

Text Mining Preprocessing Techniques

  • Task-Oriented Approaches

Categorisation

  • Text Categorisation Applications
  • Document Representation
  • Knowledge Engineering and Machine Learning Approach to TC
  • Using Unlabeled Data
  • Evaluating Text Classifiers

Introduction to Clustering

  • Clustering Tasks in Text Analysis
  • Clustering Algorithms
  • Clustering of Textual Data

Information Extraction (IE)

  • Define Information Extraction
  • IE Systems Architecture
  • Anaphora Resolution
  • IE Inductive Algorithms
  • Structural Information Extraction (IE)

Probabilistic Models for IE

  • Hidden Markov Models
  • Stochastic Context-Free Grammar
  • Maximal Entropy Modeling
  • Maximal Entropy Markov Models
  • Conditional Random Fields

Preprocessing Applications

  • HMM to Textual Analysis Applications
  • Using MEMM for IE
  • Applications of CRFs to Textual Analysis
  • Using SCFG Rules
  • Bootstrapping

Presentation-Layer Considerations

  • Browsing
  • Accessing Constraints and Simple Specification Filters at the Presentation Layer
  • Accessing the Underlying Query Language

Visualisation Approaches

  • Architectural Considerations
  • Text Mining Visualisation Approaches
  • Visualisation Techniques in Link Analysis

Introduction to Link Analysis

  • Automatic Layout of Networks
  • Paths and Cycles in Graphs
  • Centrality
  • Partitioning of Networks
  • Networks Pattern Matching

Show moredown

Prerequisites:

There are no prerequisites for attending this course.

Audience:

Anyone wishes to develop their knowledge and skill-set can attend this course. This course is well-suited for:

  • Data Scientists and Analysts
  • UX Researchers
  • Machine Learning Engineers

Text Mining Training​ Course Overview

Text mining is a knowledge-intensive process in which a user interacts with a document collection over time with the help of a suite of analysis tools. A document collection can be any grouping of text-based documents. Text Mining seeks to extract valuable information from data sources by identifying and exploring patterns.

This course is designed to provide complete knowledge of text mining operations and preprocessing techniques. Delegates will get an understanding of the text categorisation problem. They will learn about significant algorithms to perform text categorisation. Also, they will learn how to use unlabelled data and evaluate text classifiers.

During this 2-day training, delegates will be equipped with the knowledge of clustering and Information Extraction (IE). Delegates will learn how to access constraints and simple specification filters at the presentation layer. Then, delegates will be introduced to hidden Markov models and maximal entropy Markov models. Post completion of this training, delegates will be able to use MEMM for Information Extraction.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Keras Training for Data Scientists Course Outline

This Introduction to Keras Training will explore the following topics:

Introduction to Keras

  • Define Keras
  • Guiding Principles
  • Why Use Keras?

Exploring Models

  • Overview of Keras Models
  • Sequential
  • Model (Functional API)

Overview of Keras Layers

  • Core Layers
  • Convolutional Layers
  • Pooling Layers
  • Locally-Connected Layers
  • Recurrent Layers
  • Embedding Layers
  • Merge Layers
  • Advanced Activations Layers
  • Normalisation Layers
  • Noise Layers
  • Layer Wrappers
  • Create Your Own Keras Layers

Preprocessing

  • Sequence Preprocessing
  • Text and Image Preprocessing
  • Losses
  • Metrics and Optimisers
  • Activations
  • Usage of Callbacks
    • BaseLogger
    • TerminateOnNaN and ProgbarLogger
    • ModelCheckpoint
    • LearningRateScheduler
    • TenserBoard
  • Datasets
  • Applications
  • Keras Backend
  • Initialisers
  • Usage of Regularizers and Constraints
  • Model Visualisations
  • Scikit-Learn API

Show moredown

Prerequisites

There are no formal prerequisites for attending this course.

Audience

Anyone interested in learning about Keras can attend this two-day intensive course. This course is well-suited for:

  • Business Analytics Professionals
  • Anyone beginning with Machine Learning

Keras Training for Data Scientists Course Overview

Keras is an open-source neural network library written in Python and capable of running on top of CNTK, TensorFlow, or Theano. Keras was developed to enable fast experimentation and is extensively used by data scientists to architect the neural network for complex problems. Keras can serve as higher-level API, which means it can act as an interface for Theano, TensorFlow, etc. Keras also compiles model with loss and optimiser functions, training process with fit function. It does not handle low-level API such as making the computational graph, making tensors or other variables as the backend engine has dealt with it.

This 1-day Introduction to Keras Training is designed to provide knowledge to delegates about Keras and the usage of Keras. Delegates will learn about different Keras layers such as core layers, convolutional layers, pooling layers, locally-connected layers, recurrent layers, etc. In addition, delegates will learn how to perform sequence, text, and image preprocessing. Post completion of this intensive training, delegates will be able to use regularisers and constraints. 

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Pandas for Data Analysis Training Course Outline

The pandas: A Python Data Analysis Toolkit is a two-day course. The following is a brief synopsis of the topics that will be covered in this course.

Introduction and Installation

  • Define pandas
  • Installing pandas
  • Running Test Suite
  • Dependencies

Getting Started with pandas

  • Package Overview
  • Exploring panda
  • Essential Basic Functionality
  • Introduction to Data Structures
  • Comparison with Other Tools

User Guide

  • IO Tools
  • Indexing and Selecting Data
  • MultiIndex/Advanced Indexing
  • Merge, Join, and Concatenate
  • Reshaping and Pivot Tables
  • Working with Text Data, Missing Data, and Categorical Data
  • Nullable Integer Data Type
  • Visualisations
  • Computational Tools
  • Group By: Split-Apply-Combine
  • Time Series and Time Deltas
  • Stylings
  • Options and Settings
  • Enhancing Performance
  • Sparse Data Structures

pandas Ecosystem

  • Statistics and Machine Learning
  • Visualisation
  • IDE
  • Data Validation
  • Extension Data Types

Development Phase

  • Contributing to pandas
  • Internals
  • Extending pandas
  • Storing pandas DataFrame Objects in Apache Parquet Format
  • Themes in pandas Development

Show moredown

Prerequisites

There are no prerequisites to attend this course. However, a basic knowledge of programming would be beneficial.

Audience

Anyone wishes to develop their knowledge and skillset on python libraries can attend this course. This course is beneficial for those who want to make a career in data science and data analytics.

Pandas for Data Analysis Training Course Overview

Pandas is an open-source python library that provides high-performance, data analysis and data structures tool for the Python programming language. Python with pandas can be used in numerous fields, such as statistics, economics, and analytics. This course is designed to provide knowledge of how to quickly and easily analyse data with Python’s powerful library- pandas.

In this 2-day course, delegates will gain comprehensive knowledge of pandas and data structures. Delegates will learn how to work with text data, missing data, and categorical data. In addition, they will get an understanding of merge, join, concatenate, reshaping and pivot tables.

During this course, delegates will become familiarised with visualisation, IDE, data validation, and extension data types. Delegates will learn how to store pandas DataFrame objects in Apache Parquet format. On completion of this course, delegates will get an understanding of various themes in pandas.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Predictive Analytics Training​ Course Outline

This Predictive Analytics Training course will explore the following areas: ​

Introduction to Predictive Analytics

  • Predictive Analytics
  • What is Business Intelligence?
  • Predictive Analytics vs Business Intelligence
  • Challenges in using Predictive Analytics

Setting Up the Problem

  • Predictive Analytics Processing Steps
  • Business Understanding
  • Defining Data for Predictive Modelling
  • Defining the Target Variable
  • Defining Success Measures for Predictive Models

Understanding the Data

  • Single and Multiple Variables
  • Data Visualisation
  • Histograms

Data Preparation

  • Variable Cleaning
  • Feature Creation

Itemsets and Association Rules

  • Parameter Settings
  • Data Organisation Techniques
  • Deploying Association Rules
  • Making Classification Rules from Association Rules

Descriptive Modelling

  • Principal Component Analysis
  • Clustering Algorithms

Interpreting Descriptive Models

Predictive Modelling

  • Decision Trees
  • Logistic Regression
  • Neural Networks
  • K-Nearest Neighbour
  • Naïve Bayes
  • Linear Regression

Predictive Models Assessment

Model Ensembles

  • The Wisdom of Crowds
  • Bias Variance Tradeoff
  • Bagging and Boosting
  • Improving Bagging and Boosting
  • Interpreting Model Ensembles

Text Mining

  • Structured vs Unstructured Data
  • Text Mining Applications
  • Steps of Data Preparation
  • Features of Text Mining
  • Regular Expressions

Model Deployment

Show moredown

Prerequisites

There are no formal prerequisites for attending this course.

Audience

Anyone who needs to gain knowledge on predictive analytics can attend this course. This course is well-suited for data analysts and data scientists.

Predictive Analytics Training​ Course Overview

Predictive analytics is used for making predictions about unknown future events. It makes use of many techniques, including data mining, modelling, statistics, artificial intelligence, and machine learning by analysing current data. The patterns found in transactional and historical data can be used for identifying future risks and opportunities. Predictive analytics models capture relationships among factors for evaluating risk with a specific set of conditions for assigning a score. Predictive analytics enables organisations to become forward-looking and proactive, predicting outcomes and behaviours based on the data.

This Predictive Analytics Training course will provide delegates with the knowledge of predictive analytics and its processing steps. Delegates will become familiarised with cleaning and feature creation. This 2-day course will equip delegates with extensive knowledge of itemsets and association rules. Delegates will also be familiarised with the various predictive modelling techniques including logistic regression, k-nearest neighbour, Naïve Bayes, and more.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Introduction to Knime Analytics Platform Course Outline

Installing Knime Analytics Platform

  • Knime Analytics Platform Installation
  • Extensions and Integrations Installation
  • Update Knime Analytics Platform and Extensions
  • Update Sites

Introduction to Knime Analytics Platform

  • Nodes and Workflows
  • Building Workflows
  • Extensions and Integrations

Exploring Knime Workbench

  • Overview of Knime Workbench
  • Customising Knime Workbench
  • Configure Knime Analytics Platform
  • Knime Tables

Knime Extensions and Integrations

  • Define Knime Extensions and Integrations
  • Community Extensions
  • Partner Extensions

CSS Styling for JavaScript Views and Quickform Nodes

  • Description of CSS Classes
  • Classes by Node

Creating New Knime Extension

  • Setting Knime SDK
  • Creating New Knime Extension Project
  • Project Structure
  • Deploying Extension

Show moredown

Prerequisites

There are no prerequisites to attend this course.

Audience

Anyone who wishes to learn about the basic functionalities of the Knime analytics platform can attend this course.

Introduction to Knime Analytics Platform Course Overview

Knime Analytics Platform is an open-source software to create data science applications and services. With the help of Knime, understanding data, and designing data science workflows and reusable components is accessible to everyone. Knime analytics platform allows you to create visual workflows with an intuitive, drag and drop style graphical interface without any need of coding.

In this 1-day training course, delegates will learn how to install and update the Knime Analytics Platform and extensions. Delegates will gain knowledge of Knime workbench and Knime tables. In addition, they will get an understanding of Knime extensions and integrations.

During this training course, delegates will be equipped with knowledge of community and partner extensions. Delegates will learn how to create new Knime extension project. By the end of this training, delegates will be able to set the Knime SDK and deploy extension.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor
  • Certificate of Completion 

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Data Mining Training​ Course Outline

Getting Started with Data Mining

  • What is Data Mining?
  • What Kinds of Data Can Be Mined?
  • Data Objects and Attribute Types
  • Data Visualisation
  • Measuring Data Similarity and Dissimilarity

Data Preprocessing

  • Data Cleaning and Data Integration
  • Data Reduction
  • Data Transformation and Data Discretization

Data Warehousing and Online Analytical Processing

  • Basic Concepts of Data Warehousing
  • Data Cube and OLAP
  • Design, Usage, and Implementation of Data Warehouse

Data Cube

  • Preliminary Concepts
  • Data Cube Computation Methods
  • Multidimensional Data Analysis in Cube Space

Mining Frequent Patterns, Associations, and Correlations

  • Frequent Itemset Mining Methods
  • Pattern Evaluation Methods

Advanced Pattern Mining

  • Pattern Mining in Multilevel and Multidimensional Space
  • Constraint-Based Frequent Pattern Mining
  • Mining High-Dimensional Data and Colossal Patterns
  • Mining Compressed or Approximate Patterns
  • Pattern Exploration and Application

Classification

  • What is Classification?
  • Decision Tree Induction
  • Bayes Classification Methods
  • Rule-Based Classification
  • Model Evaluation and Selection

Advanced Methods of Classification

  • Bayesian Belief Networks
  • Backpropagation
  • Classification Using Frequent Patterns
  • Lazy Learners
  • Genetic Algorithms, Rough Set Approach, and Fuzzy Set Approaches

Cluster Analysis

  • What is Cluster Analysis?
  • Partitioning and Hierarchical Methods
  • Density-Based and Grid-Based Methods

Advanced Cluster Analysis

  • Probabilistic Model-Based Clustering
  • Clustering High-Dimensional and Graph Data
  • Clustering with Constraints

Outlier Detection

  • Outlier Analysis
  • Outlier Detection Methods
  • Statistical and Proximity-Based Approaches
  • Clustering-Based and Classification-Based Approaches
  • Outlier Detection in High-Dimensional Data

Show moredown

Prerequisites

There are no formal prerequisites for attending this course. However, basic knowledge of the IT industry will be beneficial.

Audience

Anyone who is interested in learning the data mining can attend this course. This course is best-suited for IT managers aiming to improve data management and analysis techniques.

Data Mining Training​ Course Overview

Data mining is the method of detecting patterns in large data sets by making use of statistics, machine learning and database systems. It includes analysing large amounts of data and converting it into useful information. The insights gained from data mining can be used for fraud detection, marketing, scientific discovery, etc.

This Data Mining Training course will provide delegates with extensive knowledge on data mining. This course will cover the main concepts of data mining including data objects, data visualisation, measuring data similarity, and data preprocessing. Delegates will also learn about data transformation and data discretization. Data warehousing and online analytical processing will also be crucial concepts of this course including basic data warehousing concepts, data cube, and OLAP.

In addition, this 2-day training course will cover mining frequent patterns, associations, and correlations including pattern evaluation methods. Delegates will acquire knowledge on advanced pattern mining that comprises constraint-based frequent pattern mining, mining high-dimensional data and colossal patterns, and pattern exploration and application. By the end of this course, delegates will have gained comprehensive knowledge on classification methods, cluster analysis, and outlier detection.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

GIS Development Training Course Outline

Introduction to Geographic Information Systems (GIS)

  • What is GIS?
  • GIS Terminologies
  • Overview of ArcMap
  • Data Representations in GIS
  • Desktop GIS Software Packages
  • GIS Analyst Skills
  • ArcGIS Licensing ad Authorisation
  • Installing ArcGIS Desktop

Basics of ArcGIS

  • Explore Data Using ArcMap
  • View and Change Layer Properties
  • Feature Classes and Attribute Tables
  • Select By Attribute and Calculate geometry
  • Select By Location
  • Define Projections
  • Analyse Data with Geoprocessing Tools
  • Setting Environment Variables
  • Assess Spatial Relationships with Spatial Join Tool

Making Maps With Common Datasets

  • Common Datasets
  • Make Maps Using Layout View
  • Core Map Elements
  • Symbology: Changing How Data Looks
  • Setting Symbology in ArcGIS
  • Labelling Map Features
  • Making Map Books

Retrieving and Sharing Data

  • Using Metadata to Document Data Products
  • Sharing Data and Maps
  • Selecting Data Format
  • Joins and Relates

Show moredown

Prerequisites:

There are no formal prerequisites for attending this course.

Audience:

Anyone interested to have a fundamental knowledge of GIS can attend this course.

GIS Development Training​ Course Overview

The GIS (Geographic Information System) is a framework to gather, manage, and analyse data. GIS integrates several types of data and analyses spatial location as well as organises layers of information into visualisations using maps and 3Dscenes. GIS comes down to just the following for simple ideas:

  • Create geographic data
  • Manage it
  • Analyse it
  • Display it on a map

The Knowledge Academy’s GIS Development Training course is designed to provide a comprehensive knowledge of the fundamentals of geographic information systems. This course will explore the world of spatial analysis and cartography with GIS. Delegates will learn the basics of ArcGIS – the leading software tool. Delegates will get an understanding of how GIS has been developed from paper maps to the globally integrated electronic software packages.

During this 1-day course, delegates will gain knowledge of how to analyse data with geoprocessing tools. In addition, they will learn about core map elements and symbology. Our expert instructors will equip you with the extensive knowledge making the course the best it can be. Delegates will learn how to create and use map packages. Furthermore, they will also get an understanding of how to upload packages to ArcGIS online. Post completion of this training, delegates will be able to create layer files and packages.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Decision Tree Modeling Using R Training​ Course Outline

Introduction to Decision Tree

  • Decision Tree modeling objective
  • Anatomy of a Decision Tree
  • Gains from a Decision Tree (KS calculations)
  • Definitions related to Objective Segmentations

Data Design for Modelling

  • Historical window
  • Performance window
  • Decide Performance Window Horizon Using Vintage Analysis
  • General Precautions related to Data Design

Data Treatment before Modelling

  • Data Sanity Check-Contents
  • View
  • Frequency Distribution
  • Means / Uni-variate
  • Categorical Variable Treatment
  • Missing Value Treatment Guideline
  • Capping Guideline

Classification of Tree development and Algorithm details

  • Preamble to data
  • Installing R package and R studio
  • Developing first Decision Tree in R studio
  • Find Strength of the model
  • Algorithm Behind Decision Tree
  • How is a Decision Tree Developed?
  • First on Categorical Dependent Variable
  • GINI Method
  • Steps taken by software programs to learn the classification (develop the tree)

Industry Practice of Classification Tree - Development, Validation and Usage

  • Discussion on Project
  • Find Strength of the Model
  • Steps taken by the Software Program to Implement the Learning on Unseen Data
  • Learning More from a Practical Point of View
  • Model Validation and Deployment

Regression Tree and Auto Pruning

  • Introduction to Pruning
  • Steps of Pruning
  • Logic of Pruning
  • Understand K Fold Validation for Model
  • Implement Auto Pruning using R
  • Develop Regression Tree
  • Interpret the Output
  • How is it Different from Linear Regression?
  • Advantages and Disadvantages over Linear Regression
  • Another Regression Tree Using R

CHAID Algorithm

  • Key Features of CART
  • Chi-square Statistics
  • Implement Chi-square for Decision Tree Development
  • Syntax for CHAID using R, and CHAID vs CART

Other Algorithms

  • Entropy in the Context of Decision Tree
  • ID3
  • Random Forest Method
  • Using R for Random Forest Method

Show moredown

Prerequisites

Basic knowledge of R programming language required before attending this course.

Audience

This training course is ideal for anyone; however, Professionals and Students who want to enter the Analytics Industry. This course is also ideal for Analytics Professionals and Data Mining Professionals.

Decision Tree Modeling Using R Training​ Course Overview

Decision Tree Modeling Using R is a popular Analytic technique which can be implemented in various business fields such as money lending business, automobile, and telecom.

This 1-day Decision Tree Modeling Using R Certification course is designed to provide delegates with a solid understanding of various concepts such as Data treatment before modelling frequency distribution, the algorithm behind decision tree, how is a decision tree developed, GINI method, steps of pruning, ID3, random forest method, and more. Starting from fundamentals of Decision Tree delegates will learn other advance topics such as data design for modelling, data treatment before modelling, classification of tree development and algorithm details, industry practice of classification tree - development, validation and usage, understand K fold validation for the model, CHAID Algorithm, the syntax for CHAID using R, and CHAID vs CART, using R for Random forest method and more.

After attending this course, delegates will gain expertise in Decision Tree Modeling using the R programming language.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

PySpark Training​ Course Outline

Module 1: Introduction to PySpark

  • What is PySpark?
  • Environment
  • Spark Dataframes
  • Reading Data
  • Writing Data
  • Transforming Data
  • MLlib
  • Pandas UDFs
  • Best Practices

Module 2: Installation

  • Using PyPI
  • Using Conda
  • Using PySpark Native Features
  • Using Virtualenv
  • Using PEX
  • Manual Downloading
  • Installing from Source
  • Dependencies

Module 3: DataFrame

  • DataFrame Creation
  • Viewing Data
  • Selecting and Accessing Data
  • Applying a Function
  • Grouping Data
  • Getting Data In/Out
  • Working with SQL

Module 4: Setting Up a Spark Virtual Environment

  • Understanding the Architecture of Data-Intensive Applications
  • Understanding Spark
  • Understanding Anaconda
  • Setting Up the Spark Powered Environment
  • Setting Up an Oracle VirtualBox with Ubuntu
  • Building First App with PySpark
  • Virtualising the Environment with Vagrant
  • Moving to the Cloud

Module 5: Building Batch and Streaming Apps with Spark

  • Architecting Data-Intensive Apps
  • Connecting to Social Networks
  • Analysing the Data
  • Exploring the GitHub World
  • Previewing App

Module 6: Learning from Data Using Spark

  • Contextualising Spark MLlib in the App Architecture
  • Classifying Spark MLlib Algorithms
  • Spark MLlib Data Types
  • Machine Learning Workflows and Data Flows
  • Clustering the Twitter Dataset
  • Building Machine Learning Pipelines

Show moredown

Prerequisites  

In this PySpark Training course, there are no formal prerequisites. 

Audience 

This PySpark Training provided by The Knowledge Academy is ideal for anyone who wants to learn the use of PySpark to support the collaboration of Apache Spark and Python.

PySpark Training​ Course Overview

PySpark is an interface for Apache Spark in Python and a comprehensive language for conducting exploratory data analysis at scale, for creating machine learning pipelines and building ETLs for a data platform. PySpark supports various features of Spark like Spark SQL, DataFrame, Streaming, MLlib, and Spark Core. It comes with immense benefits to its users and organisations, including simple to write, the framework handles errors, various useful algorithms, etc. This PySpark Training is curated by industry experts to help individuals in mastering skills required by utilising PySpark features in their day-to-day tasks and get opportunities to work on lucrative job posts in multinational companies.

In this 1-day PySpark Training course, delegates will learn about using the Conda environment to export their third-party Python packages by leveraging Conda-pack. They will gain in-depth knowledge about using virtualenv to manage Python dependencies in their clusters by using venv-pack. Further, delegates will learn other crucial concepts, such as reading, writing, and transforming data, MLlib, using PyPI, Conda, PySpark native features, Virtualenv, and PEX, connecting to network servers, etc. Our expert and technically sound trainer, who has years of experience in teaching technical courses, will conduct this training.

This training course will cover various essential concepts, such as:

  • Spark data frames
  • MLlib
  • Setting up a Spark virtual environment
  • Building batch and streaming apps with Spark
  • Exploring the GitHub world
  • Learning from data using Spark
  • Contextualising Spark MLlib in the app architecture

After attending this training course, delegates will be able to use conceptual frameworks for implementing the architecture of data-intensive applications in their organisations. They will also be able to harvest the data, ensuring its integrity and preparing for batch and streaming data processing by Spark.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Manual
  • Experienced Instructor

Show moredown

Online Instructor-led (2 days)

Classroom (2 days)

Online Self-paced (16 hours)

Data Science with R Training Course Outline

Module 1: Introduction to R

  • What is R?
  • Features of R
  • R Installation
  • R Dashboard Overview
  • R Fundamentals
  • R Variables
  • R Datatypes
  • R Operators
  • R Conditional Statements
  • R Looping Statements
  • R Functions

Module 2: Data Structures in R

  • Introduction to Data Structure
  • Types of Data Structure in R
  • Vectors
  • Lists
  • Matrix
  • Arrays
  • DataFrames

Module 3: Working with Data in R

  • Introduction
  • Types of Files in R
  • Working with CSV Files
    • Operation with CSV Files
    • Reading CSV Files
    • Analysing CSV Files
    • Writing to CSV Files
  • Working with Excel Files
    • Operation with XLSX Files
    • Installing the Package
    • Reading XLSX Files
    • Writing to XLSX Files
  • Working with JSON Files
    • Operation with JSON Files
    • Installing the Package
    • Reading a JSON File
    • Converting a JSON File into a DataFrame

Module 4: Data Manipulation in R

  • What is Data Manipulation?
  • Installation of dplyr Package
  • Data Manipulation Operations in R

Module 5: Data Visualisation in R

  • What is Data Visualisation?
  • Working with Graphs and Plots in R

Module 6: Statistics in R

  • Introduction to Statistics
  • Descriptive Statistics
    • Measure of Central Tendency
    • Measure of Variability
  • Distributions in R

Module 7: Machine Learning

  • Introduction to Machine Learning in R
  • Types of Machine Learning in R
  • Supervised Learning in R
    • Classification
    • Regression
  • Unsupervised Learning in R
  • Reinforcement Machine Learning

Show moredown

Prerequisites

There are no formal prerequisites for attending this Data Science with R Training course.

Audience

This course is intended for everyone who wants to learn about the methods of analysing data using the R programming language.

Data Science with R Training​ Course Overview

Data Science and the R programming language work in collaboration to analyse and manipulate data. R is a programming language used by data scientists to pre-process and examine data, build prediction models, do statistical analysis, and generate data visualisations. This training will enable individuals to use the R language to transform raw data into valuable insights and effective recommendations. Additionally, during this training, individuals will also develop the ability to execute sophisticated computations and statistical analysis with ease, employing tools such as data frames, matrices, and vectors. Individuals with proficiency in data science and programming skills get higher designations, which expand their career opportunities and raise their income.

The Knowledge Academy 2-day Data Science with R Training course provides delegates with in-depth knowledge about R programming and how to master it. During this training, they will learn about how to work with vectors, lists, metrics, arrays, and data frames. They will also learn about data manipulation, which is the process of transforming and modifying data to make it more suitable for analysis. This course will be led by our highly skilled and knowledgeable trainer, who has years of experience in teaching and will help delegates get a complete understanding of this Data Science with R Training course.

Course Objectives

  • To learn how to read, write, and analyse data from CSV, Excel, and JSON files
  • To install and use the dplyr package for efficient data manipulation operations
  • To attain knowledge of how to perform common data manipulation tasks
  • To learn how to explore supervised learning techniques such as classification
  • To apply machine learning techniques to solving real-world problems
  • To analyse and construct insights from the data with data science

At the end of this training, delegates will be able to manipulate and analyse data using data structures. They will also be able to perform various operations and conversions on different file formats.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Courseware
  • Experienced Instructor

Show moredown

Online Instructor-led (1 days)

Classroom (1 days)

Online Self-paced (8 hours)

Data Science and Blockchain Training Course Outline

Module 1: Introduction to Data Science

  • What is Data Science?
  • How Does Data Science Work?
  • Data Science Life Cycle
  • Roles and Responsibilities of a Data Scientist
  • Importance of Data Science
  • Data Science Applications
  • Business Intelligence Vs Data Science

Module 2: Blockchain Overview

  • Blockchain Technology
  • Why is Blockchain Important?
  • How Does Blockchain Work?
  • Decentralisation in Blockchain
  • Blockchain Uses
  • Blockchain Applications

Module 3: Implications of Blockchain in Data Science

  • Relationship Between Blockchain and Data Science
  • How Blockchain can Help Big Data?
  • How Blockchain will Enhance Data Science?

Module 4: Blockchain in Big Data Transformation

  • Introduction
  • What is a Blockchain, and How Does It Work?
  • Bringing Blockchain and Big Data Together
  • Things That Blockchain Transforms Big Data

Module 5: Blockchain Storage

  • What is Blockchain Storage?
  • What Will Blockchain Mean for Data Storage?
  • Data Flow Through a Blockchain
  • Blockchain Data Storage Solutions
  • Why Data Storage is Shifting to the Blockchain?
  • Issues with Centralised Data Centres

Show moredown

Prerequisites

There are no formal prerequisites to attend this Data Science and Blockchain Training course.

Audience

This Data Science and Blockchain Training course is suitable for anyone who wants to make technical decisions about their data science architecture, analytical environments, and organisational insights delivery. This course is well-suited for:

  • Technical Leaders
  • Data Scientists
  • Analysts

Data Science and Blockchain Training Course Overview

Data science is the study of how to extract useful information from data for business decision-making, strategic planning, and other purposes by using cutting-edge analytics techniques and scientific principles. A blockchain is a digital ledger of transactions that is encrypted and digitally signed to guarantee their integrity and authenticity. Blockchain technology is being used by data scientists to verify the accuracy of the data and track it throughout the entire chain. Studying this training will help learners to regulate interactions with various data segments as well as predict and validate data simultaneously. Pursuing this training will help individuals gain the required knowledge, skills, and experience to enhance their career prospects.  

This 1-day Data Science and Blockchain Training course is designed to provide delegates with the knowledge of blockchain that helps data scientists effectively tackle a wide range of riveting problems. During this training course, they will gain a thorough understanding of the roles and responsibilities of a data scientist to perform various modeling techniques to get the solutions and applications of blockchain. Delegates will learn about the relationship between blockchain and data science and how blockchain will enhance data science. Our highly experienced and professional trainer will conduct this training and provide delegates with the necessary understanding of how the field of data science can be enhanced with the application of blockchain technology.

Course Objectives

  • To learn how to alter information about the records retrospectively
  • To analyse data and track transactions to make better decisions
  • To understand the entire process of gathering actionable insights from raw data
  • To speed up the work process and reduces the time taken to obtain and analyse data
  • To identify dangerous or fraudulent transactions and prevent fraud entirely
  • To identify trends, models, and threats through data production and exchange

At the end of this training course, delegates will be able to use blockchain technology to ensure the authenticity and track the data at every point on the chain. They will also be able to cover all the aspects in order to drive solutions to business problems.

Show moredown

  • Delegate pack consisting of course notes and exercises
  • Experienced Instructor

Show moredown

Not sure which course to choose?

Speak to a training expert for advice if you are unsure of what course is right for you. Give us a call on 01344203999 or Enquire.

Data Science Training FAQs

FAQ's

Data science is the study of data. It involves developing methods to record, store, and analyse data to extract useful information effectively. Data Science is a mixture of algorithms, tools, and machine learning principles to discover hidden patterns from the raw data.
Yes, The Knowledge Academy’s Data Science courses follow a structure allowing delegates to start from scratch without worrying about any formal prerequisite.
Yes, the demand for data science is growing exponentially. It is a great time to be a data science professional and enter the job market by attending our wide-range of Data Science courses.
The Knowledge Academy is the Leading global training provider for Data Science Training.
The price for Data Science Training certification in the United Kingdom starts from £.

Why we're the go to training provider for you

icon

Best price in the industry

You won't find better value in the marketplace. If you do find a lower price, we will beat it.

icon

Trusted & Approved

We are accredited by PeopleCert on behalf of AXELOS

icon

Many delivery methods

Flexible delivery methods are available depending on your learning style.

icon

High quality resources

Resources are included for a comprehensive learning experience.

barclays Logo
deloitte Logo
Thames Water Logo

"Really good course and well organised. Trainer was great with a sense of humour - his experience allowed a free flowing course, structured to help you gain as much information & relevant experience whilst helping prepare you for the exam"

Joshua Davies, Thames Water

santander logo
bmw Logo
Google Logo

Looking for more information on Data Science Training