Press esc to close
Press esc to close
Fill out your contact details below and our training experts will be in touch.
If you wish to make any changes to your course, please log a ticket and choose the category ‘booking change’
Back to Course Information
The Hadoop Big Data Certification is a two-day course. The course content is divided into two sections: Hadoop and Big Data. The following outlines the topics that will be covered in each section.
This course is recommended for those needing to implement or enhance their big data environment. Additionally, it is also for anyone looking to advance their analytics career by ensuring excellent foundational knowledge.
Typically, those attending are Project Managers and IT Managers, Database Administrators & Data Architects, Developers & SQL Developers, Data Scientists & Business Intelligence. This is not an exhaustive list.
There are no prerequisites required to attend this training course.
Hadoop is an open-source software platform for computing. It facilitates the processing of big data sets across computer clusters. Hadoop has no format requirements therefore is an economical solution to any organisation. Thus training, as a Certified Specialist in Hadoop, will be an asset to any organisation.
Training as a Certified Specialist in Hadoop and Big Data, you will hone the knowledge and experience required to devise a Hadoop solution that will satisfy your business requirements and needs. Post successful completion of this course, delegates shall be able to allocate, distribute, and manage resources, monitor the Hadoop file system, job progress, and overall cluster performance.
This comprehensive two-day course will equip delegates with the skills required to install, configure, and navigate the Apache Hadoop platform. In addition, delegates will be able to build a Hadoop solution that is tailored to their specific business requirements. The emergence of large data sets brings with it fresh challenges and it can be difficult to manoeuvre oneself in unchartered territory. This course extensively covers big data and shall include the storage and processing of big data, the tools and techniques used to analyse big data, how to develop a big data strategy, and implementing a big data solution. As a Certified Specialist in Big Data Analytics, you will have the expertise and skills to build competitive strategies around data-driven insights.
The following is a brief synopsis of the topics that will be covered over the intensive one-day course.
The Hadoop Administration course is intended for IT professionals, cloud administrators, system administrators, and data engineers. However, this is not an exhaustive list.
There are no formal prerequisites for this course. However, it is recommended that delegates have understood the basics of Hadoop and have knowledge of large data fields, prior to beginning this course.
This 1-day course delivers a detailed understanding of the Hadoop open-source framework. Hadoop Administration Training will assist delegates in working with Big Data. It will further aid delegates in using the information collected to improve business objectives, quality of products, and customer satisfaction.
This course focuses on managing, maintaining, and troubleshooting a Hadoop cluster; creating and starting admin commands, communication commands tools, and commissioning and decommissioning nodes. Furthermore, familiarise yourself with the Hadoop Ecosystem, all within an intensive one-day training course.
The following is a brief synopsis of the topics that will be covered over the intensive one-day course.
This course is aimed at those who wish to become data architects, data analysts, or database engineers.
Prior to attending this course, proficient knowledge of database management systems and technologies (MapReduce, Hive, HDFS, Spark etc.) is expected of delegates.
Big Data Hadoop Architects are responsible for the development and deployment of applications on a large scale. In addition to this, they are tasked with preparing and creating Big Data systems. Delegates shall gain a thorough understanding of how to create a Hadoop solution that meets their business requirements. This comprehensive one-day course will equip delegates with the skills required to install, configure, and manage the Apache Hadoop platform. In addition, delegates will be able to build a Hadoop solution that is tailored to their specific business requirements.
The sudden development of large data sets brings with it fresh challenges and it can be difficult to manoeuvre oneself in unchartered territory. This course covers a wide range of big data architecture material from real-time and batch processing, data formats and data lifecycle, to the various database interfaces. Scalable applications is a critical topic covered within this module. Whether scaling up or down, being able to determine an application’s scalability, and doing so accurately and efficiently, these are essential skills a proficient big data architect will require. Another important issue, which is discussed during the course, is Security and Privacy. The principles of security within the platform alongside threats to privacy will be examined in detail. In addition to these topics, being able to select the best technology that suit the current demands, will be a key feature of the course syllabus.
The following is a brief synopsis of the topics that will be covered over two-day course.
Typically, those attending are Project Managers and IT Managers, Database Administrators & Data Architects, Data Engineers, IT Systems Engineers, and Cloud Systems Administrators. This is not an exhaustive list.
It is highly recommended that delegates should have a comprehensive understanding of Hadoop prior to attending.
This Big Data and Hadoop Solutions Architect training course is a two-day intensive course aimed at those who have a comprehensive understanding of Hadoop and wish to consolidate their knowledge of solutions architecture. This course has been formulated to aid delegates in becoming Solutions Architects that are essential for businesses when they look to integrate data from various sources in a limited time frame. A Big Data Hadoop Solutions Architect is responsible for identifying specific issues whilst handling large amounts of data. They are also expected to describe the structure and behaviours of the information whilst utilising the Hadoop technology.
The Big Data and Hadoop Solutions Architect also organises how the Big Data environment ought to be developed, which includes requirement analysis, platform selection, and the design.
The course shall cover the processing and analysing of data, identifying the various behaviours of data, data visualisation and migration of data, Hadoop Clusters in detail, and the NoSQL database technologies.
A Big Data Hadoop Solutions Architect possesses a sort after skill set that is invaluable to many organisations. The demand, for Big Data Hadoop Solutions Architects, has rocketed and continues to do so within the IT industry.
This 1-day course will help you develop your skills to become a successful Data Analyst. By taking this course, you will be able to successfully study different types of data and turn it into a valuable source of information. You will also be able to learn various theories which include digital, technological and analytical techniques
The Data Science Analytics certification has been designed for anyone who is interested in analysing data and identifying any improvements or issues.
There are no prerequisites for the Data Science Analytics course.
Data Science is a versatile area which combines scientific techniques, systems and processes to extract information from various forms of data. A Data Scientist uses the information collected to discover data courses such as revenues, testimonials and product information.
Handling data is increasingly becoming essential within a business. The Data Analytics with R certification will help delegates learn the fundamentals of this programming language and use it to perform various forms of data. This 1-day course will also give delegates the skills to create data analysis tasks for yourself and enhance their skills when using “R”.
This course has been developed for those who are starting to use data analysis tools. The Data Analytics with R training is ideal for those who are interested in storing and managing data.
There are no prerequisites for taking this course but it is recommended that delegates have a basic understanding when using this programming language.
This 1-day course has been specifically designed for those who have no knowledge of the programming language and would like to expand their skills in this area. The course will help delegates gain the skills to become successful analytics professional.
R is a programming language which is used for statistical computing and graphics. The open source tool has been used by many statisticians, data miners and data analysts to collect data to improve their products.
This course covers the following topics:
Understanding the Fundamentals of Big Data
The Big Data Analysis Lifecycle
Planning a Big Data Approach
Implementing a Big Data Approach
Storing Unstructured Information
Managing and Analysing Unstructured Information
This course has been designed for those who are interesting in managing large quantities of data and creating long-term strategies for their business.
There are no prerequisites for Big Data Analysis course.
As more and more businesses rely on data to make their decisions, the ability to critically analyse large datasets is more important than ever. Successful Big Data Analysis can provide an insight into activities and highlight opportunities to improve and expand, as well as identify issues which may prevent growth and affect profit.
Our 1-day Big Data Analysis training course provides a comprehensive introduction to this discipline, providing knowledge of the Big Data Analysis Lifecycle and how a Big Data approach can be planned and implemented.
Module 1: Overview of Data Analysis
Module 2: Introduction to Data Analysis with MS Excel
Module 3: Work with Range Names
Module 4: Introduction to Tables
Module 5: Cleaning Data with Text Functions
Module 6: Working with Date Formats and Time Formats
Module 7: Conditional Formatting
Module 8: Sorting and Filtering
Module 9: Subtotals and Quick Analysis
Module 10: Exploring Lookup Functions
Module 11: Working with Pivot Tables
Module 12: Data Visualisation and Validation
Module 13: Financial Analysis
Module 14: Multiple Sheets
Module 15: Formula Auditing
There are no formal requirements to attend this Data Analysis Training using MS Excel course. Although having some experience of using MS Excel or basic knowledge of MS Excel can be beneficial for the delegates.
This Data Analysis Training using MS Excel course is suitable for anybody considering to learn the usage of Microsoft Excel for Data Analysis purposes.
Data Analysis is the process of arranging, modelling, and remodelling data to find vital information that helps in business decision-making. The primary purpose of Data Analysis is to fetch valuable information from given data and derive the decision based on that meaningful information. Microsoft Excel, commonly known as MS Excel, is a spreadsheet program that provides a wide range of commands, functions, and tools to save time and facilitate the Data Analysis process. Data Analysis is a critical component of Business Intelligence (BI) and data mining. It is the process of examining and evaluating a data set by using analytical and logical reasoning. Data Analysis is widely used in many fields worldwide by numerous organisations for one or another purpose. Having Data Analysis skills of using MS Excel can benefit any individual’s professional growth and can get them a good paycheck.
This 2-day Data Analysis Training using MS Excel will provide delegates with a comprehensive understanding of carrying out Data Analysis tasks with Microsoft Excel. The topics covered in this training include cleaning data, conditional formatting, lookup functions, formula auditing, etc. In addition to that, delegates will learn concepts like data visualisation and validation, financial analysis etc. Our highly skilled and subject matter expert trainers are well versed in teaching Data Analysis and have sufficient experience in conducting this course to provide delegates with their desired skills.
What you will learn:
By the end of this training, delegates will be able to use MS Excel proficiently and effectively for Data Analysis. They will become proficient in using Excel functions and tools that are used for data analysis purposes. Delegates will also become proficient in the process of data visualisation in Excel and running financial analysis.
If delegates want to learn Data Analysis using other tools and techniques, they can also opt for our Big Data Architecture Training, Big Data and Hadoop Solutions Architect, Data Science Analytics, Data Analytics with R, Big Data Analysis, and many more courses from our Big Data and Analytics Training section to fulfil their needs/requirements.
Introduction of Scala
Executing the Scala Code
Classes Concept in Scala
Case Classes and Pattern Matching
Concepts of Traits with Example
Scala Java Interoperability
Mutable Collections vs Immutable Collections
Use Case Bobsrockets Package
Spark Course Content
Introduction to Spark
Working with RDDs in Spark
Aggregating Data with Pair RDDs
Writing and Deploying Spark Applications
Spark RDD Persistence
Spark Streaming & Mila
Spark SQL and Data Frames
Improving Spark Performance
Scheduling or Partitioning
Delegates should have basic knowledge about Java, database, query language and SQL.
This course is designed for those who want to build their career in Big Data. It is more suitable for:
Apache Spark is an open-source and lightning-fast cluster computing system which is used for analysing a large amount of data. Spark is the most extensive tool, and many large companies have used it over the world.
This 2-day Apache Spark and Scala Certification provide delegates with a piece of in-depth knowledge and practical skills to enhance competence in Big Data Spark. During this training, delegates will get an understanding of Spark and its ecosystem, Spark Streaming, Spark SQL, RDD and Scala.
This course will cover the below different concepts:
This course will be delivered by the industry-experienced instructor, who will provide comprehensive knowledge on Scala Programming language, YARN, HDFS, Sqoop, Flume, Spark GraphX and Messaging System such as Kafka. After completing this training, delegates will get a certificate if they passed the exam.
Introduction to Apache Storm
Apache Storm Concepts
Apache Storm Workflow
Overview of Distributed Messaging System
Installing Apache Storm
Apache Storm Trident
Apache Storm Applications
Anyone who wishes to pursue a career in Big Data Analytics or learn to use Apache Storm can attend this course. This course is well-suited for:
There are no prerequisites for this course; however, understanding of Java would be advantageous.
Apache Storm is an open-source data streaming framework. It enables the processing of a large amount of data using a fault-tolerant and horizontal scalable method. It is simple and can be used with any programming language.
This Apache Storm Training is designed to provide knowledge of how to use Apache Storm. Delegates will learn how to install Storm and create topologies, as well as how to use its workflow, cluster architecture and distributed messaging system. The course also looks at the Apache Storm Trident, including Topology and Tuples, Spout and Operations, and State Maintenance.
Module 1: Splunk Overview
Module 2: Splunk Search Processing Language
Module 3: Macros, Field Extraction, and Field Aliases
Module 4: Tags, Lookups, and Correlating Events
Module 5: Data Models, Pivot, and CIM
Module 6: Knowledge Managers and Dashboards in Splunk
Module 7: Splunk Licenses, Indexes, and Role Management
Module 8: Machine Data Using Splunk Forwarder and Clustering
Module 9: Advanced Data Input in Splunk
Module 10: Splunk’s Advanced .conf File and Diag
Module 11: Infrastructure Planning with Indexer and Search Head Clustering
Module 12: Troubleshooting in Splunk
Module 13: Advanced Deployment
Module 14: Advanced Splunk
There are no formal prerequisites for attending this Splunk Training course. However, a prior understanding of storing and retrieving data would be highly beneficial.
This course is designed by The Knowledge Academy for everyone who wants to grasp the essentials of Splunk. However, it will much more beneficial for:
Splunk is a popular software to monitor, search, analyse, and visualise machine-generated data in real-time. It captures, indexes, and correlates real-time data within a searchable container, generating alerts, graphs, visualisations, and dashboards. Splunk helps the users to gather, store, and deliver extensive analytical skills, allowing enterprises to act on the data's frequently profound insights. Studying this training will provide learners with the appropriate use of Splunk and enable them to discover events using search processing language. It helps in monitoring business metrics, making informed decisions, and creating a central repository for searching. Individuals with strong searching and analysis skills will grab well-paying jobs in multinational corporations, where they will be able to use their Splunk expertise in day-to-day real-time activities.
In this 2-day Splunk Training course, delegates will enhance their expertise in how Splunk can be used to analyse and respond to issues in their businesses using operational intelligence. Delegates will learn about the basics of the Search Processing Language (SPL), which includes Boolean operators, syntax colouring, search language syntax, and search modes. They will also learn about macros that run the search command to avoid rewriting the whole command and how to create it using .conf web and Splunk web. Our highly professional and skilled instructor with years of teaching experience will conduct this course and assist delegates with the fundamentals to advanced concepts of Splunk.
After attending this training course, delegates will be able to create data models and recognise the patterns of product sales requests. They will also be able to enhance the GUI and real-time visibility in a dashboard to deliver the most up-to-date data on a wide range of performance metrics.
Domain 1: Data Analytics
Module 1: Introduction to Data Analytics
Module 2: Introduction to Statistical Analysis
Module 3: Data Wrangling with SQL
Module 4: Presto
Module 5: Feature Engineering
Capstone 1: Detect Credit Card Fraud Using Machine Learning
Domain 2: Business Analytics with Excel
Module 1: Introduction to Data Analysis with MS Excel
Module 2: Cleaning Data with Text Functions
Module 3: Sorting and Filtering
Module 4: Exploring Lookup Functions
Module 5: Introduction to Power Pivot and Formula Auditing
Module 6: DAX Variables and Formatting
Module 7: Introduction to Power Map
Module 8: Design a Dashboard Using Data Model
Capstone 2: Ecommerce Sales Dashboard in Excel
Domain 3: Programming Basics and Data Analytics with Python
Module 1: Python for Data Analysis - NumPy
Module 2: Python for Data Analysis – Pandas
Module 3: Python for Data Visualisation – Matplotlib
Module 4: Python for Data Visualisation – Seaborn
Capstone 3: Exploratory Data Analysis Using Python
Domain 4: Tableau Training
Module 1: Get Started
Module 2: Data Sources
Module 3: Worksheets
Module 4: Calculations
Module 5: Sort and Filters
Module 6: Tableau Charts
Capstone 4: Data Visualisation with Tableau
There are no formal prerequisites for attending this Advanced Data Analytics Certification.
This training certification is curated by The Knowledge Academy for everyone who wants to equip themselves with the knowledge of Data Analytics and intermediates to take themselves to the next level. However, this course will be beneficial for:
Data Analytics is the scientific discipline that analyse raw data and uses that information to draw conclusions. The process has been automated into algorithms and mechanical processes that work with raw data for human consumption. It is critical to optimise business performance and implement business models that can reduce costs by storing massive amounts of data and identifying efficient business practices. A Data Analyst is responsible for using automated tools, maintaining databases, and preparing final analysis reports. This training assists aspiring candidates with the essential skills of engineering and interpreting data in Python. Individuals with these skills and knowledge will get higher designations at multinational corporations and ultimately climb up the ladders of success.
In this 4-day Advanced Data Analytics Certification, delegates will gain a comprehensive knowledge of data analytics using various programming languages and will learn techniques to expert those languages. During this certification, delegates will learn about business analytics with Excel to gain business perceptions and brush up on their Microsoft Excel skills. They will also learn about programming basics and analytics with Python using the libraries NumPy, Pandas, Matplotlib, and Seaborn. Our highly skilled trainer with years of teaching experience will conduct this training certification and equip delegates with advanced data analysis and statistical methods.
After attending this training certification, delegates will be able to analyse and visualise the information by extracting it from raw data. They will also be able to create capstone projects such as credit card fraud detection using Machine Learning, an e-commerce sales dashboard in Excel, etc., to gain expertise in the Data Analytics field.
Introduction to Couchbase Server
Installing Couchbase Server
Couchbase Administration Console Basics
Developing with Couchbase
Anyone who wishes to gain knowledge on Couchbase can attend this course. This course is well-suited for:
There are no prerequisites for this course.
Couchbase Server is a distributed and scalable NoSQL document database, designed to allow the execution of fast create, store, update, and retrieval operations.
This Couchbase training course is designed to provide knowledge on the working of Couchbase Server, including installation and how to use the Administrative Console. The course also looks at how to develop for Couchbase and monitor and manage clusters.
Introduction to Big Data
Overview of Kafka
Reliable Data Delivery
Building Data Pipelines
Cross-Cluster Data Mirroring
Administering and Monitoring Kafka
Anyone who wishes to learn how to use Apache Kafka can attend this course. This course is ideal for:
There are no prerequisites for this course. However, knowledge of basic Java Programming would be beneficial.
Apache Kafka is a high-performance real-time messaging system and open-source stream-processing platform that can process millions of messages per second. It is suitable for both online and offline message consumption. Apache Kafka integrates with Apache Storm and Spark for real-time streaming data analysis, minimising down time and data loss.
This Apache Kafka Training Course is designed to help delegates to acquire skills to become a Kafka Big Data Developer. During this two-day comprehensive course, delegates will learn the skills required to administer and monitor Kafka, including how to take control of a Kafka cluster by configuring Kafka Producers, Consumers and streams. Delegates will also learn how to build data pipelines and applications with Kafka, as well as how to install Java, Zookeeper and Kafka Broker.
Introduction to Apache Spark
Apache Spark MLlib
Apache Spark Streaming
Apache Spark SQL
Apache Spark GraphX
Anyone who wishes to enhance their knowledge of Apache Spark can attend this course. This course is ideal for:
There are no prerequisites for this course. However, basic knowledge of SQL, databases, and query language will be beneficial.
Apache Spark is a framework for large-scale SQL, stream processing, batch processing, and machine learning. Its main feature is in-memory cluster computing, which enhances its processing speed. It can also handle both batch and real-time analytics and data processing workloads, as well as process data from different data repositories including NoSQL databases, the Hadoop Distributed File System (HDFS) and more.
This Apache Spark training course is designed to provide delegates with the skills and knowledge to become a successful Big Data and Spark Developer. The two-day course explores concepts including cluster design, cluster management and artificial neural networks, as well as how to install Docker, Titan and Databricks. It also looks at how to process graphs using GraphX.
Module 1: Big Data Analytics - Introduction
Module 2: Data Analytics Lifecycle
Module 3: Basic Data Analytic Methods Using R
Module 4: Introduction to Clustering
Module 5: Association Rules
Module 6: Regression
Module 7: Classification
Module 8: Time Series Analysis
Module 9: Text Analysis
Module 10: MapReduce and Hadoop
Module 11: In-Database Analytics
Anybody wishing to pursue a career in Big Data and Data Science can attend this course. This course is well-suited for:
No prerequisites are required for this course.
Data Science is a combination of programming, analytical, and business skills that enable the review, analysis and extraction of meaningful insights from raw data. Big Data creates new opportunities for organisations to derive insights and generate competitive advantage from information.
This Big Data Analytics and Data Science Integration course will help delegates to gain expertise in using Big Data and Data Science related technologies. It provides delegates with in-depth knowledge of how to design, develop and deploy data science and big data applications in the real world. Topics covered include the Data Analytics Lifecycle, Regression, Classification, Text Analysis and Database Analytics.
Anybody wishing to pursue a career in Big Data and Data Science can attend this course. This course is well-suited for:
Introduction to Data Integration
Introduction to Talend Big Data Solutions
Working with Projects
Designing a Business Model
Hive in Talend
Designing a Job
Mapping Data Flows
Mapping Big Data Flows
Managing Metadata for Data Integration
Managing Metadata for Talend Big Data
Using SQL Templates
Anyone who wishes to use Talend for data integration can attend this course. This course is ideal for:
There are no prerequisites for this course. However, basic knowledge of Data Warehousing and SQL would be beneficial.
Talend is an open-source data integration platform. It combines data from multiple sources and ensures it can be moved quickly across to target systems to provide greater business insights. It offers various software and services for enterprise application integration, data management, data integration, cloud storage, data quality, and Big Data.
This Data Integration and Big Data using Talend course is designed to provide thorough knowledge of how to use Talend to address Big Data Integration and management challenges. The course will cover how to design and manage jobs, as well as how to create, import, open, delete and export projects. You will also learn how to manage metadata and use Talend SQL templates.
Introduction to Data Warehouse
Dimensions and Facts
Data Warehouse Architecture
Data Warehouse OLAP
Relational and Multidimensional OLAP
Data Warehouse Schemas
Horizontal and Vertical Partitioning
Introduction to Data Marting
System and Process Managers
Security and Backup
Tuning and Testing
Anybody wishing to gain good knowledge of data warehousing. This course is best suited for:
There are no formal prerequisites for this course. However, an understanding of basic database concepts would be beneficial.
A data warehouse is a database that collects and stores a large amount of data from a diverse range of sources to allow analysis and provide insights.
This Data Warehouse training course will introduce delegates to the fundamental concepts of data warehousing, including architecture, modelling, delivery and system processes. Delegates will learn about the different terminology related to data warehousing, metadata concepts, schemas, and security.
Introduction to ELK Stack
ELK in Production
Anybody who wishes to learn how to use the ELK stack can attend this course. Job titles this course is recommended for:
A basic understanding of JSON Data Format, SQL and Restful API will be helpful.
The ELK Stack is a combination of three open-source products - Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server-side data processing pipeline that inputs data from various sources at the same time, transforms it and sends it to a stash. Kibana enables users to visualise data with graphs and charts in Elasticsearch.
This ELK Stack Training course will provide delegates with a good understanding of Elasticsearch, Logstash and Kibana. Delegates will learn about Elasticsearch queries such as Boolean Operators, Fields, Ranges and URI Search. They will also gain knowledge on ELK elasticity and use cases. By the end of the course, you will understand how to use Elasticsearch, Logstash and Kibana, and how it can be used in business.
Introduction to Apache Impala
Concepts and Architecture
Planning Impala Deployment
Managing and Upgrading Impala
Impala SQL Language Reference
Using Impala Shell Command
Tuning Impala for Performance
Scalability Considerations for Impala
Partitioning for Impala Tables
Working of Impala with Hadoop File Formats
Use Impala to Query HBase Tables
Using Impala Logging
Anyone who wishes to gain expertise in Impala can attend this course. This course is ideal for:
No prerequisites are required for this course. However, basic knowledge of the principles of programming is advantageous.
Impala is a distributed massive parallel processing SQL query engine for processing enormous data volumes stored in a Hadoop cluster. Impala is licensed by Apache, and it runs on the open-source Apache Hadoop big data analytics platform.
This Hadoop Training Course with Impala is designed to equip delegates with comprehensive knowledge regarding Apache Impala. Delegates will learn how to install, manage and upgrade Impala as well as how to start Impala through Cloudera Manager and the command line. From here, the course will show you how to administer Impala, including managing security, tuning for performance, and troubleshooting.
Introduction to HBase
Overview of HBase Shell
HBase Admin API
Basics of HBase Tables
HBase Describe and Alter
HBase Exists and Shutting Down
Client API Basics
Overview of HBase Data
HBase Scan and Security Basics
Anyone who wishes to pursue a career in Big Data can attend this course. This course is beneficial for the following professionals:
There are no prerequisites for this course. However, knowledge of Hadoop architecture and APIs would be beneficial.
HBase is a non-relational database providing real-time read and write access to large datasets. It allows the storing of a huge amount of data in the form of a table. It scales linearly for handling huge datasets and combining data sources with different structures and schemas. It is natively integrated with Hadoop and works with other data access engines seamlessly through YARN.
This HBase Training is designed to provide thorough knowledge of HBase, including procedures to set up HBase on Hadoop file systems. Delegates will understand the different ways to interact with HBase Shell, how to connect to HBase with the help of Java, and how basic operations are performed on HBase by using Java. You will also become familiarised with HBase tables and perform various operations on those tables.
Anyone who wishes to elevate their knowledge regarding Informatica can attend this course. This course is ideal for:
There are no prerequisites for this course. However, basic knowledge of SQL will be beneficial.
Informatica PowerCenter is an enterprise ETL (extract, transform, and load) tool used to build enterprise data warehouses. It is used to extract data, transform it as per business needs and then load the data into a target data warehouse. It offers a wide range of features such as integration of data from multiple systems, operations at row level on data, or scheduling of data operations.
This comprehensive course is specifically designed to provide knowledge of Informatica PowerCenter and its architecture. As well as installing PowerCenter, it covers the configuration of clients and repositories, workflow, target designer, and debugging.
Module 1: Set up a Spark Virtual Environment
Module 2: Building Batch and Streaming Apps with Spark
Module 3: Juggling Data with Spark
Module 4: Data Using Spark
Module 5: Streaming Live Data with Spark
Module 6: Visualising Insights and Trends
This course is intended for:
There are no formal prerequisites for this course. However, basic knowledge of SQL and Python programming would be beneficial.
Apache Spark is an analytics engine for the processing of big data. It can carry out the processing of large-scale SQL, stream processing, batch processing, and machine learning. Spark’s main feature is its in-memory cluster computing which enhances application processing speed. It can handle both batch and real-time analytics and data processing workloads.
This Spark Training for Python Developers course is designed to provide knowledge of how to set up a virtual Spark environment. Delegates will learn how to install Spark and the Python Anaconda distribution, build batch and streaming apps using Spark, and explore data by using Blaze and Spark SQL.
Other topics covered include how to pre-process data for visualisation and how to create Wordclouds. By the completion of this course, you will be able to build a reliable and scalable streaming app.
Introduction to Apache ORC
Using Apache ORC in Hive
Using Apache ORC in MapReduce
Using ORC Core
Apache ORC Tools
There are no formal prerequisites for attending this course.
Anyone who wishes to learn about Apache ORC can attend this course.
Apache is a non-profit organisation that helps those open-source software projects that are released under the license of Apache. Apache ORC is a self-describing columnar file format enabling efficient querying and storage of data on Hadoop. It uses multi-version concurrency control for supporting ACID transactions. This Apache ORC Training is designed to equip delegates with a detailed knowledge of Apache ORC.
The Knowledge Academy’s Apache OCR Training will introduce delegates to ORC adapters and types. Delegates will gain knowledge of Apache ORC’s three levels of indexes. In addition, delegates will learn how to build Apache ORC. Delegates will get familiarised with hive DDL and configuration, including table and configuration properties.
During this 1-day course, delegates will learn how to read and write ORC files. Delegates will get an understanding of how to send OrcStruct, OrcList, OrcMap through the shuffle. This Apache ORC Training will fully prepare delegates on how to use Apache ORC tools – C++ and Java tools. Post completion of this training, delegates will be able to use Java meta, data, scan, convert, and JSON Schema.
Module 1: Introduction to Apache Maven
Module 2: Dependencies
Module 3: Plugins
Module 4: Controlling the Build
Module 5: The Project Website
Module 6: The Maven Release Process
Module 7: Maven Tricks and Patterns
In this Apache Maven Training, there are no formal prerequisites.
This Apache Maven Training is designed for anyone who wants to gain more knowledge about Apache Maven software. It is much more beneficial for:
Apache Maven is most popular build automation tool which is used for java projects. It is also a most powerful project management tool based on project object model (POM). In this 2-day Apache Maven Training delegates will learn how to solve problems related to software project builds and implement the Maven repository. From this training delegates will also learn about:
Throughout this training, delegates will understand about how to install and deploy a plugin with how to generate reports on code when developers are running into problems. After completing this training, delegates will be able to create a project website and release Maven artifacts.
Speak to a training expert for advice if you are unsure of what course is right for you. Give us a call on +44 1344 203999 or Enquire.
Our training experts have compiled a range of course packages to compliment a variety of categories in order to help fast track your career. The packages consist of the best possible qualifications in each industry and allows you to purchase multiple courses at a discounted rate.
Big Data Analysis$1095
Hadoop Administration Training$1095
Microsoft Power BI Training$1095
Total without package: $3285
Package price: $1995 (Save $1290)
Swipe for more. Don’t miss out!
Course was run very smoothly, Richard our trainer was extremely knowledgeable and delivered the course in a succinct fashion with a twist of humour thrown in.
The course was great and the so was the trainer - brilliantly delivered and I would certainly recommend this course to my colleagues. Thanks to Richard for a great course and delivering it a tough environment virtually.
Richard was very knowledgeable and explained well
You won't find better value in the marketplace. If you do find a lower price, we will beat it.
We are accredited by PeopleCert on behalf of AXELOS
Flexible delivery methods are available depending on your learning style.
Resources are included for a comprehensive learning experience.
"Really good course and well organised. Trainer was great with a sense of humour - his experience allowed a free flowing course, structured to help you gain as much information & relevant experience whilst helping prepare you for the exam"
Joshua Davies, Thames Water