Apache Scala and Spark Certification Training

Spark & Scala Certification Course Overview

This Spark certification training helps you master the essential skills of the Apache Spark open-source framework and Scala programming language, including Spark Streaming, Spark SQL, machine learning programming, GraphX programming, and Shell Scripting Spark. You will also understand the role of Spark in overcoming the limitations of MapReduce.

What you'll learn

100% Money Back Guarantee
No questions asked refund*

At It Nuggets Online, we value the trust of our patrons immensely. But, if you feel that a course does not meet your expectations, we offer a 7-day money-back guarantee. Just send us a refund request via email within 7 days of purchase and we will refund 100% of your payment, no questions asked!
  • Learn Research
  • Collect Usefull Data
  • Requirement Analysis Phase
  • Market competent Skills
  • Problem Solving Skills
  • Model Implementation Skills
  • Building or Developing Phase
  • Presenting and Testing Skills
Watch Now

    course includes:

  • 7+ hours on-demand video
  • 7+ articles
  • 20+ downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of completion


Whether you work for a small company, a large corporate or from home, a computer will be one of the first pieces of office equipment you’re going to need. And they comes in different forms, such as laptops and desktops. Computer skills are a valuable addition to any employee’s personal portfolio. Upskilling and polishing your computer literacy can greatly increase your desirability to employers. This is the perfect opportunity to take on roles you might not have previously considered. As an employer, motivating your employees to become computer literate will increase productivity and also stave off problems that can cost time and significant amounts of money. Many companies have started to depend upon computerised technology to get work done. Which is why computer skills have become increasingly important. Having the necessary and basic computer course knowledge will put you a step ahead of others. You’ll have a big advantage over those who aren’t computer literate. It’s for this specific reason that many schools and tertiary institutions encourage students to complete basic computer studies. Here are three reasons why being computer literate is beneficial in the workplace.

Helping professionals thrive, not just survive

Learning — Blended to Perfection

Learning — Blended to Perfection

Learning — Blended to Perfection

Course Inquiry


Annual Salary

Hiring Companies

Training Options


Enroll Now
  • Lifetime access to high-quality self-paced eLearning content curated by industry experts
  • 4 hands-on projects to perfect the skills learnt
  • 2 simulation test papers for self-assessment
  • 24x7 learner assistance and support


Enroll Now
  • Everything in Self-Paced Learning, plus
  • 90 days of flexible access to online classes
  • Live, online classroom training by top instructors and practitioners
  • 24x7 learner assistance and support
Classes starting from:-
  • Go With: Weekend Class
  • Go With: Weekend Class


  • Blended learning delivery model (self-paced e-learning and/or instructor-led options)
  • Flexible pricing options
  • Enterprise grade Learning Management System (LMS)
  • Enterprise dashboards for individuals and teams
  • 24x7 learner assistance and support
  • 24x7 learner assistance and support

Spark & Scala Course Curriculum


This Spark certification training is ideal for professionals aspiring for a career in the field of real-time big data analytics, analytics professionals, research professionals, IT developers and testers, data scientists, BI and reporting professionals, and students who want to gain a thorough understanding of Apache Spark.


Those wishing to take the Apache Spark certification training should have a fundamental knowledge of any programming language and a basic understanding of any database, SQL, and query language for databases. Working knowledge of Linux- or Unix-based systems is also beneficial.

Course Content

Lesson 00 - Course Overview
0.001 Introduction
0.002 Course Objectives
0.003 Course Overview
0.004 Target Audience
0.005 Course Prerequisites
0.006 Value to the Professionals
0.007 Value to the Professionals (contd.)
0.008 Value to the Professionals (contd.)
0.009 Lessons Covered
0.010 Conclusion
Lesson 01 - Introduction to Spark
1.001 Introduction
1.002 Objectives
1.3 Evolution of Distributed Systems
1.004 Need of New Generation Distributed Systems
1.005 Limitations of MapReduce in Hadoop
1.006 Limitations of MapReduce in Hadoop (contd.)
1.007 Batch vs. Real-Time Processing
3.040 PairRDD Methods-Others
1.009 Application of In-Memory Processing
1.010 Introduction to Apache Spark
1.11 Components of a Spark Project
1.012 History of Spark
1.013 Language Flexibility in Spark
1.014 Spark Execution Architecture
1.015 Automatic Parallelization of Complex Flows
1.016 Automatic Parallelization of Complex Flows-Important Points
1.017 APIs That Match User Goals
1.018 Apache Spark-A Unified Platform of Big Data Apps
1.019 More Benefits of Apache Spark
1.020 Running Spark in Different Modes
1.21 Installing Spark as a Standalone Cluster-Configurations
1.022 Installing Spark as a Standalone Cluster-Configurations
1.023 Demo-Install Apache Spark
1.024 Demo-Install Apache Spark
1.025 Overview of Spark on a Cluster
1.026 Tasks of Spark on a Cluster
1.027 Companies Using Spark-Use Cases
1.028 Hadoop Ecosystem vs. Apache Spark
1.029 Hadoop Ecosystem vs. Apache Spark (contd.)
1.30 Quiz
1.031 Summary
1.032 Summary (contd.)
1.033 Conclusion
Lesson 02 - Introduction to Programming in Scala
2.001 Introduction
2.002 Objectives
2.003 Introduction to Scala
2.4 Features of Scala
2.005 Basic Data Types
2.006 Basic Literals
2.007 Basic Literals (contd.)
2.008 Basic Literals (contd.)
2.009 Introduction to Operators
2.10 Types of Operators
2.011 Use Basic Literals and the Arithmetic Operator
2.012 Demo Use Basic Literals and the Arithmetic Operator
2.013 Use the Logical Operator
2.014 Demo Use the Logical Operator
2.015 Introduction to Type Inference
2.016 Type Inference for Recursive Methods
2.017 Type Inference for Polymorphic Methods and Generic Classes
2.018 Unreliability on Type Inference Mechanism
2.019 Mutable Collection vs. Immutable Collection
2.020 Functions
2.021 Anonymous Functions
2.022 Objects
2.023 Classes
2.024 Use Type Inference, Functions, Anonymous Function, and Class
2.025 Demo Use Type Inference, Functions, Anonymous Function and Class
2.026 Traits as Interfaces
2.027 Traits-Example
2.028 Collections
2.029 Types of Collections
2.030 Types of Collections (contd.)
2.031 Lists
2.032 Perform Operations on Lists
2.033 Demo Use Data Structures
2.034 Maps
2.35 Maps-Operations
2.036 Pattern Matching
2.037 Implicits
2.038 Implicits (contd.)
2.039 Streams
2.040 Use Data Structures
2.041 Demo Perform Operations on Lists
2.42 Quiz
2.043 Summary
2.044 Summary (contd.)
2.045 Conclusion
Lesson 03 - Using RDD for Creating Applications in Spark
3.001 Introduction
3.002 Objectives
3.003 RDDs API
3.4 Features of RDDs
3.005 Creating RDDs
3.006 Creating RDDs-Referencing an External Dataset
3.007 Referencing an External Dataset-Text Files
3.008 Referencing an External Dataset-Text Files (contd.)
3.009 Referencing an External Dataset-Sequence Files
3.010 Referencing an External Dataset-Other Hadoop Input Formats
3.011 Creating RDDs-Important Points
3.012 RDD Operations
3.013 RDD Operations-Transformations
3.014 Features of RDD Persistence
3.015 Storage Levels Of RDD Persistence
3.16 Choosing The Correct RDD Persistence Storage Level
3.017 Invoking the Spark Shell
3.018 Importing Spark Classes
3.019 Creating the SparkContext
3.020 Loading a File in Shell
3.021 Performing Some Basic Operations on Files in Spark Shell RDDs
3.022 Packaging a Spark Project with SBT
3.023 Running a Spark Project With SBT
3.024 Demo-Build a Scala Project
3.025 Build a Scala Project
3.026 Demo-Build a Spark Java Project
3.027 Build a Spark Java Project
3.028 Shared Variables-Broadcast
3.029 Shared Variables-Accumulators
3.030 Writing a Scala Application
3.031 Demo-Run a Scala Application
3.032 Run a Scala Application
3.033 Demo-Write a Scala Application Reading the Hadoop Data
3.034 Write a Scala Application Reading the Hadoop Data
3.035 Demo-Run a Scala Application Reading the Hadoop Data
3.036 Run a Scala Application Reading the Hadoop Data
3.37 Scala RDD Extensions
3.038 DoubleRDD Methods
3.039 PairRDD Methods-Join
3.040 PairRDD Methods-Others
3.041 Java PairRDD Methods
3.042 Java PairRDD Methods (contd.)
3.043 General RDD Methods
3.044 General RDD Methods (contd.)
3.045 Java RDD Methods
3.046 Java RDD Methods (contd.)
3.047 Common Java RDD Methods
3.048 Spark Java Function Classes
3.049 Method for Combining JavaPairRDD Functions
3.050 Transformations in RDD
3.051 Other Methods
3.052 Actions in RDD
3.053 Key-Value Pair RDD in Scala
3.054 Key-Value Pair RDD in Java
3.055 Using MapReduce and Pair RDD Operations
3.056 Reading Text File from HDFS
3.057 Reading Sequence File from HDFS
3.058 Writing Text Data to HDFS
3.059 Writing Sequence File to HDFS
3.060 Using GroupBy
3.061 Using GroupBy (contd.)
3.062 Demo-Run a Scala Application Performing GroupBy Operation
3.063 Run a Scala Application Performing GroupBy Operation
3.064 Demo-Run a Scala Application Using the Scala Shell
3.065 Run a Scala Application Using the Scala Shell
3.066 Demo-Write and Run a Java Application
3.067 Write and Run a Java Application
3.68 Quiz
3.069 Summary
3.070 Summary (contd.)
3.071 Conclusion
Lesson 04 - Running SQL Queries Using Spark SQL
4.001 Introduction
4.002 Objectives
4.003 Importance of Spark SQL
4.004 Benefits of Spark SQL
4.005 DataFrames
4.006 SQLContext
4.007 SQLContext (contd.)
4.008 Creating a DataFrame
4.009 Using DataFrame Operations
4.010 Using DataFrame Operations (contd.)
4.011 Demo-Run SparkSQL with a Dataframe
4.012 Run SparkSQL with a Dataframe
4.13 Interoperating with RDDs
4.014 Using the Reflection-Based Approach
4.015 Using the Reflection-Based Approach (contd.)
4.016 Using the Programmatic Approach
4.017 Using the Programmatic Approach (contd.)
4.018 Demo-Run Spark SQL Programmatically
4.019 Run Spark SQL Programmatically
4.20 Data Sources
4.021 Save Modes
4.022 Saving to Persistent Tables
4.023 Parquet Files
4.024 Partition Discovery
4.025 Schema Merging
4.026 JSON Data
4.027 Hive Table
4.028 DML Operation-Hive Queries
4.029 Demo-Run Hive Queries Using Spark SQL
4.030 Run Hive Queries Using Spark SQL
4.031 JDBC to Other Databases
4.032 Supported Hive Features
4.033 Supported Hive Features (contd.)
4.034 Supported Hive Data Types
4.035 Case Classes
4.036 Case Classes (contd.)
4.37 Quiz
4.038 Summary
4.039 Summary (contd.)
4.040 Conclusion
Lesson 05 - Spark Streaming
5.001 Introduction
5.002 Objectives
5.003 Introduction to Spark Streaming
5.004 Working of Spark Streaming
5.5 Features of Spark Streaming
5.006 Streaming Word Count
5.007 Micro Batch
5.008 DStreams
5.009 DStreams (contd.)
5.010 Input DStreams and Receivers
5.011 Input DStreams and Receivers (contd.)
5.012 Basic Sources
5.013 Advanced Sources
5.14 Advanced Sources-Twitter
5.015 Transformations on DStreams
5.016 Transformations on Dstreams (contd.)
5.017 Output Operations on DStreams
5.018 Design Patterns for Using ForeachRDD
5.019 DataFrame and SQL Operations
5.020 DataFrame and SQL Operations (contd.)
5.021 Checkpointing
5.022 Enabling Checkpointing
5.023 Socket Stream
5.024 File Stream
5.025 Stateful Operations
5.026 Window Operations
5.027 Types of Window Operations
5.028 Types of Window Operations Types (contd.)
5.029 Join Operations-Stream-Dataset Joins
5.030 Join Operations-Stream-Stream Joins
5.031 Monitoring Spark Streaming Application
5.032 Performance Tuning-High Level
5.33 Performance Tuning-Detail Level
5.034 Demo-Capture and Process the Netcat Data
5.035 Capture and Process the Netcat Data
5.036 Demo-Capture and Process the Flume Data
5.037 Capture and Process the Flume Data
5.038 Demo-Capture the Twitter Data
5.039 Capture the Twitter Data
5.40 Quiz
5.041 Summary
5.042 Summary (contd.)
5.043 Conclusion
Lesson 06 - Spark ML Programming
6.001 Introduction
6.002 Objectives
6.003 Introduction to Machine Learning
6.4 Common Terminologies in Machine Learning
6.005 Applications of Machine Learning
6.006 Machine Learning in Spark
6.7 Spark ML API
6.008 DataFrames
6.009 Transformers and Estimators
6.010 Pipeline
6.011 Working of a Pipeline
6.012 Working of a Pipeline (contd.)
6.013 DAG Pipelines
6.014 Runtime Checking
6.015 Parameter Passing
6.016 General Machine Learning Pipeline-Example
6.17 General Machine Learning Pipeline-Example (contd.)
6.018 Model Selection via Cross-Validation
6.019 Supported Types, Algorithms, and Utilities
6.020 Data Types
6.021 Feature Extraction and Basic Statistics
6.022 Clustering
6.023 K-Means
6.024 K-Means (contd.)
6.025 Demo-Perform Clustering Using K-Means
6.026 Perform Clustering Using K-Means
6.027 Gaussian Mixture
6.028 Power Iteration Clustering (PIC)
6.029 Latent Dirichlet Allocation (LDA)
6.030 Latent Dirichlet Allocation (LDA) (contd.)
6.031 Collaborative Filtering
6.032 Classification
6.033 Classification (contd.)
6.034 Regression
6.035 Example of Regression
6.036 Demo-Perform Classification Using Linear Regression
6.037 Perform Classification Using Linear Regression
6.038 Demo-Run Linear Regression
6.039 Run Linear Regression
6.040 Demo-Perform Recommendation Using Collaborative Filtering
6.041 Perform Recommendation Using Collaborative Filtering
6.042 Demo-Run Recommendation System
6.043 Run Recommendation System
6.44 Quiz
6.045 Summary
6.046 Summary (contd.)
6.047 Conclusion
Lesson 07 - Spark GraphX Programming
7.001 Introduction
7.002 Objectives
7.003 Introduction to Graph-Parallel System
7.004 Limitations of Graph-Parallel System
7.005 Introduction to GraphX
7.006 Introduction to GraphX (contd.)
7.007 Importing GraphX
7.008 The Property Graph
7.009 The Property Graph (contd.)
7.010 Features of the Property Graph
7.011 Creating a Graph
7.012 Demo-Create a Graph Using GraphX
7.013 Create a Graph Using GraphX
7.014 Triplet View
7.015 Graph Operators
7.016 List of Operators
7.017 List of Operators (contd.)
7.018 Property Operators
7.019 Structural Operators
7.020 Subgraphs
7.021 Join Operators
7.022 Demo-Perform Graph Operations Using GraphX
7.023 Perform Graph Operations Using GraphX
7.024 Demo-Perform Subgraph Operations
7.025 Perform Subgraph Operations
7.026 Neighborhood Aggregation
7.027 mapReduceTriplets
7.028 Demo-Perform MapReduce Operations
7.029 Perform MapReduce Operations
7.030 Counting Degree of Vertex
7.031 Collecting Neighbors
7.032 Caching and Uncaching
7.033 Graph Builders
7.034 Vertex and Edge RDDs
7.035 Graph System Optimizations
7.036 Built-in Algorithms
7.037 Quiz
7.038 Summary
7.039 Summary (contd.)
7.040 Conclusion

Course Training Session FAQ'S

IT Nuggets Online is a progressive IT company engaged in creating eye-grabbing computer-based content in English, for the benefit of students. we offer learning process that blends texts, visuals, animation, video clips, and sound to give a complete learning experience to students.
Everyone Take an Online Classes in Your Flexible Times.
Yes, You Can Enroll More Than Courses.
After Meeting the Certification Criteria Explained By Instructor which will be contain some quiz and hands on exercise then definitely you will get certification.