Loader attribute of the E Learning Platform
Avail Flat 10% off on all courses | Utilise this to Up-Skill for best jobs of the industry Enroll Now

Apache Cassandra Certification Training

1.24K+ Learners

Stepleaf’s Apache Cassandra Certification Training is designed by professionals as per the industry requirements and demands. This Cassandra Certification Training helps you to master the concepts of Apache Cassandra including Cassandra Architecture, its features, Cassandra Data Model, and its Administration. Throughout the Cassandra course, you will learn to install, configure, and monitor Cassandra, along with its integration with other Apache frameworks like Hadoop, Spark, and Kafka.

Instructor led training provided by Stepleaf E-Learning Platform Instructor Led Training
Real time cases are given for students attending the online professional development courses Real Time Projects
Intertviews are scheduled after completing  Online Professional Development Courses Guaranteed Job Interviews
E-Learning Platform Flexible Schedule
E-Learning Platform LifeTime Free Upgrade
Stepleaf is the E-Learning Platform provides 24*7 customer support 24x7 Support

Apache Cassandra Certification

Jul 25 Sat,Sun (6 Weeks) Weekend Batch Filling Fast 03:00 PM  05:00 PM
Time schedule for Online Professional Development Courses

Can't find a batch you were looking for?

Course Price at

$ 459.00

About Course

Cassandra is a distributed database from Apache that is highly scalable and designed to manage huge amount of unstructured data. Apache Cassandra Certification Training covers Database Operations, Table Operations, Node Operations in a Cluster, Managing & Monitoring the Cluster, Backup/Restore, Performance Tuning, and Hosting Cassandra Database on Cloud. You will also learn to integrate Cassandra with other Apache frameworks like Hadoop, Spark, and Kafka.

Although we use SQL Server to interrogate data throughout this course, SQL is a common language easily adapted to other RDBMS (relational database management systems) such as MySQL, IBM DB2, PostgreSQL, Oracle, Ingres, Sybase and Microsoft Access.

Course Objectives

This Apache Cassandra Training is designed by industry experts to help you master Apache Cassandra. The Cassandra Course offers:

  • In-depth knowledge of NoSQL database, including features such as High Availablity, Fault Tolerance, Fast Processing, and Scalability
  • Comprehensive knowledge of Cassandra Database and it's architecture.
  • Capability to ingest data in Cassandra and perform various operations
  • Experience with Single & Multi-Node Cluster setup and different Node Operations using nodetool
  • Capability to Manage and Monitor the Cassandra Cluster
  • Knowledge of various Security and Backup features provided by Cassandra
  • Exposure to many real-life industry based Projects
  • Case Studies which are diverse in nature covering banking, telecommunication, social media, and e-commerce domains
  • What are the skills that you will be learning with our Apache Cassandra Certification Training?
  • Apache Cassandra Certification Training will help you to become a Cassandra expert. It will hone your skills by offering you comprehensive knowledge on Cassandra, it's internals and the required hands-on experience for solving real-time industry-based big data projects.

During the Cassandra Training, you will be guided and trained by our expert instructors to:

  • Master the concepts of NoSQL database & understand where Cassandra is used
  • Understand CAP theorem, to Cassandra's History
  • Install Cassandra Single Node Cluster and manage them
  • Describe Apache Cassandra Architecture
  • Design and model applications for Cassandra
  • Learn about Keyspaces, Tables
  • Perform Cassandra Admin Operations for Managing a Cluster
  • Learn concepts related to Cassandra Performance Tuning
  • Implement Backup and Recovery Strategies for Cassandra
  • Host Cassandra on Cloud

Who should go for this training?

  • The market for Big Data analytics is growing across the world and this strong growth pattern translates into a great opportunity for all the IT Professionals. Cassandra being Highly Available and extremely fast is one of the widely used NoSQL database.

Our Apache Cassandra Training helps you to grab this opportunity and accelerate your career. It is best suited for:

  • Big Data Developer / Administrator / Architect / Analyst / Engineer
  • Software Architect / Engineer/Developer
  • Solution Delivery Consultant
  • Senior BI / ETL Developer
  • NoSQL Big Data Developer


  • As such there are no prerequisites for Apache Cassandra course. Knowledge of Linux command line is preferred. Exposure to Java, Database or Data-Warehouse concepts is a plus, but certainly not a mandate.
Key Skills

bigdata, etc., Cassandra, Data Model, Keyspace, Cassandra Architecture, Cassandra Database, Node Operations, Backup & Restore, Performance Tuning, Hosting Cassandra Database on Cloud

Free Career Counselling

Course Contents

Download Syllabus

Apache Cassandra Certification Training Content

Goal: In this module you will get a brief introduction of Big Data and how it creates problems for traditional Database Management Systems like RDBMS. You will also learn how Cassandra solves these problems and understand Cassandra’s features. 


Basic concepts of Cassandra


At the end of this module, you will be able to

  • Explain what is Big Data
  • List the Limitations of RDBMS
  • Define NoSQL and it’s Characteristics
  • Define CAP Theorem
  • Learn Cassandra
  • List the Features of Cassandra
  • Get a Tour of Stepleaf’s VM


  • Introduction to Big Data and Problems caused by it
  • 5V – Volume, Variety, Velocity, Veracity and Value
  • Traditional Database Management System
  • Limitations of RDMS
  • NOSQL databases
  • Common characteristics of NoSQL databases
  • CAP theorem
  • How Cassandra solves the Limitations?
  • History of Cassandra
  • Features of Cassandra

Hands On: 

Stepleaf VM tour

Goal: In this module, you will learn about Database Model and similarities between RDBMS and Cassandra Data Model. You will also understand the key Database Elements of Cassandra and learn about the concept of Primary Key.
  • Data Modelling in Cassandra
  • Data Structure Design
At the end of this module, you will be able to
  • Explain what is Database Modelling and it’s Features
  • Describe the Different Types of Data Models
  • List the Difference between RDBMS and Cassandra Data Model
  • Define Cassandra Data Model
  • Explain Cassandra Database Elements
  • Implement Keyspace Creation, Updating and Deletion
  • Implement Table Creation, Updating and Deletion
  • Introduction to Database Model
  • Understand the analogy between RDBMS and Cassandra Data Model
  • Understand following Database Elements: Cluster, Keyspace, Column Family/Table, Column
  • Column Family Options
  • Columns
  • Wide Rows, Skinny Rows
  • Static and dynamic tables
  • Creating Keyspace
  • Creating Tables

Goal: Gain knowledge of architecting and creating Cassandra Database Systems. In addition, learn about the complex inner workings of Cassandra such as Gossip Protocol, Read Repairs and so on.
• Cassandra Architecture
Objectives: At the end of this module, you will be able to:
• Explain the Architecture of Cassandra
• Describe the Different Layers of Cassandra Architecture
• Learn about Gossip Protocol
• Describe Partitioning and Snitches
• Explain Vnodes and How Read and Write Path works
• Understand Compaction, Anti-Entropy and Tombstone
• Describe Repairs in Cassandra
• Explain Hinted Handoff
  • Cassandra as a Distributed Database
  • Key Cassandra Elements
  • Memtable
  • Commit log 
  • SSTables
  • Replication Factor
  • Data Replication in Cassandra
  • Gossip protocol – Detecting failures
  • Gossip: Uses
  • Snitch: Uses
  • Data Distribution
  • Staged Event-Driven Architecture (SEDA)
  • Managers and Services
  • Virtual Nodes: Write path and Read path
  • Consistency level
  • Repair
  • Incremental repair

Goal: In this module you will learn about Keyspace and its attributes in Cassandra. You will also create Keyspace, learn how to create a Table and perform operations like Inserting, Updating and Deleting data from a table while using CQLSH.
• Database Operations
• Table Operations
Objectives: At the end of this module, you will be able to:
• Describe Different Data Types Used in Cassandra
• Explain Collection Types
• Describe What are CRUD Operations
• Implement Insert, Select, Update and Delete of various elements
• Implement Various Functions Used in Cassandra
• Describe Importance of Roles and Indexing
• Understand tombstones in Cassandra
• Replication Factor
• Replication Strategy
• Defining columns and data types
• Defining a partition key
• Recognizing a partition key
• Specifying a descending clustering order
• Updating data
• Tombstones
• Deleting data
• Using TTL
• Updating a TTL
• Create Keyspace in Cassandra
• Check Created Keyspace in System_Schema.Keyspaces
• Update Replication Factor of Previously Created Keyspace
• Drop Previously Created Keyspace
• Create A Table Using cqlsh
• Create A Table Using UUID & TIMEUUID
• Create A Table Using Collection & UDT Column
• Create Secondary Index On a Table
• Insert Data Into Table
• Insert Data into Table with UUID & TIMEUUID Columns
• Insert Data Using COPY Command
• Deleting Data from Table

Goal: Learn how to add nodes in Cassandra and configure Nodes using “cassandra.yaml” file. Use Nodetool to remove node and restore node back into the service. In addition, by using Nodetool repair command learn the importance of repair and how repair operation functions.
• Node Operations
Objectives: At the end of this module, you will be able to:
• Explain Cassandra Nodes
• Understand Seed Nodes
• Configure Seed Nodes using cassandra.yaml file
• Add/bootstrap a node in a Cluster
• Use Nodetool utility to decommission a node from the cluster
• Remove a Dead Node from a Cluster
• Describe the need to repair Nodes
• Use Nodetool repair command
• Cassandra nodes
• Specifying seed nodes
• Bootstrapping a node
• Adding a node (Commissioning) in Cluster
• Removing (Decommissioning) a node
• Removing a dead node
• Repair
• Read Repair
• What’s new in incremental repair
• Run a Repair Operation
• Cassandra and Spark Implementation
Hands On:
• Commissioning a Node
• Decommissioning a Node
• Nodetool Commands

Goal: The key aspects to monitoring Cassandra are resources used by each node, response latencies to requests, requests to offline nodes, and the compaction process. Learn to use various monitoring tools in Cassandra such as Nodetool and JConsole in this module.
• Clustering
Objectives: At the end of this module, you will be able to:
• Describe the various monitoring tools available
• Implement nodetool utility to manage a cluster
• Use JConsole to monitor JMX statistics
• Understand OpsCenter tool
• Cassandra monitoring tools
• Logging
• Tailing
• Using Nodetool Utility
• Using JConsole
• Learning about OpsCenter
• Runtime Analysis Tools
Hands On:
• JMX and Jconsole
• OpsCenter

Goal: In this Module you will learn about the importance of Backup and Restore functions in Cassandra and Create Snapshots in Cassandra. You will learn about Hardware selection and Performance Tuning (Configuring Log Files) in Cassandra. You will also learn about Cassandra integration with various other frameworks.
• Performance tuning
• Cassandra Design Principals
• Backup and Restoration
Objectives: At the end of this module, you’ll be able to:
• Learn backup and restore functionality and its importance
• Create a snapshot using Nodetool utility
• Restore a snapshot
• Understand how to choose the right balance of the following resources: memory, CPU, disks, number of nodes, and network.
• Understand all the logs created by Cassandra
• Explain the purpose of different log files
• Configure the log files
• Learn about Performance Tuning
• Integration with Spark and Kafka
• Creating a Snapshot
• Restoring from a Snapshot
• RAM and CPU recommendations
• Hardware choices
• Selecting storage
• Types of Storage to Avoid
• Cluster connectivity, security and the factors that affect distributed system performance
• End-to-end performance tuning of Cassandra clusters against very large data sets
• Load balance and streams
Hands On:
• Creating Snapshots
• Integration with Kafka
• Integration with Spark

Goal: In this Module you will learn about Design, Implementation, and on-going support of Cassandra Operational Data. Finally, you will learn how to Host a Cassandra Database on Cloud.
• Security
• Design Implementation
• On-going support of Cassandra Operational Data
Objectives: At the end of this module, you’ll be able to:
• Security
• Learn about DataStax
• Create an End-to-End Project using Cassandra
• Implement a Cassandra Database on Cloud
• Security
• Ongoing Support of Cassandra Operational Data
• Hosting a Cassandra Database on Cloud
Hands On:
• Hosting Cassandra Database on Amazon Web Services

Like the curriculum? Enroll Now

Structure your learning and get a certificate to prove it.

Two persons discussing about the online developemnet courses


What are the system requirements for this Cassandra Training?

The following are the requirements for the system to smoothly run the programs:

  • Minimum RAM required: 4GB (Suggested: 8GB)
  • Minimum Free Disk Space: 25GB
  • Minimum Processor i3 or above
  • Operating System of 64bit

How will I execute the practicals in this Cassandra Training?

For this Cassandra training, we will help you to setup StepLeaf's Virtual Machine in your System with local access. The detailed installation guides are provided in the LMS for setting up the environment. For any doubt, the 24*7 support team will promptly assist you. StepLeaf Virtual Machine can be installed on Mac or Windows machine.

Participant’s machines must support a 64-bit VirtualBox guest image

Which projects are included in StepLeaf's Online Apache Cassandra Training Course? 

Case Study 1: Product Liking Functionality [Ecommerce]

Scenario -  

David is CEO of www.purhaseitnow.com. Currently, he is selling 300k products per day across multiple categories. There are thousands of sellers having millions of products, registered on the portal.

Soon David realizes that his sale is decreasing monthly due to the poor quality of products sold by some of the sellers. He then decided to categorize the products so that the site can recommend good products to his customers. He asked his CTO John, to develop the same functionality.

John has suggested him that If they allow customers to give feedback about the product they purchased in the form of like & dislike, then they can recommend those products over other similar products.

John and Product Manager have gathered some requirements and decided to develop using Agile methodology. 


1. Get User Details by User Id

2. Get Product Details by Product Id

3. Get all products liked by User

4. Get Product liked by Multiple Users

John is aware of RDBMS only and has suggested database schema as follows: 


1. User

a. User Id

b. User Name

c. Address

2. Product

a. Product Id

b. Product Name

c. Product Description

3. User Product Likes

a. User Id (FK user table)

b. Product Id (FK product table)

c. Timestamp

Soon after, huge data got accumulated in the last table, resulting in system imbalance. They tried to apply all optimization techniques but failed to overcome the issue.  

After some digging, they realized that last 2 queries were not performing good due to. 

1. Tables will be huge due to large catalogue

2. Retrieval products/users will take more time

To solve this, they hired you because you have some experience in NoSQL databases. You must come up with proper database selection and schema design.  

Once you have finalized design you have to: 

1. Provide information about database type which you are opting RDBMS/NoSQL/GRAPH?

2. Provide information about database why you selected?

3. Provide schema details along with Primary/Partition/Composite/Clustering keys?

Extension to above problem: 

4. Get all products liked by a user should also return product names

Get all user names who have liked any products

Case Study 2: DOMAIN: BANK 

Problem Statement: 

Our consulting firm has been retained by a major bank to help improve the scalability of their current infrastructure. There are lots of transaction logs generated by various systems. Current database MySQL is not able to handle all the logs. The Firm also wants to run some aggregation jobs.

Key issues: 

You must revamp existing code and migration of existing data.


You have given end points or log files path where data is being produced.

You have different pages on the website which can be search page, promotional page, deal of day page etc. You must use this log and design schema such that it can get daily request counts per day.

1. Number of clicks on deal of the day page with Android device on 11 May 2017

2. Number of clicks on deal of the page with IOS device on 11 May 2017

3. Number of clicks on home page with Chrome browser on 11 May 2017

4. Number of clicks on home page of Firefox browser on 11 May 2017

Case Study 3: Customer Help Desk Application 

Problem Statement: 

Model a Customer Help Desk application where customer complaints are logged and captured in a Cassandra column family. The Cassandra table HelpDesk shown in the following screenshot captures these details.

The columns CustomerId, TicketId, ActionTime constitute the Primary key. The column CustomerId becomes the Partition key. The records are stored in the descending order of TicketId, ActionTime. This is to make sure that the recent action details are accessible first.


1. Create a table HelpDesk as per the above requirement

2. Insert data into HelpDesk. For every record inserted, ActionTime should get the current timestamp.

3. Use the CQL command to display all the data in the specified format.

4. Write range query to retrieve data from to specific date and time. For example, between time-period 2017-11-12 19:14:00 and 2017-11-13 19:20:00

Case Study 4: Hotel Booking Application 

Problem Statement: 

Design a hotel room reservation application data model. Access available_rooms.csv file provided. The available_rooms.csv file contains a month’s worth of inventory for two small hotels with five rooms each.


1. Create a table available_rooms_by_hotel_date as per the requirement with hotel_id as the partition key, while date and room_number are clustering columns.

2. Bulk load to table available_rooms_by_hotel_data FROM available_rooms.csv

3.Display all the records in available_rooms_by_hotel_date for a particular hotel_id (ex: AZ123) and room_number (ex: 101). Remember both hotel_id and room_number are part of composite primary key.

4. Display all records for a particular hotel between two specific date range in descending order of date.

5. Write an UDF is available which return 1 if a room is available else return 0 Make a call to the UDF to display the results for table available_rooms_by_hotel_date.

6. Create UDF/UDFs to return the total available rooms.

Apache Cassandra Certification

StepLeaf’s Apache Cassandra Professional Certificate Holders work at 1000s of companies


All the instructors at StepLeaf are practitioners from the Industry with minimum 10-12 yrs of relevant IT experience. They are subject matter experts and are trained by StepLeaf for providing an awesome learning experience to the participants.

We have limited number of participants in a live session to maintain the Quality Standards. So, unfortunately participation in a live class without enrollment is not possible. However, you can go through the sample class recording and it would give you a clear insight about how are the classes conducted, quality of instructors and the level of interaction in a class.
You will never miss a lecture at StepLeaf! You can choose either of the two options:
  • View the recorded session of the class available in your LMS.
  • You can attend the missed session, in any other live batch.

Yes, the access to the course material will be available for lifetime once you have enrolled into the course. 

With the advent of Big Data the unstructured data is increasing exponentially. Companies are inclining towards NoSQL databases where they can store large volumes of structured, semi-structured & unstructured data with quick iteration and agile structure. MNCs such as Instagram, Netflix, GoDaddy, DataStax, reddit, ebay, Spotify & Starbucks are hunting for professionals with Cassandra certification. There is a steep career growth for Cassandra certified professionals. If you are planning to make a career in NoSQL databases, now is the right time for you.
As Apache Cassandra is used most of the streaming applications which deals with unstructured data such as Internet of Things (IOT), fraud detection applications & recommendation engines. The demand of Apache Cassandra is growing rapidly. Cassandra is used by 40% of the Fortune 100 and this list is also expanding quickly. So, the demand of Apache Cassandra is increasing promptly and early start would open lots of opportunities for you. Get Certified Get Ahead.