Build Pyspark and Spark SQL Applications on AWS EMR, Orchestrate using Step Functions, Manage EMR using Boto3 and more | Discount Coupon for Udemy Course
Last updated 8/2022Course Language EnglishCourse Caption EnglishCourse Length 11:18:37 to be exact 40717 seconds!Number of Lectures 150
This course includes:
11.5 hours hours of on-demand video
1 article
Full lifetime access
Access on mobile and TV
Certificate of completion
Creating Clusters using AWS Elastic Map Reduce Web Console
Setup Remote Application Development using AWS Elastic Map Reduce (EMR) and Visual Studio Code
Develop and Validate Simple Spark Application using Visual Studio Code and AWS Elastic Map Reduce (EMR)
Deploy Spark Application as Step to AWS Elastic Map Reduce (EMR)
Manage AWS Elastic Map Reduce (EMR) based Pipelines using Boto3 and Python
Build End to End AWS Elastic Map Reduce (EMR) based Pipelines using AWS Step Functions
Develop Applications using Spark SQL on AWS EMR Cluster
Build State Machine or Pipeline using AWS Step Functions using Spark SQL Script on AWS EMR Cluster
Understand how to pass parameters to Spark SQL Scripts deployed on EMR
AWS Elastic Map Reduce (EMR) is one of the key AWS Services used in building large-scale data processing leveraging Big Data Technologies such as Apache Hadoop, Apache Spark, Hive, etc. As part of this course, you will end up learning AWS Elastic Map Reduce (EMR) by building end-to-end data pipelines leveraging Apache Spark and AWS Step Functions.Here is the detailed outline of the course.First, you will learn how to Get Started with AWS Elastic Map Reduce (EMR) by understanding how to use AWS Web Console to create and manage EMR Clusters. You will also learn about all the key features of Web Console and also how to connect to the master node of the cluster and validate all the important CLI interfaces such as spark-shell, pyspark, hive, etc as well as hdfs and aws CLI commands.Once you understand how to get started with AWS EMR, you will go through the details related to Setting up Development Cluster using AWS EMR. There are quite a few advantages to using AWS EMR Clusters for development purposes and most enterprises do so.After setting up a development cluster using AWS EMR, you will go through the Development Life Cycle of Spark Applications using AWS EMR Development Cluster. You will be using Visual Studio Code Remote Development on top of the AWS EMR Development Cluster to go through the details.Once the development is done, you will go through the details related to Deploying Spark Application on AWS EMR Cluster. You will build the zip file and understand how to run using CLI in both clients as well as cluster deployment modes. You will also understand how you can deploy the spark application as a step on AWS EMR Clusters. You will also understand the details related to troubleshooting the issues related to Spark Applications by going through relevant logs.Typically we run Spark Applications programmatically. After going through the details related to deploying spark applications on AWS EMR Clusters, you will be learning how to Manage AWS EMR Clusters using Python Boto3. You will not only learn how to create clusters programmatically but also how to deploy Spark Applications as Steps programmatically using Python Boto3.End to End Data Pipelines using AWS EMR is built using AWS Step Functions. Once you understand how to manage EMR Clusters using Python Boto3 and also deploy Spark Applications on EMR Clusters using the same, it is important to learn how to Build EMR-based Workflows or Pipelines using AWS Step Functions. You will be learning how to create the cluster, deploy Spark Application as Step on to the cluster, and then terminate the cluster as part of a basic pipeline or State Machine using AWS Step Functions.You will also learn how to perform validations as part of State Machines by Enhancing AWS EMR-based State Machine or Pipeline. You will check if the files specified already exist as part of the validations.We can also build Data Processing Applications or Pipelines using Spark SQL on AWS EMR. First, you will learn how to design and develop solutions using Spark SQL Script, how to validate by using appropriate commands by passing relevant runtime arguments, etc.Once you understand the development process of implementing solutions using Spark SQL on AWS EMR, you will learn how to deploy Data Pipeline using AWS Step Function to deploy Spark SQL Script on EMR Cluster. You will also learn the concept of Boto3 Waiters to make sure the steps are executed in a linear fashion.Who this course is for:University Students who want to learn AWS Elastic Map Reduce to process heavy volumes of data with hands on and real time examplesAspiring Data Engineers and Data Scientists who want to master building data pipelines using AWS Elastic Map Reduce for large scale Data ProcessingExperienced Application Developers who would like to explore how to build end to end Data Pipelines using Python and AWS Services such as AWS Elastic Map ReduceExperienced Data Engineers to build end to end data pipelines using Python and AWS Elastic Map ReduceAny IT Professional who is keen to deep dive into AWS Elastic Map Reduce (EMR) for heavy weight Data Processing
Course Content:
Sections are minimized for better readability, click the section title to view the course content
1 Lectures | 00:19
Introduction to Mastering AWS Elastic Map Reduce for Data Engineers
00:19
3 Lectures | 08:49
Overview of Powershell on Windows 10 or Windows 11
04:25
Install Visual Studio Code on Windows
02:44
Install Remote Development Extension Kit for Visual Studio Code
01:40
23 Lectures | 01:15:58
Planning of EMR Cluster
01:20
Create EC2 Key Pair
04:30
Setup EMR Cluster with Spark
05:59
Understanding Summary of AWS EMR Cluster
03:28
Review EMR Cluster Application User Interfaces
02:23
Review EMR Cluster Monitoring
01:46
Review EMR Cluster Hardware and Cluster Scaling Policy
01:16
Review EMR Cluster Configurations
02:11
Review EMR Cluster Events
02:21
Review EMR Cluster Steps
01:48
Review EMR Cluster Bootstrap Actions
02:03
Connecting to EMR Master Node using SSH
02:20
Disabling Termination Protection and Terminating the Cluster
01:41
Clone and Create New Cluster
03:37
Listing AWS S3 Buckets and Objects using AWS CLI on EMR Cluster
03:20
Listing AWS S3 Buckets and Objects using HDFS CLI on EMR Cluster
03:32
Managing Files in AWS s3 using HDFS CLI on EMR Cluster
04:51
Review Glue Catalog Databases and Tables
01:45
Accessing Glue Catalog Databases and Tables using EMR Cluster
05:45
Accessing spark-sql CLI of AWS EMR Cluster
04:11
Accessing pyspark CLI of AWS EMR Cluster
06:05
Accessing spark-shell CLI of AWS EMR Cluster
06:56
Create AWS EMR Cluster for Notebooks
02:50
15 Lectures | 58:38
Create bootstrap script for AWS EMR Cluster
04:37
Provision Elastic IP for Master Node of AWS EMR Cluster
03:27
Create AWS EMR for Development
03:52
Troubleshooting Issues related to Bootstrap of EMR Cluster
01:59
Fix Bootstrap Script for AWS EMR Cluster
03:51
Validate AWS EMR Cluster with Bootstrap Action with updated script
05:43
Setup Python Virtual Environment as part of VS Code Workspace
02:47
Getting Started with Boto3 to Manage AWS EMR Clusters
02:28
Setup boto3 to explore APIs to manage AWS EMR Clusters
02:47
Set AWS Profile using env file in Visual Studio Code
04:25
Get Cluster Details of AWS EMR Development Cluster using boto3
05:24
Getting Instance Id of the Master Node of AWS EMR Cluster using boto3
02:24
Getting Allocation Id of the Elastic Ip using AWS boto3
04:29
Associating Elastic Ip with AWS EMR Master Node using Boto3
03:39
Setup Notebook Environment for EMR Cluster using IAM User
06:46
16 Lectures | 01:36:13
Open Remote Window on AWS EMR Master Node using VS Code
04:32
Setup Workspace on AWS EMR Master using Git Repository
03:53
Best Practices and Advantages of using AWS EMR Cluster for Team Development
04:13
Install VSCode Extensions in remote Workspace for Python
04:58
Review Python and Pyspark details on EMR Cluster
03:26
Running Applications using local and yarn during development
02:39
Getting Started with Development of Spark Applications on EMR Cluster
07:05
Create Function for Spark Session
06:09
Upload Files to AWS s3 for the development using AWS EMR Cluster
04:51
Develop read logic for the Spark Application
10:05
Process Data Frame using Spark APIs
06:56
Write Data to Files using Spark APIs
08:20
Productionize the Code and setup required data sets for validation
05:59
Resize the AWS EMR Cluster using Web Console
07:19
Validate Changes to productionize the Application Code
08:49
Take the backup and terminate the cluster
06:59
10 Lectures | 43:59
Recreate the AWS EMR Cluster to deploy Spark Applications
03:21
Setup Code Repository on the AWS EMR Master Node
08:15
Resize the AWS EMR Cluster to validate application on larger data sets
04:38
Build Zip File for the Spark Application
03:54
Validate the Spark Application using zip file and client as deploy mode
06:02
Run Spark Application on EMR using Cluster Deployment Mode
03:42
Run Spark Application copied to s3 on EMR using Cluster Deployment Mode
03:49
Deploy Spark Application as Step to the AWS EMR Cluster
05:29
Setup Multiple Files to Manage AWS s3 Objects using State Machines
02:35
Validate Spark Application Deployed as Step on AWS EMR Cluster
02:14
11 Lectures | 59:39
Update Material related to Managing AWS EMR using Boto3
03:42
Create AWS EMR Cluster using AWS CLI Command
07:01
Manage AWS EMR Clusters using AWS CLI Commands
06:42
Overview of AWS boto3 to Manage AWS EMR Clusters
08:09
Overview of Run Job Flow API to create AWS EMR Cluster
06:17
Create AWS EMR Cluster or Job Flow Cluster using AWS Boto3
11:12
Prepare Data Sets to add Spark Application as Step to AWS EMR Cluster
02:42
Add Spark Application as Step to AWS EMR Cluster using Boto3
07:41
Exercise to add Spark Application as Step to EMR Cluster using boto3
02:28
Terminate the AWS EMR Cluster used for adding Steps
01:58
Exercise to Create AWS EMR Cluster with Steps for Spark Application
01:47
12 Lectures | 49:52
Review of Development Environment for AWS Step Functions and EMR
02:09
Quick Overview of Important Terms of AWS Step Functions
01:50
Getting Started with EMR based Pipeline using AWS Step Functions
06:27
Overview of AWS IAM Role associated with State Machine copy
01:26
Overview of Creating EMR Cluster using AWS Step Functions
05:52
Parameters to Create EMR Cluster using AWS Step Functions
02:52
Attach Permissions to Step Function Role to Create AWS EMR Cluster
04:58
Add Step to AWS EMR Cluster using AWS Step Function
08:35
Validate Adding Step to AWS EMR Cluster using Step Functions
03:59
Add Action to Step Machine to Terminate the AWS EMR Cluster
06:41
Validate the execution of State Machine to run Spark Application on AWS EMR
02:35
Terminate AWS EMR Clusters Created to Validate State Machine copy
02:28
18 Lectures | 01:19:30
Review the current state of AWS EMR based Pipeline or State Machine copy
00:41
Create State Machine using AWS Step Function to Validate s3 copy
03:03
Attach Policy with Permissions on AWS s3 to Step Function Role copy
03:00
Setup File in AWS s3 and Validate State Machine to list objects copy
03:11
Relationship between AWS Boto3 and Actions in Step Functions copy
03:41
Add State to Delete Object from AWS s3 copy
02:53
Fix Permissions and Run State Machine to Delete Object from AWS s3 copy
03:01
Passing Input to States in AWS Step Functions State Machine copy
05:54
Setup Multiple Files to Manage AWS s3 Objects using State Machines copy
02:35
Process AWS s3 Objects using Map in State Machine
08:30
Extract Key of AWS s3 Objects using Step Functions Pass
04:54
Add State to AWS Step Function Delete s3 Object
04:12
Develop AWS Lambda Function to customise State Machine Data
07:03
Add AWS Lambda Function to State Machine to Pass s3 Details for delete
09:11
Add Condition to State Machine to avoid Key Error on AWS s3 List Objects
06:19
Overview of Map Concurrency in State Machines of AWS Step Functions
03:33
Invoking AWS Step Function State Machine from Other State Machines
06:27
Overview of integration of s3 based State Machine with EMR State Machine
01:22
13 Lectures | 52:58
Taking back up of AWS Step Functions State Machines
02:13
Grant Permissions between AWS Step Functions State Machines via IAM Role
03:24
Update AWS Step Function State Machine with EMR to validate s3
05:14
Pass EMR Step Details to AWS Step Functions State
03:12
Validate AWS Step Function EMR based State Machine Execution
03:11
Run AWS Step Function State Machine to validate logic to delete AWS s3 Objects
01:11
Exercise to add validation of source s3 location in AWS Step Function StateMach
01:33
Update AWS Step Function State Machine to Validate Source s3 Location
04:59
Run AWS Step Function State Function with source s3 Validation Logic
05:03
Develop AWS Lambda Function to check number of files in source s3
05:20
Attach Policy to State Machine Role to Invoke AWS Lambda Function
02:30
Run Updated State Machine to validate source count
11:58
Best Practices to Run AWS Step Functions State Machines
03:10
14 Lectures | 01:10:25
Setup AWS EMR Cluster to develop applications using Spark SQL
10:59
Setup Visual Studio Code Workspace using AWS EMR Master Node
04:18
Update PYTHONPATH to access Pyspark Libraries or Modules on AWS EMR Master Node
03:04
Setup Required Data Sets for Spark SQL
02:16
Upload Retail DB Files to AWS s3 using AWS CLI commands
02:52
Getting Started with Spark SQL and Temporary Views using Spark SQL on AWS EMR C
05:20
Create Spark SQL Temporary Views for Orders and Order Items
04:10
Join and Aggregate using Spark SQL on AWS EMR Cluster
07:33
Write Query Results back to AWS s3 using Spark SQL on AWS EMR Cluster
03:39
Develop Script using Spark SQL Commands
07:06
Parameterize Bucket Name in Spark SQL Script
05:35
Deploy Spark SQL Script in s3 and Run using CLI on AWS EMR Master Node
05:53
Deploy Spark SQL Script as Step on AWS EMR Cluster
05:49
Conclusion to Develop Spark SQL Applications on EMR Cluster
01:51
14 Lectures | 01:22:17
Create State Machine to Deploy Spark SQL Script on AWS EMR Cluster
03:14
Overview of Managing AWS EMR Clusters using Boto3
03:50
Overview of AWS boto3 to Manage AWS EMR Clusters
08:17
Create AWS EMR Job Flow Cluster using Python Boto3
07:03
Add Spark SQL Script as Step to AWS EMR Cluster using Boto3
06:21
Overview of AWS EMR Waiters using Python Boto3
06:23
Terminate AWS EMR Cluster using waiters and Python Boto3
06:41
Overview of AWS Step Functions State Machine to execute Spark SQL on EMR
01:06
Create State Machine using AWS Step Function to create EMR Cluster
08:07
Grant Permissions to State Machine via Role to Create AWS EMR Cluster
03:59
Add Spark SQL Script as Step to AWS EMR Cluster using AWS Step Functions
09:32
Add Add Terminate AWS EMR Cluster Step to AWS Step Functions State Machine
06:22
Pass AWS EMR Step Details as Input to State Machine at Execution Time
04:37
Validate Spark SQL Script Execution as AWS EMR Step using State Machine
06:45
4.31
(84 course ratings)
1
1/84
2
3/84
3
7/84
4
28/84
5
45/84
JOIN OUR WHATSAPP GROUP TO GET LATEST COUPON AS SOON AS UPDATED
If you like to get inspired by great web projects, you should check out Made with Javascript. If you have a project that you wish to share with the world, feel free to submit your project on Made with Javascript Club website.
Free Online Tools And Converters for your use
URL Encoder
Input a string of text or a URL and encode the entered string
FAQ: Udemy Free course Most frequent questions and answers
Does Udemy offer Free Udemy coupons?
Yes, Udemy is the largest online education platform, with the broadest selection of video-on-demand courses and qualified instructors available to meet your needs. At theprogrammingbuddy.club we curate the latest udemy coupons, their expiry, and the number of uses left of these udemy coupons.
How to get free Udemy courses?
There are two ways to get free Udemy courses:
Go to udemy.com and search for your desired course category. Then select free from the filter options.
You can also get paid courses for free if you have a coupon. You can head to theprogrammingbuddy.club, where you can get a daily udemy paid course for free.
How to get Udemy Certificates for free?
Udemy offers certification on completion of each course. In order to receive a certificate of completion from Udemy, you need to complete your course 100%. There is a simple hack, you can open a video and jump on the timeline to complete a lecture.
To download the certificate from Udemy, you need to head over to your account on a desktop browser. Udemy certificates can't be accessed on the mobile app.
Do Udemy courses expire?
No, once you enroll, you will have lifetime access to the course. You can complete the course on your schedule.
Why are the Udemy instructors giving away free Udemy Coupons?
Every instructor has worked for hours on each of their courses. As new courses get launched, the instructors have no way to get their course in front of an audience to get some feedback. So, instructors share free coupons for their courses to get feedback from the students. We attheprogrammingbuddy.club work with these instructors to get their courses available to our buddies.
Is Udemy safe to use?
Yes, payments on Udemy are safe. It is no different than paying for other services on an application or website and inputting your payment information before receiving your goods. Just be sure to keep your account secure, do not share your udemy accounts.
Can Udemy courses get you a job?
Earning a skill is more valuable than earning a job these days. Skills are your most valuable asset. They can help you qualify for jobs you want and get promoted to more advanced positions within your organization. Unfortunately, it is difficult for many people to balance taking courses with work and family obligations. We have had many students, who have taken just Udemy courses, started a job as well as started freelancing with the skills they have learned.