Data Analysis with Pandas and Python | Discount Coupon for Udemy Course
Analyze data quickly and easily with Python's powerful pandas library! All datasets included --- beginners welcome! | Discount Coupon for Udemy Course
bestseller- 21.5 hours hours of on-demand video
- 40 article
- Full lifetime access
- Access on mobile and TV
- Certificate of completion
- 7 additional resources
- Perform a multitude of data operations in Python's popular pandas library including grouping, pivoting, joining and more!
- Learn hundreds of methods and attributes across numerous pandas objects
- Possess a strong understanding of manipulating 1D, 2D, and 3D data sets
- Resolve common issues in broken or incomplete data sets
Student Testimonials:The instructor knows the material, and has detailed explanation on every topic he discusses. Has clarity too, and warns students of potential pitfalls. He has a very logical explanation, and it is easy to follow him. I highly recommend this class, and would look into taking a new class from him. - DianaThis is excellent, and I cannot complement the instructor enough. Extremely clear, relevant, and high quality - with helpful practical tips and advice. Would recommend this to anyone wanting to learn pandas. Lessons are well constructed. I'm actually surprised at how well done this is. I don't give many 5 stars, but this has earned it so far. - MichaelThis course is very thorough, clear, and well thought out. This is the best Udemy course I have taken thus far. (This is my third course.) The instruction is excellent! - JamesWelcome to the most comprehensive Pandas course available on Udemy! An excellent choice for both beginners and experts looking to expand their knowledge on one of the most popular Python libraries in the world!Data Analysis with Pandas and Python offers 19+ hours of in-depth video tutorials on the most powerful data analysis toolkit available today. Lessons include:installingsortingfilteringgroupingaggregatingde-duplicatingpivotingmungingdeletingmergingvisualizingand more!Why learn pandas?If you've spent time in a spreadsheet software like Microsoft Excel, Apple Numbers, or Google Sheets and are eager to take your data analysis skills to the next level, this course is for you! Data Analysis with Pandas and Python introduces you to the popular Pandas library built on top of the Python programming language. Pandas is a powerhouse tool that allows you to do anything and everything with colossal data sets -- analyzing, organizing, sorting, filtering, pivoting, aggregating, munging, cleaning, calculating, and more! I call it "Excel on steroids"!Over the course of more than 19 hours, I'll take you step-by-step through Pandas, from installation to visualization! We'll cover hundreds of different methods, attributes, features, and functionalities packed away inside this awesome library. We'll dive into tons of different datasets, short and long, broken and pristine, to demonstrate the incredible versatility and efficiency of this package.Data Analysis with Pandas and Python is bundled with dozens of datasets for you to use. Dive right in and follow along with my lessons to see how easy it is to get started with pandas!Whether you're a new data analyst or have spent years (*cough* too long *cough*) in Excel, Data Analysis with pandas and Python offers you an incredible introduction to one of the most powerful data toolkits available today!Who this course is for:Data analysts and business analystsExcel users looking to learn a more powerful software for data analysis
Course Content:
Sections are minimized for better readability, click the section title to view the course content
- Introduction to Data Analysis with Pandas and Python12:15
Welcome to Data Analysis with Pandas and Python. In this lesson, we
introduces the pandas library including its history and purpose
introduce Jupyter Notebook, the environment in which we'll be writing our code
explore sample Jupyter Notebooks to showcase some of the technology's features
The datasets for this course are available in a single pandas.zip file. Download and unpack the pandas.zip file in the directory of your choice.
- About Me00:56
Get to know a little about your instructor.
- Completed Course Files00:31
This lesson includes the completed Jupyter Notebooks that were created during the recording of the course.
- macOS - Download the Anaconda Distribution, our Python development environment04:21
The next batch of lessons focuses on installing and configuring the Anaconda distribution on a MacOS machine. When downloading the distribution, choose the latest version of the language. It will have the greatest version number. In this lesson, we also discuss the differences between Python 2 and 3.
- macOS - Install Anaconda Distribution10:38
In this lesson, we install the Anaconda distribution on a MacOS machine. The setup install Python and over 100 of the most popular libraries for data science in a central directory on your computer. We also explore the Anaconda Navigator program, a visual application for interacting with Anaconda.
- macOS - Access the Terminal Application08:34
The Terminal is an application for issuing text-based commands to your MacOS operating system. In this lesson, you'll learn two ways to access the Terminal. We also verify that Anaconda has been successfully installed and update the version of the conda environment manager.
- macOS - Create conda Environment and Install pandas and Jupyter Notebook13:07
In this lesson, we use the Terminal to create a new Anaconda environment and install Pandas (and some other libraries) within it. We also learn how to activate and deactivate conda environments, and update packages.
- macOS - Unpack Course Materials + The Start and Shutdown Process12:32
The course materials, a collection of datasets in .csv and .xlsx file formats, is available for download in a single zip file attached to this lesson. I strongly recommend following along with my tutorials by practicing the syntax on your end. In this lesson, we walk through the startup and shutdown process for a Jupyter Notebook session. We also execute our first line of Python code!
- Windows - Find Out if Your System is 32-bit or 64-bit01:16
Discover if your Windows operating system is running a 32-bit version or a 64-bit version of the OS. You'll need to memorize this number for the next lesson.
- Windows - Download and Install the Anaconda Distribution09:32
In this lesson, we'll download and install the Anaconda distribution for our Windows computers. Anaconda is a software bundle that includes Python and the conda environment manager.
- Windows - Create conda Environment and Install pandas and Jupyter Notebook18:12
Access the Command Prompt on a Windows machine. The prompt (also known as the command line) is used to interact with the computer with text-based commands. We'll use it to download additional Python libraries for the course and update all installed Anaconda libraries.
- Windows - Unpack Course Materials + The Startdown and Shutdown Process13:13
In this lesson, we extract our .csv and .xlsx datasets, which are available in a single .zip file attached to this lesson.. We also walk through the startup and shutdown process for a study session, which includes
activating the correct Anaconda environment
launching the Jupyter Notebooks application
opening and closing a Jupyter Notebook
shutting down the Jupyter server
- Intro to the Jupyter Notebook Interface09:51
In this lesson, we explore the Jupyter Notebook interface including the toolbars and menus. We also dive into the ways we can restart the Notebook in case of slowness or unresponsiveness.
- Cell Types and Cell Modes in Jupyter Notebook07:37
In this lesson, we learn about the two different modes (Edit Mode and Command Mode) within a Jupyter Notebook. Edit Mode modifies the contents of a single cell while Command Mode enables keyboard shortcuts that work on the Notebook as a whole.
- Code Cell Execution in Jupyter Notebook03:13
Learn the multiple keyboard shortcuts to execute code cells and Markdown cells. We'll also learn how Jupyter Notebook chooses what to output below a cell that has multiple commands.
- Popular Keyboard Shortcuts in Jupyter Notebook03:41
In this lesson, we practice using keyboard shortcuts to add and delete cells from the Jupyter Notebook. We also see how to access a helpful cheatsheet of available commands.
- Import Libraries into Jupyter Notebook08:31
In this lesson, we discuss how to use the import keyword to import libraries like panda and numpy into a Jupyter Notebook. We also talk about the as keyword to assign an alias to an import, as well as the popular community aliases for pandas and numpy.
- Troubleshooting Issues with Jupyter Notebook00:14
This lecture walks you through fixing a possible issue with Autocomplete in Jupyter Notebook.
- Setup & Installation8 questions
Test your knowledge of the concepts introduced in this course section.
- Intro to the Python Crash Course03:34
In this lesson, we introduce the Python crash course, a quick bootcamp on the fundamentals on the Python programming language. This section is COMPLETELY OPTIONAL and is designed for students who have limited / no experience in coding.
- Comments03:18
A comment is a line of code that is ignored by the Python interpreter. You can use it to disable code, to leave documentation, or to write explanations.
- Basic Data Types10:47
This lesson introduces Python's basic data types including integers, floats, strings, Booleans, and the None object. All of these data structures are objects. An object is just a digital container for data.
- Operators15:35
In this lesson, we go back to grade school and practice using common mathematical operators for addition, subtraction, division, and multiplication. We also explore some special operators like modulo that you may not have used before.
- Variables07:48
A variable is a name assigned to an object in the program. It is subject to change; a variable can be reassigned to a different value throughout a program's execution.
- Declare Variables1 question
- Coding Exercise Solution: Declare Variables00:02
See the solutions to the coding exercise in the previous lesson.
- Built-in Functions10:42
A function is a recipe, a set of repeatable instructions for achieving some outcome. Functions accept inputs called arguments and produce an output called a return value. In this lesson, we practice invoking Python's built-in functions including len, str, float, int, and more.
- Built-in Functions1 question
- Coding Exercise Solution: Built-in Functions00:33
See the solutions to the coding exercise in the previous lesson.
- Custom Functions16:33
In this lesson, we define our own custom function to convert a temperature from Celsius to Fahrenheit. A custom function allows us to encapsulate a piece of repeatable business logic.
- Custom Functions1 question
- Coding Exercise Solution: Custom Functions00:18
See the solutions to the coding exercise in the previous lesson.
- String Methods20:51
In this lesson, we introduce string methods like upper, lower, title, replace, and more. A string is an immutable object so it is incapable of change. These methods return new string objects.
- String Methods1 question
- Coding Exercise Solution: String Methods00:41
See the solutions to the coding exercise in the previous lesson.
- Lists13:17
A list is a mutable data structure for storing elements in order. In this lesson, we declare some sample lists of various data types and explore some common methods for list manipulation.
- Creating Lists1 question
- Coding Exercise Solution: Creating Lists00:25
See the solutions to the coding exercise in the previous lesson.
- Index Positions and Slicing15:56
Indexing or slicing refers to extracting one or more elements from a data structure like a string or a list. In this lesson, we introduce the square bracket syntax for indexing.
- Index Positions and Slicing1 question
- Coding Exercise Solution: Index Positions and Slicing00:23
See the solutions to the coding exercise in the previous lesson.
- Dictionaries15:22
A dictionary is a mutable, unordered collection of key-value pairs. Dictionaries are ideal for creating associations or relationships between objects. In this lesson, we build up a dictionary representing a restaurant menu, then demonstrate how to add, remove, and modify key-value pairs within it.
- Creating Dictionaries1 question
- Coding Exercise Solution: Creating Dictionaries00:11
See the solutions to the coding exercise in the previous lesson.
- Completed Jupyter Notebook for this Section00:02
Downloaded the completed Jupyter Notebook for the course section.
- Python Crash Course8 questions
Test your knowledge of the concepts introduced in this course section.
- Create Jupyter Notebook for the Series Module02:15
In this lesson, we create a new Jupyter Notebook for the Series section of the course. The pandas Series object is a one-dimensional labelled array that combines the best features of a Python list and a Python dictionary.
- Create A Series Object from a Python List12:41
A pandas Series can be created with the pd.Series() constructor method. In this lesson, we'll practice creating a few sample Series by feeding in Python lists as inputs to the constructor method.
- Create A Series Object from a Python Dictionary06:24
The pd.Series constructor method accepts a variety of inputs, including native Python object. In this lesson, we'll create a Series from a Python dictionary. We'll also explore the differences between the Series and Python's built-in objects, and understand how the index operates in a Series.
- Create a Series Object v21 question
- Coding Exercise Solution: Create a Series Object00:29
See the solutions to the coding exercise in the previous lesson.
- Intro to Methods06:54
In this lesson, we continue our exploration of methods on the Series object. We practice with the sum, mean, and product methods.
- Intro to Attributes09:48
Objects in pandas have attributes and methods. Methods actively interact with and modify the object while attributes return information about the object's state. In this lesson, we'll use the values, index, and dtype attributes on a Series object.
- Attributes and Methods on a Series1 question
- Coding Exercise Solution: Attributes and Methods on a Series00:16
Explore the solution for the previous coding challenge.
- Parameters and Arguments19:06
Parameters are names for the inputs that a function/method will receive when invoked. Arguments are the values we provide for those parameters when the function/method is invoked. In this lesson, we'll learn the syntax of supplying arguments to parameters on pandas methods.
- Parameters and Arguments1 question
- Coding Exercise Solution: Parameters and Arguments00:39
See the solution to the coding exercise in the previous lesson.
- Import Series with the pd.read_csv Function18:00
The time has come to import our first datasets into our Jupyter Notebook ! In this lesson, we use the pd.read_csv method to import a dataset of Pokemon and Google stock prices. We also explore the squeeze method, which coerces an imported one-column DataFrame into a Series object.
- Import Series with the read_csv Function1 question
- Coding Exercise Solution: Import Series with the read_csv Function00:28
See the solutions to the coding exercise in the previous lesson.
- Use the head and tail Methods to Return Rows from Beginning and End of Dataset07:07
In this lesson, we learn the head and tail methods for returning a specified number of rows from the beginning and end of a Series.
- The head and tail Methods1 question
- Coding Exercise Solution: The head and tail Methods00:46
See the solutions to the coding exercise in the previous lesson.
- Passing Series to Python Built-In Functions07:29
See how the Series interacts with Python's built-in functions including len, type, sorted, list, dict, max, and min. pandas works seamlessly with all of them.
- The sort_values Method04:46
In this lesson, we invoke the sort_values method to sort a Series in ascending or descending order.
- The sort_values Method1 question
- Coding Exercise Solution: The sort_values Method00:35
See the solutions to the coding exercise in the previous lesson.
- The sort_index Method04:28
Call the .sort_index() method on a pandas Series to sort it by the index instead of its values.
- The sort_index Method1 question
- Coding Exercise Solution: The sort_index Method00:25
See the solution to the coding challenge in the previous lesson.
- Check for Inclusion with Python's in Keyword05:12
We can use Python's in keyword to check if a value exists in either the values or index of a Series. By default, pandas will search for the value in the Series index.
- Check for Inclusion with Python's in Keyword1 question
- Coding Exercise Solution: Check for Inclusion with Python's in Keyword00:49
See the answers to the coding challenge in the previous lesson.
- Extract Series Values by Index Position08:27
In this lesson, we walk through how to use square bracket notation to extract one or more Series values by their index position. The index position represents the order of the row within the Series.
- Extract Series Values by Index Label04:44
In this lesson, we explore how to use use bracket notation to extract one or more values from a Series by their index labels.
- Extract Series Values by Index Position or Index Label1 question
- Coding Exercise Solution: Extract Series Values by Index Position or Index Label00:31
See the solutions to the coding exercise in the previous lesson.
- The get Method08:21
In this lesson, we explore an alternative approach to extracting one or more values from a Series by index position or index label. The get method accepts the key to search for in the index as well as a fallback to value in return if the key is not found.
- Overwrite a Series Value05:25
In this lesson, we use square bracket syntax to overwrite a Series value at a given index position/index label.
- The copy Method11:56
In this lesson, we discuss the differences between copies and views. We also learn how to use the copy method to create a distinct, independent clone of an existing pandas object.
- The inplace Parameter13:29
In this lesson, we learn about the inplace parameter, which "modifies" an object inplace. As we find out, this is a bit of an illusion as pandas still creates a copy of the original object behind the scenes.
- Math Methods on Series Objects04:24
In this lesson, we practice invoking common mathematical methods including count, sum, mean, and std on Series objects.
- Broadcasting04:00
In this lesson, we introduce the syntax for broadcasting, which allows us to apply the same mathematical operation to each Series value.
- Use the value_counts Method to See Counts of Unique Values within a Series06:20
In this lesson, we learn the value_counts method. It returns a Series that counts the number of the times each unique value occurs in a Series.
- The value_counts Method1 question
- Coding Exercise Solution: The value_counts Method00:44
See the solutions to the coding exercise in the previous lesson.
- Use the apply Method to Invoke a Function on Every Series Values07:37
In this lesson, we use the apply method to invoke a function on every Series value. The method returns a new Series with the resulting calculations (i.e. return values).
- The map Method07:54
In this lesson, we use the map method to associate Series values to other values based on mapping object. We practice with a Python dictionary and another Series object.
- Completed Jupyter Notebook for this Section00:02
Downloaded the completed Jupyter Notebook for the course section.
- A Review of the Series Module7 questions
Review the pandas Series concepts you explored in this module with this action-packed quiz!
- Intro to DataFrames I Module09:48
In this section, we'll be playing around with the 2-dimensional DataFrame object. We define what dimensions are and also introduce the nba.csv dataset.
- Methods and Attributes between Series and DataFrames14:08
The Series and DataFrame objects share many attributes and methods in common. In this lesson, we'll review attributes like index, values, shape, and dtypes and see how they return different results depending on the object they're invoked on. We also introduce exclusive attributes like columns and hasnans.
- Differences between Shared Methods06:01
Series and DataFrames may share attributes and methods but they are still different objects. In this lesson, we'll see how methods like sum operate differently depending on the object they are called on.
- Select One Column from a DataFrame06:28
We can use two syntactical options to extract a single column from a DataFrame: dot syntax and square brackets. I prefer the square bracket approach because it works 100% of the time.
- Select One Column from a DataFrame1 question
- Coding Exercise Solution: Select One Column from a DataFrame00:28
See the solutions to the coding exercise in the previous lesson.
- Select Two or More Columns from a DataFrame03:50
In this lesson, we'll select two or more columns from a pandas DataFrame. We'll still need bracket syntax to extract but now we'll include a Python list to specify the specific columns we'd like to pull out. The result will be a new DataFrame.
- Select Two or More Columns from a DataFrame1 question
- Coding Exercise Solution: Select Two or More Columns from a DataFrame00:29
See the answers to the coding challenge in the previous lesson.
- Add New Column to DataFrame07:16
We can use square bracket syntax to add a new column on the right end of a DataFrame and populate it with values. In this lesson, we also introduce the insert method to add a column at a specified column index.
- Create New Column from Existing Column10:18
In this lesson, we apply mathematical operations to columns in our nba DataFrame and add the new Series as columns to the DataFrame.
- A Review of the value_counts Method03:24
Refresh your memory on the value_counts Series method, which counts the number of times each unique value occurs within the Series.
- Drop DataFrame Rows with Null Values with the dropna Method08:48
Pandas marks null/absent values with the NaN designation. In this lesson, we'll learn how to delete rows with NaN values with the dropna method. We'll also modify the method arguments to customize which rows will be removed.
- Delete DataFrame Rows with Missing Values1 question
- Coding Exercise Solution: Delete DataFrame Rows with Missing Values00:33
See the solutions to the coding exercise in the previous lesson.
- Fill in Missing DataFrame Values with the fillna Method08:51
One alternative to dropping null value is populating them with a predefined value. In this lesson, we'll call the .fillna() method to accomplish this. We'll practice the method on both DataFrame and Series objects.
- The astype Method I07:48
In this lesson, we convert the data types in our nba columns using the astype method. We also review how to overwrite a DataFrame column with a Series of new data values.
- The astype Method II09:37
In this lesson, we introduce the category data type, which is optimal when there is a small number of unique values in a Series.
- The astype Method1 question
- Coding Exercise Solution: The astype Method00:32
See the solutions to the coding challenge in the previous lesson.
- Sort a DataFrame with the sort_values Method, Part I08:54
We can use the sort_values method to sort a DataFrame by one or more columns. In this lesson, we explore the method in depth including how to customize the sort order.
- Sort a DataFrame with the sort_values Method, Part II10:07
In this lesson, we'll explore additional parameters to the sort_values method to sort a DataFrame by values across multiple columns. We'll also cover how to specify different sort orders (ascending vs. descending) for different columns.
- The sort_values Method on a DataFrame v31 question
- Coding Exercise Solution: The sort_values Method on a DataFrame00:40
See the solutions to the coding exercise in the previous lesson.
- Sort DataFrame Index with the sort_index Method03:37
In this lesson, we review the sort_index method, which sorts the DataFrame by its index positions/labels.
- Rank Series Values with the rank Method06:07
We can rank Series values from lowest to greatest (or vice versa) with the rank method. In this lesson, we use this method to rank the salaries of the players in our nba dataset.
- Completed Jupyter Notebook for this Section00:02
Downloaded the completed Jupyter Notebook for the course section.
- A Review of the DataFrames I Module3 questions
- This Module's Dataset + Memory Optimization15:51
In this lesson, we create the Jupyter Notebook for our new section, our second focusing on the 2D DataFrame object. The focus of this module is filtering data or, in other words, how we extract rows based on one or more conditions. We also introduce the employees.csv dataset that we'll be working with.
- Filter a DataFrame Based on A Condition12:57
In this lesson, we'll filter rows from the employees DataFrame based on a single condition. The logic involves creating a Boolean Series of True and False values, then passing it in square brackets after our DataFrame.
- Filter a DataFrame Based on a Condition1 question
- Coding Exercise Solution: Filter a DataFrame Based on A Condition00:44
See the solutions to the coding exercise in the previous lesson.
- Filter DataFrame with More than One Condition (AND - &)04:41
In this lesson, we'll explore more complex row filtering based on multiple conditions. The syntax requires some additional symbols (&) to specify that we want to check the truthiness of multiple conditions.
- Filter DataFrame with More than One Condition (AND - &)1 question
- Coding Exercise Solution: Filter DataFrame with More than One Condition (AND)00:54
See the solutions to the coding exercise in the previous lesson.
- Filter DataFrame with More than One Condition (OR - |)08:35
In this lesson, we'll continue filtering rows from the DataFrame based on multiple conditions. However, this time we'll use a new symbol ( | ) to specify an OR check. This requires only one of the tested conditions to evaluate to True in order to include the row.
- Filter DataFrame with More than One Condition (OR - |)1 question
- Coding Exercise Solution: Filter DataFrame with More than One Condition (OR)00:54
See the solutions to the coding exercise in the previous lesson.
- Check for Inclusion with the isin Method06:17
One common problem is data analysis is extracting rows whose values fall within a collection of values. Instead of writing multiple OR statements, we can use the isin method and pass in a list of values to match against.
- Check for Inclusion with the isin Method1 question
- Coding Exercise Solution: Check for Inclusion with the isin Method00:49
See the solution to the previous coding challenge.
- Check for Null and Present DataFrame Values with the isnull and notnull Methods05:07
Call the .isnull() and .notnull() methods to create Boolean Series for extracting rows will null or non-null values. Both methods return a Boolean Series object, which can be passed within square brackets after the DataFrame to filter it.
- Check For Inclusion Within a Range of Values with the between Method06:51
In this lesson, we'll learn the between method, which extract rows where a column value falls between a range of values. The between method returns a Boolean Series, which can be passed within square brackets after the DataFrame to filter it.
- The between Method1 question
- Coding Exercise Solution: The between Method00:53
See the solution to the coding challenge in the previous lesson.
- Check for Duplicate DataFrame Rows with the duplicated Method09:05
Call the .duplicated() method to create a Boolean Series and use it to extract rows that have duplicate values. This is another example of a method that returns a Boolean Series object, which can be passed within square brackets after the DataFrame to filter it.
- Delete Duplicate DataFrame Rows with the drop_duplicates Method08:16
An alternative option to identifying duplicate rows and removing them through filtering is the .drop_duplicates() method. In this lesson, we'll invoke the method to remove rows with duplicate values in a DataFrame. We'll also provide custom arguments to modify how the method operates.
- Identify and Count Unique Values with the unique and nunique Methods04:22
In this lesson, we introduce the unique and nunique methods, which extract the unique values and a count of the unique values in a Series.
- Intro to the DataFrames III Module + Import Dataset04:55
In this lesson, we introduce the third DataFrame-focused section of the course. The upcoming lessons cover how to:
set and reset an index in a DataFrame
retrieve DataFrame rows by index position or index label
set new values for one or more cells in the DataFrame
rename or delete rows or columns
extract a random sample of rows / columns
and more!
- Use the set_index and reset_index methods to define a new DataFrame index07:26
Pandas will default to assigning a data structure a numeric index starting at 0. In this lesson, we'll explore how we can use the set_index and reset_index methods to customize and reset the index labels of a DataFrame object.
- Retrieve Rows by Index Label with loc Accessor12:42
In this lesson, we'll use the loc[] accessor to retrieve DataFrame rows based on index label. We also look at providing multiple index labels within a list.
- Retrieve Rows by Index Position with iloc Accessor07:23
In this lesson, we'll use the .iloc[] accessor to retrieve DataFrame rows based on index position. We also look at providing multiple index positions within a list.
- Passing second arguments to the loc and iloc Accessors09:10
The .loc[] and loc accessors can take second arguments to specify the column(s) that should be extracted. In this lesson, we'll practice extracting movies from our dataset with this syntax.
- Set New Value for a Specific Cell or Cells In a Row04:34
In this lesson, we'll discuss how to assign a new value to one cell in a DataFrame. We first extract the cell value by using the .ix[] method with a row and column argument, then reset its value with the assignment operator (=).
- Set Multiple Values in a DataFrame06:08
In this lesson, we explore how we can overwrite multiple values in a DataFrame by passing a Boolean Series to the loc accessor. We also discuss how we can accidentally overwrite values on a slice of data rather than the original DataFrame itself.
- Rename Index Labels or Columns in a DataFrame09:33
In this lesson, we invoke the rename method on a DataFrame to change the names of the index labels or column names. We can either combine the mapper and axis parameters, or target the columns and index parameter exclusively. In either case, we provide an argument of a dictionary where the keys represent the current label names and the values represent the desired label names.
- Delete Rows or Columns from a DataFrame07:29
In this lesson, we practice 3 different syntactical options to delete rows or columns from a DataFrame. They include the .drop() method, the .pop() method, and Python's built in del keyword.
- Create Random Sample with the sample Method04:43
In this lesson, we'll call the .sample() method to pull out a random sample of rows or columns from a DataFrame. We'll specify the number of values to include by modifying the n parameter.
- Use the nsmallest / nlargest methods to get rows with smallest / largest values.05:36
There is a shortcut available to pull out the rows with the smallest or largest values in a column. Instead of sorting the rows and using the .head() method, we can call the .nsmallest() and .nlargest() methods. We'll dive into these methods and their parameters in this lesson.
- Filter A DataFrame with the where method05:03
Sometimes, you'll want to retain the structure of the original DataFrame when you extract a subset. In this lesson, we'll call the .where() method to return a modified DataFrame that holds NaN values for all rows that don't match our provided condition.
- Filter A DataFrame with the query method09:07
Our filtration process so far has involved using official pandas syntax. In this lesson, I'll introduce the .query() method, an alternate string-based syntax for extracting a subset from a DataFrame.
- A Review of the apply Method on a pandas Series Object05:53
In this review of a lesson from our Series Module, we'll call the .apply() method on a Series to apply a Python function on every value within it. This will act as a foundation for the next lesson, where we'll invoke the same method on a DataFrame.
- Apply a Function to every DataFrame Row with the apply Method06:49
The .apply() method applies a Python function on a row-by-row basis in a DataFrame. In this example, we'll create a custom ranking function for our films, then demonstrate how it can be applied to a DataFrame.
- Create a Copy of a DataFrame with the copy Method07:05
The default bracket syntax extracts a component of the larger DataFrame. Any operations on that component will affect the larger DataFrame. If we want to separate the two objects, we can use the .copy() method, which create an independent copy of a pandas object.
- Intro to the Working with Text Data Section06:09
Datasets can arrive with plenty of improperly formatted text data. The Working with Text Data section introduces the methods available in pandas to clean your data. In this introductory lesson, we create a Jupyter Notebook for this sectionand import a CSV file with public data on employees in the city of Chicago. We also optimize the DataFrame for speed and efficiency.
- Common String Methods - lower, upper, title, and len07:14
String methods in pandas require a .str prefix to work properly. In this lesson, we introduce 4 popular string methods on Series:
str.lower to convert a string's characters to lowercase
str.upper to convert a string's characters to uppercase
str.title to capitalize the first letter of every word in a string
str.len to return a count of the number of characters in a string
- Common String Methods1 question
- Coding Exercise Solution: Common String Methods00:40
See the solution to the previous lesson's coding challenge.
- Use the str.replace method to replace all occurrences of character with another08:07
The str.replace() method replaces a substring within a string with another value for all Series values. In this lesson, we use it to convert our Employee Annual Salary column to store numeric values instead of text ones.
- Filter a DataFrame's Rows with String Methods06:43
In this lesson, we'll introduce the .str.contains(), .str.startswith(), and .str.endswith() methods. All three create a Boolean Series, which can be used to extracting rows from a DataFrame. We'll also discuss case normalization to increase the accuracy of our results.
- More DataFrame String Methods - strip, lstrip, and rstrip04:31
In this lesson, we'll invoke the .str.strip() family of methods to remove leading and trailing whitespace from strings in a Series. The .str.lstrip() method removes whitespace from the left side (beginning) of a string, the .str.strip() method removes whitespace from the right side (end) of a string, and the .str.strip() method does both.
- Invoke String Methods on DataFrame Index and Columns05:30
The past few lessons focused on calling string methods on the values in a column of our dataset. In this lesson, we'll familiarize ourselves with calling the same string methods on the index labels and column names of a DataFrame.
- Split Strings by Characters with the str.split Method08:41
Strings can often contain multiple pieces of information that are separated by a common delimiter. In this lesson, we'll introduce the .str.split() method, which can split a string value based on an occurrence of a user-specified value. This is equivalent to the Text to Columns feature in Microsoft Excel.
- More Practice with the str.split method on a Series06:01
In this lesson, we'll utilize additional parameters on the .str.split() method to modify its performance. We'll extract the first names of all the employees in our dataset, a slightly more challenging puzzle than the one in the previous lesson.
- Exploring the expand and n Parameters of the str.split Method07:00
In this lesson, we'll explore even more parameters on the .str.split() method. The expand parameter allows us to expand the generated Python list into DataFrame columns while the n parameter limits the total number of splits.
- Intro to the MultiIndex Module04:50
A DataFrame or Series can hold multiple levels or layers in its index. The object that stores this index is called a MultiIndex. In this lesson, we create a Jupyter Notebook for this section and explore a new bigmac.csv dataset.
- Create a MultiIndex on a DataFrame with the set_index Method10:36
In this lesson, we'll create a multi-layer MultiIndex on a DataFrame with the .set_index() method. The method can be passed a list instead of a string to transfer multiple columns to the index.
- Create a MultiIndex on a DataFrame1 question
- Coding Exercise Solution: Create a MultiIndex on a DataFrame00:39
See the solution to the previous lesson's coding challenge.
- Extract Index Level Values with the get_level_values Method04:20
The index attribute returns the underlying object that makes up the index of a DataFrame. In this lesson, we invoke the get_level_values method on the index to extract the values from one of its levels. We show how this can done either by the layer's index position or by its name.
- Extract Index Level Values with the get_level_values Method1 question
- Coding Exercise Solution: Extract Index Level Values with the get_level_values M00:49
See the solution to the previous lesson's coding challenge.
- Change Index Level Name with the set_names Method04:15
The levels or layers of a MultiIndex can be changed. In this lesson, we'll call the .set_names() method on a MultiIndex object to rename its levels.
- The sort_index Method on a MultiIndex DataFrame08:24
In this lesson, we explore how the sort_index method operates on a MultiIndex DataFrame. We show how to sort all levels in the same order as well as how to vary up the sort order for different levels.
- Extract Rows from a MultiIndex DataFrame10:59
In this lesson, we review the familiar .loc[] and .iloc[] accessors for extracting rows from a MultiIndex DataFrame. We discuss how to package up multiple level values within a tuple to be more precise in communicating what we want to extract.
- Extract Rows from a MultiIndex DataFrame1 question
- Coding Exercise Solution: Extract Rows from a MultiIndex DataFrame00:47
See the solution to the previous lesson's coding challenge.
- The transpose Method on a MultiIndex DataFrame08:16
In this lesson, we invoke the transpose method on a MultiIndex DataFrame to swap its row and column axes. We then discuss how to use the loc accessor to attribute a column from a MultiIndex column index.
- The swaplevel Method03:29
The swaplevel method swaps two levels within a MultiIndex. In this lesson, we practice moving around levels in the bigmac dataset. If the MultiIndex consists of only two levels, no additional arguments are required.
- The stack Method06:01
The .stack() method stacks an index from the column axis to the row axis. It essentially transfers the columns to the row index. In this lesson, we'll see a live example on our bigmac dataset.
- The unstack Method, Part 103:38
The .unstack() method does the exact opposite of the .stack() method. It moves an index level from the rows to the columns. In this lesson, we'll call the method without any arguments.
- The unstack Method, Part 206:09
In this lesson, we'll continue our exploration of the .unstack() method. We'll introduce the numerous argument types we can feed it as arguments including positive integers, negative integers, and index level names.
- The unstack Method, Part 305:09
Multiple levels of the row-based MultiIndex can be shifted with the .unstack() method. In this lesson, we'll explore how to provide a list argument to the level parameter to move multiple layers at a time. We'll also introduce the fill_value parameter to plug in missing values in the resulting DataFrame.
- The pivot Method06:34
In this lesson, we'll reorganize the unique values in a DataFrame column as the column headers with the pivot method. This can be a particularly effective method for shortening the length of the DataFrame.
- Use the pivot_table method to create an aggregate summary of a DataFrame10:16
In this lesson, we'll emulate Excel's Pivot Table functionality with the pivot_table method. We'll explore the values, index, column, and aggfunc parameters. We'll also discuss the variety of aggregation functions that we can use including sum, count, max, and min.
- Use the pd.melt method to create a narrow dataset from a wide one05:59
The pd.melt method effectively perform santi-pivot operations. In this lesson, we'll call the method on a DataFrame to convert its current data structure into a more tabular format. We'll also explore the optional parameters available to modify the resulting column names in the new DataFrame.
- The pd.melt Method1 question
- Coding Exercise Solution: The pd.melt Method00:54
See the solution to the previous lesson's coding challenge.
- Intro to the GroupBy Module07:42
The pandas DataFrameGroupBy object allows us to create groupings of data based on common values in one or more DataFrame columns. In this lesson, we'll setup a new Jupyter Notebook in preparation for this module.
- First Operations with groupby Object09:33
The GroupBy object does not offer us much of substance until we call a method on it. In this lesson, we'll call the .first(), .last(), and .size() methods on a GroupBy object to gain a better understanding of its internal data structure.
- Retrieve a group from a GroupBy object with the get_group Method03:47
The .get_group() method extracts a grouping from a GroupBy object. In this lesson, we'll practice pulling out a few groups from our companies dataset.
- Methods on the Groupby Object and DataFrame Columns08:41
Aggregation methods allow us to perform calculations on all groupings within a GroupBy object. In this lesson, we'll call some mathematical methods on the groups, including the .sum(), .mean(), and .max() methods.
- Grouping by Multiple Columns04:35
A GroupBy object does not have to be made up of values from a single column. In this lesson, we'll create a new GroupBy object based on unique value combinations from two of our DataFame columns.
- The agg Method06:11
Certain situations may require different aggregation methods on different columns within our groupings. In this lesson, we'll invoke the .agg() method on our GroupBy object to apply a different aggregation operation to each inner column.
- Iterating through Groups09:04
A standard Python for loop can be used to iterate over the groups in a pandas GroupBy object. In this lesson, we'll loop over all of our gropings to extract selected rows from each inner DataFrame. We'll append these rows to a running DataFrame and then view the final result.
- Intro to the Merging, Joining, and Concatenating Section04:51
Welcome to the Merging, Joining, and Concatenating section! In this module, we'll cover how to combine data from multiple DataFrames into one. In this section, we create a new Jupyter Notebook and introduce the 4 CSV files that we will be using.
- The pd.concat Method, Part 105:20
The pd.concat method concatenates two or more DataFrames together. The process is simple when the DataFrames have an identical structure (i.e. the same column names). In this lesson, we also explore how to replace the merged index with a newly generated once.
- The pd.concat Method, Part 207:06
In this lesson, we use the keys parameter on the pd.concat method to label each concatenated DataFrame with a unique identifier. This parameter yields a MultiIndex DataFrame where the outermost layer holds the keys and the innermost layer holds each DataFrame's original index values.
- Inner Joins, Part 109:18
An inner join merges the values in two DataFrames based on common values across one or more columns. In this lesson, we'll explore the concept by merging on identical values in a single column.
- Inner Joins, Part 209:01
This lesson continues our exploration of the .merge() method. This time, we'll merge the values in two DataFrames based on common values in multiple columns. We'll also validate the data with some filtering.
- Outer Joins12:23
An outer join combines values that exist in either DataFrame into a central DataFrame. In this lesson, we'll invoke the .merge() method with a modified argument to the how parameter to perform an outer join on our weekly sales data sets.
- Left Joins09:19
A left join establishes one of the DataFrames as the base dataset for the merge. It attempts to find each value in another DataFrame and drag over that DataFrame's rows when there's a value match. In this lesson, we'll practice executing this join with the .merge() method.
- The left_on and right_on Parameters08:54
DataFrames may come equipped with different names for columns that represent the same data. In this lesson, we'll talk about how to utilize the left_on and right_on parameters to specify how to match values in differently named columns across two DataFrames.
- Merging by Indexes with the left_index and right_index Parameters11:02
Our merges so far have involved matches based on common column values. In this lesson, we'll explore how to merge DataFrames based on common index labels.
- The .join() Method03:15
Call the .join() method, a simple method to concatenate two DataFrames vertically when they share the same index. This is a shortcut to a more explicit .merge() method.
- The pd.merge() Method03:06
Call the pd.merge() method on the pandas library to merge two DataFrames. This is an alternate syntax to calling the .merge() method directly on a DataFrame.
- Intro to the Working with Dates and Times Module04:17
The Working with Dates and Times section offers a review of Python's built-in datetime objects as well as a comprehensive introduction to similar tools in the pandas library. In this lesson, we setup our Jupyter Notebook and import Python's datetime module.
- Review of Python's datetime Module09:31
Python includes built-in date and datetime objects for working with dates and times. This lesson offers a review of how we can create these objects as well as some of the attributes (.year, .month, .day etc) that are available on them.
- The pandas Timestamp Object07:15
The pandas library includes its own Timestamp object to represent moments in time. In this lesson, we'll use the pd.Timestamp() constructor method with a variety of inputs (strings, date objects, date objects) to create some Timestamp objects.
- The pandas DateTimeIndex Object05:23
A DatetimeIndex is a pandas object for storing multiple Timestamp objects. In this lesson, we'll create a few DatetimeIndex objects from Python lists.
- The pd.to_datetime() Method11:11
The pd.to_datetime() method is a convenience method to convert various inputs to pandas-focused objects. In this lesson, we'll pass a variety of inputs (date objects, datetime objects, strings, lists) to the constructor method to see what it returns.
- Create Range of Dates with the pd.date_range() Method, Part 110:22
Over the course of the next three lessons, we'll call the pd.date_range() method to generate a DatetimeIndex of Timestamp objects. This constructor method includes 3 critical parameters (start, end, and periods); we need to provide 2 of these 3 for it to function. In this lesson, we'll see how the pd.date_range() method operates with arguments for the start and end parameters.
- Create Range of Dates with the pd.date_range() Method, Part 209:04
In this lesson, we'll see how the pd.date_range() method operates with arguments for the start and periods parameters. This approach creates a set number of dates beginning from a specific point.
- Create Range of Dates with the pd.date_range() Method, Part 307:50
In this lesson, we'll see how the pd.date_range() method operates with arguments for the end and periods parameters. This approach creates a set number of dates, proceeding backwards from a specified date point. We'll also continue our exploration of the freq parameter to vary the durations between each Timestamp.
- The .dt Accessor07:29
The .dt accessor on a Series of Timestamp object allows us to access specific datetime properties, much like the .str accessor allows us to call specific methods on a Series of strings. In this lesson, we'll explore popular attributes like .day, .weekday_name, and .month.
- Install pandas-datareader Library03:33
Upcoming lessons rely on the pandas-datareader library to fetch financial datasets from Yahoo Finance. In this lesson, we'll install the pandas-datareader library.
On a Mac system, open the Terminal. On a Windows machine, look for the Anaconda Prompt from the Start Menu.
Once the application is open, run the following commands.
conda activate followed by your environment name (for example, conda activate pandas_playground)
conda install pandas-datareader
- Fixing API Errors in Next Lesson00:30
- Import Financial Data Set with pandas_datareader Library07:55
In this lesson, we use our pandas_datareader library to fetch stock data for Microsoft. The result arrives in a DataFrame object with a DatetimeIndex.
- Selecting Rows from a DataFrame with a DateTimeIndex12:25
Extracting rows from a DataFrame with a DatetimeIndex is no different than in previous sections. In this lesson, we review the familiar .loc and .iloc accessors. As a reminder, these methods use a pair of square brackets to target one or more rows by either index label or index position.
- Timestamp Object Attributes and Methods09:41
In this lesson, we'll explore some of the attributes and methods on a pandas Timestamp object. We'll also practice extracting similar information from a complete DatetimeIndex of Timestamps.
- The pd.DateOffset Object06:50
In this lesson, we'll use the pd.DateOffset object to add hours, days, weeks, months, and years to each value in a DatetimeIndex.
- Timeseries Offsets12:31
In this lesson, we'll explore how we can use timeseries offsets to arrive at specific datetime values (such as the end of the month or the start of the year).
- The Timedelta Object08:22
Over the next two lessons, we'll explore the pandas Timedelta object which represents durations. A Timedelta represents a distance of time while a Timestamp represents a specific moment in time.
- Timedeltas in a Dataset09:30
In this lesson, we'll create a Series of Timedelta objects by calculating the duration differences between two columns of Timestamps. Time difference operations can be easily performed with the subtraction ( - ) sign.
- Intro to the Input and Output Section01:21
Create a Jupyter Notebook for the Input and Output module. This collection of lessons focuses on importing and exporting various file formats into Jupyter Notebook and pandas.
- Pass a URL to the pd.read_csv Method04:04
In this lesson, we pass URL string argument to the pd.read_csv method to fetch a public dataset of baby names from New York City.
- Quick Object Conversions07:02
In this lesson, we review how we can convert a Pandas Series to native Python objects including lists, dictionaries and strings.
- Export CSV File with the to_csv Method05:26
Export a DataFrame to a .csv file with the .to_csv() method. The method includes several parameters to modify the output, including what columns will be included and whether the index will be attached.
- Install xlrd and openpyxl Libraries to Read and Write Excel Files04:03
In this lesson, we install the xlrd and openpyxl libraries that are needed to work with Excel Workbooks in Pandas.
To do so, access the Terminal (Mac) or Anaconda Prompt (Windows) and execute the following commands.
conda activate pandas_playground
conda install xlrd openpyxl
- Import Excel File into pandas with the read_excel Method09:22
In this lecture, we explore how import an Excel workbook into pandas. We discuss how we can target worksheets by index position or by sheet name, as well as how to import all of the worksheets in a workbook.
- Export Excel File with the to_excel Method07:46
In this lesson, we write multiple DataFrames to an Excel workbook by creating an Excel Writer object. The lesson also covers how to target specific columns in the export.
- Input and Output5 questions
Test your knowledge of the concepts introduced in the Input and Output module in this quiz!
- Intro to Visualization Section04:48
Welcome to the Visualization section! This collection of lessons focuses on the pandas library's integration with the matplotlib library for plotting. Throughou the upcoming lessons, we'll be using matplotlib to render various plots, charts, and graphs.
- Use the plot Method to Render a Line Chart07:55
In this lesson, we use the plot method on DataFrame and Series objects to render a line chart with matplotlib. We also explore what can go wrong when the scale of our stored values differs.
- Modifying Plot Aesthetics with matplotlib Templates04:44
In this lesson, we use matplotlib library's built-in templates to modify the aesthetic look of charts and graphs. The plt.style.available attribute can be used to reveal a list with all available template options.
- Creating Bar Graphs to Show Counts05:57
A bar graph is used to count the occurrences of separate, independent values. In this lesson, we generate a bar graph with plot method and its kind parameter.
- Creating Pie Charts to Represent Proportions04:50
In this lesson, we draw a pie chart with matplotlib. Once again, we can invoke the plot method on a pandas data structure and pass a different argument to the kind parameter to modify the type of chart that renders.
- Visualization5 questions
Test your knowledge of the visualization concepts we introduced in this module!
- Introduction to the Options and Settings Module01:42
Create a new Jupyter Notebook for the module and load the requisite libraries. In this module, we'll be exploring how we can modify the display settings of the pandas library.
- Changing pandas Options with Attributes and Dot Syntax06:57
Access and modify pandas settings using attributes within pd.options.display. In this lesson, we'll modify the maximum number of rows and columns that appear in an outputted DataFrame.
- Changing pandas Options with Methods06:14
Use the
- get_option()
- set_option()
- reset_option()
- describe_option()
methods to modify the settings of the pandas library. We explore this method with the familiar max_rows and max_columns settings.
- The precision Option03:10
Use the precision option to modify the number of digits that appear after the decimal point in a floating point number. This option does not modify the original values in the DataFrame.
- Conclusion01:39
Celebrate the completion of the pandas for Data Analysis course! Congratulations on becoming a better data analyst!
- Bonus!00:20
Explore the bonus content for the course!
JOIN OUR WHATSAPP GROUP TO GET LATEST COUPON AS SOON AS UPDATED
JOIN WHATSAPPJOIN OUR TELEGRAM CHANNEL TO GET LATEST COUPON
JOIN TELEGRAMJOIN OUR FACEBOOK GROUP TO GET LATEST COUPON
JOIN FACEBOOKFree Online Tools And Converters for your use
URL Encoder
Input a string of text or a URL and encode the entered string
Try itURL Decoder
Input an encoded string of text or a URL and decode the entered string
Try itColor Contrast Checker (WCAG)
Calculate the color contrast ration for your website (WCAG)
Try itXML Formatter
Paste or upload an XML and have it formatted to a beautiful XML format
Try itURL Slug Generator
Convert any title or sentence into a variety of slugs for your pages URL
Try itE-Signature
Draw an e-signature or type a signature for your online signature
Try it