Open In App

Python | Pandas DataFrame.where()

Improve
Improve
Like Article
Like
Save
Share
Report

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas where() method in Python is used to check a data frame for one or more conditions and return the result accordingly. By default, The rows not satisfying the condition are filled with NaN value.

Pandas DataFrame.where() Function Syntax

Syntax: DataFrame.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False, raise_on_error=None)

Parameters:

  • cond: One or more condition to check data frame for.
  • other: Replace rows which don’t satisfy the condition with user defined object, Default is NaN
  • inplace: Boolean value, Makes changes in data frame itself if True
  • axis: axis to check( row or columns)

For the link to the CSV file used, Click here.

Python Pandas DataFrame.where() Examples

Below are some examples of Pandas DataFrame.where():

Pandas DataFrame.where() Single Condition Operation

In this example, rows having particular Team name will be shown and rest will be replaced by NaN using .where() method.

Python3




# importing pandas package
import pandas as pd
 
# making data frame from csv file
data = pd.read_csv("nba.csv")
 
# sorting dataframe
data.sort_values("Team", inplace=True)
 
# making boolean series for a team name
filter = data["Team"] == "Atlanta Hawks"
 
# filtering data
data.where(filter, inplace=True)
 
# display
data


Output

As shown in the output image, every row which doesn’t have Team = Atlanta Hawks is replaced with NaN.

Pandas DataFrame.where() with Multiple Columns and Conditions

In this example, data is filtered on the basis of both Team and Age. Only the rows having Team name “Atlanta Hawks” and players having age above 24 will be displayed.

Python3




# importing pandas package
import pandas as pd
 
# making data frame from csv file
data = pd.read_csv("nba.csv")
 
# sorting dataframe
data.sort_values("Team", inplace=True)
 
# making boolean series for a team name
filter1 = data["Team"] == "Atlanta Hawks"
 
# making boolean series for age
filter2 = data["Age"] > 24
 
# filtering data on basis of both filters
data.where(filter1 & filter2, inplace=True)
 
# display
data


Output

As shown in the output image, Only the rows having Team name “Atlanta Hawks” and players having age above 24 are displayed.



Last Updated : 01 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads