Pandas DataFrame.mean: Compute the Mean of DataFrame Values

Pandas DataFrame.mean

The DataFrame.mean method in pandas calculates the mean (average) of numerical values in a DataFrame along a specified axis. It is useful for summarizing data and performing statistical analyses.

Syntax

The syntax for DataFrame.mean is:

DataFrame.mean(axis=0, skipna=True, numeric_only=False, **kwargs)

Here, DataFrame refers to the pandas DataFrame on which the mean operation is applied.

Parameters

Parameter	Description
`axis`	Specifies the axis to calculate the mean: `0` (default) – Compute mean for each column (column-wise mean). `1` – Compute mean for each row (row-wise mean).
`skipna`	A boolean that determines whether to exclude NA/null values: `True` (default) – Ignore missing values. `False` – Include missing values in calculations, resulting in NaN if any are present.
`numeric_only`	If `True`, only numeric (int, float, boolean) columns are included in the calculation. Default is `False`.
`**kwargs`	Additional keyword arguments to customize the behavior of the function.

Returns

Returns a Series with the mean values for the specified axis. If axis=None (available from version 2.0.0), it returns a scalar mean over the entire DataFrame.

Examples

Compute Column-wise Mean

The default behavior computes the mean for each column.

Python Program

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({
    'A': [10, 20, 30, 40],
    'B': [5, 15, 25, 35],
    'C': [2, 4, 6, 8]
})

# Compute mean for each column
column_mean = df.mean()
print(column_mean)

Output

A    25.0
B    20.0
C     5.0
dtype: float64

Compute Row-wise Mean

Setting axis=1 computes the mean for each row.

Python Program

# Compute mean for each row
row_mean = df.mean(axis=1)
print(row_mean)

Output

0     5.67
1    13.00
2    20.33
3    27.67
dtype: float64

Handling Missing Values

By default, skipna=True ignores missing values. Setting skipna=False includes them, resulting in NaN where applicable.

Python Program

df_with_nan = df.copy()
df_with_nan.loc[2, 'A'] = None  # Introduce a NaN value

# Compute column-wise mean, including NaNs
mean_with_nan = df_with_nan.mean(skipna=False)
print(mean_with_nan)

Output

A     NaN
B    20.0
C     5.0
dtype: float64

Using `numeric_only=True`

If a DataFrame contains non-numeric data, setting numeric_only=True excludes them from the calculation.

Python Program

df_mixed = pd.DataFrame({
    'A': [10, 20, 30, 40],
    'B': [5, 15, 25, 35],
    'C': ['X', 'Y', 'Z', 'W']
})

# Compute mean only for numeric columns
numeric_mean = df_mixed.mean(numeric_only=True)
print(numeric_mean)

Output

A    25.0
B    20.0
dtype: float64

Summary

In this tutorial, we explored the DataFrame.mean method in pandas. Key takeaways include:

mean() calculates the average for numerical columns by default.
Setting axis=1 computes row-wise means.
Missing values are ignored by default (skipna=True), but can be included.
numeric_only=True ensures only numeric columns are included.
Setting axis=None (available from pandas 2.0.0) computes the overall mean as a scalar.