• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / pandas variance – Compute Variance of Variables in DataFrame

pandas variance – Compute Variance of Variables in DataFrame

January 13, 2022 Leave a Comment

To find the variance of a series or a column in a DataFrame in pandas, the easiest way is to use the pandas var() function.

df["Column1"].var()

You can also use the numpy var() function, but be careful as the default algorithm is different than the default pandas var() algorithm.

np.var(df["Column1"]) #Different result from default pandas function
np.var(df["Column1"],ddof=1) #Same result as default pandas function

When doing data analysis, the ability to compute different summary statistics, such as the mean or median of a variable, is very useful to help us understand the data. One such summary statistic which can be useful is the variance of a variable.

The variance is the average of the squared deviations from the mean.

Finding the variance of columns or a Series using pandas is easy. We can use the pandas var() function to find the standard deviation of a column of numbers.

Let’s say we have the following DataFrame.

df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
                   'Weight': [160.20, 160.20, 209.45, 150.35, 187.52, 187.52],
                   'Height': [50.10, 68.94, 71.42, 48.56, 59.37, 63.42] })

print(df)
# Output: 
    Name  Weight  Height
0    Jim  160.20   50.10
1  Sally  160.20   68.94
2    Bob  209.45   71.42
3    Sue  150.35   48.56
4   Jill  187.52   59.37
5  Larry  187.52   63.42

To get the standard deviation of the column “Height”, we can use the pandas std() function in the following Python code:

print(df["Height"].var())

# Output:
90.15417666666664

Calculating the Variance of a Series with numpy

We can also find the variance of a series using the numpy std() function. Depending on the complexity of our code, it might be faster to use the numpy var() function.

Let’s say we have the same dataset as above.

To get the variance of the column “Height”, we can use the numpy var() function in the following Python code.

print(np.var(df["Height"]))

# Output:
8.667668692073754

As you can verify for yourself, this is a different result from the pandas var() function. The reason for this is the default normalization method is different between pandas and numpy. This is because, by default, pandas provides an unbiased estimator of the variance of a hypothetical infinite population, or uses 1 delta degree of freedom.

To get the same variance using both numpy and pandas, you need to pass ‘ddof=1’ to the numpy var() function.

print(np.var(df["Height"]))
print(np.var(df["Height"],ddof=1))
print(df["Height"].var())

# Output:
75.12848055555554
90.15417666666664
90.15417666666664

As you can see above, we received the same result from the code when we pass ‘ddof=1’ to the numpy var() function.

Hopefully this article has been helpful for you to understand how to find the variance of a variable within a column or Series using pandas.

Other Articles You'll Also Like:

  • 1.  How to Check if List is Empty in Python
  • 2.  Get Current Year in Python
  • 3.  How to Use Python to Remove Zeros from List
  • 4.  pandas Drop Columns – Delete Columns from a DataFrame
  • 5.  Using Python to Check If List of Words in String
  • 6.  Add Tuple to List in Python
  • 7.  Using Python to Print Degree Symbol
  • 8.  Python Subtract Days from Date Using datetime timedelta() Function
  • 9.  Get pandas Index Values as List in Python
  • 10.  Sort by Two Keys in Python

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy