• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / pandas Correlation – Find Correlation of Series or DataFrame Columns

pandas Correlation – Find Correlation of Series or DataFrame Columns

January 4, 2022 Leave a Comment

To find the correlation between series or columns in a DataFrame in pandas, the easiest way is to use the pandas corr() function.

df["Column1"].corr(df["Column2"])

If you want to compute the pairwise correlations between all numeric columns in a DataFrame, you can call corr() directly on the DataFrame.

df.corr()

You can also use the pandas corrwith() function to compute the correlation of the columns of a DataFrame with another Series.

df.corrwith(df2["Column"])

Finding the correlation between columns or Series using pandas is easy. We can use the pandas corr() function to find the correlations of columns of numbers, or the correlation between multiple Series.

Let’s say we have the following DataFrame.

df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
                   'Weight': [160.20, 160.20, 209.45, 150.35, 187.52, 187.52],
                   'Height': [50.10, 68.94, 71.42, 48.56, 59.37, 63.42] })

print(df)
# Output: 
    Name  Weight  Height
0    Jim  160.20   50.10
1  Sally  160.20   68.94
2    Bob  209.45   71.42
3    Sue  150.35   48.56
4   Jill  187.52   59.37
5  Larry  187.52   63.42

To get the pairwise correlation between the columns “Weight” and “Height”, we can use the pandas corr() function in the following Python code:

print(df["Height"].corr(df["Weight"]))

# Output:
0.6754685833670168

The pandas corr() function allow us to compute a few different types of correlation, namely, Pearson correlation, Kendall Tau correlation, and the Spearman Rank correlation. You can also pass your own function if you’d like.

To calculate these correlation coefficients, just pass method=”kendall” or method=”spearman” to the corr() function.

Note you will have to import the module scipy to find the kendall and spearman coefficients.

df["Height"].corr(df["Weight"], method="pearson")
df["Height"].corr(df["Weight"], method="kendall")
df["Height"].corr(df["Weight"], method="spearman")

Calculating the Correlation between Multiple Columns in pandas

There are many time when analyzing a dataset that we want to see the correlations between all variables. We can use the pandas corr() method to calculate the correlation over all columns.

Let’s say we have the same DataFrame from above, but now we’ve added another column “Age”.

df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
                   'Weight': [130.54, 160.20, 209.45, 150.35, 117.73, 187.52],
                   'Height': [50.10, 68.94, 71.42, 48.56, 59.37, 63.42],
                   'Age': [43,23,71,49,52,37] })

print(df)
# Output: 
    Name  Weight  Height  Age
0    Jim  130.54   50.10   43
1  Sally  160.20   68.94   23
2    Bob  209.45   71.42   71
3    Sue  150.35   48.56   49
4   Jill  117.73   59.37   52
5  Larry  187.52   63.42   37

We can get the pairwise correlation coefficients for all columns by calling the corr() function. In this case, the corr() function will return a correlation matrix.

print(df.corr())

#Output:
          Weight    Height       Age
Weight  1.000000  0.666055  0.285006
Height  0.666055  1.000000  0.053793
Age     0.285006  0.053793  1.000000

Finding Correlation with pandas corrwith() function

We can also use the pandas corrwith() function to calculate the correlation coefficient between a DataFrame and columns of another DataFrame or Series.

Let’s say we have the same dataset from above, and let’s say we have another DataFrame that we’d like to see if it is correlated with our DataFrame from the previous example.

df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
                   'Weight': [130.54, 160.20, 209.45, 150.35, 117.73, 187.52],
                   'Height': [50.10, 68.94, 71.42, 48.56, 59.37, 63.42],
                   'Age': [43,23,71,49,52,37] })

df_new = pd.DataFrame({'Test_Score':[90,87,92,96,84,79]})

We can find the correlation between the columns of two DataFrames using the pandas corrwith() function.

print(df.corrwith(df_new["Test_Score"]))

#Output:
Weight   -0.016455
Height   -0.359045
Age       0.408819
dtype: float64

Hopefully this article has been helpful for you to understand how to find the correlation coefficients between columns in a DataFrame or between Series using pandas.

Other Articles You'll Also Like:

  • 1.  Random Number Without Repeating in Python
  • 2.  Using Python to Find Second Smallest Value in List
  • 3.  Convert List into Tuple Using Python
  • 4.  Using Python to Create List of Prime Numbers
  • 5.  Read Last Line of File Using Python
  • 6.  Using Matplotlib and Seaborn to Create Pie Chart in Python
  • 7.  Get Days in Month Using Python
  • 8.  Python cube root – Find Cube Root of Number With math.pow() Function
  • 9.  Using Python to Repeat Characters in String
  • 10.  Count Spaces in String in Python

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy