• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / Using pandas resample() to Resample Time Series Data

Using pandas resample() to Resample Time Series Data

May 2, 2022 Leave a Comment

In Python, we can use the pandas resample() function to resample time series data in a DataFrame or Series object. Resampling is a technique which allows you to increase or decrease the frequency of your time series data.

Let’s say we have the following time series data.

import pandas as pd
import numpy as np

df = pd.DataFrame({'time':pd.date_range(start='05-01-2022',end='06-30-2022', freq="D"), 'value':np.random.randint(10,size=61)})

print(df.head(10))

#Output:
        time  value
0 2022-05-01      2
1 2022-05-02      4
2 2022-05-03      7
3 2022-05-04      9
4 2022-05-05      6
5 2022-05-06      9
6 2022-05-07      2
7 2022-05-08      4
8 2022-05-09      2
9 2022-05-10      1

You can resample this daily data to monthly data with resample() as shown below.

df.set_index('time', inplace=True)

resampled_df = df.resample('M').mean()

print(resampled_df)

#Output: 
               value
time
2022-05-31  4.741935
2022-06-30  3.300000

When working with time series data, the ability to change the frequency of the data can be very useful.

The Python pandas module gives us many great tools for working with time series data. We can use the pandas resample() function to resample time series data easily.

Resampling is a technique which allows you to increase the frequency of your time series data or decrease the frequency of your time series data.

Increasing the frequency of your time series data, or upsampling, would be like taking monthly data and making it daily data. Resampling in this case would perform enable to you perform interpolation of your time series data.

Decreasing the frequency of time series data, or downsampling, would be like taking daily data and smoothing it to monthly data.

From the documentation, you can read about the many different ways you can use resample().

In the rest of this article, you’ll learn how to resample time series data in a few of the very common ways with the pandas resample() function.

How to Resample Time Series Data and Interpolate with the pandas resample() Function

One way we can use resample() is to increase the frequency of our time series data. To increasing the frequency of our time series data is called upsampling. This is like taking monthly data and making it daily.

Let’s say we have the following data which has data points every 12 hours.

import pandas as pd
import numpy as np

df = pd.DataFrame({'time':pd.date_range(start='05-01-2022',end='05-31-2022', freq="12H"), 'value':np.random.randint(10,size=61)})

print(df.head(10))

#Output:
                 time  value
0 2022-05-01 00:00:00      1
1 2022-05-01 12:00:00      7
2 2022-05-02 00:00:00      9
3 2022-05-02 12:00:00      8
4 2022-05-03 00:00:00      9
5 2022-05-03 12:00:00      0
6 2022-05-04 00:00:00      6
7 2022-05-04 12:00:00      3
8 2022-05-05 00:00:00      7
9 2022-05-05 12:00:00      6

Let’s increase the frequency of our data to every 6 hours with resample(). First, we need to set the date time column as the index, and then we can resample.

Then, we can increase the frequency of our data by passing “6H” to resample().

df.set_index('time', inplace=True)

resampled_df = df.resample("6H").mean()

print(resampled_df.head(10))

#Output:
                     value
time
2022-05-01 00:00:00    1.0
2022-05-01 06:00:00    NaN
2022-05-01 12:00:00    7.0
2022-05-01 18:00:00    NaN
2022-05-02 00:00:00    9.0
2022-05-02 06:00:00    NaN
2022-05-02 12:00:00    8.0
2022-05-02 18:00:00    NaN
2022-05-03 00:00:00    9.0
2022-05-03 06:00:00    NaN

As you can see, we’ve now added datapoints between the datapoints which previously existed, but the values for these datapoints are NaN.

For interpolation and filling the NaN values, we have a few options. We can use the bfill() function which will “back fill” the NaN values.

resampled_df = df.resample("6H").bfill()

print(resampled_df.head(10))

#Output:
                     value
time
2022-05-01 00:00:00      1
2022-05-01 06:00:00      7
2022-05-01 12:00:00      7
2022-05-01 18:00:00      9
2022-05-02 00:00:00      9
2022-05-02 06:00:00      8
2022-05-02 12:00:00      8
2022-05-02 18:00:00      9
2022-05-03 00:00:00      9
2022-05-03 06:00:00      0

You can also use ffill() to “forward fill” the NaN values.

If you want to use interpolation, then you can use the pandas interpolate() function to interpolate and fill the NaN values in the newly created time series.

Below is an example of how you can interpolate a time series in pandas with the pandas resample() function.

resampled_df = df.resample("6H").interpolate(method="linear")

print(resampled_df.head(10))

#Output:
                     value
time
2022-05-01 00:00:00    1.0
2022-05-01 06:00:00    4.0
2022-05-01 12:00:00    7.0
2022-05-01 18:00:00    8.0
2022-05-02 00:00:00    9.0
2022-05-02 06:00:00    8.5
2022-05-02 12:00:00    8.0
2022-05-02 18:00:00    8.5
2022-05-03 00:00:00    9.0
2022-05-03 06:00:00    4.5

How to Resample Time Series Data and Aggregate Data with the pandas resample() Function

You can also use resample() to decrease the frequency of your time series data. Decreasing the frequency of your times series data is called downsampling and is like if you go from daily data to monthly data.

Let’s say we have same dataset from above with datapoints every 12 hours.

To resample this data and convert it to daily data, we can use resample() and pass “D” for days as the new frequency. Let’s also aggregate the resampled data and get the sum for each day.

Below is how you can downsample and aggregate time series data with the pandas resample() function.

resampled_df = df.resample('D').sum()

print(resampled_df.head(10))

#Output: 
            value
time
2022-05-01      8
2022-05-02     17
2022-05-03      9
2022-05-04      9
2022-05-05     13
2022-05-06      5
2022-05-07      9
2022-05-08     10
2022-05-09      8
2022-05-10      6

Hopefully this article has been useful for you to learn how to resample time series data in Python with the pandas resample() function.

Other Articles You'll Also Like:

  • 1.  Using Python to Create List of Prime Numbers
  • 2.  Decrement For Loop with range() in Python
  • 3.  pandas ceil – Find the Ceiling of a Column Using Numpy ceil
  • 4.  Loop Through Files in Directory Using Python
  • 5.  pandas groupby size – Get Number of Elements after Grouping DataFrame
  • 6.  Length of Set Python – Get Set Length with Python len() Function
  • 7.  Count Number of Files in Directory with Python
  • 8.  Create Empty List in Python
  • 9.  Using Python to Check if Number is Divisible by Another Number
  • 10.  Write Variable to File Using Python

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy