To find percentiles of a numeric column in a DataFrame, or the percentiles of a Series in pandas, the easiest way is to use the pandas **quantile()** function.

`df.quantile(0.25)`

You can also use the numpy **percentile()** function.

`np.percentile(df["Column"], 25)`

When working with data, many times we want to calculate summary statistics to understand our data better. Percentiles, or quantiles, are very important for us to understand how the data is distributed.

Finding the percentile for a given column, or the quantile for all columns or rows in a DataFrame using pandas is easy. We can use the pandas **quantile()** function to find various quantile values of a column of numbers, or a DataFrame.

Let’s say we have the following DataFrame.

```
df = pd.DataFrame({'Age': [43,23,71,49,52,37],
'Test_Score':[90,87,92,96,84,79]})
print(df)
# Output:
Age Test_Score
0 43 90
1 23 87
2 71 92
3 49 96
4 52 84
5 37 79
```

To get the the 50th quantile, or the median, for all columns, we can call the pandas **quantile()** function and pass 0.5.

```
print(df.quantile(0.5))
# Output:
Age 46.0
Test_Score 88.5
Name: 0.5, dtype: float64
```

If we only want to get the percentile of one column, we can do this using the pandas **quantile()** function in the following Python code:

```
print(df["Test_Score"].quantile(0.5))
# Output:
88.5
```

## Calculating Multiple Percentiles at Once with pandas

We can use the pandas **quantile()** function to calculate multiple percentiles at once. To calculate multiple quantiles, we pass a list of quantile values to the **quantile()** function.

Let’s say we have the same data from above. Let’s calculate the 25th, 50th and 75th percentiles of our data.

```
print(df.quantile([0.25,0.5,0.75]))
# Output:
Age Test_Score
0.25 38.50 84.75
0.50 46.00 88.50
0.75 51.25 91.50
```

## Using numpy percentile to Calculate Medians in pandas DataFrame

We can also use the numpy **percentile()** function to calculate percentile values for the columns in our pandas DataFrames.

Let’s get the 25th, 50th, and 75th percentiles of the “Test_Score” column using the numpy **percentile()** function. We can do this easily in the following Python code. The difference here is that you need to pass integer values instead of decimal values (i.e. 50 instead of 0.50).

```
print(np.percentile(df["Test_Score"],[25,50,75]))
# Output:
[84.75 88.5 91.5]
```

As you can see above, this is the same value we received from the pandas **quantile()** function.

Hopefully this article has been helpful for you to understand how to find percentiles of numbers in a Series or DataFrame in pandas.

## Leave a Reply