• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / How to Convert pandas Column dtype from Object to Category

How to Convert pandas Column dtype from Object to Category

December 1, 2022 Leave a Comment

To convert a column in a pandas DataFrame from a column with data type “object” to a column with data type “category”, use the astype() function.

import pandas as pd

df = pd.DataFrame({ "column": ["a","b","c","a","b","c","b","d"] }) 
 
print(df["column"].dtype)

df["column"] = df["column"].astype('category')

print(df["column"].dtype)

#Output:
object
category

When working with different types of data in pandas, the ability to easily be able to change the data type of a column is valuable.

One such case is if you want to convert a pandas column from a column with the data type “object” to a column with data type “category”.

To convert a column in a pandas DataFrame from a column with data type “object” to a column with data type “category”, use the astype() function.

astype() allows you to convert the data type of pandas columns.

Below is a simple example showing you how to convert the data type of a pandas column from “object” to “category”.

import pandas as pd

df = pd.DataFrame({ "column": ["a","b","c","a","b","c","b","d"] }) 
 
print(df["column"].dtype)

df["column"] = df["column"].astype('category')

print(df["column"].dtype)

#Output:
object
category

Reducing Memory Usage with dtype Category Columns in pandas

One of the main benefits of using “category” columns in pandas is you are able to reduce the amount of memory used in your process.

The reason for this is that categorical data is pandas stores only the unique values (i.e the category) instead of every single value.

Below shows an example of how you can reduce memory using categorical data in pandas.

import pandas as pd

s = pd.Series(["a","b","c","a","b","c","b","d"] * 1000) 
 
print(s.nbytes)
print(s.astype("category").nbytes)

#Output:
64000
8032

Using groupby() When Working With Column with dtype Category in pandas

One last thing I want to add to this post is something that I came across when I was performing some data analysis with pandas.

If you have categorical data and go to use the groupby() function to group your DataFrame, you should use the “observed=True” option so that groupby() behaves the same as it does when you use it on data which has the data type “object”.

Below shows you an example of how using the “observed=True” option in groupby() affects the output if you are using groupby() in pandas.

import pandas as pd

df = pd.DataFrame({"animal_type":["dog","cat","dog","cat","dog","dog","cat","cat","dog"], 
                   "gender":["F","M","F","M","M","F","M","M","M"], 
                   "age":[1,2,3,4,5,6,7,8,9], 
                   "weight":[10,20,15,20,25,10,15,30,40]})

df["animal_type"] = df["animal_type"].astype('category')
df["gender"] = df["gender"].astype('category')

print(df.groupby(["animal_type","gender"])["age"].max())
print(df.groupby(["animal_type","gender"], observed=True)["age"].max())

#Output:
animal_type  gender
cat          F         NaN
             M         8.0
dog          F         6.0
             M         9.0

animal_type  gender
dog          F         6
             M         9
cat          M         8
Name: age, dtype: int64

Hopefully this article has been useful for you to learn how to convert a pandas column from object to category in Python.

Other Articles You'll Also Like:

  • 1.  Using Python to Iterate Over Two Lists
  • 2.  Examples of Recursion in Python
  • 3.  Factorial Program in Python Using For Loop and While Loop
  • 4.  Scroll Down Using Selenium in Python
  • 5.  pandas interpolate() – Fill NaN Values with Interpolation in DataFrame
  • 6.  How to Cube Numbers in Python
  • 7.  pandas set_value – Using at() Function to Set a Value in DataFrame
  • 8.  Count Spaces in String in Python
  • 9.  Scroll Up Using Selenium in Python
  • 10.  Time Difference in Seconds Between Datetimes in Python

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy

x