• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / How to Read Excel File from AWS S3 Bucket Using Python

How to Read Excel File from AWS S3 Bucket Using Python

February 5, 2023 Leave a Comment

To read an Excel file from an AWS S3 Bucket using Python and pandas, you can use the boto3 package to access the S3 bucket. After accessing the S3 bucket, you can use the get_object() method to get the file by its name. Finally, you can use the pandas read_excel() function on the Bytes representation of the file obtained by the io BytesIO() function.

import pandas as pd
import io
import boto3

s3c = boto3.client('s3', region_name="us-east-2",aws_access_key_id="YOUR AWS_ACCESS_KEY_ID",aws_secret_access_key="YOUR AWS_SECRET_ACCESS_KEY")
obj = s3c.get_object(Bucket="YOUR-BUCKET",Key="FILENAME")
df = pd.read_excel(io.BytesIO(obj["Body"].read()))

When working with different datasets and file types, the ability to easily read and work with these different datasets is useful.

One such case is if you have data in an AWS S3 bucket and you want to read it into your Python program.

You can use the boto3 package which allows you to create, configure and manage AWS services.

With boto3, you can access the data in an AWS S3 bucket.

To start, you need to connect to AWS. This is done by first using the boto3 client function. You should pass your access key and secret access key here to authenticate.

Next, we want to get the object in question. To get the file, you can use the boto3 get_object() function and pass the bucket name and file name to the Bucket and Key parameters, respectively.

Now that we have the file, we can read it.

The get_object() function returns a dictionary with a few different pieces of information but what we care about is the “body” of the object.

The pandas read_excel() function can read from a file path or a buffer. Therefore, to read the Excel file from the AWS S3 bucket, one solution would be to read the “body” of the object, convert it to bytes and then read it with read_excel().

Below shows the entire code of how to read an Excel file from an AWS S3 bucket.

import pandas as pd
import io
import boto3

s3c = boto3.client('s3', region_name="us-east-2",aws_access_key_id="YOUR AWS_ACCESS_KEY_ID",aws_secret_access_key="YOUR AWS_SECRET_ACCESS_KEY")
obj = s3c.get_object(Bucket="YOUR-BUCKET",Key="FILENAME")
df = pd.read_excel(io.BytesIO(obj["Body"].read()))

How to Read Pickle Files and CSV Files from AWS S3 Buckets in Python

If you want to read pickle files or read csv files from an AWS S3 Bucket, then you can follow the same code structure as above.

read_pickle() and read_csv() both allow you to pass a buffer, and so you can use io.BytesIO() to create the buffer.

Below shows an example of how you could read a pickle file from an AWS S3 bucket using Python and pandas.

import pandas as pd
import io
import boto3

s3c = boto3.client('s3', region_name="us-east-2",aws_access_key_id="YOUR AWS_ACCESS_KEY_ID",aws_secret_access_key="YOUR AWS_SECRET_ACCESS_KEY")
obj = s3c.get_object(Bucket="YOUR-BUCKET",Key="FILENAME")
df = pd.read_pickle(io.BytesIO(obj["Body"].read()))

For reading a csv file from an AWS bucket, the code has the same structure.

import pandas as pd
import io
import boto3

s3c = boto3.client('s3', region_name="us-east-2",aws_access_key_id="YOUR AWS_ACCESS_KEY_ID",aws_secret_access_key="YOUR AWS_SECRET_ACCESS_KEY")
obj = s3c.get_object(Bucket="YOUR-BUCKET",Key="FILENAME")
df = pd.read_csv(io.BytesIO(obj["Body"].read()))

How to Write Excel File to an AWS S3 Bucket Using Python

If you want to write an Excel file to an AWS S3 Bucket, then you can do something similar as we have done above, but now you will use the boto3 put_object() function.

To write an Excel file to an AWS S3 Bucket using Python and pandas, you can use the boto3 package to access the S3 bucket.

After accessing the S3 bucket, you need to create a file buffer with the io BytesIO() function. Then, write the Excel file to the file buffer with the pandas to_excel() function.

Finally, you can use the put_object() method to send the pickle file to a specified file location in the AWS S3 Bucket.

import pandas as pd
import io
import boto3

s3c = boto3.client('s3', region_name="us-east-2",aws_access_key_id="YOUR AWS_ACCESS_KEY_ID",aws_secret_access_key="YOUR AWS_SECRET_ACCESS_KEY")
excel_buffer = io.BytesIO()
df.to_excel(excel_buffer)
s3c.put_object(Body=excel_buffer.getvalue(),Bucket="YOUR-BUCKET",Key="FILENAME")

Hopefully this article has been useful for you to learn how to read an Excel file from an AWS S3 Bucket using Python and the pandas module.

Other Articles You'll Also Like:

  • 1.  Python tanh – Find Hyperbolic Tangent of Number Using math.tanh()
  • 2.  Python dirname – Get Directory Name of Path using os.path.dirname()
  • 3.  Factorial Program in Python Using For Loop and While Loop
  • 4.  How to Declare Variable Without Value in Python
  • 5.  Inverting Dictionary Variables in Python with Dictionary Comprehension
  • 6.  Convert String to Datetime in pandas with pd.to_datetime()
  • 7.  Using Python to Check if String Contains Only Letters
  • 8.  Using Python to Convert Float to Int
  • 9.  Python nth Root – Find nth Root of Number with math.pow() Function
  • 10.  How to Check if String Contains Uppercase Letters in Python

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy