• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / Writing a Table Faster to Word Document Using python-docx

Writing a Table Faster to Word Document Using python-docx

March 16, 2021 12 Comments

Creating Word documents using the Python Docx package is very powerful and allows us to present our data and findings in an automated way.

Many times, we are working with data and want to output this data into a table.

Outputting data to a table in a Word Document in Python is not difficult, but making this process efficient and fast is the key to making our work as good as it can be.

Creating a Table in a Word Document using Python Docx Efficiently

Creating a table and filling the table with data in a Word document using Python Docx is easy.

To create a table, all we need to do is:

import docx

doc = docx.Document() 

table = doc.add_table(rows=ROWS, cols=COLS)

doc.save("output_file_path.docx")

When writing small tables to word, we can loop through the table and there will not be any performance problems.

Let’s say we are reading some data from an XLSX file and we want to output it in a table in a Word document.

If the data is small, we can easily output the table without problems.

import docx
import pandas as pd

data = pd.read_excel("some_data.xlsx")

doc = docx.Document() 

table = doc.add_table(rows=data.shape[0], cols=data.shape[1])

for i in range(df.shape[0]):
    for j in range(df.shape[1]):
        table.cell(i,j).text = str(df.values[i,j])

doc.save("output_file_path.docx")

But, the problem is that when you use “table.cell(i,j).text”, this updates the table in the Word document and is very slow.

The more efficient way is to do the following:

import docx
import pandas as pd

data = pd.read_excel("some_bigger_data.xlsx")

doc = docx.Document() 

table = doc.add_table(rows=data.shape[0], cols=data.shape[1])

table_cells = table._cells

for i in range(data.shape[0]):
    for j in range(data.shape[1]):
        table_cells[j + i * data.shape[1]].text =  str(data.values[i][j])

doc.save("output_file_path.docx")

The key is to use table._cells to “pop” out the cells from the table. With this line, we limit the amount of calls to the table in the Word document. By updating the Word document only after you have filled the table, you will improve the efficiency and speed of your program.

Hopefully this has been helpful in helping you write a table from a dataframe to Word document faster, and with this code, you can make your processes more efficient and take less time.

Let me know if you have any questions, and thank you for reading.

Other Articles You'll Also Like:

  • 1.  Using Python to Find Closest Value in List
  • 2.  pandas percentile – Calculate Percentiles of Series or Columns in DataFrame
  • 3.  How to Create Array from 1 to n in Python
  • 4.  Using Python to Print Plus or Minus Sign Symbol
  • 5.  How to Iterate through a Set Using Python
  • 6.  Python Replace Space With Dash Using String replace() Function
  • 7.  Sort List of Tuples in Python
  • 8.  Get Difference Between datetime Variables in Python
  • 9.  PROC MEANS Equivalent in Python
  • 10.  Draw Star in Python Using turtle Module

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Comments

  1. Peter says

    November 18, 2021 at 7:50 am

    Hi,
    I’am facing a problem when using you’re code, and was wondering if you would help.
    I’m currently looping through 104 different dataframes, and want to place them in 104 different documents. My problem is that when i insert the data, .0 is added at then end of each cell. As an example when i insert 542780, it gets formatet as 542780.0 in word?
    Any idea what causes the issue?

    Kind regards Peter

    Reply
    • Erik says

      November 23, 2021 at 8:33 am

      Hi Peter,

      Have you tried converting your data to strings when writing it to each document? If 542780 is not an integer in your dataframe, then it’s possible that when you write it, it will write it as a float, or with the .0 on the end.

      This will convert a column in your dataframe to string: df[‘DataFrame Column’] = df[‘DataFrame Column’].apply(str)

      Reply
  2. T says

    December 8, 2021 at 1:58 am

    Hey there, I faced a problem where my method of generating a massive table was extremely slow, and this looks like one of the solutions, but when I took a closer look at the code, I noticed that “table_cells” is not used anywhere. The line “table_cells = table._cells” occurs but doesn’t seem to serve a purpose, while seemingly being the central idea behind the optimization? Am I missing something?

    Thanks!

    Reply
    • Erik says

      December 8, 2021 at 9:21 am

      Thanks for the comment T.

      I’ve edited the post with the correct code – the key is to use table._cells and I made a mistake.

      Reply
      • T says

        December 8, 2021 at 7:56 pm

        Thanks for the quick reply/fix Erik, the post was very helpful.

        Reply
  3. Runy says

    February 10, 2022 at 5:11 am

    This is exactly what I needed. Getting data from excel and creating a table in Word.

    Reply
    • Erik says

      February 10, 2022 at 8:38 am

      You’re welcome! Thanks for stopping by.

      Reply
  4. Nina says

    March 23, 2022 at 4:04 am

    Hello, I have a problem while transporting my data from Excel to Docx ,the index from my table will be disappeared, can you please help me fix it?

    Reply
    • Erik says

      March 23, 2022 at 8:05 am

      If you want to have your index column be output, then you could convert the index to a column with reset_index(). Something like what’s below might be able to help you.

      df.reset_index(inplace=True)
      df = df.rename(columns = {'index':'new column name'})
      Reply
  5. Ross says

    April 22, 2022 at 3:02 pm

    Hey man,

    You just shaved my program runtime from 8 minutes 39 seconds to 3 seconds.

    Thanks.

    Reply
    • Erik says

      April 22, 2022 at 3:03 pm

      Wow! That’s incredible!! Glad we could help 🙂

      Reply
  6. Mohd Hamza says

    August 18, 2022 at 1:49 pm

    WOW! that’s really helpful. Can you tell me how can I add dataframe Header to word table.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy