Creating Word documents using the Python Docx package is very powerful and allows us to present our data and findings in an automated way.
Many times, we are working with data and want to output this data into a table.
Outputting data to a table in a Word Document in Python is not difficult, but making this process efficient and fast is the key to making our work as good as it can be.
Creating a Table in a Word Document using Python Docx Efficiently
Creating a table and filling the table with data in a Word document using Python Docx is easy.
To create a table, all we need to do is:
import docx
doc = docx.Document()
table = doc.add_table(rows=ROWS, cols=COLS)
doc.save("output_file_path.docx")
When writing small tables to word, we can loop through the table and there will not be any performance problems.
Let’s say we are reading some data from an XLSX file and we want to output it in a table in a Word document.
If the data is small, we can easily output the table without problems.
import docx
import pandas as pd
data = pd.read_excel("some_data.xlsx")
doc = docx.Document()
table = doc.add_table(rows=data.shape[0], cols=data.shape[1])
for i in range(df.shape[0]):
for j in range(df.shape[1]):
table.cell(i,j).text = str(df.values[i,j])
doc.save("output_file_path.docx")
But, the problem is that when you use “table.cell(i,j).text”, this updates the table in the Word document and is very slow.
The more efficient way is to do the following:
import docx
import pandas as pd
data = pd.read_excel("some_bigger_data.xlsx")
doc = docx.Document()
table = doc.add_table(rows=data.shape[0], cols=data.shape[1])
table_cells = table._cells
for i in range(data.shape[0]):
for j in range(data.shape[1]):
table_cells[j + i * data.shape[1]].text = str(data.values[i][j])
doc.save("output_file_path.docx")
The key is to use table._cells to “pop” out the cells from the table. With this line, we limit the amount of calls to the table in the Word document. By updating the Word document only after you have filled the table, you will improve the efficiency and speed of your program.
Hopefully this has been helpful in helping you write a table from a dataframe to Word document faster, and with this code, you can make your processes more efficient and take less time.
Let me know if you have any questions, and thank you for reading.
Hi,
I’am facing a problem when using you’re code, and was wondering if you would help.
I’m currently looping through 104 different dataframes, and want to place them in 104 different documents. My problem is that when i insert the data, .0 is added at then end of each cell. As an example when i insert 542780, it gets formatet as 542780.0 in word?
Any idea what causes the issue?
Kind regards Peter
Hi Peter,
Have you tried converting your data to strings when writing it to each document? If 542780 is not an integer in your dataframe, then it’s possible that when you write it, it will write it as a float, or with the .0 on the end.
This will convert a column in your dataframe to string: df[‘DataFrame Column’] = df[‘DataFrame Column’].apply(str)
Hey there, I faced a problem where my method of generating a massive table was extremely slow, and this looks like one of the solutions, but when I took a closer look at the code, I noticed that “table_cells” is not used anywhere. The line “table_cells = table._cells” occurs but doesn’t seem to serve a purpose, while seemingly being the central idea behind the optimization? Am I missing something?
Thanks!
Thanks for the comment T.
I’ve edited the post with the correct code – the key is to use table._cells and I made a mistake.
Thanks for the quick reply/fix Erik, the post was very helpful.
This is exactly what I needed. Getting data from excel and creating a table in Word.
You’re welcome! Thanks for stopping by.
Hello, I have a problem while transporting my data from Excel to Docx ,the index from my table will be disappeared, can you please help me fix it?
If you want to have your index column be output, then you could convert the index to a column with reset_index(). Something like what’s below might be able to help you.
Hey man,
You just shaved my program runtime from 8 minutes 39 seconds to 3 seconds.
Thanks.
Wow! That’s incredible!! Glad we could help 🙂
WOW! that’s really helpful. Can you tell me how can I add dataframe Header to word table.