• Skip to primary navigation
  • Skip to main content

The Programming Expert

Solving All of Your Programming Headaches

  • Home
  • Learn to Code
    • Python
    • JavaScript
  • Code Snippets
    • HTML
    • JavaScript
    • jQuery
    • PHP
    • Python
    • SAS
    • Ruby
  • About
  • Write for Us
You are here: Home / Python / How to Iterate over Everything in Word Document using python-docx

How to Iterate over Everything in Word Document using python-docx

November 24, 2021 Leave a Comment

Many times, when working with documentation, it would be helpful if we could use code to read, create and manipulate files to make processes more efficient.

In many organizations, Microsoft Word files are used for reporting and different processes, and from time to time, we need to update the data stored in these files.

Having to update these files manually can be a nightmare. With Python, we can write a program that does these manipulations for us, and save a lot of headache and time.

Using python-docx, we can easily manipulate Word files using Python.

How to Iterate over Everything in Word Document using python-docx

The key to iterating over everything in a Word Document using python-docx is the use of the following function from the python-docx github issues section:

import docx
from docx.document import Document
from docx.text.paragraph import Paragraph
from docx.table import _Cell, Table
from docx.oxml.table import CT_Tbl
from docx.oxml.text.paragraph import CT_P

def iter_block_items(parent):
    """
    Generate a reference to each paragraph and table child within *parent*,
    in document order. Each returned value is an instance of either Table or
    Paragraph. *parent* would most commonly be a reference to a main
    Document object, but also works for a _Cell object, which itself can
    contain paragraphs and tables.
    """
    if isinstance(parent, _Document):
        parent_elm = parent.element.body
        # print(parent_elm.xml)
    elif isinstance(parent, _Cell):
        parent_elm = parent._tc
    else:
        raise ValueError("something's not right")

    for child in parent_elm.iterchildren():
        if isinstance(child, CT_P):
            yield Paragraph(child, parent)
        elif isinstance(child, CT_Tbl):
            yield Table(child, parent)

The code above will give us each element in a Word Document, including those included in the rows and cells of the table. Then, we can iterate over a given Word document as so:

doc = docx.Document("/path/to/your/word.docx")

for block in iter_block_items(doc):
    if isinstance(block,Table):
        #this is a table 
        #do something here
    else: 
        #this is a paragraph
        #do something else here

Something that I find useful when working with Word documents is keep track of the block around the current element. For example, I might want to keep track of the previous block so that if the previous block is something important, I can add styling or content around it.

doc = docx.Document("/path/to/your/word.docx")

for block in iter_block_items(doc):
    if isinstance(block,Table):
        #this is a table 
        #do something here
    else: 
        #this is a paragraph
        #do something else here
    previous_block = block

Hopefully, this helps you with automating a Microsoft Word document process using Python.

Other Articles You'll Also Like:

  • 1.  Using Python to Get and Print First N Items in List
  • 2.  pandas product – Get Product of Series or DataFrame Columns
  • 3.  Using Lambda Expression with max() in Python
  • 4.  Read Pickle Files with pandas read_pickle Function
  • 5.  Using Python to Iterate Over Two Lists
  • 6.  Sorting a List of Tuples by Second Element in Python
  • 7.  Get Days in Month Using Python
  • 8.  Sum Columns Dynamically with pandas in Python
  • 9.  Using Python to Convert Integer to String with Leading Zeros
  • 10.  Mastering Division in Python: A Comprehensive Guide

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Copyright © 2023 · The Programming Expert · About · Privacy Policy