• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Programming Expert

Solving All of Your Programming Headaches

  • HTML
  • JavaScript
  • jQuery
  • PHP
  • Python
  • SAS
  • Ruby
  • About
You are here: Home / Python / How to Iterate over Everything in Word Document using python-docx

How to Iterate over Everything in Word Document using python-docx

November 24, 2021 Leave a Comment

Many times, when working with documentation, it would be helpful if we could use code to read, create and manipulate files to make processes more efficient.

In many organizations, Microsoft Word files are used for reporting and different processes, and from time to time, we need to update the data stored in these files.

Having to update these files manually can be a nightmare. With Python, we can write a program that does these manipulations for us, and save a lot of headache and time.

Using python-docx, we can easily manipulate Word files using Python.

How to Iterate over Everything in Word Document using python-docx

The key to iterating over everything in a Word Document using python-docx is the use of the following function from the python-docx github issues section:

import docx
from docx.document import Document
from docx.text.paragraph import Paragraph
from docx.table import _Cell, Table
from docx.oxml.table import CT_Tbl
from docx.oxml.text.paragraph import CT_P

def iter_block_items(parent):
    """
    Generate a reference to each paragraph and table child within *parent*,
    in document order. Each returned value is an instance of either Table or
    Paragraph. *parent* would most commonly be a reference to a main
    Document object, but also works for a _Cell object, which itself can
    contain paragraphs and tables.
    """
    if isinstance(parent, _Document):
        parent_elm = parent.element.body
        # print(parent_elm.xml)
    elif isinstance(parent, _Cell):
        parent_elm = parent._tc
    else:
        raise ValueError("something's not right")

    for child in parent_elm.iterchildren():
        if isinstance(child, CT_P):
            yield Paragraph(child, parent)
        elif isinstance(child, CT_Tbl):
            yield Table(child, parent)

The code above will give us each element in a Word Document, including those included in the rows and cells of the table. Then, we can iterate over a given Word document as so:

doc = docx.Document("/path/to/your/word.docx")

for block in iter_block_items(doc):
    if isinstance(block,Table):
        #this is a table 
        #do something here
    else: 
        #this is a paragraph
        #do something else here

Something that I find useful when working with Word documents is keep track of the block around the current element. For example, I might want to keep track of the previous block so that if the previous block is something important, I can add styling or content around it.

doc = docx.Document("/path/to/your/word.docx")

for block in iter_block_items(doc):
    if isinstance(block,Table):
        #this is a table 
        #do something here
    else: 
        #this is a paragraph
        #do something else here
    previous_block = block

Hopefully, this helps you with automating a Microsoft Word document process using Python.

Other Articles You'll Also Like:

  • 1.  Remove Duplicates from Sorted Array in Python
  • 2.  Open Multiple Files Using with open in Python
  • 3.  Python not in – Check if Value is Not Included in Object
  • 4.  Flatten List of Tuples in Python
  • 5.  Replace Character in String in Python
  • 6.  Check if a Number is Divisible by 2 in Python
  • 7.  Using Python to Print Variable Type
  • 8.  Get Last N Elements of List in Python
  • 9.  Drop First Row of pandas DataFrame
  • 10.  Using Python to Convert Float to Int

About The Programming Expert

The Programming Expert is a compilation of a programmer’s findings in the world of software development, website creation, and automation of processes.

Programming allows us to create amazing applications which make our work more efficient, repeatable and accurate.

At the end of the day, we want to be able to just push a button and let the code do it’s magic.

You can read more about us on our about page.

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

About The Programming Expert

the programming expert main image

Welcome to The Programming Expert. We are a group of US-based programming professionals who have helped companies build, maintain, and improve everything from simple websites to large-scale projects.

We built The Programming Expert to help you solve your programming problems with useful coding methods and functions in various programming languages.

Search

Learn Coding from Experts on Udemy

Looking to boost your skills and learn how to become a programming expert?

Check out the links below to view Udemy courses for learning to program in the following languages:

Copyright © 2023 · The Programming Expert · About · Privacy Policy