Highlight Text In PDF With Different Colors Using Python
About this Post
In this post I will be sharing a simple python script which will highlight text with different colors in PDF.
Prerequisite
- Python 2 / Python 3
Python Package Required
- PyMuPDF
Sample File
Link - https://easyupload.io/2hiobb
Python Code
#fitz is used to highlight text in PDF
import fitz
from fitz.utils import getColor
#we need to read pdf file as binary
with open("sample.pdf", "rb") as f:
file = f.read()
doc = fitz.open('pdf', file)
#function for highlighting text with color
def highlight(document, text, color_name):
for i in range(len(document)):
#looping through pages one by one (here we are having only one page in sample PDF)
page = document[i]
# searchFor is a page method that search text and based on finding returns list of Rect value which is used for highlighting
text_instances = page.searchFor(text.strip())
#here we are defining color for highlighting
color = {"stroke": getColor(color_name)}
for inst in text_instances:
#annot: additional objects that can be added in document(here we are adding highlight annot)
annot = page.addHighlightAnnot(inst)
#setting color for highlighting
annot.setColors(color)
#updating of annotation
annot.update()
#We are having "red", "green", "blue", "pink", "yellow", "brown", "purple", "orange" text present in sample pdf file. We are highlighting text with same color
color_list = ["red", "green", "blue", "pink", "yellow", "brown", "purple", "orange"]
for val in color_list:
highlight(doc, val, val)
#saving output pdf
doc.save("new.pdf")
Awesome
ReplyDeleteVery well documented bro.
ReplyDeleteWell done bhaiya
ReplyDelete