Table of Contents
The daughter of an elementary school student scans the wrong problems in math and puts them together in PDF. However, I intended to print only the necessary pages, but the page numbers are quite different, and I have to redo it many times. It is surprisingly analog that I intended to digitize it. Therefore, I wanted to automatically assign page numbers to PDF files. You can prevent the page number from shifting due to human error, or you will notice it immediately.
However, even though I want to number pages in PDF, I can't find a surprisingly good way to do it. As a result of searching, I found some web services and tried them, but some do not like the position to put the page, and some are charged if there are many page numbers (and monthly subscription) And so on.
So I tried to find out if PDFs could be page numbered with Python, which has recently become available and feels good, but there seems to be no PDF library for this use case (at least on its own). I found out that.
This article was also posted on https://achiwa912.github.io/.
The most famous PDF library in Python seems to be around PyPDF2, pdfrw. These are good at merging multiple PDF files, splitting them in reverse, and swapping pages, but they do not seem to support the use case of "adding page numbers to existing PDF files". is.
Upon further investigation, I found that the ReportLab library seemed to be able to number pages. https://www.blog.pythonlibrary.org/2013/08/12/reportlab-how-to-add-page-numbers/
This web page is a very promising title, but the sample code raises questions. In the first place, it seems that the existing PDF file is not read and the page number is given to the newly created PDF page. It's no good. .. ..
I will also read through the manual. https://www.reportlab.com/docs/reportlab-userguide.pdf
Again, there was no explanation for reading an existing PDF file.
Still, when I continued searching without giving up, I found it. https://stackoverflow.com/questions/28281108/reportlab-how-to-add-a-footer-to-a-pdf-file
As expected, stackoverflow! I love it along with qiita in Japan. Apparently, it says that it seems possible to combine ReportLab and pdfrw. There is also a description that is worrisome. .. ..
DISCLAIMER: Tested on Linux using as input file a pdf file generated by Reportlab. It would probably not work in an arbitrary pdf file.
"I tested it with a PDF file created with ReportLab, but I think it doesn't work with any PDF file."
... Eh !! But this is the only thing I can rely on. Let's try it.
Let's modify the sample code on the stakoverflow page.
from reportlab.pdfgen.canvas import Canvas
from pdfrw import PdfReader
from pdfrw.toreportlab import makerl
from pdfrw.buildxobj import pagexobj
import sys
import os
if len(sys.argv) != 2 or ".pdf" not in sys.argv[1].lower():
print(f"Usage: python {sys.argv[0]} <pdf filename>")
sys.exit()
input_file = sys.argv[1]
output_file = os.path.splitext(sys.argv[1])[0] + "_pgn.pdf"
reader = PdfReader(input_file)
pages = [pagexobj(p) for p in reader.pages]
canvas = Canvas(output_file)
for page_num, page in enumerate(pages, start=1):
canvas.doForm(makerl(canvas, page))
footer_text = f"{page_num}/{len(pages)}"
canvas.saveState()
canvas.setStrokeColorRGB(0, 0, 0)
canvas.setFont('Times-Roman', 14)
canvas.drawString(290, 10, footer_text)
canvas.restoreState()
canvas.showPage()
canvas.save()
And when I run it ... That, it moved quickly. Just in case, 7/88 written at the bottom of the page is the page number I put in this time. What was that disclaimer? .. ..
Since we are using f-string, please use it with python 3.6 or later.
PDF library installation
pip install reportlab
pip install pdfrw
Save the above code as addpagenum.py. (Change the file name to your liking)
Run
python addpagenum.py <pdf_filename>
The page number is A4 and is displayed at the bottom center of the page.
Please change this area appropriately.
footer_text = f"{page_num}/{len(pages)}"
canvas.setFont('Times-Roman', 14)
canvas.drawString(290, 10, footer_text)
--Change footer \ _text to change the displayed content --If you want to change the page number display position, change x = 290 and y = 10.
In the canvas of ReportLab, the coordinates (x = 0, y = 0) are at the bottom left of the page. If you want to use a Letter other than A4, specify it when creating a canvas object. See the ReportLab manual for details.
Recommended Posts