Compare PDF output in Java for snapshot testing

When developing an application that outputs PDFs such as forms, have you ever wanted to automatically test the final output PDF including the layout? In this document, I will show you how to perform regression testing including layout by imaging and comparing two PDF files.

Basic idea

Confirmation environment

Until PDF imaging

Imaging PDFs is easier than you might think with Apache PDFBox (https://pdfbox.apache.org/). As mentioned earlier, the purpose is an automatic regression test, so if the number of pages or page size is different, the test will fail and give up.

The larger the DPI for imaging, the more precise comparisons can be made using high-resolution images, but machine resources (CPU, memory) are required accordingly.

static void assertPdfEquals(InputStream expected, InputStream actual) throws IOException {
    try (PDDocument doc1 = PDDocument.load(expected);
         PDDocument doc2 = PDDocument.load(actual)) {
        //Test failed if the number of pages is different
        assertEquals(doc1.getNumberOfPages(), doc2.getNumberOfPages());

        PDFRenderer renderer1 = new PDFRenderer(doc1);
        PDFRenderer renderer2 = new PDFRenderer(doc2);
        for (int i = 0; i < doc1.getNumberOfPages(); i++) {
            BufferedImage image1 = renderer1.renderImageWithDPI(i, 144, ImageType.RGB);
            BufferedImage image2 = renderer2.renderImageWithDPI(i, 144, ImageType.RGB);

            //Test fails even if the size is different
            assertEquals(image1.getWidth(), image2.getWidth());
            assertEquals(image1.getHeight(), image2.getHeight());

            //Test image match and output diff image to temporary file if they do not match
            Path path = Files.createTempFile("diff-" + i + "-", ".png ");
            try (OutputStream os = Files.newOutputStream(path)) {
                assertTrue(compareImage(image1, image2, os), path);
            }
        }
    }
}

Until image comparison

Comparing images is not particularly difficult as long as you only check the exact match of RGB values pixel by pixel. The point is not just to compare, but to repaint the mismatched pixels with a highlight color to create a diff image.

static boolean compareImage(BufferedImage image1, BufferedImage image2, OutputStream os) throws IOException {
    boolean matched = true;
    for (int x = 0; x < image1.getWidth(); x++) {
        for (int y = 0; y < image1.getHeight(); y++) {
            int p1 = image1.getRGB(x, y);
            int p2 = image2.getRGB(x, y);
            //Pixels that match are left as they are, and pixels that do not match are changed to magenta.
            if (p1 != p2) {
                matched = false;
                image1.setRGB(x, y, Color.MAGENTA.getRGB());
            }
        }
    }
    //Output the difference image
    if (os != null) {
        ImageIO.write(image1, "png", os);
    }
    return matched;
}

Difference image output example

In contrast to the expected PDF, in the actual PDF, "the date of the heading has been added", "item 3 of the item has been deleted", and "subtotals and totals have changed due to the deletion of item 3". Can be read by comparing it with the difference image.

Supplement

    //Matched pixels are black, unmatched pixels are white
    if (p1 == p2) {
        image1.setRGB(x, y, Color.BLACK.getRGB());
    } else {
        matched = false;
        image1.setRGB(x, y, Color.WHITE.getRGB());
    }

image.png

Summary

I introduced how to perform a snapshot test by converting the PDF output to an image. There is a limit to how humans can visually perform regression testing when trying to confirm that "the data is not just output, but is displayed in the correct layout" like PDF.

If you use the method introduced this time, you will not only detect the degreasing of the application you implement, but you will also be aware of unintended layout changes when you upgrade the PDF output library, so you can develop with more peace of mind. There is none.

References

Recommended Posts

Compare PDF output in Java for snapshot testing
Compare Lists in Java
Save Java PDF in Excel
Rock-paper-scissors game for beginners in Java
I created a PDF in Java.
[For beginners] Run Selenium in Java
Mixed Western calendar output in Java
Log output to file in Java
Output PDF and TIFF with Java 8
Settings for SSL debugging in Java
Java draws shapes in PDF documents
Modern best practices for Java testing
First steps for deep learning in Java
Key points for introducing gRPC in Java
Output Notes document as XML document in Java
[Java] for Each and sorted in Lambda
Output Date in Java in ISO 8601 extended format
Program PDF headers and footers in Java
Create barcodes and QR codes in Java PDF
Read a string in a PDF file with Java
ChatWork4j for using the ChatWork API in Java
Technology for reading Java source code in Eclipse
Try scraping about 30 lines in Java (CSV output)
Prepare for log output using log4j in Eclipse.
Solution for NetBeans 8.2 not working in Java 9 environment
I tried to output multiplication table in Java
Set pop-up display for Java language in vim.
[Java] Let's create a mod for Minecraft 1.14.4 [99. Mod output]
[Java] Something is displayed as "-0.0" in the output
Enable / disable SNI in Java for each communication
Things to watch out for in Java equals
[Java] output, variables
For JAVA learning (2018-03-16-01)
Partization in Java
Changes in Java 11
Rock-paper-scissors in Java
Pi in Java
Java for statement
FizzBuzz in Java
Try Easy Ramdom, a PropertyBase Testing tool for java
A note for Initializing Fields in the Java tutorial
Get Locale objects for all locales available in Java
Output true with if (a == 1 && a == 2 && a == 3) in Java (Invisible Identifier)
This and that for editing ini in Java. : inieditor-java
I tried using an extended for statement in Java
Sample code for log output by Java + SLF4J + Logback
Gzip-compress byte array in Java and output to file
[memo] Generate RSA key pair for SSH in Java