pdfs, images and text combined in real time
We were writing a ColdFusion application that allows users to claim expenses and add PDFs or images of receipts electronically. The client requested that a single PDF for each claim be created which included both the claim details and ALL electronic attachments.
Using <cfpdf> chewed memory and created huge files
ColdFusion has built in PDF functionality but we found that adding BOTH images and merging other pdfs on the fly had performance issues and created documents with bloated file sizes.
Using iText provided a solution
In the end we decided to drop into the underlying Java iText libraries and call them directly from ColdFusion. The results were vastly more performant and the file sizes much more manageable – resembling the sum of the files merged.
Baseline test
A simple test using three reasonably sized files was used to run comparisons between cfpdf and iText. The aim was to write code that would run as quickly as possible, using the least RAM and creating a file size that was as close to the (sum) ~4.6MB as possible.
sample files | size (KB) |
---|---|
invoice.cfm (pdf content in memory) | 137 |
sample.jpg | 2046 |
sample.pdf | 2537 |
Total of file sizes | 4720 |
Baseline ColdFusion RAM on server | 461K |
The results
I decided to run the code ten times for each solution to get some sample statistics. iText produced a file that was the same as the target size, using the least amount of RAM and often four to five times faster than cfpdf.
cfpdf | iText | |
---|---|---|
file size | 12,368 KB | 4,720 KB |
most common processing time | ~3 secs | ~ .5secs |
average time | 4723ms | 1280ms |
variation | 2805ms – 6910ms | 461ms – 4491ms |
RAM (baseline was 461K) |
540-604K (+80->140K) |
467-483K (+6->22K) |
Show me the code
Below is the main function we wrote to achieve this:
As you can see we pass in an ORM Bean (FileManager) with our Expenses Claim Details. We loop through the files and add pages to our pdf one by one. Each image is a separate page scaled to the size of the image. PDF’s are added one page at a time.
PART 1: Create our reader/writer objects <cfscript> // get a copy of the base PDF (results of invoice.cfm). local.claimPDF = FileManager.getPathToClaims() & ‘invoice.pdf’ ; // create a file to write to. local.FileToWrite = FileManager.getPathToClaims() & ‘new.pdf’ ; // The *reader* is a Java object containing our initial pdf invoice local.reader = createObject( "java","com.lowagie.text.pdf.PdfReader“).init( local.claimPDF); // The *reader* can tell us how many pages it has so we know what the next page number will be local.newPageNumber = local.reader.getNumberOfPages() + 1; // create a fileIO Object so we can create our new pdf local.FileIO = createObject( "java","java.io.FileOutputStream“.init( local.FileToWrite); // The *stamper* is initialised with both our existing claim.pdf and our new, empty pdf object. local.stamper = createObject( "java","com.lowagie.text.pdf.PdfStamper“ ).init( local.reader, local.fileIO); </cfscript> Part 2: Loop through our files and add them <cfloop array=“#local.ClaimBean.getFiles()#" index="local.FileBean"> <!--- get each file pdf or image? --- > <cfset local.File = local.FileBean.getFilePath() /> For a PDF: <cfif local.FileBean.getFileExt() eq 'pdf'> <!--- create a new instance of our pdf reader and loop through the file adding each page at a time ---> <cfset local.PDFObj = createObject( "java","com.lowagie.text.pdf.PdfReader“).init(local.File) /> <cfif not local.PDFObj.isEncrypted() > <!--- loop through all the pages in the pdf and add them one by one ---> <cfloop from="1" to=“#local.PDFObj.getNumberOfPages()# " index="i“ > <cfset local.rectangle = createObject( "java“,"com.lowagie.text.Rectangle“).init( javacast("float“,Local.PDFObj.getPageSize(i).getWidth()), javacast("float“ ,local.PDFObj.getPageSize(i).getHeight()) ) /> <cfset local.stamper.insertPage( local.newPageNumber, local.rectangle) /> <cfset local.under = local.stamper.getUnderContent( local.newPageNumber) /> <cfset local.under.addTemplate( stamper.getImportedPage( local.PDFObj, i), 1, 0, 0, 1, 0, 0) /> <!--- a quick note about under / over content: "under" is like background content. You could use this for background images and overlay text on top using "over" ---> <cfset local.newPageNumber ++ /> </cfloop> </cfif> <!--- end isEncrypted? ---> <cfelse> <!--- we have an image ---> For an image: <!--- create a new instance of our image reader and scale our image/page appropriately ---> <cfset local.imgObj = createobject( "java","com.lowagie.text.Image“ ) /> <cfset local.img = local.imgObj.getInstance(local.File) /> <!--- scale down the image if needed ---> <cfif local.img.getScaledWidth() gt 612 > <cfset local.img.scaleToFit(612,792) /> </cfif> <cfset local.rectangle = createObject( "java","com.lowagie.text.Rectangle“ ).init( javacast("float“, local.img.getScaledWidth()), javacast("float“, local.img.getScaledHeight()) ) /> <cfset local.stamper.insertPage(local.newPageNumber,local.rectangle) /> <cfset local.content = local.stamper.getOverContent( local.newPageNumber) /> <!--- now we assign the position to our image ---> <cfset local.img.setAbsolutePosition( javacast("float",0), javacast("float",0)) /> <!--- add our image to the existing PDF ---> <cfset local.content.addImage(local.img) /> <cfset local.newPageNumber ++ /> </cfif> <!--- end if is pdf/image --- > </cfloop> <!--- end loop through file system --- > Flatten our final file and close the Objects <cfset local.stamper.setFormFlattening(true) /> <cfset local.stamper.close() /> <cfset local.reader.close() /> The END;