BLOG – Redaction – part II
What You Can See and What You Can’t See
When redacting PDF content, know that there may be content that you can’t see – so you must rely on your redaction tool to find and remove this unseen content to redact safely.
PDF files can be described as a “container” – or as a “sandwich” which is my preferred description because they can contain different layers of information. This notion of layers is described in a whitepaper recently published by Docs Corp and can be downloaded here . The whitepaper discusses 3 types of PDF format to take into account when redacting a PDF.
OCR’d PDF Files
The PDF format I’m discussing here is a sort of “combo” PDF, created when an image-PDF is converted to searchable PDF using an OCR process. OCR is “optical character recognition” software which can “read” the letters and spaces in an image file and create searchable text automatically. When this is applied to an image-PDF, the image remains and is what’s visible to the reader – but the searchable text is added as a new layer in the PDF.
The result is a PDF file containing an image layer and a text layer.
Removing Text and Image
Your redaction tool must remove the area in the image marked for redaction and the corresponding searchable text. It is absolutely vital that you only use a tool with a true redaction annotation since this is the only way to remove content you can see and content you can’t see - other annotation marks will simply cover what you can see and will ignore what you cannot. Note that Adobe Professional (not Standard) is the only version of Acrobat with a true redaction tool – and only the more recent versions.
One option you may consider is to "flatten" your PDF file so that all content is reduced to a single layer. There are at least 2 approaches to flattening a PDF file - so email me if you're not sure about how to do this. But you still a true redaction tool to remove content from the area marked for redaction. Simply flattening is not a complete solution.
Ask your PDF content provider what capabilities their tool has – and then test it! (email me for a description of a simple way to test whether redaction has worked).
In a future blog we’ll discuss some redaction best practices to adopt which should keep you on the straight and narrow.