Metadata - There Ought to be a Law

There seems to be much discussion lately about metadata. Some of the courts have raised new concerns about filings that may contain metadata - and law firm staff and practitioners alike are sometimes confused by the very topic. Perhaps The Sedona Conference and case law can give us guideance.

What is metadata? The answer to this question is often "data about data" which is correct although this definition fails to provide any context - and that is vital. From The Sedona Conference and case law we can develop a more concise definition:

1) although metadata often is lumped into one generic category, there are at least several distinct types, including substantive (or application) metadata, system metadata, and embedded metadata. Sedona Principles 2d Cmt. 12a;

Substantive metadata, also known as application metadata, is "created as a function of the application software used to create the document or file" and reflects substantive changes made by the user. [track changes, prior edits or editorial comments, and includes data that instructs the computer how to display the fonts and spacing]

System metadata "reflects information created by the user or by the organization's information management system." [OS author date/time created or modified]

Embedded metadata consists of "text, numbers, content, data, or other information [*355] that is directly or indirectly inputted into a [n]ative [f]ile by [**15] a user and which is not typically visible to the user viewing the output display" of the native file. Md. Protocol 27.
Examples include spreadsheet formulas, hidden columns, externally or internally linked files (such as sound files), hyperlinks, references and fields, and database information.Aguilar v. Immigration & Customs Enforcement Div., 255 F.R.D. 350, 354-55 (S.D.N.Y. 2008)

From the context of litigation, we have some concise guidance about what metadata really is.

PDF Documents and Metadata

What does all of this mean in the context of PDF documents?

Embedded metadata is not transferred when a source document is converted to PDF format. If you recall comments from an earlier blog about how PDF files are created, you know that only visible content is in the PDF file when converting a source file. The single exception might be text colored to white in an attempt to hide it.

System metadata belongs to the document itself. The system metadata from a source document is not transferred when converting to PDF format, but the PDF document will have its own system metadata. Many PDF document solutions allow you to change or remove this metadata.

Substantive metadata follows the "visible" rule. If you can see track changes in your Word document - then you will see them in the resulting PDF document. If you "accept all" so that track changes are not showing then they are not contained in the resulting PDF document.

The objective of this commentary is to provide a bit more information about metadata and what to expect when converting source documents to PDF format.

The Editor