In 2005, Wichita tv station KSAS-TV received a mysterious package. Upon further examination, it was determined to be from the BTK killer, who had just begun a string of taunting communications, fourteen years after the last contact. In the package was, among other things, a floppy disk.

The floppy disk had been included after he was given assurance that it was a secure means of communication (note: it’s not). When the police obtained the floppy disk, it was scoured for evidence. The floppy disk had been wiped clean, but still contained two key elements hidden in the file’s metadata – someone named “Dennis” was the last to have modified a document, as well as the phrase “Christ Lutheran Church.” Ten days later after a few more dots were connected, Dennis Rader was arrested in connection of the ten BTK murders.

Thanks to the metadata found on that floppy disk, he is now serving ten life sentences.

Metadata – once one strips away the technical jargon – is just information about data (or, data about data). Whenever you create a document, make a phone call, send a text message, or any other innumerable digitally related tasks, information is logged with various data points; when it was created, opened, last viewed, etc. Using this information gives context to the data, and becomes usable in relation to your case.

Metadata is so pervasive in our technology, you are constantly using it without realizing it. When Pandora, Spotify or Apple Music cultivates playlists fashioned for your tastes, it’s using metadata; using a search term when you’re looking for an email you sent five months ago is using metadata; even just opening a word document and finding where you last saved it is using metadata.

Everyday use of metadata is like an iceberg – you see the top, but there is great immensity below the surface. To understand how much metadata can be pulled from the smallest of sources, just examine the metadata of a tweet; the amount of metadata can easily outnumber the characters in said tweet – around 150 points of information unrelated to the subject of the tweet’s message.

While there are innumerable types of metadata, NISO (National Information Standards Organization) proposes three basic classes of metadata: descriptive, administrative, and structural. To illustrate, we’ll take a look at some high-level metadata of a photo.

Descriptive Metadata:

As the name indicates, this is the part of metadata that describes the data. The most visible of the different types of metadata, it details information about the data such as who generated the data, the title of the data, keywords etc.

Descriptive Metadata

Administrative Metadata:

Administrative MetadataAdministrative Metadata

This contains the technical details about the respective files, such as intellectual property rights, file format, color schemes, when it was created, preservation activities, etc. Usually, anything you’d need to actively dig into the data to view would be categorized here.

Structural Metadata:

Structural MetadataStructural Metadata

This allows data to be grouped and related to similarly relevant data, creating a searchable network database. Besides tagging data in relation to each other, it allows for structures such as a table of contents, indexes, etc. to be included.

Metadata is typically used to bolster evidence in a case, help refute it and also provide structure during a document review. Handing over evidence and ESI is the basis for production, so we’ll look at a couple examples of how some cases have involved metadata.

Example 1:

In 2012, an infrastructure firm ran a security audit to look for software which would allow hackers entry into the firm’s network. While they didn’t find any malicious programs running, they did find something else of note – a software engineer (making a six-figure salary) was outsourcing his work to a Chinese tech firm and spending his workdays surfing the web and buying things on eBay. By all accounts the work he submitted from the Chinese firm was top quality, but – as one can imagine – his employers were less than pleased.

It was through the metadata of the work he submitted that the security audit was able to find that the work was not coming from within the company, but instead was being brought in from Shenyang, China.

Example 2:

From 2008 to 2010, Facebook experienced unprecedented growth in American users, expanding from less than 145 million to over 600 million. At that time Paul Ceglia brought a lawsuit against Facebook, claiming 84% ownership of the company. The suit seemed plausible, as Mark Zuckerberg had done work for Ceglia leading up to the creation of Facebook. However, Zuckerberg denied any connection between his work with Ceglia, and what became his hundreds-of-billion dollar company.

In early 2011, much of Ceglia’s counsel abandoned him after examining the metadata in his evidence – email exchanges that had been altered, contracts that had been backdated, and more. It was through the computer forensic analysis of the metadata that it was determined that he did not have a case, but he was in fact prosecuted for fraud. After escape house arrest and fleeing the country, he most recently was found in Ecuador and is fighting extradition.

Properly managing metadata is crucial to avoid a situation where evidence can be called into question. Metadata is quite flimsy, and much like Schrodinger’s cat, just by observing it can alter the data.

  • A critical place to start in securing your metadata is by not doing it yourself – too often an upstart lawyer will think they know how to properly collect and preserve, and do just the opposite. Securing a professional data forensic collector is essential to assure that your data is properly handled.
  • Keep record of your preservation methods. Assume any process will be called into question, and be prepared to give a solid defense of your standard procedures.
  • Be as transparent as possible. Accusations of spoliation are most easily avoided by developing a culture of rigorous preservation accountability.