In this guide, we’ll walk through what metadata is, how it can help you with requests, some sample requests, and ways to assist a public agency with the request.

Table of Contents

🤔 What is metadata?

Metadata (or metainformation) is “data that provides information about other data”,[1] but not the content of the data, such as the text of a message or the image itself.[2]

Metadata is not strictly bound to one category, as it can describe a piece of data in many other ways.

There are many distinct types of metadata, including:

  • Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords.
  • Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials.[3]
  • Administrative metadata[4] – the information to help manage a resource, like resource type, permissions, and when and how it was created.[5]
  • Reference metadata – the information about the contents and quality of statistical data.
  • Statistical metadata[6] – also called process data, may describe processes that collect, process, or produce statistical data.[7]
  • Legal metadata – provides information about the creator, copyright holder, and public licensing, if provided.
Source:  https://en.wikipedia.org/wiki/Metadata

🤷‍♂️ Why make a metadata request?

As the description implies, we are looking at data about data. One may ask: “If I am looking for data, why would I want data about the data, though?”

Great question. With digital systems, there is a plethora of information available, making their content very dense. Metadata requests are a way to hone in and precisely inquire about specific aspects of a dataset.

For instance, if a public agency is using a database you are unfamiliar with, making a public records request for the schema of the database (an example of structural metadata) is extremely helpful to make requests for the underlying data.

Below is an example of a database schema for MediaWiki (explore the schema online here):

Once you have the schema, you can make a precise request for the data contained within particular tables. And, if you know SQL (or the query language the database utilizes), you can even write the queries for the agency to execute.

If you still need additional insight on a particular digital system an agency employs, I recommend further research by:

  • Other direct requests about the system to the same agency or another one using it, such as requesting a manual
  • Requests for IT expenditure records to see their vendors and proceed accordingly to obtain information about these softwares, databases, etc.

📧 Metadata in email

Emails, and really any communication medium, are a valuable and crucial data source for the Public to understand how Public Agencies think, plan, operate, and execute on behalf of the Public.

As time has gone on, it has become more difficult to obtain emails through public records requests for various reasons, including the expansion of what is considered reasonably particular.

Thankfully, this is where metadata can help.

What many people do not realize is that emails contain a tremendous amount of metadata. Some of this data is made visible and familiar to all of us. Below are pieces of metadata that is rendered in our email clients to create what we think about as an email:

ToFromCC
BCCSubjectDate Sent
AttachmentsMessage BodyPriority

The documents that guides how email should be structured are Request for Comments (RFC) 5322, Internet Message Format and RFC Update to Internet Message Format to Allow Group Syntax in the “From:” and “Sender:” Header Fields.

Some email servers, plugins, and related software may implement these slightly differently, along with including additional pieces of metadata, usually in the headers, of the email.

🧐 Why should you make metadata requests for e-mail?

Benefits

  • Easier lift, overall, for the agency to fulfill
  • Very little text content to examine from a legal perspective
  • Lower costs for review, if any is necessary
  • Provides a reading of the abstract vs. the entire paper
  • Larger menu to help drive subsequent requests

Risks

  • Agency, if poorly staffed on the IT front, may initially lack technical skills to fulfill the request (but can be trained and worked with to succeed in the release of records related to your request)
  • Create arguments about the public nature of metadata
  • Manifest contention about creating a new record for the metadata

👣 Basic Steps for an E-mail Metadata Request

  • Identify, if possible, e-mail system used by the agency
    • This is important if you need to help them with the data extraction or need to push back against a denial.
  • Make the specific metadata request
  • Follow-up with any technical guidance as necessary, rebuttals to friction
  • Fight for the records
  • (Eventually) Enjoy your results, which allow for an easy way to follow-up on full e-mails

🔍 Identify the email system used by the agency

Receive an e-mail from the government agency you want to make request.  Inside the header information for the e-mail, you can understand the services they use:

🧱 Basic Structure for an Email Metadata Request

  • Be explicit on the fields you are seeking
    • Common fields:  To, From, CC, BCC, Subject, Date
  • Share that you are NOT seeking the bodies of the emails
    • Including an example of what information you are looking for is helpful for agencies that may be unfamiliar with your request.
  • Specify that you want the data in a machine readable format
  • Ask for a timeline of release 
  • Offer to work with their IT Department if needed

💡 Example Request

Good day Public Agency,

I would like to request the email metadata (not the bodies of the emails, but the To, From, CC, BCC, Subject, Date) for the Public Utility email address (utility@gov.gov).

The metadata can be limited to include in the To, From, CC, or BCC John Smith, Jane Smith (direct or alias emails), Jacqueline Smith for the months of April, May, June, and July 2023.

The information should be provided in a machine readable format such as CSV, JSON, or XML.

It would be appreciated if you can provide a timeline for the release of these records as well.

Thank you.

🏁 Sample of Results

And here are some sample results. This particular one looks like it was copied directly from Outlook since the From/To have both emails and Names listed.

This is one known issue with exporting directly from Outlook.

⚙️ Technical Guidance for Supplying the Record

🖥️ Methods: Outlook, IMAP Connection and CSV Export

  • Bring in any IMAP supported email account to Outlook
  • Sometimes a mailbox may be extremely large and require adjustments to the default settings
  • Works with any mail server that connect to Outlook
  • Direct export for emails to CSV from Outlook, based on a folder
  • Limitation is that there isn’t a date, email may resolve to a Display Name

🖥️ Methods: Exchange Server

🖥️ Methods: Google Mail

🖥️ Methods: Google Mail with Vault Access

Google provides a wonderful reference for search parameters that can be leveraged inside Vault.

Once the Vault query is finished, the Public Agency can simply download the metadata.csv file that contains all of the relevant metadata related to the search.

This is by far the easiest way to generate the metadata you need.

Below is a sample of a metadata.csv file:

👀 Visual Guide of Google Vault Exporting Flow

📚 References