While you may not be familiar with the term “enterprise search,” those who have recently submitted a broad FOIA request for email records are likely to be familiar with the challenges agencies face in conducting records searches across vast electronic repositories. On May 6, 2021, OGIS hosted a webinar with Centers for Disease Control and Prevention (CDC) FOIA program leadership and IT Project Manager to demonstrate that agency’s approach to searching its email and electronic systems and to share best practices for requesters. A recording of the webinar is available on NARA’s YouTube Channel; the presentation slides and transcript are also available on the OGIS website.
CDC FOIA officials explained that while enterprise searching allows the FOIA program to search across multiple electronic locations using a single search query, there are still limitations to the tools they use. The key to a successful FOIA request is how a requester defines the request. For instance, identifying particular custodian mailboxes is not mandatory, but may be very helpful. Additionally, providing a narrow time frame and limiting the number of keywords and/or key phrases may assist the CDC in processing the request in a timely manner.
While enterprise searches are fast and efficient, one drawback to enterprise searching is the sheer volume of records that are returned with the search. The CDC uses deduplication and containment tools when conducting enterprise searches to ensure unique records are captured within the scope of potentially responsive records, while duplicative and extraneous records are identified and removed.
The CDC described three categories of enterprise searches:
- “Low” intensity: the scope of the request is clear with no ambiguity in keywords, custodian mailboxes are defined with few participants, and the date range is narrow;
- “Medium” intensity: requires more complex searching such as Boolean search terms, and FOIA staff need to ensure the keywords are returning relevant records;
- “High” intensity: very complex searches with multiple keywords and phrases, many custodian mailboxes with an unknown number of recipients, and records that include many attachments.
CDC FOIA staff offered the following tips to ensure successful enterprise search results:
- Have a well-defined scope and be precise in what you seek. Split searches into different line items and explain where you expect keywords to appear (subject; body; within so many words of a specific key phrase).
- Limit keywords or suggest phrase searching. Keep in mind that the keywords you suggest may be different than those used by agency staff. The context of the records you seek is often more helpful than keywords. Numerous and generic keywords will dilute search results.
- When possible, limit the number of custodians, the timespan, and type of responsive records. Currently, the agency is unable to conduct enterprise searches for large groups of people, and providing numerous custodians will complicate the search. If you are unsure which custodians may have the records you seek, CDC FOIA staff will consult with CDC subject matter experts to obtain additional context for the records.
- Discuss the scope of your request with the FOIA Public Liaison (FPL) and analyst assigned to your request, and be up front as to what you seek. For example, if you do not want attachments, let the analyst or FPL know so those records can be excluded from the responsive records.
The CDC FOIA program has been proactive in communicating with their stakeholders using this venue. OGIS is happy to help any other agency FOIA program host similar events. If you are interested, please email us at firstname.lastname@example.org. We look forward to hearing from you!