Serious companies, companies in regulated industries, and companies dealing with sensitive information often must manage their data carefully. Data that is stored improperly can be a massive liability, even if nothing happens relating to it. Whether you have legal mandates to retain data for a certain amount of time or a mandate to get rid of it after some period has elapsed, data retention policies can be a critical part of your file and data management.
How can you manage your data retention policies if you’re using Google Drive to store, manage, and collaborate on files? Does Google offer any specific tools or services to help?
Let’s dig in and unearth all the essential details.
What Are Data Retention Policies?
First, it is necessary to know what data retention policies are and why they often exist.
A data retention policy is essentially just a statement that says, “we hold X data for Y amount of time.”
Anything older than that period can either be moved to secondary, tertiary, or backup storage or deleted. A good data retention policy specifies what happens to the data, not just when something happens.
Data retention policies are necessary for a variety of reasons. For one, they’re essential for managing limited storage space across an organization.
Even though sufficiently large organizations can get unlimited storage from Google, you still don’t want to manage unlimited data or sort through it to find something you know exists but can’t quite remember what it’s called or where it is.
Data retention is sometimes legally mandated or regulated by industry agencies; this can vary by industry and regulatory body. A company operating within a specific space might be required to store data for a certain number of years before it can delete it; alternatively, it might be needed to delete it after a certain amount of time has passed.
With Google Drive, data retention policies are based on two different types of controls:
- The first type of control is the Retention Policy, which an administrator sets. As discussed, this decides how long files and folders will remain available for users to access and when files and folders are removed from Google Drive.
- The second type of control is called a Preservation Policy. This policy protects vital records or other sensitive data from being accidentally changed or deleted. Preservation Policies control when a file or folder cannot be modified or deleted.
Note: It’s important to note that data retention policies do not affect data stored outside Google Drive, such as email or documents stored in other cloud services. Additionally, retention policies do not protect against malicious activities like hacking or unauthorized access.
Data Retention Isn’t The Same as Backups
Data retention is a specific set of policies and data management practices that govern how long data is retained and what happens to it when it ages out, and at intermediate steps along the way. Critically, it’s worth recognizing that data retention doesn’t mean backups and storage. A backup is a copy of data stored in a secondary location to restore the data if the primary copy is corrupted, deleted, or otherwise broken or lost. Data retention policies govern both operational data and archived or backed-up data.
The long and the short of it is that you may need to purge data after it reaches a certain age; alternatively, you might not be required to, but you may want to ensure that old, unused data doesn’t clutter up your storage and backup systems.
Data retention policies come with plenty of benefits.
- You can set them to ensure compliance with any regulations and laws.
- They cut back on storage costs associated with storing unnecessary data.
- They help ensure that all of your data is immediately relevant, and you don’t risk digging up and working on outdated files and data.
- They help protect you and your customers, workers, and vendors from legal discovery and other potential issues that can crop up if you’re storing unnecessary data.
- Data you don’t have can’t be leaked in the event of a data breach.
Creating a data retention policy isn’t hard; enforcing it can be much more challenging.
How to Structure a Data Retention Policy
Creating a data retention policy is a team effort and will involve numerous people from different company parts. Most notably, it should involve key stakeholders like executives, someone from the legal team familiar with legal and compliance regulations that govern your business, and someone from IT who can help implement the data retention policy directly.
The first step is to pin down any regulatory and legal requirements. Violating these requirements doesn’t just breach industry rules, and it can earn your company fines or other penalties.
Next, determine your flow. For example, imagine you need to keep a specific kind of data for seven years, but you know that the data ends up refreshed every year; you can create annual archives so that only the most recent and relevant data is “active”; anything from years 2-7 is stored in an archive, and anything that ages out of that archive is deleted.
Define policies as relevant to different data types, as well. For example, confidential customer information will have various regulations that apply to it than employee financial data, which will have multiple policies than business process data. Your data retention policy is a series of different policies governing different types and formats of data.
At this point, you need to know who will be responsible for implementing and enforcing compliance, the tools they’ll use to audit and review your data, and more. All of this can vary depending on your ecosystem and the kinds of data you handle.
Once you’ve drafted your policy and gotten it approved, all you need to do is implement it.
Does Google Drive Offer Data Retention Tools?
Yes – sort of.
Google Drive directly does not offer data retention tools. However, Google does, through a separate-but-related tool called Google Vault.
Google Vault is a cross-Google tool to manage data stored not just in Google Drive but several other Google services across Workspace. It offers the ability to set data retention policies, place holds on specific files or ranges of files that need to be exempted temporarily or permanently, search through and export files and data, and so on.
Here’s their coverage:
- Gmail – Vault manages retention, holds, search, and export and has specific policies regarding tertiary Gmail messages, such as those sent in confidential mode, those generated by other Google services like Calendar, draft messages, and more. You can read the whole Gmail policies here.
- Drive – Vault manages retention, holds, search, and export and has specific policies regarding files and items stored in Drive, such as those generated by Docs, Sheets, Slides, Forms, Sites, Drawings, Google Meet meetings, poll logs, Jamboard files, and even encrypted files. You can read more here.
- Groups – Vault manages retention, holds, search, and export and has specific policies regarding conversation history and archiving, message availability and indexing, and more.
- Chat – Vault manages retention, holds, search and export, and special rules requiring history being enabled, message attachment storage, and external messages. Of particular note is that rich text formatting isn’t maintained. Find out more here.
- Hangouts – Vault uses Chat rules for retention and holds and Drive for search and export.
- Voice for Workspace – Vault manages retention, holds, search, and export, and has special rules depending on your account type; for example, Voice users have to have a specific license to use Vault, and some data isn’t covered, like transcripts involving a multi-user phone number. You can read more here.
- Sites – Vault manages retention through site-specific rules, holds via Drive rules, search, and export through Drive as well.
Note: One thing to mention here is that if you delete a user, all data associated with that user will be deleted as well, even if data retention would otherwise require you to keep it for a period beyond that date. As such, a good practice is to suspend users until their data ages out instead.
If all of this sounds needlessly complicated, know that Vault controls pretty much everything except for edge cases, data that is controlled by third-party apps or services instead of native Workspace apps, and data that generally wouldn’t be archived anyway. They also default to specific rules if some data falls under multiple categories.
It all makes more sense once you dig into it.
One special note to make is that Google Vault only covers data stored and managed on Google services.
That means if you have data stored offline or on user devices, and use Google Drive as a backup and storage solution, Vault will only handle the data stored on Drive. The local user copies will need to be controlled individually. Moreover, even if Vault deletes data past its expiration from Drive if a user’s syncing restores it, Vault won’t re-delete it. You will need tools and management to handle retention and deletion for non-Google data sources as well, so it all works together.
What Does Google Vault Cost?
That is, Google Vault is available as a free tool to any account using Google Workspace through Business, Education, or Enterprise editions. Users of free non-Workspace accounts, Workspace for Nonprofits, and workspace starter plans don’t have access to Vault.
If you have a compatible edition, you can also purchase Vault licenses as an add-on by paying a $5 per month for each additional user.
You can read more about their pricing tiers and whether or not it’s worth the upgrade over here.
How to Set Data Retention Policies in Google Vault
How can you actually set rules in Vault to cover your retention, hold, and deletion policies?
The first step is what we outlined above; actually developing your policies. Until you know what your policies are, you can’t set automated rules to handle them. Once you’re ready to build automated policies, you can visit vault.google.com to get started.
Here’s how you create a retention policy.
- Sign into Google Vault using a super admin account.
- Click on Retention.
- Choose “custom rules” at the top to create new rules.
- Pick a service for the rules to apply to, like Drive.
- Determine and choose whether the policy will apply to specific shared drives, all shared drives, or all drives within a specific organizational unit (and pick the unit or drives).
- Choose your retention period and what action to take once that period elapses. You can choose between purging permanently deleted items or purging all items in the Drive that expire.
- Save the rule.
You can create numerous rules, including rules that overlap. The shortest duration for a file is the one that will govern when it is deleted.
Note: Remember that holds take priority over retention rules; they’re meant to put a hold on the deletion order until the data is reviewed or released. Holds usually go hand-in-hand with audits, so they’re an essential tool.
What are “Matters”?
When creating data retention rules, you’ll see a menu labeled Matters. What are they?
A “Matter” is Google’s term for a container that contains holds, searches, and exports related to a specific eDiscovery project. For example, if a lawsuit requires discovery to pull all data between specific dates or related to certain users, you can create searches and filters to find that data and manage them through a single Matter. Specifically, a Matter includes saved search queries, holds, accounts, exports, and an audit log of user activity relating to it.
The idea behind matters is to make it easier for organizations to ensure that all the data related to their projects is organized, preserved, and accessible when needed. With matters, organizations can also easily comply with their legal requirements regarding data retention.
When creating a matter in Google Vault, users can choose from matter-level retention and item-level retention. Matter-level retention applies to all items in the matter, while item-level retention allows users to specify different retention periods for individual items; this flexibility gives organizations more control over their data retention policies.
Essentially, any time something requires you to interface with your data outside the normal flow of setting retention rules, it can be contained in a single Matter to keep it together. You can read more about how these Matters work here.
Using Filerev for Retention Policy Enforcement
You may have noticed that not all Google account holders have access to Vault. What if you want to set up and manage data retention policies but you can’t access Vault to do it yourself?
In these cases, I recommend Filerev. While the app doesn’t have automatic retention rule enforcement, it does have the ability to sort your files by date and data type, so you can run searches to find data that age out of your policies and purge it through Filerev directly. It’s a very useful feature and one of many ways that Filerev can help you with managing your data throughout a Google account.
Filerev is also useful in other ways, even if you have access to Vault naturally. I highly recommend giving it a try, so you can see what it can do for you. If you have any questions, as well, feel free to ask me. I’m happy to help!