Protect your sensitive data with Microsoft Purview

ALIF Consulting
Jan 19, 2023
4 min read

Updated: May 21, 2024

Implement Microsoft Purview Information Protection (formerly Microsoft Information Protection) capabilities to help you discover, classify, and protect sensitive information wherever it lives or travels.

These information protection capabilities give you the tools to know your data, protect your data, and prevent data loss.

Use the following sections to learn more about the available capabilities and how to get started with each one.

Know your data

To understand your data landscape and identify sensitive data across your hybrid environment, use the following capabilities:

Capability	What problems does it solve?	Get started
Sensitive information types	Identifies sensitive data by using built-in or custom regular expressions or a function. Corroborative evidence includes keywords, confidence levels, and proximity.	Customize a built-in sensitive information type.
Trainable classifiers	Identifies sensitive data by using examples of the data you're interested in rather than identifying elements in the item (pattern matching). You can use built-in classifiers or train a classifier with your content.	Get started with trainable classifiers
Data classification	A graphical identification of items in your organization that have a sensitivity label, a retention label, or have been classified. You can also use this information to gain insights into your users’ actions on these items.	Get started with Content Explorer Get started with Activity Explorer

Sensitive Information Type

Microsoft Purview provides three ways of identifying items so that they can be classified:

manually by users
automated pattern recognition, like sensitive information types
machine learning

Sensitive information types (SIT) are pattern-based classifiers. They detect sensitive information like social security, credit card, or bank account numbers to identify sensitive items. Microsoft provides a large number of pre-configured SITs or you can create your own.

Sensitive information types are used in

Trainable Classifiers

This categorization method is well suited to content that isn't easily identified by either the manual or automated pattern-matching methods. This method of categorization is more about using a classifier to identify an item based on what the item is, not by elements that are in the item (pattern matching). A classifier learns how to identify a type of content by looking at hundreds of examples of the content you're interested in identifying.

Where you can use classifiers

Classifiers are available to use as a condition for:

Office auto-labeling with sensitivity labels
Auto-apply retention label policy based on a condition
Communication compliance
Sensitivity labels can use classifiers as conditions; see Apply a sensitivity label to content automatically.
Data loss prevention

Types of classifiers

pre-trained classifiers - Microsoft has created and pre-trained multiple classifiers that you can start using without training them. These classifiers will appear with the status of Ready to Use.
Custom trainable classifiers - If you have content identification and categorization needs that extend beyond what the pre-trained classifiers cover, you can create and train your classifiers.

Data Classification

As a Microsoft 365 administrator or compliance administrator, you can evaluate and then tag content in your organization to control where it goes, protect it no matter where it is, and ensure that it is preserved and deleted according to your organization's needs. You do this through the application of sensitivity labels, retention labels, and sensitive information type classification. There are various ways to do the discovery, evaluation, and tagging, but the end result is that you may have many documents and emails tagged and classified with one or both of these labels. After you apply your retention labels and sensitivity labels, you'll want to see how the labels are being used across your tenant and what is being done with those items. The data classification page provides visibility into that body of content, specifically:

the number of items that have been classified as a sensitive information type and what those classifications are
the top applied sensitivity labels in both Microsoft 365 and Azure Information Protection
the top-applied retention labels
a summary of activities that users are taking on your sensitive content
the locations of your sensitive and retained data

You also manage these features on the data classification page:

Protect your data

Capability	What problems does it solve?	Get started
Sensitivity labels	A single labelling solution across apps, services, and devices to protect your data as it travels inside and outside your organization. Example scenarios: - Manage sensitivity labels for Office apps - Encrypt documents and emails - Apply and view labels in Power BI For a comprehensive list of supported scenarios for sensitivity labels, see the Get Started documentation.	Get started with sensitivity labels
Azure Information Protection unified labelling client	For Windows computers, extends labelling to File Explorer and PowerShell, with additional features for Office apps if needed	Azure Information Protection unified labelling client administrator guide
Double Key Encryption	Under all circumstances, only your organization can ever decrypt protected content or for regulatory requirements; you must hold encryption keys within a geographical boundary.	Deploy Double Key Encryption
Office 365 Message Encryption (OME)	Encrypts email messages and attached documents that are sent to any user on any device so only authorized recipients can read emailed information. Example scenario: Revoke email encrypted by Advanced Message Encryption	Set up new Message Encryption capabilities
Service encryption with Customer Key	Protects against viewing of data by unauthorized systems or personnel, and complements BitLocker disk encryption in Microsoft data centers.	Set up Customer Key for Office 365
SharePoint Information Rights Management (IRM)	Protects SharePoint lists and libraries so that when a user checks out a document, the downloaded file is protected so that only authorized people can view and use the file according to the policies that you specify.	Set up Information Rights Management (IRM) in the SharePoint admin centre
Rights Management connector	Protection is only for existing on-premises deployments that use Exchange or SharePoint Server or file servers that run Windows Server and File Classification Infrastructure (FCI).	Steps to deploy the RMS connector
Information protection scanner	Discovers, labels, and protects sensitive information that resides in data stores that are on-premises.	Configuring and installing the information protection scanner
Microsoft Defender for Cloud Apps	Discovers, labels, and protects sensitive information that resides in data stores that are in the cloud.	Discover, classify, label, and protect regulated and sensitive data stored in the cloud
Microsoft Purview Data Map	Identifies sensitive data and applies automatic labelling to content in Microsoft Purview Data Map assets. These include files in storage such as Azure Data Lake and Azure Files and schematized data such as columns in Azure SQL DB and Azure Cosmos DB.	Labeling in Microsoft Purview Data Map
Microsoft Information Protection SDK	Extends sensitivity labels to third-party apps and services. Example scenario: Set and get a sensitivity label (C++)	Microsoft Information Protection (MIP) SDK setup and configuration

Sensitivity Labels

Sensitivity labels from Microsoft Purview Information Protection let you classify and protect your organization's data while making sure that user productivity and their ability to collaborate aren’t hindered.

What a sensitivity label is

When you assign a sensitivity label to content, it's like a stamp that's applied and is:

Customizable. Specific to your organization and business needs, you can create categories for different levels of sensitive content in your organization. For example, Personal, Public, General, Confidential, and Highly Confidential.
Clear text. Because a label is stored in clear text in the metadata for files and emails, third-party apps and services can read it and then apply their protective actions if required.
Persistent. Because the label is stored in metadata for files and emails, the label stays with the content, no matter where it's saved or stored. The unique label identification becomes the basis for applying and enforcing policies that you configure.
Encryption. Encryption is an important part of your file protection and information protection strategy. This article provides an overview of encryption for Office 365. Get help with encryption tasks like how to set up encryption for your organization and how to password-protect Office documents.

Prevent Data Loss

To help prevent the accidental oversharing of sensitive information, use the following capabilities:

Capability	What problems does it solve?	Get started
Microsoft Purview Data Loss Prevention	Helps prevent the unintentional sharing of sensitive items.	Get started with the default DLP policy
Endpoint data loss prevention	Extends DLP capabilities to items that are used and shared on Windows 10 computers.	Get started with Endpoint data loss prevention
Microsoft Compliance Extension	Extends DLP capabilities to the Chrome browser	Get started with the Microsoft Compliance Extension
Microsoft Purview data loss prevention on-premises scanner (preview)	Extends DLP monitoring of file activities and protective actions for those files to on-premises file shares and SharePoint folders and document libraries.	Get started with Microsoft Purview data loss prevention on-premises scanner (preview)
Protect sensitive information in Microsoft Teams chat and channel messages	Extends some DLP functionality to Teams chat and channel messages	Learn about the default data loss prevention policy in Microsoft Teams (preview)

Data Loss Preventions

Organizations have sensitive information under their control, such as financial data, proprietary data, credit card numbers, health records, or social security numbers. To help protect this sensitive data and reduce risk, they need a way to prevent their users from inappropriately sharing it with people who shouldn't have it. This practice is called data loss prevention (DLP).

In Microsoft Purview, you implement data loss prevention by defining and applying DLP policies. With a DLP policy, you can identify, monitor, and automatically protect sensitive items across:

Microsoft 365 services such as Teams, Exchange, SharePoint, and OneDrive
Office applications such as Word, Excel, and PowerPoint
Windows 10, Windows 11, and macOS (three latest released versions) endpoints
non-Microsoft cloud apps
on-premises file shares and on-premises SharePoint.

DLP detects sensitive items by using deep content analysis, not by just a simple text scan. Content is analyzed for primary data matches to keywords, by the evaluation of regular expressions, by internal function validation, and by secondary data matches that are in proximity to the primary data match. Beyond that, DLP also uses machine learning algorithms and other methods to detect content that matches your DLP policies.