MIP: Notes from the field

1. Introduction

Is information protection critical or crucial to an organization?

For most of us, the answer seems to be an obvious YES. However, when it comes to directly investing in information protection mechanisms, the discussion seems to be around “Should I invest in information protection tools this year or can it be next year?”. This is possibly because information protection failures are often, a result of the failure of broader security controls.

You might have heard of the Cyber Kill Chain framework where a malicious actor goes through different phases such as reconnaissance, weaponization, delivery, exploitation etc. Typically, the original intention of intruders behind such attacks is data exfiltration. With the rapid increase in the number of security incidents worldwide, will an appropriate information protection solution be impactful to reduce the severity of such an intrusion?

I guess, we will never know answers to some of these questions. But it is safe to say that the adverse impact of these attacks will be far less if your sensitive data is identifiedencrypted, and protected against its loss.

Having a comprehensive information protection solution is often a journey and depending on where organizations are in this journey, this could be a multi-year/multi-phased program.


It is also important to note that in addition to the components discussed in this article, elements such as Identity & Access Management (IAM) and device management controls should also be part of your information protection strategy.

2. Where are your crown jewels?

These days, due to their existing Information Security policies and/or compliance requirements, most organizations are conscious of the need to protect their data. However, what they often struggle with is to define a complete, up-to-date, and clear picture of what data they want to protect. In other words, where their “crown jewels” are located.

It is also important to note that “One-size-fits-all” approach doesn’t work for information protection and you need to have an information protection model that will work for you and your organization.

Most of my current engagements require me to work on data classification and protection but very few on data discovery. However, in an ideal world, I recommend organizations to start with a data discovery phase to “take stock” of the data lying around that can then be used to create a data classification model that suits their organization.

Data discovery


To help my customers to discover their data, I recommend using a combination of tools, helping with an integrated view and can possibly cover the majority of their data repositories.

This requires organizations to have a clear view on where their data is currently stored. It also needs careful planning to determine tools appropriate for your data store.

I recommend the below for the proactive scanning:

It is also recommended to leverage below features for added visibility of your sensitive information:

  • Analytics Overview (Know your data):Overview on how the labels are being used across your tenant and what is being done with those items
  • Content and Activity explorer: Additional details on the content and activity

The same technologies can be used for auto-classification.

Data classification


One of the first questions that I usually ask my customers on every engagement is “what is your main driver for enabling MIP?”. Most of the time, “protecting data” is their answer. They are not wrong, but what they often miss is the other key advantage that MIP brings – assigning value to your information in a way that humans and systems can understand and handle (e.g. apply protection).

Designing data classification for an organization can become a complicated and endless exercise. However, I recommend the below to make this process easier.

  • Start with what already exists – Known classification models already exists and can be leveraged as a baseline while defining the model for your organization.
    • Microsoft also have released a white paper recently to help organizations create their data classification framework.
  • Make it simple – Keeping your data classification simple is extremely important when defining your data classification models. In case of end-users manually applying classification, the more complex it is, the more confused they will be on what the right classification is for their data. This often leads to information remaining as not classified and at worst, incorrectly classified. As an example, I usually recommend having a maximum of 4 and 6 top-level sensitivity labels.
  • User training – Even with a simple classification taxonomy, your end-user adoption of the solution could be low. This is often because, even though end-users understand how sensitive the information is, they may not necessarily understand what it implies (e.g. encryption, mails being blocked etc.). Unexpected implications of classification labels might lead to poor user experience and frustrations which could then lead to poor adoption of the solution. This requires end-user education, trainings, and awareness campaigns that you must plan to include in your rollout. In our experience, awareness campaigns driven through posters, emails etc. are found to be more effective than classroom based formal trainings.   
  • Designing your taxonomy: Top-label = sensitivity; sublabels = scope – MIP has a notion of optional sub-labels included inside a top label (see below picture). I always recommend designing a classification taxonomy with top labels to assign sensitivity to the information, and potential sub-labels to scope the expected recipients of this information. This ensures you keep visibility across your IT environment where your valuable assets exist. In addition, these sub-labels can be used for any other specific scenarios like exemption of security controls (e.g. no DLP, no protection).

Finally, below recommendations should also be considered while defining your data classification model.

  1. Assign all sensitive type of information to a label and configure MIP automatic/recommended classification. This will ensure appropriate classification occurs at the time of creation as users use their Office applications.
  2. Ensure all emails and new/existing documents that were opened & saved are classified using default or mandatory labeling. I recommend “mandatory labeling” for Office documents and “default label” for Outlook. It might be annoying to force users to classify every email they send during the day and therefore “default label” should be more acceptable to them. I would expect fewer documents created or edited daily and therefore suggest “mandatory labeling” for the documents.

3. Preventing Data Leakage


Data Loss Prevention (DLP) is often the ultimate goal organizations want to achieve in an information protection effort. Frequently, this is also the most complex one to implement.

Key questions to consider:

  • What data must be considered and protected from data loss?
  • Channel for possible data loss (mails, files, Teams conversations)?
  • Group of users handling data that you want to protect from data loss (groups, domain, external entities)?

Once you have answers for these, several layers of DLP should be implemented.

It might be complex to draft and implement a DLP strategy only based on sensitivity information types without context. For example, you might decide that sharing employee IDs is not permitted for most of the time except if they are shared with a specific partner organization or passport numbers are not allowed to be shared at all. Compliance requirements may require organizations to create policies/rules to protect every sensitive information type which eventually makes DLP mechanisms complex to implement and monitor. Therefore, DLP technologies should leverage sensitivity labels as the first (but not the only) line of defence. In addition, when encryption is applied with sensitivity labels, this also acts as your last layer of protection.

DLP based on sensitivity labels is probably the most agile method, but this would require all information to be correctly classified. Being realistic and pragmatic, we all know that this cannot be guaranteed. For this reason, DLP based on sensitive info types would still be required.

As you will be using both, you also need to ensure that the actions you perform in the case of detection of sensitive information type remain aligned to the actions you perform for the corresponding labels.

DLP based on Sensitivity labels

Emails: I usually recommend my customers to use Exchange Transport Rule (ETR) to block emails based on sensitivity labels.

Sharing from Cloud repositories: Currently, only MCAS using file policies has the capacity to restrict sharing to external users based on the sensitivity label.

Note: Microsoft is building a Unified DLP platform based on M365 DLP and will become sensitivity label aware. Once released, DLP mechanisms based on sensitivity labels should be migrated to M365 DLP.

DLP based on sensitive info types

Exchange Online, SharePoint Online, OneDrive for Business and Teams: I recommend using O365 DLP for sensitive info types in Office 365.

For 3rd party cloud apps: MCAS File policies can be leveraged for 3rd party cloud apps.

It is important to note that it takes additional time for the Office Data Loss Prevention (DLP) policy to scan the content and apply rules to help protect sensitive content. If external sharing is turned on, sensitive content could be shared and accessed by guests before the Office DLP rule finishes processing. To  prevents guests from accessing newly added files until at least one Office DLP policy scans the content of the file, it is also recommended to enable some of the advanced capabilities in SharePoint Online such as “Sensitive by Default”.

The DLP technologies combined with Sensitive labels allow to apply the most powerful DLP approach when working in conjunction (e.g. block sharing of sensitive information, except if classified as Confidential).

4. Adapt your information protection strategy to new ways of work (BYOD, remote working)


In the old world, remote access using VPN or similar solution was always there. However, in the new world, majority of us take work laptop back home, or even use BYOD and work from home (in most cases without using VPN); so, the definition of a meaningful boundary where your data resides is distorted. Therefore, the traditional “protect your castle” mindset need to be adapted to reflect the new reality of life – remote working and boundaryless data.

Based on the paradigm shift of “Identity as the new perimeter” and “Zero Trust”, we should think about how to establish an information protection model for the new ways of working – people using shared computers, working from coffee shops/home/untrusted public networks etc. As you focus on protecting your data (both at rest and in-transit), planning and implementing identity protection (such as MFA for 100% of your employees), leveraging Secure Score to provide intelligent insights, and protecting your endpoints (such as using Advanced Threat Protection solutions, device management using Intune etc.) should also be considered and planned for.

But do you stop there? Do we need to be watchful of those curious people craning forward to look at what you have on your screen in the coffee shop? Will leave that up to you to figure out.

5. Summary

undefinedInformation protection Solution Summary

Solution components discussed in the previous sections can be summarized as below:

Use caseComponents involvedExpected behavior
Scan and classify on-premises filesAIP ScannerAIP ClientAIP scanner – Time to scan depends on the number and size of files + number and performances of scanners.AIP client – Only scan documents opened within Office clients where AIP add-in is installed.
Scan and classify cloud filesOffice 365 auto-labeling for Office 365 repositoriesMCAS File Policies for 3rd party cloud appsOffice 365 auto-labeling – Run in simulation mode at least 24h before applying labels. 
Protect sensitive data from being shared with unauthorized recipients (emails, SharePoint/OneDrive links, Teams chats)Sensitive by default (SBD) enabled in SharePoint OnlineSensitivity label with encryptionExchange Transport Rules (ETR)Office 365 DLPMCAS File PoliciesMIP site and group settings in Office 365SBD prevents external sharing of files until the file is scanned by Office 365 DLPEncryption of Sensitivity label prevents the consumption of information from any unauthorized user regardless where the data resides.ETR prevents mail’s exchanges based on recipients and sensitivity of the mail.Office 365 DLP centralizes DLP mechanisms of Exchange, SharePoint, OneDrive, and Teams to prevent sharing sensitive information.Prevent sharing based on sensitivity labels and info types using MCAS.Prevent guest access invitations on containers (O365 groups, SPO sites, Teams channels) with sensitive labels applied.
Protect sensitive information from being exfiltrated to unauthorized/unmanaged resources (endpoints, mobile/cloud apps) Intune MAMMCAS Session policiesConditional AccessMIP site and group settings in Office 365Protect sensitive information on Windows endpointsProtect sensitive information on managed mobile applicationsPrevent downloading sensitive information on untrusted endpointsControl access of untrusted devices to “containers” based on the sensitivity label assigned (full access, web-only access without download, print nor sync options, or block access).

Table 2 – Information protection Solution Summary

Thanks for reading and we hope you find this useful!