15 min read

Cyber back to school: Microsoft Token Theft Unveiled

Cyber back to school: Microsoft Token Theft Unveiled

Introduction

I am thrilled to participate in the Cyber Back to School initiative hosted during cyber awareness month! This session is all about Primary Refresh Token VS Access Token stealing in Microsoft Entra ID, and will show the practical countermeasures for each of them. I preferred to write a blog post for this instead of a PowerPoint deck since there are a lot of technical details and references I want to cover.

Refresh VS Access token

Before we can go into how PRT and Access token stealing work, we need to cover the basics of what the difference is between these two tokens. This is vital to understand why certain countermeasures only work for one specific scenario.

Access tokens

As explained in the Microsoft documentation, access tokens are a type of security token designed for authorization, granting access to specific resources on behalf of an authenticated user. These tokens contain information (called 'claims') which will eventually determine whether a user is authorized to access a specific resource.

Each access token has three parts being:

  • Header - Provides information on how to validate the token.
  • Payload - Contains the token claims about the user and application that is attempting to call a service.
  • Signature - Raw material used to validate the token.

Each piece is separated by a period (.) and Base 64 encoded. An example of an access token can be found below, where the red part is the header, the blue part is the payload, and the green part is the signature:

Using a website such as jwt.ms, we can view what is inside this access token:

The blue part being the 'payload' is the part that contains the claims that are interesting for an attacker. A couple of interesting claims include:

  • aud - Identifies the intended audience of the token (client ID or resource URI)
  • iat - Timestamp that indicates when the authentication of this token occurred
  • exp - Timestamp on or after which the token must not be accepted for processing
  • amr - Identifies the authentication method of the subject of the token
  • azp- The application ID of the client using the token. Can act as itself or on behalf of a user. Typically represents the application object in Entra ID but can also represent a service principal object.
  • azpacr - Indicated the authentication method (0 for public client, 1 for client id and secret, 2 for certificate)
  • scp - The set of scopes exposed by the application for which the client application has requested consent (only for user tokens)
  • roles - The set of permissions exposed by the application that the requested user has been given permissions to call.
  • sub - The principal associated with the token.

If you want to know all the claims that Microsoft Entra ID issues in their access tokens, you can use the following link here.

The amr claim

One of the interesting claims of an access token is the amr claim. This claim contains the authentication method that was used, and tells you if the user for example used MFA to authenticate. In the example below, you can see that I authenticated to a service using my password and MFA.

💡
This claim can be used by the application to check if MFA was performed, it is not used by conditional access since the access token gets issued after conditional access was passed.

The access claims

For Entra ID access tokens, there are 4 claims which can be used to provide access:

  • roles - Contains the set of permissions that the calling user or application has been given permission to call. When using client-credential flow these permissions are used in place of user scopes. For user tokens, this set of values contains the assigned roles of the user on the target application.
  • groups - Provides the object IDs that represent the group memberships of the subject.
  • scp - Tells you the scopes to which the subject has access.
  • wids - Contains the tenant-wide roles assigned to the user.

Depending on these claims, the access token will provide access to certain resources or permissions. Below you can find an example of the wids claim of a normal user:

While the below wids claim is one from a Global Administrator:

Both claims have the 'b79fbf4d-3ef9-4689-8143-76b194e85509' ID, which is the default for all users. But when you search for the '62e90394-69f5-4237-9190-012177145e10' ID in the Microsoft documentation, you will find that this is the ID for the Global Administrator role:

Access token lifetime

An access token has a default variable lifetime between 60 - 90 minutes (75 minutes on average). This variation improves service resiliency, and prevents hourly spikes in traffic to Entra ID. This token lifetime is important to keep in mind, as there are a lot of misconceptions regarding these lifetimes.

In below figure you can find an example of the Entra ID OAuth 2.0 auto code grant flow, including the following steps:

  1. Request of a new token using the OAuth bearer or refresh token
  2. Return of a new access and refresh token
  3. Calling the destined service with access token in authorization header

Here you can see that the access token is eventually used to access the application, and is not passing through the Identity platform anymore. This means that once the access token is provided, you have no control over it. As the Microsoft learn explains, "Access tokens can be a security concern if access must be revoked within a time that is shorter than the lifetime of the token".

There are a lot of misconceptions out there when talking about conditional access sign-in frequency. Some people think that when they configure a sign-in frequency policy of 1 hour, the user will always be prompted to re-authenticate every hour. This is not always the case, since the maximum time between credential prompts is the token lifetime (which ranges between 60 and 90 minutes) plus the sign-in frequency interval. Below you can see an example where the total time between two credential prompts is 149 minutes, because there was a sign-in triggered before the Sign-in frequency threshold was exceeded:

💡
max access token lifetime + SIF = max. time between authentication prompts (90 min + 60 min = 150 min)

Device claims?

Notice that an access token does not contain a claim representing the device of the user. This will be important to note for the end of the blog post 😄.

Refresh tokens

A refresh token is used to obtain new access tokens and refresh token pairs when the current access token expires. They can be used to acquire extra access tokens for other resources, since they are bound to a combination of user and client rather than to a resource or tenant. This is very important, since this will explain later in this blog post why refresh tokens are so valuable. A refresh token is encrypted, and only the Microsoft identity platform can read it.

The PRT token

The Primary Refresh Token is a key artifact in Microsoft Entra authentication. It is a JSON Web Token (JWT) that enables single sign-on (SSO) across applications used on a device.

So what does it contain? Since it is an opaque blob, you cannot see what is inside a PRT. Microsoft mentions that the PRT contains the same claims which can be found in a refresh token, but includes some device-specific claims such as:

  • Device ID - Used to determine authorization for Conditional Access based on the device state or compliance.
  • Session key - Encrypted symmetric key used to prove possession of the PRT when it is used to obtain tokens for applications.
💡
Aha! This is an important one to note down. The Device ID claim in the PRT is the one conditional access used to check the device state and compliance during an access token request.

When is a PRT used? Device registration is a prerequisite for PRTs to be used and thus being able to do device-based authentication to Entra ID. For Microsoft Entra Joined and Hybrid Joined devices, the Microsoft Entra CloudAP plugin is the primary authority of the PRT, while for Microsoft Entra Registered devices, the Microsoft Entra WAM plugin is the primary authority of the PRT (since Windows logon isn't happening with the Microsoft Entra account).

When CloudAP is the primary authority of the PRT, the PRT gets renewed every 4 hours. When WAM is the primary authority, the PRT gets renewed during access token requests (whether it is silent or with user interaction). During PRT renewals conditional access policies are not validated, this is only the case when requesting access tokens with the PRT.

As explained in the Microsoft docs, a PRT can get an MFA claim in two scenarios:

  • Sign in with Windows Hello for Business
  • MFA during WAM interactive sign-in

When an MFA-based PRT is used to request access tokens, the MFA claims get transferred to those tokens.

💡
Another important note here, is that the PRT can contain either MFA claim that was used during WAM MFA and Windows Hello for Business!

Token lifetime

Refresh tokens have a longer lifetime than access tokens. The default lifetime for most refresh tokens is 90 days, except for single-page apps where it is 24 hours. For every use of a refresh token, the token replaces itself with a fresh one.

As explained in the Access Token lifetime part of this blog post, the lifetime of the refresh token can be manipulated by using Conditional Access Sign-in Frequency. By using this, conditional access can check after how much time a user needs to login interactively again, when the refresh token is provided to request a new access token. Keep in mind the below, where the sign-in frequency interval does not always equal the time between interactive logins.

💡
max access token lifetime + SIF = max. time between authentication prompts (90 min + 60 min = 150 min)

Sign-in frequency works best with applications that implement OAuth2 or OIDC protocols according to the standard, as long as they don't drop their own cookies and are redirected back to Microsoft Entra ID for authentication on a regular basis. In the past, sign-in frequency only applied to first-factor authentication, while it now also applies to the second-factor as well.

On Microsoft Entra and Hybrid joined devices, unlocking the device or signing in interactively refreshed the PRT token every 4 hours. The last refresh timestamp compared with the current timestamp must be within the time allocated in SIF policy for the PRT to satisfy SIF and grant access to a PRT that has an existing MFA claim.

On Microsoft Entra Registered devices, unlock or sign-in does not satisfy the SIF policy because the user is not accessing the device with a Microsoft Entra account. For these devices, Microsoft Entra WAM plugin refreshes the PRT during native application authentication.

Revoking tokens

Let's say that a user is compromised and you want to revoke the users tokens. This can be done by clicking the 'Revoke session' button in Entra ID, or by using the below PowerShell command:

Revoke-MgUserSignInSession -UserId $User.Id

But remember earlier when we mentioned that once an access token has been given, you have no control over it? The same counts for revoking tokens. Revoking a token in Entra ID revokes the refresh token in the background, meaning they cannot get a new access token but the current provided access token is still valid.

Another caveat to keep in mind is that you cannot revoke tokens for B2B users in the resource tenant, this needs to be done in the home tenant of the B2B user.

💡
This means that when a B2B user is compromised, you need to disable the account instead of trying to revoke the sessions.

PRT VS Access token stealing

During my time as a consultant, I noticed that a lot of organizations mix the two concepts of Access Token stealing and PRT token stealing. This results in organizations implementing certain countermeasures for a scenario that not apply to the countermeasure, which eventually gives a false sense of protection. Therefore, I wanted to set things straight, and explain the differences for each scenario.

Access token stealing

Before going into what I call 'Access token stealing', I would like to refer to Adversary-in-the-Middle (AITM) Entra ID Attack and Defense playbook. This is a great community projecte created by Sami Lamppu and Thomas Naunheim which covers a lot of Entra ID related attacks and how you can prevent and detect them!

During access token stealing, an adversary tricks a user into visiting a malicious website which basically proxies the real Microsoft website. By entering credentials in the Microsoft login page through the malicious website, the attack is able to steal not only the password, but also the access token that is being used.

As explained before, once an attacker gets an access token, you have no control over it anymore. The attacker can now use the access token to access an application during the lifetime of the access token (between 60 and 90 minutes)

Mitigate access token stealing

There are multiple mitigations you can configure to prevent access token stealing as explained in the Azure Attack and Defense playbook. Below is a summary of the most common mitigations:

  • Require device state
  • Microsoft profile in Global Secure Access
  • Entra ID token protection in conditional access
  • Continuous access evaluation
  • Phishing-resistant MFA
    • Passkeys
    • Windows Hello for Business
    • Certificate-based authentication
  • Entra ID Protection
  • Defender for Cloud Apps session proxy

Let's talk a bit more about the first four mitigations.

Require device state

In most of the blog posts regarding AITM token stealing you will find that requiring a compliant device is a mitigation step to protect against this attack. While technically true, requiring any device state will help as well. If you for example require a device to be registered, the AITM attack will be prevented as well. This is because Cloud Authentication Provider (CloudAP) & Web Account Manager (WAM) plugins which are responsible for requesting PRTs and using PRTs for getting access tokens, only use Microsoft URI's to do so (remember that PRTs are the refresh tokens with the Device ID's in them). Meaning if an AiTM site is proxying your request, they cannot use the PRT of your device to do so, resulting in not having the Device ID, and getting blocked by conditional access since the source device is not known. More info can be found in the Microsoft docs and in the AiTM playbook.

Microsoft profile in Global Secure Access

With the Microsoft profile in Global Secure Access, you can protect your users by requiring a 'compliant network location' in conditional access. What this will do in the background is add the Microsoft SSE POCs as trusted named locations, which can then be used in your conditional access policies.

Once configured, AiTM attacks will not succeed since the AiTM site is not able to tunnel the request through the Microsoft SSE backbone.

💡
An interesting thing I noticed is that the destination FQDN's of the application do not need to be acquired by the Microsoft profile for compliant network restrictions to work. Since the access token request is sent to the 'Entra ID and MSGraph' endpoints (login.microsoftonline.com), any application integrated with Entra ID conditional access can be used to require compliant networks, since the login.microsoftonline.com FQDN is included in the Microsoft profile.

In the example below I created a conditional access policy to block access to a third-party application Atlassian, except when the connection comes from a compliant network:

This is an example of how compliant network enforcement works for third-party applications, as long as they are SSO integrated with Entra ID.

Entra ID token protection

Token protection creates a cryptographically secure tie between the token and the device (client secret) it's issued to. Without the client secret, the bound token is useless. When a user registers a Windows 10 or newer device in Microsoft Entra ID, their primary identity is bound to the device. What this means: A policy can ensure that only bound sign-in session (or refresh) tokens, otherwise known as Primary Refresh Tokens (PRTs) are used by applications when requesting access to a resource. By using this, non-registered devices that do not have a PRT (like the AiTM proxy), will not be able to get an access token.

Be aware that there are POC's that can evade Entra ID token protection using Device Code phishing. To remediate this, it is recommended to block unknown or unsupported device platforms, or block the device code flow in conditional access.

Before configuring the policy, be aware of the prerequisites and limitations this brings.

Continuous access evaluation

With continuous access evaluation, you can make sure that when a 'critical event' such as a token revocation happens, the access token (of which you had limited control if you remember) gets revoked. This makes sure that the user is denied access within minutes instead of one hour or more. Before you can use this, you will need a CEA-capable client and applications, which you can find here.

More information on CAE can be found in the Microsoft learn.

PRT stealing

What I call 'PRT stealing' is described in the AzureAD-Attack-Defense playbook as 'ReplayOfPrimaryRefreshToken'. It is an attack containing various scenario's which eventually all come down to one thing, being able to fiddle with or replay the Primary Refresh Token to request new access tokens. For these attack scenario's it is important to keep in mind that the attacker needs to have some kind of execution capabilities on the endpoint device. I wrote a blog post about one of these attack scenarios in the past as well which you can find here.

The danger of PRT stealing

The main danger of PRT stealing is that the attacker is able to act from the device. Remember when we talked about access tokens not having a device claim, resulting in requiring a device state to mitigate an access token AiTM attack? Since the PRT token does have the device ID, the attacker can now request new access tokens using the PRT token and act as the Entra ID Registered, Hybrid Joined, Joined, or compliant device! This means that device state conditional access policies will not mitigate the risk of PRT token stealing!

Another reason why a PRT is so valuable is because a PRT is not bound to a resource or tenant, but rather to a combination of user and client. This means that a PRT can be used to request different types of access tokens for different resources. In practice, there are some things that needs to be in order before this can be done, which I explain a bit more here.

Mitigate PRT token stealing

When looking at the AzureAD-Attack-Defense playbook you will see that the mitigations are mainly device-based, or are about detecting possible token misuse after the attack scenario completed. This is kind of the key point here. To be able to protect against this scenario, you will need to have hardening on your endpoints, and have to make sure that you follow up on alerts Defender XDR throws. Because once the PRT token has been stolen and is being used to request new access tokens, you will have a hard time knowing that it happened. More information on specific mitigations can be found in my previous blog post.

Conclusions

To summarize this blog post, I wanted to note down some conclusions and important statements to remember.

Access tokens are used to access an application, while refresh tokens are used to request new access tokens once they expire. An access token has a variable lifetime between 60 and 90 minutes.
AiTM captures the access token via the AiTM proxy server. But since the request for the access token does not contain the PRT (and thus the device claim), an AiTM proxy server is not able to request an access token when device state is required in conditional access.
When an adversary can steal a PRT token from a device, the adversary is able to request new access tokens using the PRT and the possible MFA claims within it.
You can use multiple mitigative controls for AiTM access token stealing, but PRT stealing is far harder to protect against. Protecting against PRT stealing is mainly done on the device itself, or by monitoring for suspicious behavior after the attack succeeded.
Once an access token is given, you have limited control over it. Revocation and sign-in frequency is done on the refresh token, which allows for attackers to use the access token for as long as it's lifetime is left. Continuous Access Evaluation can help solve this, but is limited to CAE capable clients and applications.