6 min read

Correlating Defender for Endpoint and Global Secure Access Logs

Correlating Defender for Endpoint and Global Secure Access Logs

Introduction

If you are working with Microsoft security solutions, you might have heard of the new kid on the block called Microsoft Global Secure Access. Being a blue teamer myself, I asked myself how we can use this new Secure Service Edge solution - and specifically the Internet Access logs - to make our detections better.

During my research I found that these logs are especially interesting when we correlate them with the EDR solution of Microsoft called Microsoft Defender for Endpoint. If you want to learn how you can do this, make sure to keep reading.

Log export

Before we can correlate the logs between MDE and GSA, we need to have the logs somewhere stored. For this we can choose if we want to do the correlation in Microsoft Sentinel or in Defender XDR. Even if you want to do the correlation in Defender XDR, we have to ingest the GSA NetworkAccessTraffic logs into Microsoft Sentinel first. To do this, navigate to entra.microsoft.com and go to Identity > Monitoring & Health > Diagnostic Settings. Here you can choose to send the logs to your Microsoft Sentinel Log Analytics workspace.

Sentinel correlation

If you want to do the correlation in Microsoft Sentinel, the second step is to ingest the DeviceNetworkEvents table into Sentinel. Be aware that this can become very costly, since this is a lot of data that will be stored! Go to the Microsoft Defender XDR data connector and enable the DeviceNetworkEvents table:

💡
Be aware that this option is more costly compared to doing the correlation in Defender XDR

Defender XDR correlation

If you want to do the correlation in Defender XDR, you can use the Unified SOC Platform of Defender XDR. To do this, you need to couple a Microsoft Sentinel workspace in Defender XDR via Settings > Microsoft Sentinel > Workspaces.

Correlating GSA and MDE

While investigating how to correlate the logs between MDE and GSA, I found that there is no unified ID we can use to find a transaction log in GSA that is related with a connection log in MDE. Because of this, I wanted to correlate the logs based on the '5-tuple':

  • SourcePort
  • SourceIP
  • DestinationPort
  • DestinationIP
  • Protocol

Unfortunately, this was not as easy as I thought it would be.

Challenges

In this chapter I will explain the challenges of trying to correlate MDE and GSA logs based on a 5-tuple, and how I handled them.

Source IP

The first challenge with the 5-tuple correlation is that the SourceIP is not the same for GSA and MDE. When looking at the Source IP in GSA, we get the public IP Address of the device, while in MDE we get the private IP Address of the device.

GSA Source IP
MDE Source IP

To fix this, we will be correlating on Device ID instead. You can find more in this in the section below.

Device ID

Even though we need to collection on Device ID instead of Source IP, the Device ID in the GSA and MDE logs are not the same. In GSA, they use the Entra ID Device ID while in MDE logs they use the Defender XDR Device ID.

MDE Device ID
GSA Device ID
MDE and GSA Device ID Difference

Luckily, the mapping between the Entra ID Device ID and the Defender XDR Device ID can be found in the DeviceInfo table!

Destination IP

The Destination IP Addresses in GSA and MDE are also not the same. This is because MDE records the GSA Point-Of-Presence IP Address once GSA is installed as 'RemoteIP' instead of the real IP Address of the destination.

Destination IP in GSA
Destination IP in MDE
Destination IP in GSA Diagnostic Logs

To fix this, we will be correlating on the Destination FQDN instead, since this is the same in both logs and is related to the Destination IP.

Source Port

The fourth challenge is that the Source Port mapping in MDE and GSA is not reliable to work with. When looking at the Source Ports in MDE and GSA I could never find the same ports.

Source Ports in MDE
Source Ports in GSA

Even though this is optional, we can do the correlation on Source Process Name instead. This is not really the same as correlating on Source Port and is therefore not ideal, but it is better than nothing 😄.

The query

To summarize the query, we basically correlate on the following properties instead of the 5-tuple:

  • Device ID (via table join with DeviceInfo)
  • Destination FQDN
  • Destination Port
  • Protocol
  • Initiating Process

Below you can find the KQL query:

💡
This query will soon be added to the following GitHub repo: https://github.com/HybridBrothers/Hunting-Queries-Detection-Rules
let gsa_events = NetworkAccessTraffic
    // Join DeviceInfo to get MDE DeviceID
    | join kind=inner ( 
        DeviceInfo
        | distinct DeviceId, AadDeviceId
    ) on $left.DeviceId == $right.AadDeviceId
    // Remove Entra Device ID from GSA logs
    | project-away DeviceId
    // Rename MDE Device ID to DeviceId column
    | project-rename DeviceId = DeviceId1;
// Get all MDE network events
DeviceNetworkEvents
// Get HTTP details if HTTP connection is logged
| extend HttpStatus = toint(todynamic(AdditionalFields).status_code),
    BytesIn = toint(todynamic(AdditionalFields).response_body_len),
    BytesOut = toint(todynamic(AdditionalFields).request_body_len),
    HttpMethod = tostring(todynamic(AdditionalFields).method),
    UrlHostname = tostring(todynamic(AdditionalFields).host),
    UrlPath = tostring(todynamic(AdditionalFields).uri),
    UserAgent = tostring(todynamic(AdditionalFields).user_agent),
    HttpVersion = tostring(todynamic(AdditionalFields).version)
// Join GSA logs
| join kind=inner gsa_events on 
    DeviceId,
    $left.RemoteUrl == $right.DestinationFqdn,
    $left.RemotePort == $right.DestinationPort,
    $left.Protocol == $right.TransportProtocol,
    $left.InitiatingProcessFileName == $right.InitiatingProcessName
| project-rename TimeGeneratedGsa = TimeGenerated1, TimestampMde = Timestamp
| project-away Type, TenantId, TimeGenerated, TenantId1, Type1, DeviceId1, AadDeviceId

What are the benefits?

You might ask yourself what the benefits are of correlating MDE and GSA events together? Let's go over them here.

Detection benefits

If you do detection engineering, you know that the more information is available the better. With MDE, we have detailed information on the process that generated specific network events, while we have more detailed information on HTTP Headers in GSA logs. When combining this together we can:

  • Try to reduce the Benign-Positive rate of our detections.
  • Try to create better detections using the HTTP Headers.

Hunting and investigation benefits

One of the biggest benefits we found at the SOC during Threat Hunting or Investigations, is when we are looking for context how a user visited a (possible) malicious website via the browser. When you only have MDE logs, the only thing you can see is which process connected to the website (for example msedge.exe) but not how the user got there:

If we want to know how the user got there, we had to create a browser history dump to find if another webpage forwarded the user to the malicious site:

But when combining the GSA logs with MDE, we can now see the source website via the HTTP ReferrerHeader field:

This helps a lot, since we have to do far little manual actions on the device for this scenario during investigations or hunting cases.

Conslusion and nuances

The conclusion of this blogpost is that the logs of GSA and MDE together can be very powerful in a couple of scenarios. Even though correlation is not 'easily' done, we can perfectly work ourself around the challenges.

If you want to learn more about the logging capabilities and nuances of both solutions, I recommend to read one of my previous blogpost here: https://hybridbrothers.com/analyzing-mde-network-inspections/