Correlating Defender for Endpoint and Global Secure Access Logs

Introduction
If you are working with Microsoft security solutions, you might have heard of the new kid on the block called Microsoft Global Secure Access. Being a blue teamer myself, I asked myself how we can use this new Secure Service Edge solution - and specifically the Internet Access logs - to make our detections better.
During my research I found that these logs are especially interesting when we correlate them with the EDR solution of Microsoft called Microsoft Defender for Endpoint. If you want to learn how you can do this, make sure to keep reading.
Log export
Before we can correlate the logs between MDE and GSA, we need to have the logs somewhere stored. For this we can choose if we want to do the correlation in Microsoft Sentinel or in Defender XDR. Even if you want to do the correlation in Defender XDR, we have to ingest the GSA NetworkAccessTraffic logs into Microsoft Sentinel first. To do this, navigate to entra.microsoft.com and go to Identity > Monitoring & Health > Diagnostic Settings. Here you can choose to send the logs to your Microsoft Sentinel Log Analytics workspace.

Sentinel correlation
If you want to do the correlation in Microsoft Sentinel, the second step is to ingest the DeviceNetworkEvents table into Sentinel. Be aware that this can become very costly, since this is a lot of data that will be stored! Go to the Microsoft Defender XDR data connector and enable the DeviceNetworkEvents table:

Defender XDR correlation
If you want to do the correlation in Defender XDR, you can use the Unified SOC Platform of Defender XDR. To do this, you need to couple a Microsoft Sentinel workspace in Defender XDR via Settings > Microsoft Sentinel > Workspaces.

Correlating GSA and MDE
While investigating how to correlate the logs between MDE and GSA, I found that there is no unified ID we can use to find a transaction log in GSA that is related with a connection log in MDE. Because of this, I wanted to correlate the logs based on the '5-tuple':
- SourcePort
- SourceIP
- DestinationPort
- DestinationIP
- Protocol
Unfortunately, this was not as easy as I thought it would be.
Challenges
In this chapter I will explain the challenges of trying to correlate MDE and GSA logs based on a 5-tuple, and how I handled them.
Source IP
The first challenge with the 5-tuple correlation is that the SourceIP is not the same for GSA and MDE. When looking at the Source IP in GSA, we get the public IP Address of the device, while in MDE we get the private IP Address of the device.


To fix this, we will be correlating on Device ID instead. You can find more in this in the section below.
Device ID
Even though we need to collection on Device ID instead of Source IP, the Device ID in the GSA and MDE logs are not the same. In GSA, they use the Entra ID Device ID while in MDE logs they use the Defender XDR Device ID.



Luckily, the mapping between the Entra ID Device ID and the Defender XDR Device ID can be found in the DeviceInfo
table!
Destination IP
The Destination IP Addresses in GSA and MDE are also not the same. This is because MDE records the GSA Point-Of-Presence IP Address once GSA is installed as 'RemoteIP' instead of the real IP Address of the destination.



To fix this, we will be correlating on the Destination FQDN instead, since this is the same in both logs and is related to the Destination IP.
Source Port
The fourth challenge is that the Source Port mapping in MDE and GSA is not reliable to work with. When looking at the Source Ports in MDE and GSA I could never find the same ports.


Even though this is optional, we can do the correlation on Source Process Name instead. This is not really the same as correlating on Source Port and is therefore not ideal, but it is better than nothing 😄.
The query
To summarize the query, we basically correlate on the following properties instead of the 5-tuple:
- Device ID (via table join with
DeviceInfo
) - Destination FQDN
- Destination Port
- Protocol
- Initiating Process
Below you can find the KQL query:
let gsa_events = NetworkAccessTraffic
// Join DeviceInfo to get MDE DeviceID
| join kind=inner (
DeviceInfo
| distinct DeviceId, AadDeviceId
) on $left.DeviceId == $right.AadDeviceId
// Remove Entra Device ID from GSA logs
| project-away DeviceId
// Rename MDE Device ID to DeviceId column
| project-rename DeviceId = DeviceId1;
// Get all MDE network events
DeviceNetworkEvents
// Get HTTP details if HTTP connection is logged
| extend HttpStatus = toint(todynamic(AdditionalFields).status_code),
BytesIn = toint(todynamic(AdditionalFields).response_body_len),
BytesOut = toint(todynamic(AdditionalFields).request_body_len),
HttpMethod = tostring(todynamic(AdditionalFields).method),
UrlHostname = tostring(todynamic(AdditionalFields).host),
UrlPath = tostring(todynamic(AdditionalFields).uri),
UserAgent = tostring(todynamic(AdditionalFields).user_agent),
HttpVersion = tostring(todynamic(AdditionalFields).version)
// Join GSA logs
| join kind=inner gsa_events on
DeviceId,
$left.RemoteUrl == $right.DestinationFqdn,
$left.RemotePort == $right.DestinationPort,
$left.Protocol == $right.TransportProtocol,
$left.InitiatingProcessFileName == $right.InitiatingProcessName
| project-rename TimeGeneratedGsa = TimeGenerated1, TimestampMde = Timestamp
| project-away Type, TenantId, TimeGenerated, TenantId1, Type1, DeviceId1, AadDeviceId
What are the benefits?
You might ask yourself what the benefits are of correlating MDE and GSA events together? Let's go over them here.
Detection benefits
If you do detection engineering, you know that the more information is available the better. With MDE, we have detailed information on the process that generated specific network events, while we have more detailed information on HTTP Headers in GSA logs. When combining this together we can:
- Try to reduce the Benign-Positive rate of our detections.
- Try to create better detections using the HTTP Headers.

Hunting and investigation benefits
One of the biggest benefits we found at the SOC during Threat Hunting or Investigations, is when we are looking for context how a user visited a (possible) malicious website via the browser. When you only have MDE logs, the only thing you can see is which process connected to the website (for example msedge.exe) but not how the user got there:

If we want to know how the user got there, we had to create a browser history dump to find if another webpage forwarded the user to the malicious site:

But when combining the GSA logs with MDE, we can now see the source website via the HTTP ReferrerHeader field:

This helps a lot, since we have to do far little manual actions on the device for this scenario during investigations or hunting cases.
Conslusion and nuances
The conclusion of this blogpost is that the logs of GSA and MDE together can be very powerful in a couple of scenarios. Even though correlation is not 'easily' done, we can perfectly work ourself around the challenges.
If you want to learn more about the logging capabilities and nuances of both solutions, I recommend to read one of my previous blogpost here: https://hybridbrothers.com/analyzing-mde-network-inspections/