§ AC

Token Theft After DBSC: What Identity Detection Actually Looks Like in 2026

Refresh token theft has been the dominant identity attack pattern for three years running, and the industry’s answer — Device Bound Session Credentials, or DBSC — is finally shipping in stable Chrome and Edge channels through 2025 and into 2026. Microsoft’s token protection (the conditional access feature, not the marketing umbrella) builds on the same TPM-backed primitives. Google’s been quieter but has the same machinery in production for Workspace sessions. So far so good.

Except the posture most shops will end up with through 2026 is partial coverage at best, and the detection burden does not go away — it shifts. If you read the spec and assume token theft is solved, you will miss the cases that matter, because the cases that matter are exactly the ones DBSC doesn’t cover yet: non-browser clients, legacy OAuth flows, the long tail of SaaS apps that haven’t adopted the binding, and the gap between the browser binding the session and the IdP enforcing the binding.

This is a defender’s problem, not an architecture diagram problem. So let’s talk about what the detection actually looks like once you turn it on, what the first-round tuning has to fix, and where your environment changes the answer.

The mechanism, briefly

The attack pattern hasn’t really changed since 2022. An attacker phishes a session, an infostealer scrapes cookies and refresh tokens from a workstation, or a malicious browser extension exfiltrates the same. The stolen artifact is replayed from attacker infrastructure. The IdP, lacking proof that the original device is presenting the token, issues a fresh access token. Game over for that identity until somebody notices.

DBSC closes the cookie-replay path for browser sessions by binding the session cookie to a key that lives in the TPM (or the platform secure enclave). When the browser refreshes the session, it proves possession of that key. An attacker copying the cookie jar gets nothing useful, because the cookie alone won’t refresh.

That’s the win. It’s a real win. But notice what it doesn’t cover: the refresh token issued to a native desktop or mobile client, the OAuth flow used by a CLI tool, the legacy SAML session that some line-of-business app still wants, and — this is the one teams underestimate — the period between when the user authenticates and when the binding gets established, which on a fresh sign-in is a window long enough for an AitM proxy to be useful.

Microsoft’s token protection in Entra ID is similar in spirit but only enforces the binding when the resource and client both support it. The conditional access policy has a “requires compatible client” mode, and the list of compatible clients is shorter than the docs make it look. Outlook desktop, current builds, yes. Teams, mostly. The forty SaaS apps your business federates to via OIDC? Depends entirely on whether the relying party honors the tbid claim, and most don’t. (The 2024 docs were optimistic about this. The 2025 reality was less so. I’d check the current compatibility matrix before committing to a policy.)

What detection actually looks like

Assume Entra ID as the IdP and Sentinel or Splunk as the SIEM, because that’s the common case. The signals you care about live in SigninLogs and AADNonInteractiveUserSignInLogs. The non-interactive table is the one that matters for token replay — interactive sign-ins are where MFA happens, non-interactive is where refresh tokens get cashed in.

The field set worth knowing:

  • TokenProtectionStatusDetails.SignInSessionStatus — tells you whether the session was bound. Values are roughly bound, unbound, notApplicable. The notApplicable bucket is bigger than you’d hope.
  • AuthenticationProtocoloAuth2, wsFederation, saml20, deviceCode. Device code flow is the one to watch; it’s the modern phishing channel.
  • OriginalRequestId — correlates a refresh chain back to the original interactive sign-in. Useful when you want to ask “where did this session start?”
  • IPAddress plus AutonomousSystemNumber — ASN shifts mid-session are the cleanest replay signal you have, assuming the user isn’t on a flaky mobile carrier.
  • DeviceDetail.TrustTypeAzureAD, Hybrid, Workplace, or empty. Empty on a non-interactive refresh for a corp user is a flag.

The detection that earns its keep is the one looking for a non-interactive sign-in whose ASN does not match the ASN of the originating interactive sign-in, within the lifetime of the refresh token. Concretely, in KQL against Sentinel:

AADNonInteractiveUserSignInLogs
| where TimeGenerated > ago(24h)
| where ResultType == 0
| join kind=inner (
    SigninLogs
    | where TimeGenerated > ago(7d)
    | project OriginalRequestId=CorrelationId, OrigIP=IPAddress, OrigASN=tostring(parse_json(NetworkLocationDetails)[0].networkType)
) on OriginalRequestId
| where IPAddress != OrigIP
| extend asn_now = tostring(parse_json(NetworkLocationDetails)[0].networkType)
| where asn_now != OrigASN

That’s the shape, not the finished rule. The finished rule needs an exclusion for known mobile carrier ASNs (users on cellular roam constantly), an exclusion for your VPN egress, and probably a confidence weighting based on geographic distance between the two IPs. The raw rule will fire on every road warrior with an iPhone.

Expected volume in a 5,000-seat shop, ballpark: low thousands of hits per day untuned, dropping to tens once you’ve carved out carrier ASNs, dropping again to single digits per week once you add a geo-velocity filter and exclude the known SaaS-to-SaaS hop patterns (Salesforce calling back through its own ASN to refresh a delegated token, that kind of thing). If your tuning isn’t taking you down by two orders of magnitude in the first two weeks, the rule is wrong, not the environment.

Where the false positives come from

The noise sources, ranked roughly by how much pain they cause:

Mobile carriers shift ASNs between cell towers. A user driving down I-95 will hit three or four ASNs in an hour. You cannot detect on raw ASN change; you have to detect on ASN change into a category you don’t expect — datacenter ASNs, residential proxy networks, hosting providers.

Corporate VPN concentrators with split tunneling generate sessions that originate on the user’s residential IP and refresh through the corporate egress, or vice versa, depending on which app is in scope. The fix is environment-specific: either tag your VPN egress IPs and exclude them, or use the NetworkLocationDetails.networkNames field if your tenant is populated with named locations. (Named locations are underused. Populate them.)

SaaS-to-SaaS delegation. When app A calls app B on behalf of the user via OAuth 2.0 on-behalf-of, the refresh shows up in the user’s sign-in logs originating from app A’s infrastructure. This looks exactly like token theft until you learn the pattern, at which point you maintain an allowlist of ResourceDisplayName plus origin ASN pairs. The allowlist will rot — apps move infrastructure — and somebody on the SOC will have to own it.

Guest users from federated tenants. Their refresh sessions originate from wherever their home tenant says they do, which is often nowhere useful. Exclude UserType == Guest from the high-confidence rule and run a separate lower-priority rule against guests, because guest token theft is a real problem and you don’t want to suppress it entirely.

Environment assumptions that change the answer

Flat versus tiered identity matters more than people admit. In a flat Entra tenant where everyone’s a member user and admin roles are assigned directly, your detection has to treat every account as equally important and you’ll drown in volume. In a tiered model with PIM-elevated admin accounts in a separate scope, the high-fidelity detection is the one on the admin scope — token replay against a Global Admin is a five-alarm event, token replay against an intern is a ticket.

FedRAMP High changes the math because the residual risk of an unbound session is much higher and the compensating controls (conditional access requiring compliant device, sign-in frequency reduced to four hours, no persistent browser sessions on unmanaged endpoints) are non-negotiable. In that environment, you can be more aggressive: any non-interactive sign-in from an ASN outside your approved list is a block, not an alert.

Hybrid AD with on-prem federation through ADFS or Entra Connect’s pass-through auth introduces a second token issuance path that does not participate in DBSC at all. If you still have ADFS in 2026, you are accepting that the token protection story has a hole the size of your federation farm, and the detection has to compensate. That means full AU-2 audit on ADFS itself, claim transformation rules under CM-3 change control, and a separate analytic for ADFS-issued tokens being replayed against cloud resources.

The BYOD case is the one nobody wants to talk about. DBSC requires a key in a hardware-backed keystore. On a personal Android device with an unlocked bootloader, or a Mac without FileVault and Secure Enclave properly configured, the binding is software-only and an attacker with local code execution defeats it. If your conditional access policy says “require token protection” and your device compliance policy is permissive about BYOD, you are bound to a key that doesn’t actually protect anything. The fix is to require managed devices for sensitive scopes (AC-3, AC-6) and accept that BYOD users get a downgraded session lifetime.

Control mapping

The pieces of 800-53 this work touches:

Control What it covers here
IA-2(1), IA-2(2) MFA for privileged and non-privileged; binding is the layer below
IA-5(1) Authenticator management — session keys are authenticators
AC-12 Session termination on anomaly
AU-2, AU-6 Logging the sign-in and refresh events, correlating them
SC-8, SC-23 Transmission integrity, session authenticity — DBSC sits here
SI-4(4) Inbound/outbound anomaly detection on the auth path
CM-7 Restricting which auth protocols are even allowed — turn off device code where you can

The one I’d push hardest on is CM-7. Most of the device-code phishing campaigns that have flourished since 2023 work because device code flow is on by default and nobody scoped it down. There is a conditional access template for blocking it; use it, with exceptions for the two or three legitimate use cases (PowerShell on jump hosts, mostly) scoped to specific service principals.

What most teams get wrong the first time

They turn on token protection in audit mode, see the notApplicable bucket dwarfing everything else, conclude the feature is broken, and turn it off. It’s not broken — it’s telling you which of your apps and clients haven’t adopted the binding, which is the inventory you needed anyway. Leave it in audit, treat the notApplicable count as a backlog metric, and work the list down by app owner.

The second mistake is writing the detection against SigninLogs only and missing the non-interactive table entirely. Refresh-token replay does not show up in SigninLogs. It shows up in AADNonInteractiveUserSignInLogs, which has a separate retention setting and a separate ingestion cost line item, and which I have seen disabled in cost-conscious tenants. If it’s disabled, you are blind to the attack you most want to see.

Third mistake: trusting RiskLevelDuringSignIn as the only signal. Entra’s risk engine is decent at the obvious cases and weak at the slow ones. A patient attacker who replays a token from a residential proxy in the same metro as the victim will not trip the built-in risk detection most of the time. You need the ASN-shift correlation as a layered control, not as a replacement for risk-based CA.

DBSC and token protection are real improvements. They are not a finished story, and the operational work — tuning the detections, owning the allowlists, killing off device code where you don’t need it, getting BYOD off the sensitive scopes — is what actually moves your posture. The protocol does the easy part. The hard part is still yours.