Stolen Session Tokens Walk Past Your MFA: Detecting and Killing AiTM Replay in Entra

By AutoCypher · 7 weeks ago 04 Jun 2026

The phishing-resistant MFA story has a hole in it, and the hole is the token. You can deploy number matching, you can kill SMS, you can push everyone to the Authenticator app, and an adversary-in-the-middle proxy will still walk a stolen session cookie past all of it. The MFA challenge completes — legitimately, with the real user tapping approve — and the attacker harvests the resulting session artifact and replays it. From their infrastructure. No second prompt. The user did everything right and the account is compromised anyway.

This is not new in 2026, but the kits are commodity now. Tycoon 2FA, Mamba 2FA, the Rockstar/NakedPanda family — these are rented, not built, and they reverse-proxy the real Microsoft login page so the victim sees a genuine login.microsoftonline.com flow with a valid cert. The proxy sits in the middle, relays the MFA challenge, and captures the session cookie that Entra issues on success. What you are defending against is not a password problem and not really an MFA problem. It is a session integrity problem, and the controls that fix it are different from the ones most shops reached for first.

What the token actually is, and why replay works

When a user completes interactive sign-in to Entra, the issued artifact depends on the client. Browser sessions get an authentication cookie (the ESTS cookie). Native clients on Entra-joined Windows get a Primary Refresh Token, which is a different and harder target because the PRT is cryptographically bound to the device’s TPM. The AiTM kits go after the browser session, because the browser session is, by default, a bearer token. Whoever holds it, uses it. There is no proof-of-possession check tying that cookie to the machine it was issued to.

That is the entire problem in one sentence: a bearer token does not care who is presenting it.

So the attacker imports the cookie into their own browser context, hits the same resource, and Entra sees a valid, unexpired session. Conditional Access already evaluated at issuance and passed. Unless something re-evaluates the session — and unless that re-evaluation can see that the presenter changed — the replay sails through. This is why “we require MFA on everything” is a necessary control and an insufficient one. MFA gates issuance. It does nothing about what happens to the token afterward.

What the replay looks like in the logs

If you are pulling Entra SigninLogs into Sentinel (or shipping them to Splunk via Event Hub and the Azure add-on — and if you’re on the add-on, check that properties.authenticationDetails is actually surviving the JSON flattening, because the nested array gets mangled more often than people notice), the signal is a correlation, not a single event.

You are looking for two sign-ins that share a session but disagree on context. The legitimate interactive sign-in: MFA satisfied, a known device, a residential or corporate IP, a normal user agent. Then, minutes to hours later, a non-interactive or token-refresh event reusing the same session, from a different ASN, often a hosting provider’s range, frequently with a user agent that does not match anything the user owns.

The fields that carry the weight:

AutonomousSystemNumber — the cheap tell. A jump from the user’s carrier or corporate ASN to a VPS provider (DigitalOcean, OVH, a residential-proxy ASN) on a shared session is the thing worth alerting on.
UserAgent — mismatch between the interactive sign-in and the subsequent token use. Attackers often don’t bother spoofing it well.
SessionId / CorrelationId — what ties the two events together. Without correlating on the session, you’re just doing impossible-travel, which is noisier and slower.
RiskEventTypes — Entra’s own anomalousToken and tokenIssuerAnomaly detections land here when Identity Protection catches it. Treat these as a feed into your own logic, not as the whole answer, because they fire late and they miss the quiet ones.

The naive version of this detection is impossible-travel on its own, and it will bury you. A salesperson connects through the corporate VPN in Virginia, then their phone falls back to LTE in a different metro, and you’ve got a geo-velocity alert that means nothing. In a 10,000-seat shop, expect raw impossible-travel to throw something in the high hundreds per day before tuning. The session-correlated version — same session, ASN change into hosting space, UA mismatch — gets you down to something a SOC analyst can actually work, single to low double digits daily, and most of what remains is real or at least worth a phone call.

The first round of tuning, and where the false positives live

The noise in this detection comes from three predictable places, and you will not get ahead of it in the lab.

Corporate egress and VPN. Split-tunnel VPN clients, cloud secure-web-gateways (Zscaler, Netskope), and Microsoft’s own service-to-service token refreshes all generate ASN changes on a live session that are completely benign. The fix is a named-locations allowlist plus a carve-out for Microsoft’s published service tags. This is tedious and it goes stale — someone adds a new SASE PoP and your false-positive rate ticks back up until the allowlist catches up. Budget for that maintenance; it is not set-and-forget.

Mobile and roaming. Carrier-grade NAT and mobile IP rotation will rotate a user’s apparent ASN mid-session. Outlook mobile is the usual culprit. You can suppress on known mobile carrier ASNs, but be careful — residential-proxy services that attackers buy sometimes egress through the same carrier ranges, so a blanket carrier allowlist hands the adversary a blind spot. I would scope the suppression to first-party mobile app client IDs rather than to the ASN alone.

Legitimate multi-device. A user signed in on a laptop and a desktop and a phone produces concurrent sessions that look, structurally, a little like replay. The discriminator is the UA-plus-ASN-plus-device-ID combination, not any one field. If you alert on session reuse without requiring at least two of those to disagree, the detection is dead on arrival from FP volume.

One value to anchor on: I’d start the alert requiring an ASN change into a hosting/VPS or known-proxy category combined with a UA family mismatch on the same SessionId inside a rolling window — call it 60 minutes — and tune the window from there. Sixty is a starting guess, not gospel; if your token lifetimes are long you’ll want it wider, and a wider window means more correlation work for whatever’s running the query.

Token Protection and CAE — what they fix and what they don’t

The real remediation is to stop issuing bearer tokens that anyone can replay. Microsoft’s answer is token protection (sometimes still called token binding in older docs), which cryptographically binds the session token to the device’s secure key material so a stolen copy is useless off-box. Good. Necessary. Deploy it.

But read the coverage fine print before you tell leadership the problem is solved. As of the current rollout, token protection for sign-in session tokens covers a narrow set: Entra-joined or hybrid-joined Windows 10/11, with specific desktop apps — Teams, OneDrive, the Office desktop clients — under supported versions. Browser-based sessions are the gap, and browser-based sessions are exactly what the AiTM kits steal. So you can have token protection enabled and still be exposed on the precise vector you bought it for, if your users live in the browser. macOS and mobile coverage trail further behind. Check the current support matrix against your actual device fleet before you scope the Conditional Access policy, because a token-protection CA policy in block mode will lock out everything it can’t bind, and that’s a help-desk event.

Continuous Access Evaluation is the other half. CAE lets resource providers (Exchange Online, SharePoint, Teams, Graph) re-evaluate access mid-session and revoke near-instantly on critical events — disabled account, password reset, an admin-triggered revocation, a detected risk change. Without CAE you are living with access-token lifetimes that can run an hour, which is an hour of free reign after you’ve already noticed the compromise. With CAE, revocation propagates in roughly minutes.

The caveat: CAE only works where the client supports it and the resource supports it. CAE-capable clients hitting CAE-enabled services get the fast path; everything else falls back to standard token lifetimes and you’ve gained nothing for that traffic. And CAE’s IP-based location enforcement assumes Entra and the resource provider agree on the client’s source IP, which breaks the moment a downstream proxy or a misconfigured SASE egress rewrites it. If you turn on strict-location CAE without auditing your egress IP visibility first, you will generate spurious revocations and a lot of angry tickets.

So the honest posture: token protection where the fleet supports it, CAE everywhere it’s available, phishing-resistant FIDO2 to make the initial AiTM proxy step harder in the first place, and the session-correlation detection above to catch what slips through the coverage gaps. None of these alone closes it. The detection exists precisely because the preventive controls have holes.

Control mapping

Control	Relevance
IA-2 / IA-2(1)	MFA at issuance; phishing-resistant authenticators raise the cost of the AiTM relay step
AC-12	Session termination — CAE is the operational expression of near-real-time session revocation
SC-23	Session authenticity; token protection is binding the session to the device key, directly on point
SI-4	The SIEM detection logic itself — anomalous session reuse monitoring
AU-6	Correlated review of sign-in logs across the issuance and replay events
CA-7	Continuous monitoring of the identity plane as an ongoing control, not a point-in-time check
AC-2(12)	Account monitoring for atypical usage — the ASN/UA anomaly is the atypical usage

SC-23 is the one people skip in their RMF crosswalk because session authenticity reads like a transport-layer concern. It isn’t, here. The token is the session, and binding it is the control.

The shops that get this right stop treating MFA as the finish line. They instrument issuance and they instrument what happens after, because the attacker’s whole game is the gap between the two. If your only identity detection is failed-login spikes and impossible travel, you are watching the door and ignoring the window someone already climbed through.