ToolShell six months on: SharePoint on-prem detection that holds up in the index

If you ran SharePoint Server on-prem in July 2025 and your incident channel did not light up, either you were extremely lucky or your IIS logs weren’t being parsed. The ToolShell chain — CVE-2025-53770 and CVE-2025-53771, layered on top of the partial fixes for CVE-2025-49704 and CVE-2025-49706 from the July Patch Tuesday — was the kind of bug class that shows up in a vendor advisory marked “exploitation detected” and then proceeds to do exactly that, at scale, before most shops have a maintenance window scheduled.

It’s worth revisiting now, not because the CVEs are new, but because the detection content most teams stood up in week one was wrong in interesting ways. The IOC lists aged badly. The first-generation Sigma rules over-fit on spinstall0.aspx as a filename. And the part nobody quite wants to talk about — rotating SharePoint MachineKeys across a farm without breaking custom solutions — got skipped at a lot of sites that otherwise patched on time.

So. What ToolShell actually was, what the durable detections look like, and where the noise comes from.

The shape of the bug

ToolShell is a deserialization chain reached through an authentication-bypass trick against /_layouts/15/ToolPane.aspx. The short version: a crafted POST with a Referer header pointing at /_layouts/SignOut.aspx would slip past the auth check that the July patch tried to introduce, hand the request to the ToolPane handler, and from there a tampered __VIEWSTATE blob got deserialized server-side with the farm’s MachineKey. If the attacker had the MachineKey — and the first stage of the chain was designed specifically to leak it — the ViewState payload signed cleanly and ran as w3wp.exe.

This is the part that matters for defenders six months later: the MachineKey is the persistent compromise primitive, not the webshell. If your remediation playbook ended at “applied the out-of-band patch and deleted spinstall0.aspx,” you patched the door and left the key on the mat. Microsoft’s own guidance was explicit about rotating MachineKeys and restarting IIS after patching, and it was the step most commonly skipped in the writeups that surfaced through August.

Affected products were SharePoint Server Subscription Edition, 2019, and 2016. SharePoint Online (the M365 tenant) was never in scope — different code path, different auth stack. If your farm is hybrid with on-prem federation into Entra, the on-prem side is the exposure; the cloud side is not.

What the telemetry actually looks like

There are three places the activity surfaces, in roughly this order of usefulness.

IIS logs. This is where the initial access leaves the cleanest fingerprint. You’re looking for POSTs to ToolPane.aspx with a Referer field set to a SignOut.aspx URL on the same host. In a healthy farm, that combination is essentially never legitimate — ToolPane.aspx is reached from inside an authenticated edit session, not from the signout page. In Splunk, if you have the IIS TA installed and the cs_referer field parsed correctly, the working query is roughly:

index=iis sourcetype="ms:iis:auto" cs_uri_stem="/_layouts/15/ToolPane.aspx" cs_method=POST cs_referer="*SignOut.aspx*"
| stats count by c_ip, s_ip, cs_uri_query, sc_status

A couple of caveats the docs are quiet about. First, the IIS TA’s default field extractions in older versions truncate cs_referer if the URL is long — check that the field is populated before you trust the rule. Second, some load balancers (the F5 ASM in particular, in certain virtual-server configs) will strip the Referer header before it ever reaches the SharePoint front-end, which means your edge logs and your IIS logs disagree about whether the header was present. Tune on the IIS side, not the edge.

Expected volume in a clean farm: zero. If you’re seeing more than a handful of hits per day after the first week of patching, the noise source is almost always either a SharePoint health probe written by someone who copied the URL out of a stack overflow post, or an internal pen test that nobody told the SOC about. Both are findable. Neither is a tuning problem; they’re a process problem.

Process telemetry on the SharePoint front-ends. This is where Sysmon (or EDR equivalent) earns its keep. The signature is w3wp.exe spawning a child that has no business being a child of an IIS worker process — cmd.exe, powershell.exe, csc.exe, or anything that looks like reconnaissance (whoami, net.exe, nltest). The Sysmon Event ID 1 query, again Splunk-shaped:

index=sysmon EventCode=1 ParentImage="*\\w3wp.exe" 
  Image IN ("*\\cmd.exe","*\\powershell.exe","*\\csc.exe","*\\cscript.exe","*\\wscript.exe")
| table _time, host, ParentCommandLine, Image, CommandLine, User

This is also where the first round of tuning gets bloody. w3wp.exe legitimately spawns csc.exe whenever ASP.NET compiles a page for the first time after an app pool recycle. On a farm with custom solutions, that fires a lot — especially right after patching, when every app pool has been restarted and the JIT cache is cold. The fix is not to exclude csc.exe wholesale; it’s to exclude csc.exe runs where the command line points into the Temporary ASP.NET Files path and the parent app-pool identity matches a known SharePoint service account. Anything else is suspicious.

PowerShell as a child of w3wp.exe should be zero in a sanely configured farm. If you see it and the operator says “oh, that’s the monitoring agent,” the monitoring agent is wrong and someone should fix it. Don’t whitelist it; fix it.

File creation in the LAYOUTS folder. The original IOC was spinstall0.aspx dropped into C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\TEMPLATE\LAYOUTS\. By August, the filename had drifted — spinstall1, info3, random eight-char strings, and at least one campaign that used a UTF-8 homoglyph in the filename to evade naive string-match rules. The durable detection is not the filename; it’s any .aspx file written into the LAYOUTS folder by a process other than the SharePoint installer or the Windows Update servicing stack. Sysmon Event ID 11, scoped to that path, with an exclusion for TrustedInstaller.exe and the SharePoint patch installer. That’s the rule that ages well.

Where the false positives actually come from

Three sources, in descending order of how much pain they cause.

Custom SharePoint solutions deployed via PnP PowerShell or the older stsadm.exe will, depending on how they were authored, write files into LAYOUTS at deployment time. If your shop has an active SharePoint dev team, expect the LAYOUTS-write rule to fire during scheduled deploys. Carving out the deploy window by change ticket is the cleanest answer; whitelisting the PnP module by signed-cert hash is the next-cleanest; whitelisting by parent process name is the laziest and the one that gets abused.

SCCM/MECM and similar configuration-management agents running on the SharePoint hosts will occasionally spawn processes under contexts that look adjacent to w3wp.exe in process-tree visualizations even when they aren’t strict children. Most EDRs handle this correctly. A few don’t, and the resulting parent-process attribution noise is hard to debug because it looks real. Check the actual ParentProcessGuid field if your EDR exposes it, not the rendered tree in the console.

Health-check tooling — the in-house kind, written years ago, that POSTs to assorted SharePoint endpoints to make sure they return 200 — is the one that catches teams by surprise. The script doesn’t know about ToolShell; it was written in 2018 and has been running on a Tuesday cron ever since. It usually shows up as a single source IP hammering a list of URLs with no Referer set, which is its own tell. If your IIS rule keys on Referer=SignOut.aspx, it won’t fire on this. If you generalized the rule to “any unusual POST to ToolPane,” it will, and the on-call will spend an hour finding the cron job.

The MachineKey problem

Key rotation is where the remediation story gets uncomfortable. The official guidance — patch, rotate MachineKeys via Update-SPMachineKey, restart IIS — is correct and also operationally fraught on farms with custom solutions that pin to specific key material, or that use ViewState MAC validation in ways the original solution authors didn’t fully document. The failure mode is that rotation succeeds, IIS restarts, and then a subset of pages throw ViewState validation errors until the affected app pools are individually recycled or the solution is redeployed.

The shops that handled this well staged the rotation: rotate in a non-production farm first, watch for ViewState errors in the ULS logs (ASP.NET ViewState event category), fix the solutions that break, then rotate prod. The shops that handled it badly rotated in prod on a Saturday, took the help-desk calls on Monday, and rolled back. Rolling back a MachineKey rotation when you suspect prior compromise is the worst possible outcome — you’re back to a known-leaked key on a host you can’t trust.

If you’re reading this and you patched ToolShell but never rotated the keys, treat that farm as suspect until you have evidence to the contrary. The MachineKey is small, exfiltrates in a single HTTP response, and is trivially reused from anywhere on the internet against your front-ends until rotated.

Control mapping

The 800-53 alignment is straightforward, but two of the mappings are worth calling out because they’re the ones audits tend to skim past.

Control Relevance
SI-2 Flaw remediation — the out-of-band patches, plus the follow-on key rotation that the patch alone doesn’t perform
SI-4 System monitoring — IIS, Sysmon, and ULS log ingestion into the SIEM with the rules above
SI-7 Software/information integrity — MachineKey is the integrity anchor for ViewState; rotation is an SI-7 action, not just an SI-2 one
SC-12 Key management — the MachineKey is cryptographic material under SC-12, and your inventory of “keys in scope” probably didn’t include it before this
CM-7 Least functionality — disable layouts pages and endpoints that the farm doesn’t actually need
AU-6 Log review — the IIS rule above is the load-bearing one
RA-5 Vulnerability scanning — internal scanners need a ToolShell check that exercises the bypass, not just a banner-grab on the SharePoint version

The SC-12 mapping is the one most ISSOs hadn’t thought about before this CVE. If your key inventory for the system boundary lists TLS certs and database TDE keys but not the ASP.NET MachineKeys for each app, that gap is worth closing during the next assessment cycle regardless of ToolShell.

What to actually do this week

If you still have on-prem SharePoint and you haven’t gone back and audited the post-patch state, the order of operations is: confirm the July and out-of-band patches are applied on every front-end (Get-SPProduct -Local and compare versions across the farm — drift is common); confirm MachineKeys have been rotated since the patch and that the rotation timestamp postdates the patch install; run the IIS log query above against at least the last 180 days of retention, because the active exploitation window predates a lot of teams’ detection content; check LAYOUTS for any .aspx not owned by a SharePoint or Windows installer process.

And if you find evidence of prior MachineKey theft — anomalous ViewState validation errors in ULS logs from before the patch, unexpected files in LAYOUTS, w3wp.exe spawning shells in the parent-process history — assume the farm was compromised, rotate keys again, and treat the box like an IR engagement rather than a cleanup task. The cost of being wrong in the other direction is too high.

The broader lesson here is one defenders already knew but ToolShell made expensive: a deserialization bug that leaks the signing key isn’t a single CVE, it’s a credential disclosure with a remote-code-execution side effect. Patch the bug, but inventory and rotate the secret it leaked. The patch alone is not the fix.

Sources