CRXcavator is dead! It started flaking out in late 2023, was effectively gone by 2024, and Cisco never said a word about it.
For nearly five years it was the answer when somebody on a security team asked "how do we tell if this Chrome extension is safe?" but what do you do now?
Now, extension scanning is our thing at Am I Being Pwned, so we're obviously biased here, but we're also not competing with CRXcavator because the service doesn't work anymore.
So what was CRXcavator?
CRXcavator launched out of Duo Labs (Duo Security's research arm, post-Cisco-acquisition) in February 2019. A pile of AWS Lambda functions, embarrassingly parallel, walked the entire Chrome Web Store every three hours, pulled each CRX, unpacked it, and ran it through a battery of static checks. The output was a public risk score you could query by extension ID, no account needed. It was free, beta its entire life, and the source was never released, which is part of why nothing has cleanly replaced it. When the Lambda account got switched off, the code went with it.
The risk score was a straight additive sum: permissions + CSP + RetireJS. No normalisation, no weighting, no ceiling. Permissions had per-API risk weights (tabs is small, <all_urls> is huge). CSP scored each directive from "tightly restricted" to "not defined at all", with the catch that the extension CSP only governs the extension's own pages, not the content scripts injected into every site you visit, and almost nobody bothered customising it anyway. RetireJS flagged known-vulnerable libraries.
The team behind it knew exactly what was wrong with this. The score measured risk surface, not malicious intent. Which meant a password manager with legitimately broad permissions would rank higher than a tightly-scoped credential stealer targeting a single bank. That was the wall it was always going to hit.
It still caught real things while it was running, mind you. The biggest moment was in February 2020, when independent researcher Jamila Kaya used CRXcavator to dig into a cluster of suspicious extensions and ended up uncovering an ad-fraud and data-exfiltration ring that, working with Jacob Rickerd and Google's threat analysis team, expanded to 500+ malicious extensions affecting around 1.7 million users. Google pulled the lot.
The thing that actually made CRXcavator sticky inside enterprises was the operational layer around it: the Gatherer companion extension that fed fleet inventory back into the dashboard, and a public API that spawned a small ecosystem of open-source clients and Splunk SOAR connectors before the lights went out.
Where the methodology started running out of road
Here's the stuff a static, code-at-rest scanner was never going to catch:
Supply chain compromise. In December 2024 the Cyberhaven attack hit 35+ extensions and around 2.6 million users. The malicious update used the exact same permissions as the legitimate version, because the attackers just pushed a new version through a stolen OAuth token. A scanner that only looks at the manifest sees zero change. The diff is entirely in the JavaScript.
Obfuscated exfiltration. When Stylish came back from the dead with a "verified publisher" badge, the actual payload was four layers of base64 plus a columnar transposition cipher plus AES-256-CBC with a hardcoded key. The permissions are exactly what you'd expect for a styling extension. The behaviour is full browsing-history exfiltration to a data company. RetireJS doesn't see this. Nothing static really does.
"Manifest looks fine, code is rotten." The MultiPassword CVSS 8.3 I disclosed earlier this year (1M+ users) had a totally clean manifest. The exploit was a content script that trusted a postMessage from the page's main world, so any site could iterate through the user's saved credentials and pull plaintext passwords and TOTP codes. No permissions check would have caught it, because nothing about the permissions was wrong.
Permission scoring measures declared intent, the stuff the developer ticks in the manifest. The same set of permissions can power a perfectly nice ad blocker or a credential stealer depending on what the JavaScript decides to do that morning. None of it touches what the code is actually doing, and that's the gap that's been getting people pwned.
How we actually do it
The whole product is built on one premise: don't trust the manifest, read the code.
Every extension we ingest gets deobfuscated and decompiled, then run inside an instrumented sandbox where an agent drives it the way a real user would, capturing the actual network egress, DOM interactions and message-passing under load. We follow the message-passing between content scripts, background workers and injected iframes, and trace where every external request actually comes from. The output is a model of what the extension is doing, not what it claims it might.
An LLM pipeline reasons over the deobfuscated source and runtime traces. The things I most want to catch are the things no published signature has seen yet, which is the kind of problem an LLM is genuinely good at: weird egress destinations, content-script-to-page leaks, encoded payloads being assembled at runtime.
Then a researcher checks anything serious before it lands in a customer dashboard. LLMs hallucinate, and a wrong "this extension is stealing passwords" claim is much worse than a slow one. That bit of the pipeline costs a lot, and we keep doing it anyway.
On top of the scanning pipeline there's a fleet side: an extension you push to your team via invite link (or, soon, via MDM) that enrols every device into continuous monitoring. The extension itself only collects extension IDs, version numbers and permissions. No browsing data, no page content, no cookies, by design. Anything else would make us the kind of extension we'd be writing about.
What CRXcavator could see vs what we can see
Both tools do the basic static checks (permission scoring, library CVEs via RetireJS, declared CSP egress). The differences are in what each one can see beyond that.
| Capability | CRXcavator | Am I Being Pwned |
|---|---|---|
| Actual runtime egress destinations | No | Yes |
| Catches supply chain compromise (Cyberhaven) | No | Yes |
| Continuous fleet monitoring | Allowlist via Gatherer | Yes (every device, every update) |
| Human-verified findings | No | Yes |
| Browser support | Chrome (Edge / Firefox limited) | Chrome and Chromium-based |
| Status | Defunct (gone by 2024) | Alive and well |
Things this approach has actually caught
Methodology arguments are cheap, so here's the receipts from the last six months:
- MultiPassword CVSS 8.3 (1M+ users). A bug in how the content script and the credential dropdown iframe talked to each other meant any site sharing a country-code eTLD could exfiltrate usernames, passwords, URLs and TOTP codes from any other login on the same eTLD. No user interaction. Manifest looked totally fine.
- Stylish is back, back again (2M+ users at peak). Verified publisher badge. "Featured" extension. Sending every URL you visit to a data company through four layers of obfuscation.
- WhatRuns. Scraping AI chat contents and browsing URLs.
- StayFocusd. Exfiltrating your complete browsing history.
- Coupert (3M users). Exfiltrating search results.
Undisclosed findings got us added to Belgium's Centre for Cybersecurity (CCB) Wall of Fame in March 2026. Many of the private disclosures we're still waiting on vendors to fix are more impactful than anything we've published.
A permission-based scanner would have given several of these a clean or near-clean score. The permissions were not the problem in any of them. Password managers have genuine uses for scripting, Stylish does actually need to have access to every hostname you visit and WhatRuns needs access to tell you what's running. The problem isn't with what permissions an extension has, it's how they're being used.
If you're still linking to crxcavator.io
Update the wiki. Update the procurement doc. Update the onboarding checklist. Those scores are now over two years stale, and the methodology was already running out of road on the day the server stopped responding.
CRXcavator was free as a public web lookup. We do roughly the same thing from inside the browser via a free Chrome extension you can install and scan with right now, no account needed. For organisations, fleet monitoring is $3/seat/month and whitelist monitoring is $1/extension/month, with a 14-day trial that doesn't need a card.
If you want the short version of "what should we use instead", the honest options today are:
- Us, if you want the LLM and code-analysis approach at a fair price and more reliable results than CRXcavator. Scan your extensions.
- Palo Alto's Prisma Browser Extension (or the full Prisma Browser), which absorbed Koi Security and its "Wings" engine when PANW bought them for ~$400M in February 2026. Worth a look if you're already in the PANW ecosystem, but the pricing is quote-based and aimed at 10k+ employee orgs.
- CrowdStrike Falcon Exposure Management, if you're already running Falcon. Check whether the extension module covers what you need before buying anything standalone.
CRXcavator was good for its time, and the methodology gave the industry a baseline to argue about. Then the threats moved past the manifest, and at that point a static scanner has nowhere left to look. The honest answer to "is this extension safe?" in 2026 is "what is it doing right now," and that takes reading the code, running it, and having an agent dynamically interact with it in a sandbox, alongside human review for critical environments.
That's what we built. And we're alive and well.
