Why Permission Scoring Fails

Permission scoring is the dominant approach to Chrome extension security. It is also largely useless for finding the threats that actually matter.

This is not a theoretical, we have the data.

The numbers

We tested this directly. The full data is in our Spin AI comparison, but the headline: across 2,534 extensions, the median risk score for extensions we rated CRITICAL was lower than for extensions we rated CLEAN. No meaningful correlation between permission-based score and actual risk.

This is not a shot at SpinAI, we just have to highlight the structural problem with risk scoring.

Why permission scoring sounds reasonable

The logic is intuitive: an extension that requests <all_urls>, cookies, tabs, and scripting is clearly more dangerous than one that requests only activeTab. More permissions = more attack surface = higher score.

This was how CRXcavator worked. It was how almost every scanner built before 2023 worked. It gave the industry a baseline to argue about, and it catches some things - a new extension from an unknown publisher requesting every permission available is genuinely suspicious.

The problem is the specific class of threat it cannot catch, which happens to be the most dangerous one.

The fundamental problem

Permissions describe what an extension is allowed to do. They say nothing about what it does.

The same set of permissions can power a tool that respects your privacy or one that exfiltrates everything you type. The manifest is identical. The code is not.

Three examples from our own research:

Stylish - 2M users, Featured, Verified Publisher

Stylish is a CSS customisation extension. It legitimately needs access to every site you visit to apply styles. <all_urls> is not suspicious in this context - it is required for the product to work.

What a permission scanner sees: an extension with broad host access and a large, established user base. Probably fine.

What we found: every URL you visit transmitted through five layers of encoding - URL encoding, double base64, JSON stringify, a columnar transposition cipher, and AES-256-CBC with a hardcoded key - to a data company on every page load. Running silently. No user-visible indicator.

Spin AI scored Stylish as 33 (low risk). We scored it critical.

MultiPassword - 1M users

MultiPassword is a password manager. A password manager legitimately needs scripting access to inject credential fields into login pages. It legitimately needs storage access. Its permissions are exactly what you would expect.

What a permission scanner sees: normal permissions for a password manager. Nothing to flag.

What we found: a postMessage vulnerability where any site sharing a country-code eTLD with another site could iterate through and exfiltrate usernames, passwords, TOTP codes, and URLs - with no user interaction. The manifest was completely clean. The vulnerability was entirely in how the extension handled messages from untrusted origins.

CVSS 8.3. We disclosed it, they patched it. Their risk score with Spin AI went up after the patch. We think this was because their extension was taken down from the Chrome Web Store for about a month.

uBlock Origin - 16M users

uBlock Origin is the most widely-used open-source ad blocker. It is developed in the open, audited regularly, and trusted by the security community. It is about as clean as extensions get.

What a permission scanner sees: <all_urls>, tabs, webRequest, webRequestBlocking - an extension that can intercept and modify any web request, read every URL you visit, and inject scripts into any page. By permission count, this is about as high as it gets.

Spin AI scored it 51 (medium risk) - higher than Stylish, higher than Coupert, roughly the same as the extensions we observed collecting data.

Why this isn't fixable

You cannot solve this by adding more signals to the permission model. We tried - 212 features, gradient boosting, everything we could extract without running the code. It explained 28% of variance on the training set and scored negative R² on validation. The fundamental issue is that the information required to detect malicious intent - what the code actually does at runtime - is not in the manifest at all.

An attacker building a malicious extension does not need unusual permissions. They need whatever permissions their cover story requires, and then they need the JavaScript to do something different from what the cover story implies. A styling tool that exfiltrates URLs. A productivity timer that sends your browsing history to a data broker. An adblocker with the infrastructure to scrape AI chats when the server tells it to.

The threat lives in the code and in the runtime behaviour. Permission scoring looks at neither.

What actually works

We deobfuscate the extension source and run it in an instrumented sandbox. A static analysis agent reads the code and forms specific, falsifiable claims about what the extension does. A dynamic analysis agent runs the extension and tries to prove those claims in a live environment - watching what leaves the device, what gets read, what gets sent where. Network captures are the evidence layer.

This is how we found Stylish the second time, after it returned with a Verified Publisher badge. It is how we found the MultiPassword vulnerability. It is how we identified eight extensions with a combined 7 million users scraping AI chat content, most of them with Featured and Verified badges.

This is not to mention the other RCEs we're coming out with soon after they get patched, from legitimate trusted vendors who just made big mistakes.

None of those findings came from looking at permissions.

If you want to see what your extensions are actually doing, get in contact - free for individual extensions, fleet monitoring available within 48 hours.