Deceptive AI Skill Evades Security Scanners to Reach Thousands of Agents

The Experiment and Its Method

Security researchers at AIR demonstrated a significant weakness in the AI agent skill ecosystem by creating a fake skill that bypassed multiple security scanners. The deceptive skill, named brand-landingpage, was uploaded to a popular skill marketplace and promoted through an Instagram advertisement targeting marketers, salespeople, and designers. According to AIR, the skill reached approximately 26,000 agents, including some connected to corporate accounts. The harmless payload only collected user email addresses, but the experiment showed how a real attack could work.

Contents

The Experiment and Its Method Why Current Defenses Fall Short

The researchers exploited the gap between what security scanners check and what actually happens when an agent runs the skill. Scanners from Cisco, NVIDIA, and those built into skills.sh only analyze the submitted package files like SKILL.md. The fake skill contained no malicious code. Instead, it directed the agent to an external website controlled by AIR for setup instructions. The link initially pointed to legitimate documentation, so scanners cleared the package. After the skill gained wide adoption, AIR changed the external page to instruct the agent to download and run a script.

Why Current Defenses Fall Short

The fundamental problem is structural because the security scan happens only once at submission time. An attacker can rewrite the content on the external site at any point after approval. This blind spot has been independently demonstrated in prior research, including work from Trail of Bits that bypassed multiple scanners using the same technique. Anthropic’s own documentation warns that skills fetching external URLs carry this risk since the content can change after vetting.

Trust signals that the ecosystem relies on, such as GitHub stars and clean scanner verdicts, proved easy to manipulate. The skill inherited the star count from a repository with around 36,000 stars by simply submitting a pull request that was quickly merged. The takeaway for defenders is clear: treat skills as software that must be vetted for what they point to externally, not just what ships inside the package. Organizations should route new skills through a single controlled source, check them whenever anything changes, pin versions, and enforce least privilege for agent actions.

Source: The Hacker News

Deceptive AI Skill Evades Security Scanners to Reach Thousands of Agents

Security researchers demonstrated that AI agent skills can bypass multiple scanners by hosting malicious instructions on external websites that are only fetched after the skill passes initial review.

The Experiment and Its Method

Why Current Defenses Fall Short

Trending

New Permanent SecureROM Exploit Targets Apple A12 and A13 Chips

Microsoft Teams Gains Bot Detection to Curb Unwanted Meeting Access

SystemBC Malware Enables Stealthy C2 Tunneling and Persistent Access

GuardFall Attack Bypasses Safety Filters in Popular AI Coding Tools

Study Finds 282 iOS Apps Expose AI API Keys Through Network Traffic

Related Stories

New Supply Chain Worm Hits SAP npm Packages, Targets Developer Secrets

Microsoft Warns Hospitality Sector of Photo-Themed Phishing Campaign Delivering Node.js Malware

Google Lawsuit Targets Smishing Network That Weaponized Gemini AI

BitUnlocker Attack Breaks Windows 11 BitLocker Encryption in Minutes