The Experiment and Its Method
Security researchers at AIR demonstrated a significant weakness in the AI agent skill ecosystem by creating a fake skill that bypassed multiple security scanners. The deceptive skill, named brand-landingpage, was uploaded to a popular skill marketplace and promoted through an Instagram advertisement targeting marketers, salespeople, and designers. According to AIR, the skill reached approximately 26,000 agents, including some connected to corporate accounts. The harmless payload only collected user email addresses, but the experiment showed how a real attack could work.
The researchers exploited the gap between what security scanners check and what actually happens when an agent runs the skill. Scanners from Cisco, NVIDIA, and those built into skills.sh only analyze the submitted package files like SKILL.md. The fake skill contained no malicious code. Instead, it directed the agent to an external website controlled by AIR for setup instructions. The link initially pointed to legitimate documentation, so scanners cleared the package. After the skill gained wide adoption, AIR changed the external page to instruct the agent to download and run a script.
Why Current Defenses Fall Short
The fundamental problem is structural because the security scan happens only once at submission time. An attacker can rewrite the content on the external site at any point after approval. This blind spot has been independently demonstrated in prior research, including work from Trail of Bits that bypassed multiple scanners using the same technique. Anthropic’s own documentation warns that skills fetching external URLs carry this risk since the content can change after vetting.
Trust signals that the ecosystem relies on, such as GitHub stars and clean scanner verdicts, proved easy to manipulate. The skill inherited the star count from a repository with around 36,000 stars by simply submitting a pull request that was quickly merged. The takeaway for defenders is clear: treat skills as software that must be vetted for what they point to externally, not just what ships inside the package. Organizations should route new skills through a single controlled source, check them whenever anything changes, pin versions, and enforce least privilege for agent actions.
Source: The Hacker News
