Sandbox Architecture and Initial Access
Anthropic’s Claude Cowork, a Windows application designed to give non-technical users access to AI code execution capabilities, relies on a multi-layered Linux sandbox for security. The environment uses Hyper-V isolation, signed pipe communication, and namespace restrictions to contain any malicious code that might run inside the virtual machine. Researchers at Armadin identified that the primary defense against unauthorized access was an Authenticode signature check on the named pipe used for communication between the host and the sandboxed VM.
Privilege Escalation via DLL Sideloading
The research team discovered that claude.exe loads the USERENV.dll library from its own application directory before checking the system path. By crafting a malicious DLL that exports the required function GetUserProfileDirectoryW and naming it USERENV.dll, the researchers achieved code execution inside a legitimate, signed Anthropic binary. This satisfied the named pipe’s identity verification without triggering any alarms, granting them a foothold inside the protected environment.
Complete Sandbox Escape
With code execution established, the researchers reverse engineered the RPC protocol used for VM management. They identified that the spawn method accepted an isResume parameter that was forwarded directly to the VM’s sdk-daemon without validation. Setting isResume to true bypassed the requirement to create a new unprivileged user, allowing commands to be executed as any specified user including root. This chain of vulnerabilities effectively neutralized every layer of Anthropic’s sandbox protections, demonstrating that privilege boundaries in AI agent tools remain fragile once an attacker gains initial local access.
Source: Cyber Security News
