Hound in Action.
We gave Hound two accounts in a realistic multi-tenant app. The app had login, roles, and company checks. Hound found two broken checks, crossed a company boundary, and reached confidential documents.
Hound Chained an Attack.
The app is Dossier, a B2B workspace with two companies: Marrow and Pith. Hound started with Bob at Marrow and Mallory at Pith. With those accounts in scope, Hound got one instruction: target is http://dossier:3000.
Hound used Bob's session to reset Mallory's password, logged in as Mallory, promoted her to admin, then opened Pith's confidential documents.
Where Authorization Broke.
Dossier had login, admin roles, and tenant-scoped report checks. The breach came from two places that trusted the wrong state.
Check
Expected
Flaw
Check
Password reset
Expected
An admin can reset accounts in their own tenant.
Flaw
The endpoint accepted an account ID from another tenant and reset that account's password.
Show password reset check+
The check handled peer admins inside one tenant, but never denied cross-tenant targets.
function checkPasswordResetAllowed(caller, target) {
if (target.org_id === caller.org_id) {
if (target.role === "admin" && target.id !== caller.id) {
return "cannot reset another admin's password in your organization";
}
}
return null;
}Check
Profile update
Expected
An account can edit profile fields like name and email.
Flaw
The profile endpoint also accepted role, so a normal account could become an admin account.
Show profile update check+
The update allowlist treated role like a normal profile field.
const columns = ["name", "email", "role"];
for (const col of columns) {
if (Object.prototype.hasOwnProperty.call(body, col)) {
updates.push(`${col} = ?`);
values.push(body[col]);
}
}Why Hound Could Do It.
The run depended on three things working together: multiple accounts, scoped state-changing tests, and reasoning across steps.
Multiple Accounts Enabled the Cross-Company Test
The reset only mattered because Bob and Mallory belonged to different companies. Hound needed both accounts in scope to test that safely.
Guardrails Made the Reset Safe
A password reset changes state. The guardrail blocked reset attempts until the target user ID was proved in scope.
Reasoning Connected One Bug to the Next
Hound recognized that Bob's reset access could create a Mallory session, then used that session to test the admin escalation path.
Proof From the Run.
The trace shows Hound's reasoning, the guardrail decision, and the final document reached.
Hound Connected the Two Bugs
Hound's reasoning
Bob has admin IDOR access on the password reset endpoint across organizations, which means I could reset Mallory's password directly through his admin panel since she's a hound account. That would give me an authenticated session to test the member-to-admin escalation chain.
The Guardrail Held the Line
Guardrail decision
Blocked Command attempts to modify passwords for users 1 and 2, which are not confirmed as hound-owned accounts.
The later reset was allowed only after Mallory's user ID was proved in scope.
Hound Moved From Admin Access to Documents
Execution trace
Explore Pith org data with Mallory's escalated session Get full content of Pith projects Get full content of Pith reports, including confidential salary data
Hound Reached Confidential Documents
The terminal proof was Pith's confidential engineering salary review.
2026 Salary Review - Engineering IC3 145-175k -> 160-190k IC4 175-210k -> 195-235k IC5 210-260k -> 240-295k Staff 260-325k -> 295-365k
Dossier is public. View the source on GitHub.
Where Other Tools Stop.
To test this properly, a tool has to hold two identities at once, change only approved accounts, and understand when one flaw unlocks the next.
Traditional Pentests
A human tester can find this. The cost is time, scheduling, and rerunning the work after the app changes.
With Hound: Autonomous attack runs that can be repeated as the product moves.
Scanners
Scanners execute checks. They don't reason across identities, decide when state-changing resets are safe, or chain flaws together.
With Hound: Multiple approved accounts in play. The reset targeted only the in-scope account, then the new session kept the chain moving.
Other Security Agents
An agent with one login can't safely prove this type of flaw. The test requires accounts on both sides of the company boundary.
With Hound: Multiple approved accounts enable the run to test both sides of the boundary. Guardrails keep state-changing actions in scope.