Role assignments multiply faster than you expect.
Here is how we went from structured permissions to chaos, and how we fixed it.
It Starts With One Exception
You build a clean RBAC model.
Groups for teams. Roles at the right scope. Least privilege enforced.
Then someone needs access for a demo. Just this once.
You add a direct role assignment to their account.
You plan to remove it later.
You forget.
Every Exception Becomes Permanent
The demo user never gets removed.
Another developer needs temporary access for debugging. You grant it.
A contractor needs read access for a week. You grant it.
A service principal needs permissions for a migration. You grant it.
None of them get cleaned up.
Six months later, you have:
- individual user accounts with direct assignments
- orphaned service principals with Contributor access
- groups that no one remembers the purpose of
- assignments at multiple scopes for the same identity
You cannot untangle it without breaking something.
The Scope Problem Makes It Worse
Azure RBAC has inheritance.
Roles assigned at the subscription level cascade to all resource groups and resources.
That sounds convenient. Until someone grants Contributor at the subscription level for a task that needed access to one resource group.
Now that identity has far more access than necessary.
And because it works, no one fixes it.
We had identities with:
- Contributor at the subscription level
- Reader at the resource group level (redundant)
- specific roles at the resource level (also redundant)
The effective permissions were impossible to audit.
No One Owns the Cleanup
Developers grant access when they need it.
They do not remove access when they are done.
Security teams see the sprawl, but they do not know which assignments are still needed.
Operations teams assume someone else is managing it.
So no one does.
We had assignments that were three years old for people who had left the company.
The Break-Fix Cycle
Eventually, something breaks.
A service stops working. Logs show “Access Denied.”
Someone adds Contributor to fix it quickly.
It works. The incident closes.
No one investigates why the original permissions were not sufficient.
No one cleans up the overly broad role after the fix.
The sprawl gets worse.
How We Audited What We Had
We had to inventory everything.
We exported all role assignments using Azure CLI:
az role assignment list --all
We analyzed:
- who has access
- what roles they have
- at what scope
- when it was granted
- by whom
We found:
- 60% of assignments were at broader scopes than necessary
- 30% of assignments were for identities that no longer existed
- 15% of assignments were duplicates at different scopes
The numbers were worse than we expected.
How We Cleaned It Up
We could not just delete things.
We built a process:
- identify all assignments at subscription level
- determine if the scope could be reduced
- check sign-in activity for user accounts
- verify service principals are still in use
- contact owners for group-based assignments
We moved:
- individual user assignments to groups
- broad scope assignments to narrower scopes
- custom roles to built-in roles where possible
We deleted:
- assignments for deleted accounts
- assignments for unused service principals
- redundant assignments
We reduced our total role assignments by 45%.
Nothing broke.
What We Enforce Now
We changed the rules:
- all role assignments go through infrastructure as code
- no direct assignments except for break-glass scenarios
- groups are mandatory for user access
- service principals use Managed Identity where possible
- assignments are tagged with owner, purpose, and expiration
We also built automation:
- quarterly reviews of all role assignments
- alerts for assignments at subscription scope
- automatic removal of assignments for disabled accounts
- reports of assignments older than 12 months
It is not perfect. But it stopped the chaos.
The Real Problem Is Process, Not Tools
Azure RBAC works fine.
The problem is how we used it:
- granting access quickly without thinking about scope
- never removing access
- not using groups
- not documenting why access was granted
The tools cannot fix that. Only process can.
We learned that sprawl is not a technical problem.
It is a discipline problem.
Now we treat every role assignment as debt.
You can add it. But you have to justify it. And you have to clean it up.
That mindset stopped the sprawl.