Why permission models rot over time.

Azure RBAC feels simple at first.

Assign a role. Pick a scope. Move on.

The problems show up later.

RBAC Accumulates History

Permissions tend to grow, not shrink.

Temporary access becomes permanent. Emergency grants never get revisited. Roles pile up across scopes.

Over time, no one remembers why access exists. They only remember that removing it feels risky.

I have audited Azure subscriptions where people had role assignments from three jobs ago. Former contractors still had Contributor access years after their contracts ended. Service principals created for one-off migrations still had Owner access to production.

The problem is not malice. It is inertia.

Granting access is a clear action with immediate value. Someone needs to do something. You grant them access. They can do the thing. Success.

Removing access is murky. Will removing this role assignment break something? Is this person still using it? Is there a runbook or automation that depends on it? The safest answer is to leave it alone. So it stays.

The Myth of “We’ll Clean It Up Later”

Later rarely comes.

RBAC cleanup requires:

  • understanding dependencies
  • coordinating teams
  • accepting short term risk

So it gets postponed. Again and again.

Role Granularity Cuts Both Ways

Fine grained roles feel safer. They are also harder to manage.

Broad roles feel convenient. They are also harder to audit.

There is no perfect choice. Only tradeoffs.

Azure has built-in roles like Reader, Contributor, and Owner. They are broad. Everyone understands them. But they often grant more permissions than necessary.

Custom roles let you define exactly what actions are allowed. You can create a role that allows starting and stopping VMs but not deleting them. Or a role that allows reading secrets from Key Vault but not modifying them.

Custom roles sound great until you have dozens of them. Each one needs to be maintained. Each one adds cognitive load. Each one requires documentation to explain when to use it instead of the built-in roles.

We tried both extremes. Too many custom roles became unmanageable. Everyone just used Contributor because understanding the custom roles took too long. Too few custom roles meant granting overly broad access.

The middle ground was defining a small set of custom roles for common patterns and using built-in roles everywhere else. Maybe five to ten custom roles total, each with a clear name and purpose.

Change Is the Hard Part

RBAC works best when access patterns are stable.

When teams change. When services evolve. When environments multiply.

Permissions lag behind reality.

That is when rot sets in.

Team reorganizations are the worst. Someone moves from team A to team B. They get access to team B’s resources. But they still have access to team A’s resources because no one remembered to revoke it.

Service evolution creates similar problems. A service that used to need write access to a database now only needs read access. But the role assignment still grants Contributor. No one bothers to downgrade it because it works and changing it feels risky.

New environments multiply the problem. You spin up a new staging environment. You copy the RBAC configuration from production. Now twice as many role assignments exist. When someone leaves, you have to remember to remove them from both environments.

Treating RBAC as a static configuration leads to rot. Treating it as a living thing that needs regular pruning is the only way to keep it healthy.

What Helped Us

We improved RBAC by:

  • scoping roles narrowly
  • documenting intent
  • reviewing access regularly
  • tying permissions to identities, not people
  • accepting that removal is part of the job

RBAC stayed imperfect. It stayed understandable.

Narrow scoping means granting access at the resource group level instead of the subscription level whenever possible. It means using Reader plus a few targeted permissions instead of Contributor. It takes more thought up front, but it reduces the blast radius of mistakes.

Documenting intent means adding comments or tags to role assignments explaining why they exist. “Grants team-api Contributor to rg-api-prod for deployment pipeline.” When someone reviews the RBAC configuration six months later, they understand the purpose without having to ask around.

Regular reviews became a quarterly ritual. We export all role assignments, group them by principal, and challenge anything that looks excessive or outdated. It takes a few hours every three months, but it prevents years of accumulated cruft.

Tying permissions to Managed Identities instead of user accounts helps with automation and service access. When a service principal has access, it does not go on vacation or leave the company. The access is tied to the workload, not the person who set it up.

Accepting removal as part of the job was the cultural shift. Revoking access is not mean. It is maintenance. We celebrate cleanup the same way we celebrate new features.

Final Thought

Azure RBAC is not hard to set up.

It is hard to maintain.

Treating it as a living system is the only way it stays healthy.

Related reading: