Our CI/CD Pipeline Started Simple and Still Got Complicated

Complexity always finds a way in. Our CI/CD pipeline started with a few scripts. Build the code. Run the tests. Deploy the artifact. It worked well. For a while. Then requirements accumulated. Quietly. Complexity Did Not Arrive All at Once No one decided to make the pipeline complicated. It happened incrementally: add a security scan support another environment introduce feature flags handle hotfixes add approvals support rollback Each change made sense in isolation. ...

June 5, 2025 · 4 min · Jose Rodriguez

Designing Systems People Can Actually Support

Systems are often designed for correctness and performance. Supportability comes later. If at all. That ordering is backwards. Supportability Is a Feature If a system cannot be understood under pressure, it is incomplete. Supportable systems have: clear boundaries predictable behavior obvious ownership simple failure modes visible state These are design choices. Clear boundaries mean you know where one service ends and another begins. You can reason about dependencies. You can isolate failures. You can answer “is this my problem or someone else’s” quickly. ...

May 25, 2025 · 6 min · Jose Rodriguez

Alert Fatigue Is an Organizational Problem

Alert fatigue is often blamed on tooling. Too many alerts. Bad thresholds. Noisy systems. Those things matter, but they are symptoms. Alert fatigue is an organizational problem. Alerts Reflect What the Org Cares About Every alert encodes a value judgment. This matters. This is urgent. Someone should wake up. When everything is urgent, nothing is. Organizations that cannot agree on priorities produce alert storms. The system is only reflecting that confusion. ...

May 10, 2025 · 6 min · Jose Rodriguez

Why Runbooks Failed Us

We invested heavily in runbooks. We documented scenarios. We listed steps. We added screenshots. And during incidents, people still asked for help. Runbooks Assume the Wrong Thing Most runbooks assume that the problem is known. “If X happens, do Y.” Real incidents rarely look like that. Instead, people are asking: what changed what is failing whether this is new or known who else is affected Runbooks start too late. The typical runbook structure is procedural. It starts with a symptom and walks through resolution steps. “If CPU is high, restart the service. If that does not work, check for memory leaks. If memory is fine, scale horizontally.” ...

April 25, 2025 · 5 min · Jose Rodriguez

On Call Is a Product of Architecture

On call is often treated like a staffing problem. Who is rotating. How often pages fire. Which alerts wake people up. Those details matter, but they are not the root cause. On call quality is largely a product of architecture. Architecture Decides Who Gets Woken Up Every architectural decision carries operational weight. Synchronous dependencies increase blast radius. Tight coupling turns small failures into outages. Hidden retries create noisy cascades. Poor isolation spreads pain across services. ...

April 10, 2025 · 5 min · Jose Rodriguez

Secrets Are Configuration, Not Infrastructure

Where Key Vault belongs and where it does not. Secrets often get treated like infrastructure. They are stored with infra. Managed by infra. Reviewed with infra. That is usually a mistake. Why Secrets Feel Like Infrastructure Secrets feel permanent. They feel critical. They feel risky. So they end up bundled with infrastructure decisions. But secrets change more often than infrastructure. They also belong closer to applications. Infrastructure teams often manage Key Vault because it lives in Azure alongside virtual networks, storage accounts, and databases. It gets deployed with Terraform or Bicep. It has firewall rules and access policies. It looks and feels like infrastructure. ...

March 25, 2025 · 4 min · Jose Rodriguez

Azure RBAC Is Easy Until You Need to Change It

Why permission models rot over time. Azure RBAC feels simple at first. Assign a role. Pick a scope. Move on. The problems show up later. RBAC Accumulates History Permissions tend to grow, not shrink. Temporary access becomes permanent. Emergency grants never get revisited. Roles pile up across scopes. Over time, no one remembers why access exists. They only remember that removing it feels risky. I have audited Azure subscriptions where people had role assignments from three jobs ago. Former contractors still had Contributor access years after their contracts ended. Service principals created for one-off migrations still had Owner access to production. ...

March 10, 2025 · 4 min · Jose Rodriguez

Giving Engineers Access Without Creating a Security Incident

Practical IAM, not zero trust theater. Access control often swings between two extremes. Everything open. Everything locked down. Neither works. Why Overly Restrictive IAM Fails When access is too hard to get: engineers work around it secrets get shared permissions creep quietly reviews become rubber stamps Security that blocks work does not create safety. It creates shadow systems. I have seen this pattern repeat across multiple teams. Access requests take days or weeks to get approved. The approval process requires three levels of sign-off, none of which understand the technical need. Engineers get frustrated and find workarounds. ...

February 20, 2025 · 4 min · Jose Rodriguez

Key Vault Is Not a Dumping Ground

How secrets sprawl happens and how to stop it. Key Vault feels deceptively simple. If something is sensitive, put it in the vault. Problem solved. That logic is how secret sprawl starts. How the Vault Becomes a Junk Drawer It usually begins with good intentions. A new service needs a secret. A developer adds it to Key Vault. Permissions are granted. Everyone moves on. Repeat this enough times and suddenly: ...

February 5, 2025 · 4 min · Jose Rodriguez

Managed Identity Solved Problems We Did Not Know We Had

The quiet upgrade most teams underestimate. When we first adopted Managed Identity, it felt incremental. No big architecture change. No dramatic security announcement. Just fewer secrets. What surprised us was not what it replaced. It was what it quietly removed. The Problems We Thought We Had Before Managed Identity, most of our security conversations focused on symptoms. rotating credentials expiring secrets leaked connection strings confusing access reviews We assumed these were the core problems. ...

January 15, 2025 · 4 min · Jose Rodriguez