Billing Support × Billing Engineering
Aligning on a move from case-by-case escalations to incident-based management · Daniel Anselmo (Global Lead, Billing Customer Incident Support) & Stijn Zweegers (Senior Billing Engineering Manager)
With new leadership over the FINTECH team, we opened a joint review of the billing escalation process — moving from a rough, individual understanding to a concrete plan for reducing unnecessary escalations and improving alignment between Customer Incident Support (CSUP) and Billing Engineering.
Today, cases are escalated individually even when the workaround or root cause is already known. That consumes capacity better spent on root-cause fixes and innovation. An efficient earlier model (Epic Jira + tables of contents for bulk fixes) was discontinued for unknown reasons.
flowchart LR
subgraph CUR["CURRENT — case-by-case"]
direction TB
A1["Customer cases"] --> A2["Individual escalation
per case"]
A2 --> A3["Billing Eng repeats
an already-known fix"]
A3 --> A4["Hundreds of duplicate
escalations per issue"]
end
subgraph TGT["TARGET — incident-based"]
direction TB
B1["Customer cases"] --> B2["Confirm issue once"]
B2 --> B3["Incident record +
public status page"]
B3 --> B4["RCA ticket: root cause,
fix ETA, data cleanup, severity"]
B4 --> B5["Customers informed →
fewer new cases"]
end
CUR -.->|"shift focus"| TGT
classDef cur fill:#fce8e6,stroke:#ea4335
classDef tgt fill:#e6f4ea,stroke:#34a853
class A1,A2,A3,A4 cur
class B1,B2,B3,B4,B5 tgt
Once an issue is confirmed, shift from individual escalations to an incident with public status updates — keeping customers informed and preventing further case creation. CSUP effort refocuses on investigation, status, and root-cause identification.
A formal policy distinguishing a bug from an incident: cosmetic/visual issues do not warrant an incident; anything blocking subscriptions or payments does. Removes the sustainability problem of high incident volume.
Every root-cause bug gets a dedicated RCA ticket (code issue, fix timeline, data cleanup, severity). Prioritization weights:
Customer sizeHybrid / enterpriseCustomer ageTotal impactSo critical enterprise clients (e.g. Notion) are addressed first. Escalation count alone has proven an incomplete metric.
A safe, controlled UI for the team to action accounts (e.g. clearing billing flags) without exposing high-risk commands. Enables autonomous resolution, reduces engineering load, and speeds resolution times.
CSUP currently lacks visibility into subscription lifecycle, Ninja Panel, and Stripe, forcing reliance on hunting down individuals. Documentation, training, and system access are needed to operate as a true frontline engineering team.
| Action | Owner |
|---|---|
| Refine the escalation → incident transition | CSUP + Eng (w/ Leslie) |
| Draft the bug-vs-incident policy | Joint |
| Build prioritization framework (size, segment, age, impact) | Billing Eng |
| Continue Town classification & Goa anomaly detection | Billing Eng |
| Scope a controlled CSUP account-actioning UI | Joint |
| Address CSUP tooling, documentation & training gaps | CSUP |
Fewer duplicate escalations, faster resolution, customers proactively informed, and engineering freed to fix root causes — with CSUP operating as an empowered frontline team rather than a relay for repetitive escalations.