A Tale of Accidental Architecture: How 50 Lines Became A Black Friday Disaster

Let me tell you about Sarah.

This is a fictional story. But I bet you’ll recognize it.

I’ve seen this pattern play out across different companies, different teams, different tech stacks. The details change. The progression doesn’t.

Week 1: The Perfect Start

Sarah’s building a notification system for an e-commerce platform.

First requirement: send an email when someone places an order.

Simple. She writes one function. Webhook comes in, format the email, hit SMTP, done.

The whole thing is maybe 50 lines. It works perfectly. Code review approves it. It ships.

Sarah’s thinking: “It’s just one notification type. I’ll add proper abstraction when we actually need it.”

You’ve thought this too. So have I.

Nothing wrong with it. Week 1, this is the right call.

Week 3: The First Copy-Paste

Product team loves the email notifications. Now they want SMS for order shipments.

Mike picks up the ticket.

He opens Sarah’s code. Sees the pattern. Makes sense. He follows it.

New handler. Receives the shipment webhook. Formats the SMS message. Connects to Twilio. Sends it.

He copies some of Sarah’s email formatting logic because customers should see consistent information. Has to adjust it for the 160-character SMS limit, but the core logic is the same.

Mike’s thinking: “There’s some duplication with the email code, but SMS is different enough that abstracting it would be premature. It’s only two notification types.”

Deadline is tomorrow. This ships.

Still nothing catastrophically wrong here. Two types, small duplication, it’s manageable.

Right?

Week 5: User Preferences

Customers start complaining.

“I don’t want SMS notifications.”

“Why am I getting emails for every status change?”

Sarah adds user preferences. Creates a database table. Updates her email handler to check if the user wants that particular notification before sending.

The handler triples in size.

Query the database. Check multiple preference flags. Handle the case where preferences don’t exist yet. Default values. Edge cases.

Sarah’s thinking: “This is getting messy, but the deadline is tomorrow and this works. I’ll refactor it next sprint.”

I cannot tell you how many times I’ve heard “next sprint.”

(Spoiler: next sprint never comes.)

This is fine dog meme - developer ignoring growing problems

Week 7: Two Ways to Do Everything

Mike needs to add notifications for order cancellations and delivery confirmations.

He realizes hardcoding email bodies isn’t going to scale.

So he builds a template system. Creates a templates directory. Writes a simple renderer. Updates his handlers to load templates, populate data, send.

It’s actually pretty clean.

Meanwhile, Sarah’s handlers still use string formatting. She doesn’t know Mike built a template system. Mike didn’t announce it in Slack. It just… exists now.

The codebase now has two different ways of generating notification content.

Sarah finds out later. Thinks: “I should probably switch to Mike’s templates… but my code is working and I’m slammed with other features.”

And she is. Three new features this sprint. No time to refactor working code.

Week 9: The Third Approach

Emma joins the team.

First task: add Slack notifications for the support team when high-value orders come in.

She opens the notification code. Finds Sarah’s inline approach. Finds Mike’s templates. Neither makes sense for Slack.

Slack needs structured JSON payloads, not formatted text.

So Emma does what any good engineer would do: she creates a “proper solution”.

Notification service class. Methods for each notification type. Handles destination-specific formatting internally. Clean. Testable. Well-designed.

She shows it to the team in standup.

Mike: “That’s nice, but I don’t have time to refactor my SMS code right now. Maybe later.”

Sarah: “I like it, but my code has been running in production for months. If it ain’t broke…”

Emma’s service class gets used for Slack notifications. Nothing else changes.

Now there are three ways to send notifications.

Spider-Man pointing meme - three developers with different approaches

Week 12: The Chaos Compounds

Product wants:

  • Push notifications for the mobile app
  • Digest emails (daily order summaries)
  • Ability to snooze notifications

Three developers. Three features. Same week.

Each one discovers the existing fragmentation. Each one makes their own call.

Developer A tries to extend Sarah’s inline approach. Adds push notification logic directly in the handler.

Developer B uses Mike’s templates but creates a new template format because the existing one doesn’t support digest layouts.

Developer C tries to use Emma’s service class but realizes it doesn’t handle scheduling or snoozing. So they add that logic directly in their handler instead.

The notification preferences table is now being updated by five different code paths.

Each developer added their own columns because they didn’t realize others had added similar fields. One stores preferences as JSON. Another uses boolean columns. Another created a separate preferences table with foreign keys.

I’ve seen this code review happen. Every PR gets approved. Every piece of code works.

Nobody did anything wrong.

And yet.


“Every PR got approved. Every piece of code worked. Nobody did anything wrong. And yet.”


Week 15: Customer Complaints

Support tickets start flooding in.

“I’m getting duplicate notifications.”

“I disabled email but I’m still getting them.”

“I’m not getting notifications at all for important orders.”

Sarah investigates. Opens the codebase.

Six different code paths handle notifications. Some check preferences before sending. Some check during sending. Some don’t check at all because the developer assumed another layer was handling it.

She finds the bug. It’s in her original email handler. The preference check is wrong.

She fixes it. Deploys.

Three other notification types break. They were relying on her buggy behavior.

Domino effect - one bug fix breaks three other features

The team estimate to fix it properly: “We need to stop and refactor everything first, or we’ll just make it worse.”

Management: “We don’t have time for a refactor. Just fix the bugs.”

Week 17: The Template Nightmare

Marketing wants to update email designs. New brand guidelines.

The developer assigned to this opens the codebase.

Templates are everywhere.

Some in a /templates directory. Some hardcoded as strings. Some in the database. Some fetched from an external CMS that one developer integrated without telling anyone.

There’s no single source of truth.

Worse: the data passed to templates is completely inconsistent.

Email templates expect order objects with certain fields. SMS templates expect a flattened structure. Push notifications expect a completely different format.

One design change requires touching dozens of files.

The developer estimates: “Two weeks, maybe three.”

Marketing: “It’s just a design update. How is that two weeks?”

Week 20: Performance Crisis

Black Friday.

The system crashes.

Everything is on fire - Black Friday system crash

Investigation reveals: notification handlers are opening new database connections for every single notification sent.

Some handlers properly close connections. Some don’t.

Connection pools exhausted. Some handlers retry failed sends immediately and indefinitely, amplifying the problem during the outage. One handler spawns a goroutine for each notification but never limits concurrency.

The server runs out of memory processing a batch of 10,000 order confirmations.

Different developers made different assumptions about error handling.

Some silently swallow errors and log them. Some retry with exponential backoff. Some fail fast. Some store failed notifications in one database table for retry. Others use a different table. One developer integrated a third-party queue system that nobody else knew existed.

Notifications are getting lost between these systems.

I’ve been on calls where the CTO asks: “How many notification systems do we have?”

Nobody can answer.

Week 24: The Audit

Compliance team asks a simple question:

“Can you show us a record of all notifications sent to customer X in the past 90 days?”

The team cannot answer this.

Notification logs are scattered everywhere.

Some handlers log to stdout. Some to files. Some to a database table. Some don’t log at all.

The log formats are completely different. Some include the full message content. Some just log “notification sent” without details. There’s no correlation between the notification and the triggering event.

The auditor asks: “How do you ensure notifications contain required legal disclosures?”

Each template was created independently. Some include required legal text. Some don’t. There’s no centralized enforcement.

I’ve seen this audit happen. Teams spend weeks reconstructing logs manually.

The Breaking Point

VP of Engineering asks for a simple feature:

“Add an unsubscribe link to all emails.”

The team estimates: Three weeks.

The VP is shocked.

Shocked reaction - three weeks to add an unsubscribe link?!

“It’s just adding a link. How is that three weeks of work?”

The tech lead explains:

“We have seven different code paths that send emails. Each uses a different templating system. Some render templates on the server. Some fetch them from external systems. Some are hardcoded strings. We need to update each one individually, ensure the unsubscribe logic is consistent across all of them, add tracking for unsubscribe events, update the preferences system to handle unsubscribes properly, and test everything thoroughly because there’s no centralized testing strategy.”

Three weeks. For a link.

The VP asks the obvious question: “How did it get this bad?”

What Went Wrong?

Here’s the thing that kills me about this story.

Nobody made a catastrophically bad decision.

Sarah’s Week 1 implementation was appropriate. Mike’s template system was a reasonable improvement. Emma’s service class was a genuine attempt to bring order.

Every single developer was trying to do good work under deadline pressure.

The problem wasn’t the individual decisions.

It was the absence of a shared architectural vision.

Without clear boundaries and layers, each developer made reasonable local optimizations that created global chaos.

The “I’ll refactor it later” moments never came because there was never a good time to stop feature development.

The “let’s standardize this” conversations happened but never resulted in action because no one had time to migrate existing code.

The codebase evolved organically.

And organic growth without structure doesn’t produce a garden. It produces a weed-infested lot.

“But This Is Just a Communication Problem”

You might be thinking: the real issue was that developers didn’t communicate.

If Sarah and Mike had talked, they wouldn’t have built two different templating systems. If Emma had socialized her service class better, others would have adopted it.

Better standups. Better code reviews. Better documentation. That’s what was missing, not architecture.

This is seductive because it’s partially true.

But here’s why it misses the point: architecture IS communication.


“Architecture IS communication. It’s the most important form of communication for technical decisions.”


It’s the most important form of communication for technical decisions.

Think about what actually happened in the story.

The team DID communicate. Mike showed his template system in code review. Emma presented her service class and got positive feedback. They had a meeting in Week 11 trying to align on standards.

The communication happened.

What didn’t happen was turning those conversations into durable, enforceable decisions.

This is the key difference:

Conversation says “we should probably do X.”

Architecture says “X is how we do things here, and here’s where it lives.”


“Conversation is ephemeral. Architecture is the artifact that persists after the meeting ends.”


When a new developer joins and asks “where should notification logic go?”, the answer shouldn’t require scheduling a meeting or hunting through Slack history.

It should be obvious from looking at the codebase.

Communication without architecture leads to the problem Emma faced. She built something good. People agreed it was good. And then… nothing changed.

Without architectural decisions being explicitly made (“from now on, all notifications go through NotificationService”), the good idea just becomes another option in an increasingly fragmented codebase.

Good communication can prevent chaos. But it can’t survive bad processes.

When developers are under deadline pressure, working on different features, joining the team at different times, communication will have gaps.

Architecture is the safety net for when communication fails.

It’s the shared context that makes it possible to work somewhat independently without creating complete divergence.

So yes, the team in our story could have communicated better.

But the solution isn’t “communicate more.”

It’s “communicate the architecture and make it stick.”

Document where things belong. Make architectural decisions explicit. Enforce them in code review. Build structure that persists beyond any individual conversation.

Because at the end of the day, you can have all the Slack channels and standups and retros you want.

Without a shared architectural foundation, you’re just having the same conversations over and over while the codebase continues to fragment.


What Should Have Happened in Week 1

Sarah should have spent 30 minutes writing this:

# Notification System Architecture

## Where Things Live
- All notification logic → services/notification_service.py
- Templates → templates/ directory (Jinja2 format)
- Preference checks → services/preference_service.py
- Delivery logging → notification_log table

## How to Add a New Notification Type
1. Add template to templates/
2. Add method to NotificationService
3. Log delivery attempt (success or failure)
4. Add tests to test_notification_service.py

## Error Handling
- Retries: 3 attempts with exponential backoff (1s, 2s, 4s)
- Failed sends → dead_letter_queue table
- All errors logged with correlation ID

## Preferences
- Check preferences BEFORE sending (not during)
- Default: all notifications enabled
- Unsubscribe → set all preferences to false

That’s it. 30 minutes of work. Would have saved months of chaos.


When Mike added SMS in Week 3, he would have known where to put it. When Emma added Slack in Week 9, she would have followed the existing pattern. When three developers worked simultaneously in Week 12, they would have made consistent decisions.

Not because they communicated better. Because the architecture communicated for them.


The Pattern You’ve Seen Before

I’ve seen this exact pattern play out at least a dozen times.

Different companies. Different tech stacks. Different teams. Different features.

The pattern is always the same.

Week 1: Clean, working code.

Week 3: Small duplication appears.

Week 7: Multiple approaches emerge.

Week 12: Chaos compounds.

Month 6: Simple changes take weeks.

The timeline varies. Sometimes it happens faster (AI accelerates it). Sometimes slower (disciplined team delays it). But without architecture, the destination is always the same.


Then AI Showed Up and Made Everything 10x Worse

Everything I just described? It’s been happening for decades.

Slow burn. Predictable. Manageable if you catch it early.

Then 2024 happened.

AI coding assistants arrived. And they turned architectural decay from a slow burn into a wildfire.

AI Replicates. It Doesn’t Invent.

Here’s what changed.

When Mike needed to add SMS in Week 3, he opened Sarah’s code. Looked at it. Made a decision. Maybe he copied the pattern. Maybe he tried something different.

But he thought about it.

Now imagine Mike has Cursor. Or Copilot. Or Claude Code.

He types: // Add SMS notification for shipments

The AI looks at the codebase. Sees Sarah’s pattern. Instantly replicates it.

Code appears. Mike reviews it. Looks good. Ships.

He never even saw the architectural decision being made.

The AI made it for him. Based on what already existed.


“AI doesn’t just copy your code. It copies your architecture. Even the accidental parts.”


The Speed and Scale Just Exploded

Remember Week 12? Three developers, three features, three different approaches emerging over a week?

With AI, that’s Tuesday.

Developer A asks AI for push notifications. AI sees Sarah’s inline handler. Copies it.

Developer B asks AI for digest emails. AI sees Mike’s templates. Copies those.

Developer C asks AI for snoozing. AI sees Emma’s service class. Copies that.

All three features ship the same day.

But it’s not just faster. It’s bigger.

Pre-AI: 50-200 lines of code per day.

With AI: 500-2000 lines in the same time.

That’s 5-10x more code implementing patterns, creating variations, spreading duplication.

You have two ways of checking preferences? AI propagates both. Three error handling approaches? AI replicates all three. Every inconsistency becomes a seed that AI plants everywhere.

The notification system that took Sarah’s team 20 weeks to become unmaintainable?

With AI, you can get there in 4.

AI Can’t See What You Didn’t Write Down

Here’s the fundamental problem.

AI is incredible at implementation. It can write clean, working code. It follows patterns. It handles edge cases.

But it cannot architect.

It can’t look at your codebase and think: “Wait, this is getting fragmented. We should consolidate these patterns.”

It can’t say: “I see three different approaches here. Which one should I follow?”

It just… picks one. Based on similarity to what you’re asking for.

If your architecture is accidental, AI accelerates the accident.

The Old Advice Is Now Dangerous

The advice used to be: “Don’t over-architect small projects. Start simple. Refactor when you need to.”

That advice just became dangerous.

With AI, “small projects” don’t stay small. They explode.

By the time you realize you need to refactor, you have 10x more code to untangle.

The window between “clean start” and “architectural debt crisis” collapsed.

Week 1 decisions matter more than ever.

You can’t afford to defer architecture anymore.

But Here’s the Good News

The same force that amplifies chaos can amplify order.

AI replicates good patterns just as enthusiastically as bad ones.

If you write that architecture document in Week 1. If you establish clear boundaries. If you make the “right way” obvious.

AI will follow it.

Consistently. Every single time. Across every feature.

It will use your NotificationService. It will follow your template structure. It will implement your error handling exactly as specified.

At scale. At speed. Without deviation.

The chaos multiplier becomes a consistency multiplier.

But only if you give it something consistent to multiply.


“AI doesn’t make architecture optional. It makes it mandatory.”


This is why the next post matters even more now.

I’ll show you how to set up that architectural foundation before you start generating code with AI.

How to make the right patterns so obvious that AI can’t help but follow them.

How to turn AI from an architectural time bomb into an architectural enforcement mechanism.


What’s Next

In the next post, I’ll show you how to build that architectural foundation.

Not some enterprise framework. Not over-engineered complexity.

The simple, practical structure that prevents this chaos.

We’ll rebuild this exact notification system with clear boundaries, testable code, and patterns that guide developers toward consistency instead of fragmentation.

You’ll see:

  • Where things live (and why)
  • How to test without infrastructure
  • How to make architectural decisions stick
  • How AI helps instead of amplifying chaos

Until then, look at your codebase.

What week are you on?


Have you lived through this story? I’d love to hear about it. Find me on LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *