Misconceptions your team might have during The Big Rewrite

Disclaimer: I enjoy my project and this is still a work in progress. This rant reflects my experience, not my employer’s views. Despite these troubles, I think we’re coming along nicely. 2026 edit: it is only after being gainfully employed at the same job for this long that I have finally untangled many of the issues discussed here. this entire arc of my life has been a real adventure in undoing early ‘architecture’ decisions.

I joined a team doing the big rewrite in 2018. I’d worked on the project before and knew it well. To break its “limitations,” we started v2. My team has been good; my tech lead excels at architecture. He loves creating new concepts and abstractions. I prefer leveraging existing code over inventing new stuff. We both have different instincts here.

Now, here’s what happened during the big rewrite so far: 4 people on the team, 2 years in

Persistent confusion about sources of slowness in our app

claim: “it’s only slow because devtools is open”
cold water: maybe, but it’s a red herring. code should work with devtools open. the claim: devtools adds instrumentation that slows it down. the evidence: none, just stated during a 3-hour planning call.
claim: “it’s only slow because we’re using a development build of react, try a production build”
cold water: production builds help, but not if you’re constantly rerendering all components every millisecond during scroll. we suffered from this. it creeps back without constant vigilance because you can’t test for it. then one day devtools shows rendering twice per frame: signature of unnecessary setState.
claim: “it’s slow because we are hogging the main thread all the time”
cold water: we built a complex webworker framework. offloading CPU helps, but v2 went too far. we bet on OffscreenCanvas which no browsers support. our webworker bundles contain all the main-thread code, making them huge and slow to load. everything is RPC calls now, so the worker concept is deeply entrenched throughout the codebase, making it hard to reason about.
claim: “it’s slow because there are processes that haven’t been aborted spinning in the background, so we must build out an intensive AbortController thing that touches the entirety of all our code including sending abort signals across the RPC boundary in hopes that a locked up webworker will respond to this”
cold water: v1 had no aborting and worked fine. I can’t find evidence that aborting helps v2. if the new code struggles without it, something else is wrong. Team disagrees. No clear answer. So we keep passing AbortSignals through every function parameter, forever.
claim: “it’s slow because we are not multithreading”
cold water: so we try multi-threading, but this repeatedly downloads the same data twice into different webworkers, and parses it separately, which leads to more resource spent, more network IO, more slowness…

Persistent confusion about what our users’ needs are

claim: there should be per-track scroll bars
cold water: doing this leads to many scrolls within-scrolls on the page, which makes it very hard to scroll the page
claim: the old search indexing system is “bad”
cold water: yes, it is a bit slow, but is it really THE critical problem we face? likely not. bioinformatics people run a data pipeline. it takes a couple of days. if it’s really an issue, use elasticsearch
claim: our users are “stupid” so they need to have every single thing GUI editable.
cold water: interesting idea, but the GUI is monstrously complex and hasn’t simplified anything for users. the historical customization modes were admin based and text-config based
claim: our users “do not like modal popups” so we put everything in a tiny sidedrawer that barely fits the content
cold water: now everything is in a tiny side drawer, which is a major constraint on user interface design
claim: having interest in catering to obscure or not very clear “user stories” like displaying the same exact region twice on the screen at once, saying “someone will want to do this”
cold water: this causes a lot of extra logical weirdness in the app that has unclear benefit in the long run

On-going misdirections

problem: not catering to emerging areas of user needs such as breaking our large app into components that can be reused, and instead just going full hog on a large monolith project and treating our monolith as a giant hammer that will solve everyone’s problems. in reality, our users are also programmers that could benefit from using smaller componentized versions of our code
there is confusion about “what our competitors have”. my team one day claimed “alright so if we just do that and then we will have everything that product X has?” and I just had to be clear and be like, no! product X has a really complex system of their own that we could never hope to replicate. we likely don’t have even 20% of what X has. and we are slower and not as scalable as X. we do have our own strengths that make us compelling still
the fundamental challenge is making architectural decisions without a clear understanding of the actual problems we’re trying to solve

But what does all this imply?

there’s persistent confusion about the challenges, the architecture, user needs, and design goals. that’s the real problem.

Misconceptions your team might have during The Big Rewrite

2020-06-03

Persistent confusion about sources of slowness in our app

Persistent confusion about what our users’ needs are

On-going misdirections

But what does all this imply?