The starting point: inheriting a live, fragile codebase
Inheriting a production codebase from another team is always a risk. At Netrisk, the risk was concrete: multiple business-critical PHP portals — insurance comparison, banking, telecoms — that generated significant daily revenue, with deployment processes that were manual, unreproducible, and genuinely scary to run. The previous team had delivered features; they had not invested in the engineering foundation that makes confident delivery possible.
Our mandate was clear: stabilise delivery, build confidence in releases, and enable the growing product roadmap to move faster without accumulating further technical debt.
What we did — and what it took
- Reproducible development environments. The first step was eliminating "works on my machine" as a category of problem. Vagrant-based local environments that mirrored production topology removed a whole class of integration surprises.
- CI/CD with quality gates. We built GitLab pipelines that ran automated tests, static analysis, and code quality checks on every commit. Deployments became one-click, auditable, and reversible.
- Test coverage, written retroactively. Writing tests for untested legacy code is slow, unglamorous work. We did it systematically: highest-risk paths first, covering the business logic that actually mattered for daily revenue.
- Monitoring and observability. You cannot improve what you cannot see. We set up error tracking and performance monitoring to make production behaviour visible and give the team early warning of regressions.
- Team scaling. As the delivery foundation became trustworthy, we grew the team to 10+ developers, each able to ship independently without risk of breaking things for the others.
What we learned
The biggest insight from Netrisk was deceptively simple: the bottleneck in legacy modernisation is almost never the technology. It is trust — or the lack of it. Developers do not refactor code they do not trust. Product teams do not accelerate roadmaps on platforms they cannot deploy reliably. Every improvement we made was ultimately in service of building that trust.
The second lesson: test-driven stabilisation beats feature-freezing. The temptation when inheriting a fragile system is to stop shipping new features until the foundation is solid. In practice that is almost never acceptable commercially. The sustainable approach is to add test coverage incrementally, around and ahead of new features, rather than as a separate initiative.
What we would do differently today
1. AI-assisted test generation for legacy code
Writing retroactive tests for untested legacy code is the most labour-intensive part of stabilisation work. Today, AI tools can generate a meaningful first draft of test cases for a given function or endpoint in seconds — not a substitute for understanding the code, but a significant acceleration. On a project the size of Netrisk, that is hundreds of engineering hours recovered.
2. AI-assisted codebase onboarding
One of the most expensive activities when inheriting a large codebase is onboarding new engineers — explaining what undocumented code does, mapping the data flows, understanding the business rules embedded in legacy logic. Today we would use AI as an interactive documentation layer: engineers can ask "what does this function do?" and get a reliable answer, dramatically shortening the time from hire to productive contribution.
3. Faster environment standardisation
Setting up reproducible development environments was significant effort in 2018. Today, with Docker as a universal baseline and AI available to generate Dockerfiles, compose configurations and debug environment issues, the same work takes a fraction of the time.
4. Earlier investment in observability
We would instrument from day one more aggressively — not just error tracking, but structured logging, distributed tracing, and dashboards that made the system's behaviour legible to both engineers and product. The earlier you can see production, the earlier you can act on what you see.
What we would keep the same
- The "trust first" philosophy. Technical improvements that do not build trust in the delivery pipeline are wasted effort. Every decision should be evaluated through the lens of: does this make our releases more reliable and reversible?
- Incremental test coverage alongside features. Not a rewrite, not a feature freeze — steady, measurable improvement of the test suite, week by week.
- Close communication with the product team. Stabilisation work is invisible to business stakeholders if you do not make it visible. Regular reporting on deployment frequency, lead time, and incident rates turns engineering work into business language.
Relevance today
Most companies running software at scale have at least one Netrisk-like situation in their portfolio: a system that works, generates revenue, and is quietly accruing technical debt that will eventually slow everything down. The work of stabilisation is unsexy but commercially critical.
With AI tools available today, the timeline for that work compresses significantly. If you have a legacy system that needs this kind of attention, let's talk.