What to unit test and what not to in legacy applications

I am working on a team supporting a legacy application. Unit test coverage is close to zero.

Anyway, we would like to start adding unit tests, but then there is a question of what to unit test.

Option 1: Every method of every class. This is a huge amount of work and I do question the value of this. Some methods seem to do trivial things and it seems pointless to test.

Option 2: Only business rules. This is kind of the school I favor, but in a legacy app the documentation is poor to non-existent, to flat our wrong. A real question is even figuring out if something is a business rule vs a simple implementation detail. Furthermore, is this kind of coverage complete enough?

What are some other options? When looking at a method, what do you look for either say, “Yes, this method needs a unit test” or “No, it is not worth the effort / time to unit test.”

Are there any good resources on this topic? I have tried googling this, but I keep finding stuff about TDD which is fine for NEW code, but does not touch the issue of legacy code.

I really hate the current mainstream idea of unit testing where a unit is a single function, as it makes the tests extremely brittle, makes refactoring much harder than it needs to be (especially being able to do non-trivial refactors without having to change both test and application code simultaneously), and just has not been good for ROI in my experience.

I’ve much more recently changed my testing habits to be in the middle of integration testing but without external components (external APIs and data stores are mocked out but everything else is run inline). This allows for testing actual business rules (your second) from an outsiders perspective, so the tests resemble more real life scenarios, like “user logs in, creates a thread, other user logs in, verify other user can see thread”. I’ve had some really conceptually complicated scenarios mapped out in 20-ish line automated tests that allows for extremely heavy refactor without touching any test code at all, with all internal integration points verified.

That being said, it requires a good amount of infrastructure to do and it’s extremely hard to bolt that onto a legacy app that wasn’t designed for it. So while I think business rules are what should be tested that may be easier said than done.

One idea is to not worry about what should be tested and instead focus on testing from here on out. So instead of trying to determine how tests should be created just let the tests come organically and figure it out from there. If a new feature is being added or a bug needs to be fixed add a test to verify that happens. This will get you a good feeling of where the pain points in the architecture are coming from (in regards for testing) and will probably help pinpoint a more long term testing strategy to work towards that the team is in alignment with.

Here’s my mindset when it comes to dealing with this kind of no-documentation scenario: the business rules are the current behaviour of the app, regardless of whatever the documentation says. The users will have gotten used to the way it works, and any change to the behaviour needs to be intentional.

So, given that, whenever I need to change something, I make sure to write tests to capture the current behaviour. This is usually quite a pain, as it entails reading the code to reason out what it does, and then writing tests to verify that my understanding is correct. A pain, but it’ll also help a lot with understanding the code.

What do I write tests for? Whatever unit is too complex for me to immediately grasp the function of. Sometimes that unit is most of the methods in a class/module, sometimes it’s just the main entry method, depending on what the code looks like.

TDD, as you say, focuses mainly on developing new code, but I think the mindset works just fine for working on legacy code as well. You just have a step before the red-green-refactor cycle which is “write tests to capture current behaviour”.

The main problem will be that the code is probably harder to test than completely test-driven code, so the initial tests you write tend to know a lot of implementation detail. Let that guide your refactoring work, once you have those tests in place. If you refactor with the goal of making the tests simpler, it will drive the code towards the same kind of easy-to-test design that TDD would have driven you towards in the first place.

Is there a driving need to have full unit tests and a timeframe, or is it just a direction you want to move towards?

If it’s just a desired goal, then in the short term maybe prioritize the code paths which are least likely to be hit by other code coverage or integration/functional testing.

If you can evolve towards full unit coverage, then just make a policy that any function or API you change, you need to create a unit test for it also.

Are you planning a big refactoring or re-architecting? Then maybe you don’t want to waste time on unit testing the old codebase at all; just focus on documentation and specifying existing behavior, and set policies for unit tests of the new code being developed.

They (Management) are kind of pushing for unit testing coverage. They are running something called SonarQube and its painting everything red as far as not having unit tests, among many other issues.

Still though this begs the question of what to test. I would assume using TDD that every method would be tested. This is kind of hard to wrap your head around for methods that do not have any real logic, maybe ones that just run a database query and then just populate some data object.

It also bothers me that there is tight coupling between test code and implementation code. For example, checking variable values or if other methods were called, if someone renames the method / variable, then that would break a unit test.

In all the TDD tutorial stuff I see, they always use completely simple methods to test where you can pass in some value and get a transformed value back. This scenario almost never happens in real life code as far as I have seen. There are plenty of void methods that change a bunch of stuff and have no simple return value to see if they worked correctly or not.

Maybe this leads to the question: How do you write code so that it is easily unit testable?

My teams work with a lot of legacy apps with zero unit testing. We just have a rule:

“Improve Unit testing coverage every time you touch the code.”, meaning that we want to see improvement in the 2-5% range per update. We never go into code to only to unit tests, that would be wasteful with a perfectly working application.

Writing unit tests should have an objective: To reduce the broken builds and to improve your confidence that you haven’t broken anything when introducing code changes.

To that end, I would recommend to only write unit tests where you change code. If you’re not changing it, don’t write a unit test for it.

An exception might be if you want to have unit tests around the “happy path” of the primary functionality that is used the most. But I would only do surgical updates to add unit tests because every time you touch the code, you are taking a chance of breaking that code.

For me, TDD is all about letting tests guide you towards a good design and correct implementation. Writing a test first makes me ask the question “what should this do?” instead of “how should this be implemented?”. As you work, you are building a scaffolding of tests, that allows you to refactor without fear of breaking something that was already working. And when you can refactor without worrying, you are not afraid to try new implementations when you discover new facts about your program.

But in the same way that you don’t leave scaffolding after a building is complete, you shouldn’t (perhaps) leave all the tests you used when developing the program. Some of them may be too specific, and maybe too coupled to the specific implementation. In my mind, part of a good TDD practice is actually pruning your tests as part of the red-green-refactor cycle. And that means that you might remove the test that just verified the mapping between database query and returned object, because all it does is tightly couple the object with a specific query. Maybe that’s better tested in a high-level test, that drives through the whole application.

Yes, there’s certainly a common scenario to have mutable state like that. The question to ask there is: how is this mutable state used? Somewhere the things you change have an effect on an outcome, and that’s what you’ll end up using to verify the behaviour.

Exactly! That is the question that TDD wants you to ask yourself, and experiment with. One mindset I find very useful, when writing the test first, is “write the code you wish you had”. In other words, when I write my test, in the setup and the verification, I just access the code under test in some way that I wish was possible, knowing full well it won’t work immediately. But it tells me what I need to be able to test it more easily. And then the next step becomes implementing that code I wished I had.

So maybe one answer to the question is “by writing tests first and being lazy”.

Buy yourself a copy of Working Effectively With Legacy Code

Late to the party.

It would seem that the best use of limited resources would be to only add unit tests on introduction of new features and/or code refactor.

Adding unit test for the sake of unit tests or checklist tocking seems a huge waste of precious development efforts.

Edit: As you work though the code on adding new features, the quality improves because of unit testing added.

Code refactor forces you to duplicate the exact behaviour so it’s a good time to unit test too.

This way, you dont waste precious effort chasing down code that may never need to be touched, either because they are ultra reliable in the first place or will be obsolete because of new features you are about to introduce.

Extra late to the party but I also recommend this book. In January I started on a project that inherited a big code base. Although it seemed to be designed pretty well, once we started having to actually modify it, we realized it was a mess. For example (it’s Java) - there is a data ingestion portion and a back-end portion which are separate… and they use different (but identically named) DAO classes… that are different. So when I had to do some work on the ingestion I kept having database issues because the guy who wrote the ingestion DAOs only wrote the minimum he needed to for his task - so the DAOs don’t fully populate the objects when you read them in. That causes some wacky hijinks when trying to use the objects, lemme tell you! And I won’t even get into the fact that a big percentage of our code is in Oracle DB stored procedures, which makes it 3 billion times more difficult to test and diagnose issues.

So you decide to add unit tests. And you find out that classes are one giant function with 400 lines of code. So you should factor out some functions. But you can’t because you’re afraid to change the code… because you don’t have any tests. I’m there. I have sympathy for you!

At any rate, the Feathers book recommended by @Chris_Gwinn was recommended to me and it’s great. Has a lot of chapter headings like “my application is all API calls,” “my code has no structure,” and plenty more.

One recommendation that I think makes a lot of sense when inheriting a big ball of code that has no tests is to create an end-to-end test that demonstrates functionality. If you break things after that point, you can at least know you messed up SOMETHING and can try to repair it ASAP.

So take heart - things could be worse! We use the SonarLint plugin to Eclipse. It brings the SonarQube analysis into your IDE and tells you about issues you should fix. One thing it harps on is the cyclomatic complexity of functions - SonarQube complains if your function’s complexity is above 15, so you’ll see warnings like “reduce the cyclomatic complexity of this function from 45 to 15” and the like when you run across complex functions. There’s another subproject on my contract that just got ahold of their source code (C#, doing some sort of .NET application). One of their functions had a cyclomatic complexity of 4500 (it includes, among other things, a 700 line switch statement).

In addition to Feathers’ book, I also recommend Clean Code as a way to improve the code you write going forward.