Testing legacy code when you dislike tests (and legacy code), by Maeve Revels

Abstract

Are you supporting legacy code? Would you like to stop? A good testing strategy can transform legacy code into living code that is resilient and easy to evolve.

Learn why legacy code is so difficult to maintain and identify where tests can make the most impact. Not just any tests, though! We'll dive into the characteristics of high-value versus low-value tests and learn techniques for writing tests that minimize the cost of change.

Developers of any experience level can benefit from these concepts. Familiarity with Rails and an automated testing framework is helpful but not required.

Details

What is legacy code?

We know it when we see it. Legacy code tends to be crufty, brittle, complicated, and someone is probably complaining about it in production right now. It also happens to be fertile ground for memes. The jokes in this section are curated to demonstrate that legacy code involves: the risk of the unknown, a high cost of change, and an ongoing support burden.

Proposed definition: legacy code is any code deployed to production without tests.

Untested production code of any vintage also involves the risk of the unknown, a high cost of change, and an ongoing support burden. You may have deployed new legacy code recently. I deconstruct a few hypothetical defensive responses with empathy; to soothe any actual defensive responses before continuing.

What should we be testing?

Your users' highest priority is application behavior. Users are happy when we add new behaviors that they need. Users are unhappy when we modify or remove behaviors on which they depend. Focusing on behavior will shape how you think about testing. Ultimately, this mindset will help you write high-value tests that catch regressions while remaining resilient to change.

I discuss "behavioral seams" and "enabling points"[1] as a means of identifying what to test. A behavioral seam is a place where you can alter behavior without modifying code. An enabling point is where you decide to invoke a specific behavior along the seam. These concepts are foundational, and I go through several real-world examples at the application, subsystem, and object levels.

How do we avoid writing legacy tests?

Legacy tests involve the same limitations as legacy code: the risk of the unknown, a high cost of change, and an ongoing support burden. Legacy tests cause CI failures every time you update the name of an unrelated CSS class, add a fixture record, or sometimes for no discernable reason. Legacy tests are almost worse than no tests at all.

At this point, I introduce a pivotal idea: high-value versus low-value tests. Often, when developers complain that testing is hard and not worth the bother, they are actually complaining that maintaining a suite of low-value tests is hard and not worth the bother. And they are right!

High-value tests continue to pass as long as the application behavior remains unchanged. These tests are sensitive to changes in the specific behavior they are asserting but resilient to other incidental changes. High-value tests are easy to maintain over time. The downside is that initially, high-value tests also take the most effort to write.

Low-value tests are more tightly coupled to the implementation of the code under test, making them brittle. Low-value tests are not no-value tests, but they do not provide long-term protection against regressions. The advantage of low-value tests is that they tend to be very quick and easy to write.

How do I write high-value tests?

Integration tests assert application behavior from the user's perspective and can be the highest value tests of all. The trade-off is that integration tests are typically slower to execute and require more initial effort to set up the test harness. I describe several techniques for writing high-value integration tests that are easily maintainable, with specific code examples for Capybara and RSpec (but the same concepts apply to any testing framework).

The lowest value tests are unit tests. Unit tests are typically easy to understand and quick to execute. I write them every day as part of my development process. Unit tests are also notoriously poor at catching regressions (I have some fun with the "unit tests without integration testing" genre of memes at this point). I offer some tips for more resilient unit testing. Perhaps the most helpful advice I give is to throw away unit tests when they no longer serve you.

Straddling these two types of testing are functional tests, which exercise a subsystem with minimal mocking. Functional tests offer more coverage of system behavior than unit tests but are faster than integration tests. A functional test is an excellent place to comprehensively test behavior with many known edge cases, such as error handling or permission checks.

Putting it all together

  • Legacy code maximizes the cost of change
  • Good tests minimize the cost of change
  • For resiliency, think in terms of testing for behavior versus asserting inputs/outputs
  • Focus on writing high-value tests over low-value tests
  • The tests should continue to pass as long as application behavior remains unchanged

[1]: These terms are borrowed, with attribution, from Working Effectively with Legacy Code by Michael C. Feathers.

Pitch

I've been a Rails developer for over 15 years and a software engineer for even longer, which means I'm kind of an expert on writing legacy code. I'm sorry.

I've witnessed Rails projects evolve from mainly greenfield development to mature ecosystems, complete with a warren of nooks and crannies where mission-critical legacy code tends to hide. Many Rails developers have to maintain legacy code these days, but few resources are devoted to the topic.

Last year, I gave this talk to a group of about 30 Rails engineers. Months later, attendees have mentioned that these concepts fundamentally changed their approach to legacy code and testing.

Edit proposal

Submissions

RailsConf 2022 - Accepted [Edit]

Add submission