Some reminders about feature toggles

Feature toggles feature occasionally where I work. Today they featured in a lunchtime discussion.

Dare you switch?

Let’s say a particular toggle has been ‘off’ for six weeks. How confident are you that you can turn it on again, safely?

Well, where does that confidence come from? For us, it comes from our acceptance test suite, i.e are there tests that verify the externally visible behaviour in both the on and off state?

Toggle implementations

Compile time constants.

We weren’t too worried by these – each time we rebuild, the tests are rerun – and we only need to run the new or old behaviour test based on the toggle’s value.

Runtime variables

These generally get implemented as a piece of mutable state that changes on receipt of a particular message (e.g a jmx call). Here our test coverage was less convincing. For at least one case, only one of the toggle states had coverage. The developers who created the tests might have run them both ways, but in the automated build/test run, that wasn’t happening.

Feature Toggle? These are more like kill switches (or zombie buttons)

The majority of the runtime toggles were actually kill switches. Either an experimental feature of uncertain outcome was being prototyped, and needed an emergency stop button, or a feature long thought dead was disabled, and a switch left to revive it when, inevitably, the user of the feature turned up post release.

In these cases, the right path appears to be having tests that assert the positive case works when the toggle is on/off appropriately. More explicitly:

  • For an experimental new feature, add new tests describing the behaviour, with one test that verifies nothing happens if the toggle is off
  • When removing a dead feature, keep tests describing old behaviour running by pushing the zombie button before running them. Add one extra test that verifies that nothing happens if the revive button is left untouched.

Really these are special cases of the more general rule: If you liked it then you should have put a test round it.

Reminder 1: Match the scope of your feature toggle with the scope of your

tests

Some very simple corollaries:

  • Runtime feature toggles should have tests that exercise both states in a given test run
  • Compile time feature toggles are ok to just exercise the currently configured toggle value
  • Reader exercise: what about configuration level feature toggles?

Reminder 2: Feature toggles aren’t free, and runtime toggles are

particularly not free

Runtime feature toggles are (at least) twice as expensive to maintain as compile time constants. The payoff might be that you don’t need a release cycle to change them, but you’re paying for it every time you have to run both sets of tests, or change the code underneath that toggle.

Corollary: Feature toggles that don’t change should be removed. In general, once a feature is proven, removing the toggle should be a no brainer. The answer to the original question should really be “How has it survived for so long?”.

Reminder 3: Martin Fowler realised all this years ago

Corollary: There’s nothing new under the sun (but at least this blog is reasonably titled).