eBay Tech London

May 12, 2015 Shutl, Tech

Is duplication always a bad thing?

By:

tl;dr

No.
Well, perhaps, but it can be the lesser of two evils.
It depends.

disclaimer – these thoughts reflect my personal opinions and not a consensus of how the entire Shutl engineering team feels. 

Full version

As a software engineer, it’s always been drilled into my head that duplication is a bad thing, and teams I’ve been a part of have always worked diligently to reduce all forms of duplication they see. This as a general rule makes a lot of sense, I can still remember reading through Martin Fowler’s Refactoring while I was at university and the excitement of discovering design patterns.

Duplication is bad

Good developers are precise and careful and understand the risks and pitfalls of duplication in a code base (see the DRY principle, from the pragmatic programmer)

“The alternative is to have the same thing expressed in two or more places. If you change one, you have to remember to change the others, or your program will be brought to its knees by a contradiction. It isn’t a question of whether you’ll remember: it’s a question of when you’ll forget.”

Which makes perfect sense, it’s time well spent when we try to make our code streamlined and readable. You’ll end up with a cleaner, easier to maintain and more extensible code-base as a result. I mention this briefly here just to give some context, but I’m not planning to go into huge detail. It’s been covered many times and I’m assuming that we’re all on board with duplication is a code smell, and removing it is usually a good thing.

However, like most things, I believe this advice starts getting you into trouble if you take it to an extreme. Advice, just like code, does not live in a vacuum and context is king. When it comes to unit tests, sometimes the cost introduced by removing duplication can be greater than the benefits gained. What I plan to show you in this post are a few examples of where I would leave duplication in place and how this has, in my mind, led to cleaner and more readable code.

Peacock Programmers

[Edit – I’ve just discovered that an alternate name may already exist for this (and here I thought I was being original) – The Magpie Developer ]

I was trying to think up an appropriate term for a thought I had in my had (typically, a large chunk of time was spent coming up with a name) and settled on Peacock Programmers.

A peacock programmer is someone who loves to use every tool in their arsenal and for various reasons attempts to decorate code with every possible feature that is available to them.

You can recognise the works of a peacock programmer through inelegant attempts at elegance. It’s wordy, flashy, looks complicated and is almost impossible to understand without deep concentrated effort and untangling. Sure it’s fun to write and a blast to experiment and learn all the different possible ways of solving a problem, but that’s its use as an academic tool. When you’re seeing in on a production codebase, it can be frustrating.

Are Peacock Programmers a bad thing?

I was talking through the concept of peacock programmers over lunch in the office and a colleague, Kai, raised an interesting point. The term sounds rather negative, and in truth my description in the previous section was slanted that way. However, Kai raises a good point. It is only through the efforts of these peacock programmers and excessive use of all these features that the rest of us learn where the limits of reasonable use are. It’s a common pattern with many technologies, like meta-programming for example. The first stage is to learn it. The next stage is to get excited by this new knowledge and attempt to use it everywhere. It’s then, by seeing it used excessively and understanding the consequences of that, that you learn to regulate its use and ensure it’s appropriate.

As Kai sees it, peacock programmers are the early adopters that the rest of us ultimately learn moderation from, and their nature is an essential part of our learning.

A Case Study

The focus of this blog post is going to be a real unit test from a live codebase that I’ve worked on. I’ve copied the original test here below** as an example that the rest of this blog post is going to focus on. There were some more interesting examples to pick from, but they started to get a little unwieldy  for this blog post, so I’ve picked a simpler example. I’ll add some notes at the end on other issues I had found in the larger tests. I’ll also not be concerning myself with actual details of what is being tested, if it’s appropriate or well-formed, that’s not the focus here, but rather just the overall structure of the test class.

[** Disclaimer – the code and the style has been preserved exactly as was, I’ve just renamed classes/methods for anonymity]

require File.expand_path('spec/spec_helper')

describe NewFarm do
  let(:valid_params) {
    {size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor", user_id: 1321}
  }

  let(:expected_attributes) do
    {
        size: "large", 
        colour: "#E0E0EB", 
        breed: "Greyface Dartmoor",
        user_id: 1321, 
        fake_sheep: nil, 
        session_id: "a uuid", 
        value: 0
    }
  end

  let(:service) do
    double 'service'
  end

  let(:sheep_collection) { double("sheep collection", id: "an id", valid?: true) }

  before do
    service.stub(:generate_sheep).and_return(sheep_collection)
    SecureRandom.stub(:uuid).and_return("a uuid")
  end

  let(:new_farm) { NewFarm.new service, valid_params }
  subject { new_farm }

  describe "#formatted_base_errors" do
    let(:sheep_collection) {
      double("sheep collection", id: "an id", valid?: false, errors: {base: errors})
    }

    before do
      new_farm.save
    end

    context "when no errors are present on the sheep_collection" do
      let(:errors) { [] }
      its(:formatted_base_errors) { should == "" }
    end

    context "when errors are present on the sheep_collection" do
      let(:errors) { ["no wool", "pretty sure it's a disguised wolf"] }
      its(:formatted_base_errors) { should == "no wool, pretty sure it's a disguised wolf"}
    end
  end

  describe "#save" do
    context "success" do
      it "generates sheep" do
        service.should_receive(:generate_sheep).
            with(expected_attributes).
            and_return(sheep_collection)

        new_farm.save
      end

      it "persists the farm" do
        new_farm.save
        Farm.last.uuid.should == "a uuid"
      end

      it "associates the sheep collection with the farm" do
        SecureRandom.stub(:uuid).and_return("a uuid")

        service.should_receive(:generate_sheep).
            with(expected_attributes).and_return(sheep_collection)

        new_farm.save

        Farm.last.sheep_collection_id.should == "an id"
      end

      its(:save) { should be_true }
    end

    context "when colour is wrong for breed" do
      let(:params) {
        {size: "newborn",
         colour: "#ff69b4",
         breed: "Badger Face Welsh Mountain"}.with_indifferent_access
      }

      let(:new_farm) { NewFarm.new service, params }
      its(:save) { should == false}

      it "puts an error on the colour" do
        new_farm.save
        new_farm.errors[:color].should == [
            "Badger Face Welsh Mountain sheep are not #ff69b4"
        ]
      end
    end

    context "when input is missing" do
      let(:new_farm) { NewFarm.new service }

      its(:save) { should == false }

      it "doesn't call the service" do
        service.should_not_receive(:generate_sheep)
        new_farm.save
      end

      it "doesn't create a farm" do
        expect {
          new_farm.save
        }.not_to change{ Farm.count }
      end

      it "adds errors for each field" do
        new_farm.save

        [:size, :colour, :breed].each do |attr|
          new_farm.should have(1).error_on attr
        end
      end
    end

    context "when service returns errors" do
      let(:sheep_collection) {
        double 'sheep_collection', {"valid?" => false,
                                    "errors" => {
                                        "size" => ["oh noes"],
                                        "base" => ["A base error", "Is something else"]
                                    }
        }
      }

      before do
        service.should_receive(:generate_sheep).and_return(sheep_collection)
      end

      it "puts the errors on itself" do
        new_farm.save

        new_farm.errors[:size].should == ["oh noes"]
        new_farm.errors[:base].should =~ ["A base error", "Is something else"]
      end

      it "returns false" do
        new_farm.save.should be_false
      end

    end
  end

  context "#farm_id" do
    it "returns the farm uuid" do
      new_farm = NewFarm.new service, valid_params

      new_farm.save
      new_farm.farm_id.should == Farm.last.uuid
    end
  end
end

This test was typical for the code base, and was seen by many as a well written one. In some ways it is; the tests are small, contained, and they test one thing each.

Yet I’ve always been frustrated when I see code like this. I’m a fan of simplicity and try to avoid, as much as I can, the use of superfluous language features. Maybe I’m just stupid (this is quite possible) but I look at at test like this and I really struggle to understand what’s going on. Also important to note is that I’ve picked one of the simpler samples for this post.

Nested describes – like a boss

We ran an analysis on a small set of our tests (the tests for one of our services) to figure out the worst culprits for nested describes/contexts. Our record was seven. That’s right, seven layers of nested description before you finally get to the test body. Good luck trying to figure out what’s actually being tested and finding the relevant bits of set up!

Nested describes/contexts can potentially be a way of helping structure your tests, but give it a moment’s though before you wrap that spec in a context and ask yourself “why am I doing this?”. Especially when you’re creating a context with just one test inside it, why not just make the test name more descriptive?  In isolation the idea of being able to use contexts to help structure and add meaning to your tests is a good one, and I’ve made good use of contexts. But use them sparingly, lift your head regularly from the depths of your spec and take a look at the whole test file and ask yourself “how’s it looking?”.  Once you start more than a couple of levels deep, perhaps it’s time to rethink and ask yourself why is your test so complicated?

e.g. instead of:

require "spec_helper"
context "when something interesting happens"
  it "behaves like this" do
    ...
  end
end

How about…

require "spec_helper"
it "behaves like this when something interesting happens" do
  ...
end

 Bonus:

Here’s an example I found from a codebase (again with the names obfuscated, but the only words I’ve replaced are farm and sheep, everything else is exactly as was) I’ve worked on with 6 layers (I couldn’t find the record-breaking seven-layer spec 🙁 )

describe Farm::UpdateSheep do 
  ...
  describe '#from_farm' do
    ...
    context 'saving from farm' do
      ...      
      describe 'formatting for the api' do
        ...
        context 'with a nulled date' do
          ...
          it 'submits an empty date' do
            ...

Now imagine the message you would see when this test fails…

Farm::UpdateSheep #from_farm saving from farm formatting for the api with a nulled date submits an empty data

‘Let’ it go

The other feature I think is worryingly overused is ‘let’. Now I understand that let is lazily evaluated and can improve performance, but we’re talking about unit tests here, which should be taking in the order of seconds to run a suite. What I’m more concerned about is the time wasted trying to decipher a test class that has let blocks scattered all over.

Again my biggest issue with them comes from misuse and misdirection. I have seen many simple unit test classes that have been made so much more verbose and complex through the addition of unnecessary let blocks. Their introduction led to an explosion in the scope of a test, no longer can I look at a self-contained “it” block and understand the test, now I need to scroll all around, trying to piece together the nested contexts and follow the chain of let blocks to see which data is being set up, and then see which of these many lets my test is overriding… I’m getting anxious just thinking about it (more thoughts on this, and alternative structure can be found in the When, Then, Given section below).

Subject

This is probably my least favourite feature. I can’t think of a single time in my own experience where I’ve thought that using “subject” was a good idea and I am really confused as to why people are propagating it’s use. It just further reduces readability and adds pointless misdirection to your tests. When I first experienced the joys of RSpec after a stint working in Java, I loved the expressiveness and the flow of the test structure. This feels like a step backwards.

It gets even more confusing when your spec is testing something that is happening in the initialization of the class under test. (Note: bonus point for combining subject with described_class)

subject {described_class.new}
...
# way way down in the file
it "does something" do
  Sheep.should_receive(:is_disguised_wolf?) 
  ...
  subject
end

We came across a test like this by chance while working on a class, and my pairing partner saw the trailing subject and assumed it was some mistake as it looked like a pretty meaningless line accidentally added at the end of a long test. They deleted this line of code without giving it much thought, and then a few minutes later when we ran the test, it naturally started failing.

My advice, ditch the pointless subject, new is not always better. Make your tests explicit, I’d much rather see an appropriately named variable that is instantiated within the test body it’s used as opposed to a cryptic “subject’ that was defined way up near the top of my test (or somewhere in the nest of contexts you had to traverse to reach the test).

When, Then, Given

I’m sure you’re starting to see a pattern here. Most of my annoyances is born from testing styles that increase the spread of focus that is required to understand what is happening. The idea of a unit test is that it tests a unit. I expect them to be concise and self-contained.
The way I try to achieve this is by following the “When, Then, Given” pattern. I’m assuming that you’re going to be familiar with the BDD given/when/then (or the synonymous arrange/act/assert) philosophy of test structure. When I write I unit test, I first start by mentally picturing these as three distinct sections of my test. At times, when trying to make this pattern explicit or share it with a new team I explicitly express the section through comments:
it "does something" do
  # Given
  ...
  # When
  ...
  #Then
  ...
end
So far so good. However the mistake people make from here is to start working through the sections filling them in. I wouldn’t do that. Start with the When. This should one line, and the most obvious/easiest to write. When I look at a test written in this format I can immediately recognise the trigger for your scenario.
Next move on the Then. This is the next easiest section to fill in. What is the expected outcome?
And only then should you go back and complete the Given. In order to achieve this result, what data is required? And keep it all in the test! Again, this advice is flexible, and I can understand there are times when you’d want to pull things out, but I try to resist this urge as much as possible (in the rewritten version of the test at the end of this post, you’ll notice I did use one let block). Just keep it under check, else a few months later your test will have grown into a mess. Keep the discipline.
This also brings us back to another reason why I hate let blocks. You can no longer read a spec and understand why the class under test is behaving the way it does. The data required to achieve this result is now spread all around. In addition to that, while you’re following all these let blocks to see what your test is doing, you’re chasing a load of red herrings as much of this data is totally irrelevant to this specific scenario.

Shared Examples = Shared pain

I couldn’t find a small enough sample to include as a case study, but shared examples are just a nightmare! Imagine all the trouble of going back and forth across a long test file trying to make heads or tails of it, and then multiply that by 10. Just avoid.

Less duplication = more line of codes…right?

This brings us back to our case study. I’ve attempted to rewrite a number of tests using this style, and the first time I tried, I truly did expect there to be more line of code. I was ok with that, I was willing to accept that as a trade-off for the increased readability and comprehensibility of the tests.

The original test was 162 lines of code.

The rewrite…101.

Technically speaking, there is more code, the lines are longer, but there are fewer of them, and they have more meaning to them. I’ve seen the same result in almost every test I’ve rewritten, and was surprised.

So here’s the finished result, this is the above test as I would have written it. It’s not perfect and perhaps it’s just me, but I find this style so much easier to comprehend and get my head around.

require "spec_helper"

describe NewFarm do

  let(:valid_params) { {size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor", user_id: 1321} }

  it 'only returns base errors when there are no other errors on the sheep' do
    sheep_collection = double(Service::SheepCollection, valid?: false, errors: {base: ["no wool", "pretty sure it's a disguised wolf"]})
    service = double("service", generate_sheep: sheep_collection)

    new_farm = NewFarm.new(service, valid_params)
    new_farm.save.should be_false
    new_farm.formatted_base_errors.should == "no wool, pretty sure it's a disguised wolf"
  end

  it 'has no formatted base errors when there are both errors on base as well as other errors on the farm' do
    sheep_collection = double(Service::SheepCollection, valid?: false, errors: {base: ["farm errors should take priority"]})
    service = double("service", generate_sheep: sheep_collection)

    new_farm = NewFarm.new(service, {breed: nil})
    new_farm.save.should be_false

    new_farm.formatted_base_errors.should be_nil
  end

  it 'returns nil when trying to format base errors on a valid farm' do
    sheep_collection = double(Service::SheepCollection, valid?: true, errors: {})
    service = double("service", generate_sheep: sheep_collection)

    new_farm = NewFarm.new(service, valid_params)
    new_farm.save

    new_farm.formatted_base_errors.should be_nil
  end

  context "#save" do
    it 'generates sheep on successful save' do
      expected_attributes = {size: "large", colour: "#E0E0EB", breed: "Greyface Dartmoor",
                             user_id: 1321, fake_sheep: nil, session_id: "a uuid", value: 0}
      RandomGenerator.stub(:uuid).and_return("a uuid")

      service = double("service")
      service.should_receive(:generate_sheep).with(expected_attributes).and_return(double("sheep collection", id: "id", valid?: true))

      NewFarm.new(service, valid_params).save
    end

    it "persists the sheep" do
      service = double("service", generate_sheep: double("sheep collection", id: "id", valid?: true))
      RandomGenerator.stub(:uuid).and_return("a uuid")

      NewFarm.new(service, valid_params).save

      Farm.last.uuid.should == "a uuid"
    end

    it "returns the sheep uuid" do
      service = double("service", generate_sheep: double("sheep collection", id: "id", valid?: true))
      new_farm = NewFarm.new service, valid_params

      new_farm.save
      new_farm.farm_id.should == Farm.last.uuid
    end

    it "associates the sheep collection with the farm" do
      service = double("service", generate_sheep: double("sheep collection", id: "an id", valid?: true))

      NewFarm.new(service, valid_params).save

      Farm.last.sheep_collection_id.should == "an id"
    end

    it "puts an error on the colour when it doesn't make sense for the breed" do
      service = double('service', generate_sheep: double("sheep colletion", id: "an id", valid?: true))
      farm = NewFarm.new(service, {size: "newborn", colour: "#ff69b4", breed: "Badger Face Welsh Mountain"})

      farm.save

      farm.errors[:colour].should == ["Badger Face Welsh Mountain sheep are not #ff69b4"]
    end

    it "adds errors and doesn't create a farm when input is missing" do
      service = double("service", generate_sheep: double("sheep collection", id: "id", valid?: true))
      new_farm = NewFarm.new service

      expect{ new_farm.save }.not_to change{ Farm.count }
      [:size, :colour, :breed].each{ |attr| new_farm.should have(1).error_on attr }
    end

    it 'puts sheep collection errors onto self when there are errors' do
      sheep_collection = double("sheep collection", valid?: false, errors: {base: ["Error"], size: ["oh noes"]})
      service = double("service", generate_sheep: sheep_collection)

      new_farm = NewFarm.new(service, valid_params)
      new_farm.save

      new_farm.errors[:size].should == ["oh noes"]
      new_farm.errors[:base].should =~ ["Error"]
    end
  end
end

(CC image by leg0fenris)

Leave a Reply

Your e-mail address will not be published. Required fields are marked *