I’ve written a few posts about refactoring already. What is still missing is a post about the basic principles, so here we go.
Rules of engaging your code…
Small compilable steps
Have you ever worked on your code for hours, then you try to compile and test it – and nothing works anymore? Usually, in such a situation you try to fix the problems for a few more hours until you realize that you only got into even more trouble. In the end, you roll back everything and start over, frustrated. Or worse, you roll back and don’t even bother to try another time.
That’s why refactoring is best done in small steps where each single step compiles and passes the tests. Why pass the tests you may ask. You’ll want to pass your tests because to not pass the tests means you changed some behavior. You either accidentally changed something you did not want to change, or you did change it on purpose, in which case you’ll also have to change the tests.
Every small step on the way is a small victory. Set a rollback point whenever your code compiles, passes the tests and when the change you made contributes to reaching the refactoring goal.
Of course, nobody wants to see a hundred commits consisting of single-line changes. But here is where version control systems like git can help. For example, I often stage a number of small changes vía
git add before I actually commit something. If a change does not compile or does not have the desired effect, I can simply go back to only staged changes. When I have staged enough changes, I either create a new commit or amend the last commit. Finally, if needed, we’re able to squash multiple commits into one.
Have a plan
Plan the small steps ahead. Try to think of a path from the current state of the code to the state you want to reach. Compare what you do to what you planned to do – that way you find out early if you are heading in the wrong direction.
Don’t stray from your path. If you find bad variable names, violation of design principles or other things that need fixing, take notes, but don’t fix them yet. Finish what you set out to do and fix your findings later. That way your changes stay focused and comprehensible. It also prevents you from being sidetracked and running into time problems. It’s of no use if at the end of the day you started a dozen refactorings and finished none of them.
Back and forth
Doing small steps sometimes requires us to do some intermediate steps that have to be undone later. For example, to move a method out of a class we might have to make other members of the class public first. Later, when the method no longer requires access to the members, we can make them private again.
Undoing things we have done a few minutes before might look like a waste of time at first. However, doing it this way gives us some security that we would not have otherwise. Compare it to crossing a pitch black room: It might be safer to feel our way along the wall, the long way round, instead of stumbling through the dark and running into furniture.
The above example also shows something else: We are allowed, sometimes even required to break coding rules during our refactorings. We can violate style guides, duplicate code and commit all kinds of atrocities, as long as our plan includes steps to fix these things.
Don’t mix things up
Sometimes you might encounter a problem that prevents you from continuing your current plan. Remember that refactoring means modifying code without changing the behavior of the larger picture. If you need to do such a change to fix your problem, there are basically two options. The first option is to bring the refactoring to an early end (but fixing the rule violations you committed) and apply the change as a next step.
The other option is to roll back the steps you have done so far, fix the problem, and apply the steps again. Rolling back does not necessarily mean throwing away everything. For example, in git, we can check out a point before the refactoring started, fix the problem there on a new branch and then rebase the refactoring onto the fixed branch.
… are more like guidelines.
The points above may seem overly pedantic and tedious. In addition, how small should a small step be? That’s up to you. If you are confident that you can do a larger step at once without running into problems, go ahead. If you are refactoring code that is unfamiliar and hard to understand, smaller steps might be appropriate.
The same goes for the number of tests you run after each step. Running a three-minutes test suite after every twenty-seconds change will slow you down terribly. It’s a compromise between investing time and sacrificing security. Depending on the size of your steps, even just compiling might be enough, as long as you run all affected tests once in a while.
git stash can be helpful here if you discover that doing one refactoring requires doing another first.
Thanks for the addition. If you are not already several commits into a refactoring, this is definitely the right thing to do.
In the “Small compilable steps” section I would like to add something. If the code that you are refactoring does not have tests that validates its behaviour I normally like to add some “characterization tests” before starting to touch the library code. The security that those tests (or any other ones) give me to refactor is a key element for the refactoring 😉
I was thinking that “never refactor without test coverage” goes without saying – but in a post like this it does not of course.