A while ago, an ex-colleague of mine shared with me he had been given a task to refactor an enormous legacy MS SQL stored procedure. It was interesting to me how enthusiastic and happy he was without realizing the upcoming pain… Well, two weeks later the “refactor” was not even close to the final goal and management gave him another task because they had realized that this one is a lost cause.
Is he a bad developer? Not at all! Only the approach for that refactoring was wrong.
Let us recall what does “refactoring” mean? Refactoring is the change of the internal structures of the code without altering its external behavior. In short, by simplifying, rewriting to well-structured and extendable code, we reduce the cost of future development (and make our lives as developers much easier). Simple, right?
Refactoring is not an event, refactoring is a daily habit
If you wait for some point in time to refactor you are lost. When you see a problem, fix it right away. Do not let it grow beyond your capability of fixing it. Is it hard to rename a badly named method or clarify some magic string? No, it is not. Refactoring is a continuous process consisted of thousands of baby steps.
Some of you may say, “Doing that we won’t have enough time for writing new features!”. I have even heard some managers saying it… So, according to you, what is more time consuming – extracting code to a single method/class or changing duplicated code in 320 files at a later stage of the project?
The example above is something I have seen in reality. Try to guess what the cost was for adding even simple new features.
Refactor when you have full knowledge over the expected behavior
If you do not know what the desired result is or have unclear elements, you can easily make the wrong assumptions. This is extremely valid if you are refactoring someone else’s code (or even code you wrote 6 months ago).
If the needed refactoring is more complex, start looking for the different interfaces. Analyze how different modules/classes communicate with each other. Do not go into much detail (for now). If the code is a mess, the contracts between the modules will be unclear and the responsibilities scattered around. Only by having strong knowledge of the business domain you will able to define and isolate the proper boundaries and responsibilities.
When the main structures and flows are set, you can continue with their sub-elements and details.
Refactor when a solid number of unit tests backs you up
Refactoring without unit tests is a sin, and we all have been sinners in our lives. What is the solution? Just write unit tests. Even if you cannot provide a high percentage of coverage, do at least unit tests for the most important features.
Refactoring should not lead to overenginering
This is a lesson I have learned about 11 years ago when I was a junior developer. A simple task was given to me. I had to reverse logic in an “if statement” and to add one more case. I ended in a strategy pattern implementation by adding several more classes and methods. In addition, it was buggy :). At the end of the day, one of the senior developers said to me “Mitko, you played a lot today, please revert and just add the “if statement””.
Sometimes you just have to add one more “if statement”. When? Sadly, only experience can tell you that.
Breaking point
Taking into account all the information, we see what was wrong with the task given to my ex-colleague. Everything. He did not have full knowledge of the problem. He was not even the author of the stored procedure. Actually, many people using “duct tape and rubber bands” through the years wrote that stored procedure. There were no tests to back him up (How many of you create unit tests for stored procedures :)?). The code had been already a total mess before he started the refactor.
Is it possible to refactor such a mess? Sure! However, the benefits of such a big investment of time are highly doubtful and rarely justified.