If we look into Wikipedia for code refactoring, we are going to see the following:
“Code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior. Refactoring is intended to improve the design, structure, and/or implementation of the software (its non-functional attributes) while preserving the functionality of the software.”
Why code refactoring?
Why do we care about improving the design, structure, and/or implementation of software? From the project manager’s point of view, the goal is to reduce the cost of support and future development. That is fine, but what makes the regular developer do refactoring?
Refactoring saves a lot of future labor. Here is a very simple example. An inexperienced developer has to add a simple string validation. This string validation is needed in several places (like 20+). Instead of creating a separate method/class and calling it, he just copies the block of code to these places. The validation check works fine and everybody is happy (for the sake of the example we consider that the merge request has passed somehow code review)… Six months later comes an improvement request. The string validation logic should be changed slightly. What our inexperienced developer should do? He should find all the places with the validation and change it according to the new requirement. Does this take more time than if the validation was in one place? Is it possible for the developer to miss a place? Sure.
What is the right thing to be done in this case? When the developer finds a second place where the validation check is needed, he has to refactor his code by extracting a method or class and use it. It is simple but this will save him a lot of searching/replacing and debugging in the future.
Even with that over-simplistic example, we see how little by little the software design can deteriorate.
The other drive for refactoring is that developers love elegant solutions. They love clean design and strive for it. Well-designed systems are often an expression of our deepest desires to create a piece of art software. Sometimes desires are so strong that we over complicate our system designs… but this is another topic of discussion.
In summary, we love to create a piece of art software instead of putting off fires. We want things to be easy for a change and flexible but also robust.
What is software behavior?
There are two main types of software behavior – positive behavior and negative behavior. Under positive behavior, we can put all planned and implemented functionalities that bring business value to the customer. The negative category consists mainly of bugs and other problems preventing extracting that business value.
In this case – is fixing a bug a refactoring? No, because we have a change of behavior. Is optimization, or rearrangement of the UI a refactoring? Also no.
Refactoring itself can lead to these changes but any change that intentionally changes the behavior is not refactoring.
Consider refactoring as a selfish act of a developer who tries to make things easy only for himself and his colleagues without customers notice anything.
Refactoring should not change intentionally any of the expected behavior.
How to do code refactoring?
To explain it in a single article would be madness. A hundred pages book can be written easily about that topic. That is not my goal in this publication. I am going to share with you several of the most important principles (from my point of view) in summary. In future articles, we will discuss each of these principles and their technical realization in detail.
On first place in my list is breaking dependencies. Lack of any abstraction is one of the first steps toward spaghetti code (well cooked). Classes instantiating other classes directly, objects used as global holders (not in the right way), lack of any hierarchy or too deep hierarchy, long “if/switch” statements, lack of separation from external libraries… All of these problems are signs of very tightly coupled design (if any design is even present).
We refactor that mess by introducing tiny abstractions, changing and separating interfaces, inversing dependencies, etc. Technics and principles that we are going to explore in another article.
Single responsibility (fighting code duplication)
A module, class, or method should have only one reason for its existence and should be responsible for only one thing. Imagine a method that extracts data, processes it, formats it, and then sends an email. In a well-designed system, such a method would be split on at least several classes and interfaces.
Most of the time the lack of clear responsibility does not happen overnight. It is a cumulative result of bad practices and laziness over a long period. Once with a single and clear responsibility a method/class/module can grow to something vague and unclear due to lack of discipline in the dev team.
The problem with the responsibilities we solve using “extract class”, “extract method”, “introduce hierarchy” and many others.
Lack of expressiveness
Have you ever tried to read code 20 times and not be able to get the intention behind it? However, the code works perfectly! That is a lack of expressiveness. In most cases, the code intention is not clear because of bad naming, bad formatting, magic strings and numbers, a bad grouping of statements, etc. However, sometimes the lack of expressiveness comes from overengineering and bad design practices. The latter is far more problematic for refactoring.
If the original developer is not present, the lack of expressiveness leads to a long debugging session…
These three categories are the first things I start looking when I am refactoring.
My advice to you is to always start with a holistic approach. Know how the change you’ve introduced (even a small one) will affect other parts of the system. This is extremely valid when you are refactoring badly written legacy systems.