Search This Blog

To Build Big, Start Small

A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.John Gall
The more I learn about programming, and the more programming I do, the more I am certain that this statement is true. Think about the big, complex, mature software systems out there that you use everyday: Google, Amazon, Facebook, Twitter, etc. Or how about the development tools you use: Ruby on Rails, StackOverflow, GitHub, Vim, etc. (These are some of my favorites; feel free to substitute your own.) If you had to design one of these systems from scratch, you would be completely and utterly overwhelmed, probably to the point of paralysis.

These systems didn't start in the imposing state they are now. Google started out as a simple web search engine based on an insightful idea of how to rank web pages for search keywords, and it originally ran on a few servers that Larry Page and Sergey Brin cobbled together. Amazon started out only selling books from a basic web store in Jeff Bezos' garage. Ruby on Rails was extracted from the Basecamp online project management software when it was just a simple MVC framework that made the popular web app easier to develop.

None of these systems started out serving hundreds of millions of users, responding to billions of requests per day, or offering dozens of elegant, productivity-enhancing features. They didn't have to. At first they only needed to work for hundreds of users and do a few things really, really well. Then they needed to be flexible enough to scale.

These systems would have been terribly over-designed if they had started out trying to handle massive loads that were non-existent in their infancy. Most likely the designers would have gotten everything wrong if they had gone down that path. They would have been trying to solve problems that didn't yet exist and not solving the problems that would make the system loved enough by enough people to make scaling problems an issue. Worrying about scale too early is an instance of Big Design Up Front, and Uncle Bob Martin warns about it in Clean Code:
It is not necessary to do a Big Design Up Front (BDUF). In fact, BDUF is even harmful because it inhibits adapting to change, due to the psychological resistance to discarding prior effort and because of the way architecture choices influence subsequent thinking about the design.
Scale is not the first problem. The first problem is how to make scaling up become the primary problem. There's a lot of learning that happens along the way, and if the most pressing problems are always the ones being solved, the best possible knowledge gets integrated into the system at every step of its growth. The system ends up growing with and adapting to its environment as they change together instead of the system coming into existence bloated, rigid, and completely ill-suited to its environment.

That all seems reasonable in general, but what does it mean when you get down to the specifics of programming the system? First, you need an agile development environment. In particular, you need tests in place so you can make sure you don't break anything that used to be working.  Code is going to change dramatically as the system evolves so you want to make sure that you're never regressing. Whether you practice TDD (Test-Driven Development) or not, a good test suite enables change in a way that nothing else can, and change is the name of the game.

Second, decouple optimization from development. Don't try to do them at the same time because you'll do each one better if you focus on developing great features and functionality when you're developing, and you focus on targeted optimizations where they are measurably needed when you're optimizing. If development is quick and dirty, meaning that you do the simplest thing that could possibly work, most of the system will never need to be optimized, which saves time. That leaves more time to optimize the parts that actually need it. If the architecture is done right with sane algorithms and data structures, optimizing only where it’s needed isn’t a big deal.

Of course, the system does have to be well-designed so that the parts of the system that need optimization can be refactored in isolation without affecting much of the rest of the system. The ability to design a decoupled system comes partly from experience, but such a system also emerges naturally when it is thoroughly unit tested. Even without much experience, designing an easily optimized system is possible, provided that a good test suite is developed along with the system that will support the changes that need to be made.

Third, split code into classes when it gets painful not to, but not before. When files get to be many hundreds of lines long, when a class is doing too many different things, when sections of a class are not talking to each other, that is the time to reorganize a class into a more complex structure of classes. Don't spend time creating hundreds of miniature classes that form a dozen-level inheritance hierarchy with every derived class differing by one line of code because You Aren't Gonna Need It! Wait until you feel the pain of working with a bloated class or duplicating code between classes, and then address the pain.

Until you feel the pain, you can spend your time on other things. Optimization is different than bugs, and has a different cost structure. Bugs get more expensive the longer they live and should be squashed as soon as a test is in place to reproduce them, but optimization is different. It may seem like you’re wasting time and money by re-architecting something in the heat of the moment when it's not performing as needed. But the current architecture did get you where you are today, and if you had tried to architect it for higher performance from the get-go, you probably wouldn’t have come up with the same design anyway. Any effort to design for possible future needs will likely be wasted because it doesn't allow the system to evolve to fit its environment.

Remember, hindsight is 20/20. Now that you know how to architect the code for higher performance for the workload that you have now, you can do it. You haven’t really wasted any time because you didn't have the problem or the experience before. Now that you have both, you can do it right. Maybe you feel that the system should have been designed right in the first place, and now it’s your time being used, not the time of the person that built it, to do it the right way. But you are working on much more pertinent information than whoever designed it in the first place. Remember that the environment and design constraints could have been entirely different when that code was first written, and now it needs to adapt to a new environment and work under new constraints. Your job is to make that happen.

When a system is designed for change, with a healthy test suite and constant doses of refactoring, adding to the system incrementally becomes easy. In contrast, trying to design a huge, complicated system up front is ridiculously hard. Avoid that rabbit hole. When designing a new system, don't think too much about the interaction of lots of features and how the system will behave at scale. You'll never make any progress. You'll be running in circles trying to keep everything in your head at once, or you'll create a mess of design documents that will detail some elaborate system that could never work in real life.

Start small. Think about what the most essential part of the system is, get that part working, and then expand the system. Mold it, slap some more functionality on, refine it, and build it up some more. The process mirrors that of any complex system: art, construction, and even living organisms. They don’t come into existence all at once. They grow and change over time to become complex systems as they mature. But they always start small.