Premature Optimization
Summary
TLDRThe video script discusses the pitfalls of premature optimization in software development. It emphasizes the importance of balancing velocity (speed of adding new features), adaptability (system's ability to change), and performance. The speaker argues that focusing too early on performance can lead to technical debt and hinder progress. Instead, they advocate for a measured approach, using data structures wisely, profiling for hotspots, and making educated guesses about code efficiency. The key message is to optimize only when necessary and to prioritize real-world problem-solving over micro optimizations.
Takeaways
- ð« Premature optimization is often unnecessary and can be harmful, as it diverts attention from more pressing issues.
- â±ïž Performance discussions are not always valuable; they matter when they address real-world problems.
- ð Performance should be viewed as part of a tradeoff triangle, balancing velocity, adaptability, and maintainability.
- ð ïž Focusing solely on velocity can lead to technical debt, which slows down development in the long run.
- ð§ Adaptability involves designing code for future changes, but overdoing it can hinder velocity and performance.
- ð® A crystal ball (predicting future requirements) would be ideal, but in reality, over-engineering for unlikely scenarios is wasteful.
- ð² The importance of performance varies with the project's stage; early development may prioritize features over performance.
- ð Facebook's initial use of PHP, despite its inefficiencies, allowed for rapid development and growth.
- ð Measuring performance is crucial; assumptions about what makes code faster should be tested empirically.
- ð Premature micro-optimizations are often pointless when modern computers are fast enough to handle them.
- ð Profiling can reveal the true performance bottlenecks, guiding where to focus optimization efforts.
Q & A
What is the root of all evil according to the speaker?
-The speaker believes that premature optimization is the root of all evil.
Why are most conversations about performance considered a waste of time?
-They are a waste of time because people often prioritize performance over other important aspects like velocity and adaptability, without considering the bigger picture.
What is the tradeoff triangle in software development?
-The tradeoff triangle consists of velocity (how quickly new features are added), adaptability (how well the system can change to new requirements), and performance (the efficiency of the system).
What is the risk of focusing solely on velocity?
-Focusing solely on velocity can lead to creating technical debt, which may slow down development and maintenance in the future.
How does adaptability affect velocity and performance?
-Adaptability, which involves writing reusable and extensible code, can increase velocity by reducing the size of changes needed for new features. However, it can also hurt performance due to added abstraction and indirection.
What should be the approach to performance optimization based on the stage of a project?
-The approach depends on the project stage. Early in development, focus on velocity and adaptability. As the project nears completion, shift focus to performance to ensure a smooth final product.
Why did Mark Zuckerberg choose PHP for Facebook initially?
-Zuckerberg chose PHP because it allowed him to build Facebook quickly, despite its inefficiencies and scaling issues.
What are the two types of performance issues the speaker differentiates?
-The speaker differentiates between macro performance (design-level performance considerations) and micro performance (fine-tuned performance at the module or function level).
Why is premature optimization typically a concern for micro performance?
-Premature optimization for micro performance often involves unnecessary fine-tuning that may not yield significant benefits and can lead to wasted effort.
What is the importance of measuring in performance optimization?
-Measuring is crucial because it helps validate assumptions about what will improve performance. It ensures that efforts are directed towards actual bottlenecks rather than perceived issues.
What should be the first step in addressing performance issues?
-The first step should be to identify and address 80% moves, such as changing data structures or algorithms that can have a significant impact on performance.
Outlines
ð The Tradeoff Triangle: Performance, Velocity, and Adaptability
The paragraph discusses the importance of balancing performance with velocity (speed of adding new features) and adaptability (system's ability to change). It emphasizes that focusing on performance too early can lead to technical debt and hinder future development. The speaker suggests that the right design can increase velocity by reducing the changes needed for new features, but over-adaptability can slow down performance. The stage of the project should guide the focus on these elements, with mature projects prioritizing performance and early-stage ones focusing on feature development and system extensibility. The speaker also differentiates between macro and micro performance, with the latter often being the subject of premature optimization.
ð€ The Futility of Premature Optimization
This paragraph delves into the concept of premature optimization, where developers might focus on micro optimizations that have negligible impact on performance. The speaker uses the example of the pre-increment (++) and post-increment (i++) operators in C++ to illustrate how such optimizations can be unnecessary and time-consuming. They argue that unless a performance issue is proven to be caused by a specific function, readability should take precedence. The paragraph also touches on the importance of measuring performance and the role of data structures in achieving significant performance improvements.
ð Optimizing for Real-World Impact
The speaker advocates for a practical approach to optimization, starting with changes that could have a significant impact (80% moves) and using data structures effectively. They suggest profiling to identify performance hotspots and making educated guesses about the underlying code operations. The paragraph highlights that memory allocation and proximity of critical elements in memory can affect performance, but these considerations should only come into play after more straightforward optimizations have been exhausted. The speaker concludes by emphasizing the need for a real performance problem before focusing on optimization and encourages a measured approach to finding and addressing performance issues.
Mindmap
Keywords
ð¡Premature Optimization
ð¡Tradeoff Triangle
ð¡Velocity
ð¡Adaptability
ð¡Technical Debt
ð¡Performance
ð¡Macro Performance
ð¡Micro Performance
ð¡Data Structures
ð¡Profiling
ð¡Optimization
Highlights
Premature optimization is considered the root of all evil.
Performance discussions are often a waste of time due to strong feelings about performance rather than its actual importance.
Performance is part of a tradeoff triangle involving velocity (speed of adding new features) and adaptability (system's ability to change).
Focusing solely on velocity can lead to technical debt and hinder future development.
Adaptability involves writing code for easy changes and maintaining reusable, extensible components.
High adaptability can sometimes reduce velocity and negatively impact performance due to increased abstraction and indirection.
The balance between velocity, adaptability, and performance depends on the project's stage.
Mark Zuckerberg's choice to write Facebook in PHP, despite its inefficiencies, allowed for rapid development and growth.
Performance should not be the first problem addressed; it's important to be deliberate about optimization strategies.
Performance issues can be categorized into macro (design-level) and micro (fine-tuned) performance.
Premature optimization often occurs at the micro level and may not be necessary given the speed of modern computers.
The difference between pre-increment (i++) and post-increment (++i) operators in C++ is negligible after optimization.
Function calls are often not the root of performance issues, and readability should be prioritized unless proven otherwise.
Data structure selection is crucial for performance optimization, as it can significantly affect the outcome.
Profiling tools can identify hotspots in code, guiding developers to areas that need optimization.
Memory usage and allocation patterns play a significant role in performance, with proximity in memory leading to faster execution.
Optimization should begin with addressing real performance problems, making significant changes, and then profiling and analyzing code.
Transcripts
I truly do believe that premature optimization is the root of all
evil.
Most conversations about performance
are a total waste of time, not because performance isnât important,
but because people feel very, very strongly about performance.
I tend to think of performance as an element of this tradeoff triangle.
Velocity, here, is how quickly one adds new features and adaptability
is how well the system can change to new requirements.
You might think that velocity and adaptation go hand in hand, and they do.
But sometimes one can hurt the other.
Focusing on pure velocity means hacking together something as fast as possible.
Taking the shortest path to the feature.
Future maintainers be damned, and at the beginning
you'll gain a lot of immediate velocity as you hack things together.
But as you do this, you're creating a bunch of technical debt
which will weigh you down to a halt.
Adaptability is about writing
the code in a way to enable changes as new requirements come in.
Think creating reusable extensible components,
beautifully crafted interfaces and configurability.
With the right designs, you can increase velocity by reducing
the size of changes needed to add new features.
But if you make things too adaptable, you also hurt velocity.
If you had a crystal ball and could see the future,
you'd be able to know exactly which use cases you'd need to design for
and also which ones you don't need to design for.
But when you
build a highly adaptable system that can adapt to a whole bunch of cases
that never happen, then all of that just ends up being a big waste of time.
With highly extensible systems, you also hurt performance.
Adding lots of adaptability naturally adds more abstraction and indirection
in your code, which usually has a negative impact on performance.
But this trade off is usually worth it because of the adaptability it allows.
You might think you always want something in the middle,
but I actually believe this depends on the stage of your project.
A feature complete game
pushing final ship date would focus on performance.
You might be okay with reducing the velocity of new changes
and adaptability in order to squeeze out the last little bit of performance.
But when a game is at earlier in development,
you might focus on getting more features out quickly or
building up an extensible system that lets you tweak the game freely.
When Mark Zuckerberg wrote Facebook, he did it in. PHP.
PHP is an awkward language, to say the least,
and it gave Facebook a whole slew of scaling issues as Facebook grew.
They ended up making a PHP to C++ compiler.
And later, when that had a performance plateau,
they basically created a dialect of PHP called Hack.
But would Zuckerberg be better off if he wrote it in highly optimized
code to start? No, I don't think so.
I think writing in the inconsistent, inefficient
PHP was the right move because it meant that he could build Facebook quickly.
Once performance
became a real problem Engineers went to solve it.
If he focused on choosing performance over velocity, it's not clear
that Facebook would have taken off and Zuckerberg wouldn't have had the privilege
to visit Congress as much as he has.
So the key with the triangle is
that performance is a problem, but it's not usually the first problem.
And you should be deliberate about which way you're leaning.
It's useful to differentiate performance issues into two camps:
macro performance, which I think of as design
level performance - these are system wide performance considerations;
and micro performance, which is fine tuned performance.
This could be looking at the perf of a single module or function.
Premature optimization usually occurs for micro performance.
This is typically where someone comments in a code review that you should do X
instead of Y because x is faster than y, but computers are really fast.
I have some python code that helps me generate the code animations
for this series, and that python code do some horrific stuff.
Each character,
my code is basically in a big array and on every frame
I just linear search through it to find the relevant characters,
and it still renders
me out of video fast enough that it's not worth making it any faster.
At the end of the day, you're writing code to solve a real world problem
and typically getting to the solution of that real world problem faster
is better than solving the problem slower with faster code.
Let's look at a few examples of premature optimization.
In C++, there's two operators that both let you increment a variable by one.
Proponents of ++i.
will say it's faster because technically i++ needs
to make a copy of the value since when it's part of a larger expression,
the incrementation needs to happen after the expression is evaluated.
Oh man, this one gets me.
You might say, but CodeAesthetic if it's faster.
Well, then why don't you just do it?
Why don't you just always do ++i?
Because sir/madame.
Think of the precedence.
I don't know for a fact that it is indeed faster.
And so should I blindly trust Daryl because he thinks it's faster?
No, of course not.
So now I need to go through and do a thorough and lengthy investigation
into the distinction between i++ and ++i.
Now, I've got to go and disassemble it.
I write a for loop with the pre increment operator and look at what it does.
Okay.
Here it loads the value from i into a register, EAX, then
adds one to it and then puts the result back into the memory for i.
Okay. Seems straightforward.
So let's look at post increment and it's identical.
You get the same code for both. Okay.
But this is just an int.
What if it's an iterator?
We have a vector which is just C++ versions of a list
and you can access the elements of the vector through an object
that overrides the two ++ operators.
The post one would need to make a copy, right?
Well, let's look.
Aha. Gotcha.
Look at the difference between these two long functions.
You'll notice that this one calls the function for the post
and this one calls the function for the pre.
And the post version of this has four extra assembly lines
to make a copy of the iterator and it calls into the other operator.
So you got me.
It does cost a little bit more to do i++.
Except when you turn on optimization, those function calls completely disappear
and you end up with identical code for both cases.
So now the answer is a thorough
and totally satisfying: maybe.
We must measure.
After measuring the difference on my MacBook
the speed of the slowest increment is 3.4 nanoseconds.
If you notice a little counter below, this is how many worst
case increments have occurred since I started talking about this.
Is my increment really going to execute this
many times?
Is my code going to even ship?
Will my startup fail?
I spent 3 hours on this investigation.
Could have spent 3 hours building the best
dog walking app known to humanity.
Does anything really matter?
So my point with this is that you should ask
yourself, is this conversation even worth it?
If it brings you joy to figure out which is technically faster, go for it.
But I don't think this should end up on a code review because it doesn't matter.
Otherwise, I have to go through this whole thing again.
I like i++ because I think it looks prettier.
And until someone tells me that we can't ship because our app is too slow
and we've measured that the solution is to change all of our i++s
to ++is, I'm sticking with it.
This also applies to functions.
In many of my previous videos,
I've argued about using extraction to help readability.
Some have argued that functions are expensive,
but rarely has the cost of a function been so significant
that removing the function is the solution to a performance problem.
And in the rare instance that it was a solution, it doesn't mean
that it's proof that it's worth the readability loss by default.
A big issue with making performance rules to follow
is that often the rules have a lot of exceptions.
Here we have a class that keeps track of the currently logged in users.
Our first implementation just has an array which contains a list of users.
Then we have this loggedIn() method which returns
whether or not the user's logged in.
It has a simple for loop
which looks through the list searching for the current user.
During a code review Someone says, âHey, this thing is slow.
You shouldn't just search the list of users like that.
It's slow, and you should use a set insteadâ - a data structure
that lets you look up unique elements much faster.
But when you measure the difference, you actually find that set is slower
when you only have a few users log in, which is normal for your system.
Iâd speculate this is because our integers in the area
right beside each other in memory and the implementation of set likely
allocates objects in lots of different places in memory - reducing cache hits.
But like I said before, until you've shown that this function specifically
is the leading cause of your performance issues, go with what's more readable.
So I do think that
set creates cleaner, more readable and less bug prone code.
So I'd go with that.
There's so many factors to performance that there's only one way
to properly optimize: measure., try something, measure again.
Measuring is critical because like we just showed your assumptions
of what will make things faster, can make things slower.
You can help form a hypothesis of how to make things better by doing an analysis.
Data structure selection by far is the most important,
and that's because choosing the right data structure can give dramatically
better results over the wrong data structure. When
dealing with performance issues.
I tend to think in terms of 80% moves first.
What are changes that could have an 80% reduction?
And often the only thing that can get you that far are data structure changes.
Once you've implemented them and measured the difference
and still see it's not good enough, then you have to look at the smaller things.
You might not know where to start, and that's
where you'd want to look at the profiler.
A profiler can tell you what are the hotspots of your code.
It can point out functions that are the most expensive.
There was kind of a funny story about Grand Theft Auto V online.
It was so slow to launch that one player who goes by t0st wanted to figure out why.
Even though they didn't have the source code
t0st used a profiler figure out why it was so slow.
It turned out
pretty much all of the time was spent parsing and processing a JSON file.
A patch to fix just that one part improved the load time by 70%.
That's the funny thing with performance is that sometimes when things are slow,
it's just one silly thing slowing everything down.
For GTA V all someone had to do is look.
After data structures and profiling
then you just have to start making educated guesses,
thinking about how your code
could be working underneath the hood and find ways to simplify stuff.
A lot of performance comes from how your code uses memory.
Allocating memory, which is done whenever you make a new object or array,
can slow things down in critical sections because the system has to find
free chunks to put stuff.
You'll get a lot of speed for having critical elements close by in memory
because things are much, much faster when they're in the CPU's cache.
But again, I wouldn't worry about these things
until you know you have a performance problem and have tried other things first.
So when you optimize:
first, have a real performance problem, then measure it.
Try to make 80% moves by swapping data structures
or moving to a well-known faster algorithm.
Profile and find hotspots.
Then, worst case, start thinking about what your code is doing under the hood.
What are some interesting cases you've hit while
trying to make code faster?
When a video of mine is getting long, I sometimes cut sections and post them
as deleted scenes on my Patreon. For this video, we talked about micro optimization,
but there's a section on my Patreon about macro optimization strategies.
If you're curious.
5.0 / 5 (0 votes)