My failed explanation!
Today I was asked to explain the multi-stage build pattern to someone, and I did an exceptionally horrible job doing it. I think I had two things going against me.
- I was put on the spot and I wasn't in the right mindset.
- I didn't have an example in front of me.
Sure, I'm making excuses but I'd like to look at the reasons why and hopefully make some adjustments for the future. That being said, my mind doesn't always seem to cooperate with me. I have friends, coworkers, and family members that can recite things they learned years ago without a second's notice. I didn't get that gene!
I'm not sure about any of you, but I typically need a bit of a refresher, a little nudge, or at a minimum an example in front of me. Especially when it comes to explaining things that I'm not used to explaining. I use Docker all the time but I don't need to create Dockerfiles every day, so I rarely think about the structure of them.
It's like trying to play a tune on the guitar when you haven't played it in a while. If you can just hear it, or see the first couple of notes, or better yet, join in with someone - it all comes rushing back. I know this tends to be muscle memory but it the same can be said for other things.
I did an awful job trying to explain what it was. I think I was trying to go for a high concept but just failed miserably on the delivery. I mentioned the multiple-image references but failed to explain the why!
Anyway as my penance, I'm writing this article to see if I can explain it better than I did earlier today. AND hopefully remember it better the next time someone asks me. Going through the process of writing out should provide a better thought process of how I should explain this to others.
Docker Multi-Stage Build Pattern Explained
The goal of a multi-stage docker build pattern is to keep the end image size as small as possible. Before multi-stage builds, you would typically create a single Dockerfile that ended up creating an image that was larger than it needed to be or create separate Dockerfiles that did their job and nothing more. Each Dockerfile would create a set of artifacts that were needed for the next stage. While this sounds good in theory, it can become a maintenance nightmare.
For example. Assume you are building an image for a .netcore application. You want your Dockerfile to pull in the source code, compile it, and publish it to produce the final product. In order to actually compile your application (within a docker container), you need to use the SDK. If you included that in the final output you would have a much larger image needed than necessary. At the end of the day, all you want is a Docker image with the .netcore runtime and your published files, not the entire SDK and source files along with it.
This can be done in a couple of ways:
- With multiple docker files or
- With a multi-stage single Dockerfile.
* Side note: I know there are some other patterns and tools out there, link the Docker buildkit, but since I was asked to explain the multi-stage build pattern, we're going to focus on that.
How Does it Work
In the multi-stage build pattern, you will use multiple FROM lines that create an image or stage of its own, and unless specified otherwise, only the last FROM image will be used as the final image created based on your package. That is the magic.
We can now have a single file that creates images that are used during stages of the build and then only output the final FROM image reference as the final product. This means we can define a Dockerfile that does all the work in a nice linear pattern that sheds its layers as it moves from one layer to the next.
What Does it Look Like?
Do You See Anything Wrong?
If you see something wrong, feel free to let me know. I have pretty thick skin and I'm always looking to learn and improve.