Ask HN: How do you use Docker in production?

I know what Docker is and how it works. What are some problems that get solved better when using Docker?

  • Here's the problems we're solving with Docker:

    * Sanity in our environments. We know exactly what goes into each and every environment, which are specialized based on the one-app-per-container principle. No more asking "why does software X build/execute on machine A and not machines B-C?"

    * Declarative deployments. Using Docker, Core OS, and fleet[1], this is the closest solution I've found to the dream of specifying what I want running across a cluster of machines, rather than procedurally specifying the steps to deploy something (e.g. Chef, Ansible, and the lot). There's been other attempts for declarative deployments (Pallet comes to mind), but I think Docker and Fleet provide even better composability. This is my favorite gain.

    * Managing Cabal dependency hell. Most of our application development is in Haskell, and we've found we prefer specifying a Docker image than working with Cabal sandboxes. This is equally a gain on other programming platforms. You can replace virtualenv for Python and rvm for Ruby with Docker containers.

    * Bridging a gap with less-technical coworkers. We work with some statisticians. Smart folks, but getting them to install and configure ODBC & FreeTDS properly was a nightmare. Training them in an hour on Docker and boot2docker has saved so much frustration. Not only are they able to run software that the devs provide, but they can contribute and be (mostly) guaranteed that it'll work on our side, too.

    I was skeptical about Docker for a long time, but after working with it for the greater part of the year, I've been greatly satisfied. It's not a solution to everything—I'm careful to avoid hammer syndrome—but I think it's a huge step forwards for development and operations.

    [1]: https://coreos.com/using-coreos/clustering/

    Addendum: Yes, some of these gains can be equally solved with VMs, but I can run through /dozens/ of iterations of building Docker images by the time you've spun up one VM.

  • I used Docker to solve a somewhat unconventional problem for a client last week. They have a Rails application that needs to be deployed in two vastly different situations:

    * a Windows server, disconnected from the internet

    * about 10 laptops, intermittently connected to the internet

    Docker let us build the application once and deploy it in both scenarios with much less pain than the current situation, which basically consists of script to git-pull and over-the-phone instructions when dependencies like ruby or imagemagick need to be upgraded.

    We run VirtualBox with a stock Ubuntu 14.04 image with docker installed from the docker-hosted deb repo. We use the Phusion passenger Ruby image[1], which bundles almost every dependency we needed along with a useful init system so we can run things like cron inside a single container along with the application. This makes container management trivial to do with simple scripts launched by non-technical end users.

    [1]: https://github.com/phusion/passenger-docker

  • At Shopify, we have moved to Docker for deploying our main product. Primary advantages for us are faster deploys, because we can do part of the old deploy process as part of the container build. Secondly: easier scalability, because we can add additional containers to have more app servers or job workers. More info at http://www.shopify.com/technology/15563928-building-an-inter...

  • We've been using Docker for YippieMove (www.yippiemove.com) for a few months now, and it works great.

    Getting your hand around the Docker philosophy is the biggest hurdle IMHO, but once you're there it is a delight to work with. The tl;dr is to not think of Docker as VMs, but rather fancy `chroots`.

    In any case, to answer your question, for us it significantly decreased deployment time and complexity. We used to run our VMs and provision them with Puppet (it's a Django/Python app), however it took a fair amount of time to provision a new box. More so, there were frequently issues with dependencies (such as `pip install` failing).

    With Docker, we can more or less just issue a `docker pull my/image` and be up and running (plus some basic provisioning of course that we use Ansible for).

  • Good question.

    I have learnt enough about Docker to know it's not something which solves any problem I have, but finding out concrete facts about what others are actually doing with it was one of the hardest parts of the learning process.

    The official Use Cases page is so heavily laden with meaningless buzzwords and so thin on actual detail that I still feel dirty just from reading it. https://docker.com/resources/usecases/

  • We're moving all of production in EC2 from an old CentOS 5 image managed by capistrano to CoreOS, with fleet deploying images built by the docker.io build service and private repo. I love it.

    Every week, we rebuild our base image starting with the latest debian:stable image, apply updates, and then our apps are built off of the latest base image. So distro security updates are automatically included with our next deploy.

    We had been deploying multiple apps to the same EC2 instances. Having each app's dependencies be separate from other apps has made upgrading them easier already.

    This also means all containers are ephemeral and are guaranteed to be exactly the same, which is a pretty big change from our use of capistrano in practice. I'm hoping this saves us a lot of debugging hassle.

    Instead of using ELBs internally, I'm using registrator to register the dynamic ports of all of my running services across the cluster in etcd, with confd creating a new template for NginX and updating it within 5 seconds if a service comes up or drops out. Apps only need to talk to their local NginX (running everywhere) to find a load-balanced pool of whichever service they are looking for. NginX is better than ELB at logging and retrying failed requests, to provide a better user-experience during things like deploys.

    Some of these things could be solved by spinning up more EC2 instances. However that usually takes minutes, where docker containers take seconds, which changes the experience dramatically.

    And I'm actually reducing my spend by being able to consolidate more. I can say things like "I want one instance of this unit running somewhere in the cluster" rather than having a standalone EC2 instance for it.

  • Docker, CoreOS, fleet, and etcd have completely changed how I build projects. It's made me much more productive.

    I'm working on Strata, which is a building management & commissioning system for property owners of high-rise smart buildings. It's currently deployed in a single building in downtown Toronto, and it's pulling in data from thousands of devices, and presenting it in real-time via an API and a dashboard.

    So in this building, I have a massive server. 2 CPUs, 10 cores each, 128GB of RAM, the works. It came with VMWare ESXi.

    I have 10 instances of CoreOS running, each identical, but with multiple NICs provisioned for each so that they can communicate with the building subsystems.

    I built every "app" in its own Docker container. That means PostgreSQL, Redis, RabbitMQ, my Django app, my websocket server, even Nginx, all run in their own containers. They advertise themselves into etcd, and any dependencies are pulled from etcd. That means that the Django app gets the addresses for the PostgreSQL and Redis servers from etcd, and connects that way. If these values change, each container restarts itself as needed.

    I also have a number of workers to crawl the network and pull in data. Deployment is just a matter of running 'fleetctl start overlord@{1..9}.service', and it's deployed across every machine in my cluster.

    With this setup, adding machines, or adding containers is straightforward and flexible.

    Furthermore, for development, I run the same Docker containers locally via Vagrant, building and pushing as needed. And when I applied for YC, I spun up 3 CoreOS instances on DigitalOcean and ran the fleet files there.

    As I said, I've been able to streamline development and make it super agile with Docker & CoreOS. Oh, and I'm the only one working on this. I figure if I can do it on my own, imagine what a team of engineers can do.

    Very powerful stuff.

  • We're using Docker to solve these kinds of problems:

    - Running Jenkins slaves for some acceptance/integration tests that runs in the browser, previously we had to configure multiple chromedrivers to spin up on different ports or be stuck running 1 test per machine. Now we have 6 machines (down from 9) which runs 6 slaves each, so we can parallelize our tests as 36 tests run concurrently. That has significantly improved our deployment time (as these tests are necessary to do a deployment) while reducing costs.

    - Migrating our infrastructure (around 70 instances) to AWS VPC, we had our machines running on EC2-Classic. While I had previously done some work automating most applications using Chef we have really managed to fully automate our machines with Docker, it was way easier than solving cookbook dependency and customization issues. We have a couple dozen Dockerfiles that fully explain how our systems run and what are the dependencies for each application.

    And that is only in the last month and a half that I began using Docker, I was pretty skeptical before as it was touted almost as a silver bullet. And it comes close to that in many scenarios.

  • I honestly find it really depressing to see all these folks taking code and applications that would otherwise be entirely portable, and rebuilding their entire deployment and development environment around a hard dependency on Linux.

    If Docker becomes sufficiently popular, it's going to put HUGE nails in the coffin of portability and the vibrancy of the UNIX ecosystem.

  • We use docker for:

    - running graphite (can't say it was less pain launching it, since Dockerfile was outdated a bit, and I also had to additionally figure out persistency issues, but overall I'm happy it's all virtualized and not living on server itself)

    - building our haskell projects for specific feature (your run a container per feature, this way you omit pain switching between features when you need to build one)

    - running tests (per each feature we start container with whole infrastructure inside (all databases, projects etc.))

    - running staging, also container per feature

    Very useful stuff, comparing to alternatives, I should say. And quite easy to work with after you play a bit with docker's api.

  • We use it for everything.

    We use the Google Compute Engine container optimised VMs, which make deployment a breeze. Most of our docker containers are static, apart from our application containers (Node.js) that are automatically built from github commits. Declaring the processes that should run on a node via a manifest makes things really easy; servers hold no state, so they can be replaced fresh with every new deployment and it's impossible to end up with manual configuration, which means that there is never a risk of losing some critical server and not being able to replicate the environment.

  • Docker revolutionized our server deployment. My company has 50 nodejs services deployed on VPS providers around the world. It allows us to completely automate the deployment of these servers regardless of the provider's APIs. When we roll out updates, we never patch a running box, we just bring the new container up and remove the old one. Super easy, super reliable, and best of all, totally scriptable.

    We also have a pretty sophisticated testing environment using Docker which creates a simulation of our server in on any developer's laptop. It's really remarkable actually.

  • We have a stateless worker process that previously required a separate EC2 instance for each worker. Even using small instances, this meant a pretty cumbersome fleet in AWS with lags for spin up and excess costs due to workers that were brought online and finished their jobs well before the use hour was complete.

    Using Docker, we can have several of these workers on a single box with near instantaneous spin up. This means that we are able to use fewer, larger instances instead of several small ones. In turn, this makes the fleet easier to manage, quicker to scale and less costly because we aren't over paying for large portions of AWS hours that go under utilized.

    I am not entirely sure that Docker was a necessity in building this as I sort of inherited the technology. I originally was pushing for a switch to pure LXC, which would have fit the build system that was in place better. However, given the fervour over Docker there is a lot of information out on the web and so changing the build and deployment systems has been relatively easy and quick. I bring this up because I think some tasks are better suited to pure LXC, but people seem to be defaulting to Docker due to its popularity.

  • We actually started using Docker a few months ago and it really sped up our deployment process. It's not only incredibly faster than using virtual machines for testing; it allows you to host multiple apps on one server and to have all versions of your app ready to download and run. More info at http://www.syncano.com/reasons-use-docker/

  • We use docker extensively to ship a large and complex legacy platform that was designed to run as a hosted service, but was transformed into an on-premise product.

    The system is composed of several components originally designed to run on separate VMs for security reasons. Luckily, we were able to translate VM <-> docker container, so now each component has its own Dockerfile + shell script for booting up and providing runtime configuration.

    Docker helps us solve several problems:

    * A canonical build. It provides a way to configure the build system, fetch all dependencies and execute a reproducible build on different machines/environments. It's also used as documentation when engineers have no clue, where settings/parameters come from.

    * A super fast build pipeline and release repository. We use maven -> nexus, docker -> docker-registry, vagrant -> local export for a completely automated way to bootstrap an ovf-file that can be deployed at customer site. Releases for the old platform were not automated and took the previous teams weeks (!) on a single platform.

    * A way to restrict resources. Given some security constraints from the product, lxc + docker helps us restrict memory and networking.

    * Shipping updates. We deliver automated updates through a hosted docker registry for customers who open up the appliance to the internet. Previous teams were not able deliver updates in time for a single hosted platform. We can now ship new releases and have them deployed at several customers data-centers in a matter of hours.

    We have been using docker in production for almost a year now and despite headaches in the beginning it's been absolutely worth it.

  • One thing I'd like to point out are OS upgrades, security patches or generally package updates. With docker I just rebuild a new image using the latest ubuntu image (they are updated very frequently), deploy the app, test and then push the new image to production. Upgrading the host OS also is much less of a problem because far fewer packages are installed (i.e. it's just docker and the base install).

  • At Lime Technology, we have integrated docker into our NAS offering along with virtual machines (KVM and Xen). Docker provides a way to eliminate the "installation" part of software and skip straight to running proven and tested images in any Docker environment. With Containers, our users can choose from a library of over 14,0000 Linux-based apps with ease. Docker just makes life easier.

  • I'm solving the most obvious issues that docker was meant to solve. I'm currently working alone in a company that's starting up and yesterday I needed to spin up a server and create a rest api service to integrate with a telco's system. They asked me how long it would take me to do that and I said an hour. I just spun up a digital ocean instance, cloned my api git project, built from the docker file (it's incredibly fast on ssd), and in about 30 minutes I was running a nginx->uwsgi->flask with python3.4, bcrypt, and a few other packages.

    Now all this can be done with a simple bash script too but then that affects my main server environment. In this case when I want to stop the service or change something I simply edit my docker image.

    My dev environment is a windows laptop and I use vagrant to spin up a server with nginx configured and docker installed. And I use docker to get my environment running for working on my apps. It's pretty awesome.

    Vagrant and docker are one of the best things that has happened for me as a developer.

  • We're using Docker for a few internal projects at Stack Exchange. I've found it to be simple and easy, and it just works.

    We have a diverse development team but a relatively limited production stack - many of our devs are on Macs (I'm on Ubuntu), but our servers are all Windows. Docker makes it painless to develop and test locally in exactly the same environment as production in spite of this platform discrepancy. It makes it a breeze to deploy a Node.js app to a Windows server without ever actually dealing with the pain of Node.js on Windows.

    Also, it makes the build process more transparent. Our build server is Team City, which keeps various parts of the configuration in many different hidden corners of a web interface. By checking our Dockerfile into version control much of this configuration can be managed by devs well ahead of deployment, and it's all right there in the same place as the application code.

  • Since we are still waiting for CoreOS + (flynn.io || deis.io) to mature. I modified our existing VMWare VM based approached to setup ubuntu boxes with Docker install. Where I then use fig to manage an application cluster, and supervisor to watch fig.

    When its time to update a box, jenkins sshs in calls docker pull to get the latest, then restarts via supervisor. Any one off docker run commands require us to ssh in, but fig provides all the env settings so that I don't have to worry about remembering them. The downtime between upgrades is normally a second or less.

    The biggest thing I ran into is that each jenkins builds server can only build and test one container at a time. After each one, we delete all images. The issue is that if you have an image it wont check for a new image. This applies to all underlaying images. We cut the bandwidth by having our own docker registry that acts as our main image source and storage.

  • We use Docker to deploy on Aptible, and this makes our projects entirely self-contained. With a Dockerfile in the project directory, the entire build and runtime environment is now explicitly declared.

    With "git push aptible", we push the code to the production server, rebuild the project, and run it in one command.

  • Docker explicitly violates the principles of the Twelve-Factor App. Docker apps don’t rely on any external environment. In fact, Docker demands that you store all config values, dependencies, everything inside of the container itself. Apps communicate with the rest of the world via ports and via Docker itself. The trade-off is that apps become a little bit bulkier (though not significantly), but the benefit is apps become maximally portable.

    In essence, Docker makes almost no assumptions about the app’s next home. Docker apps care about where they are even less than twelve-factor apps. They can be passed to and fro across servers—and, more importantly, across virtualization platforms—and everything needed to run them (besides the OS) comes along for the ride.

  • Satoshilabs use Docker to build firmware for Trezor, allowing people to do the same and verify binaries.

    https://groups.google.com/forum/#!topic/trezor-dev/5MCyweTY4...

  • We're using docker in part to get round the old 'works on my machine' problem: http://www.rainbird.ai/2014/08/works-machine/

  • One of the main things I'm using it for are reproducible development environments for a rather complex project comprising nearly ten web services.

    We have a script that builds a few different docker images that the devs can then pull down and get using straight away. This is also done through a dev repo that they clone that provides scripts to perform dev tasks across all services (set up databases, run test servers, pull code, run pip etc.).

    It used to take a day to set up a new dev enviroment, now it takes around 30 mins and can be done with almost no input from the user and boils down to: install docker, fetch databases restores, clone the dev repo, run the dev wrapper script

  • We're been using Docker and coreos+fleet for our production environment at GaiaGPS for a few months now, and have been very impressed. We use quay.io for building our repositories, triggered by a github commit.

    I agree with what others have said, and for us, the biggest benefit we see is keeping our production environment up to date, and stable. We're a small shop, and want to waste as little time as possible maintaining our production environment. We were able to go from 1 host (that occasionally went down -- and downtime for every deploy) to a 3-node coreos cluster fairly easily. We can also scale up, or even recreate the cluster, very easily.

  • At Shippable(shippable.com) we've been using docker for over an year now for the following use cases:

    1. deploying all our internal components like db, message queue, middleware and frontend using containers and using a custom service discovery manager. The containerization has helped us easily deploy components separately, quickly set up dev environments, test production bugs more realistically and obviously, scale up very quickly.

    2. running all the builds in custom user containers. This helps us ensure security and data isolation.

    We did run into a bunch of issues till docker was "production-ready" but the use case was strong enough for us to go ahead with it

  • I've used docker for process isolation at two companies now. In both cases, we were executing things on the server based on customer input values, and desired the isolation to help ensure safety.

    In the first company, these were one-off import jobs that would import customer information from a URL they provided.

    In the other, these are long-running daemons for a multi-tenant service, and I need to reduce the risk that one customer could exploit the system and disrupt the other customers or gain access to their data.

    I have some other experiments in play right now in which I am packaging up various services as docker containers, but this is currently non-production.

  • Deployment. We have a legacy application that would take about a day of configuration to deploy properly. With Docker (and some microservices goodness) we've reduced the deploy down to an hour, and are continually improving it.

  • We are running our app[1] in instances of Google Compute Engine. We installed Docker in those instances.

    Our app is a bunch of microservices, some Rails Apps each one running with Puma as webserver, HAProxy, some other Rack app (for Websockets). We also use RabbitMQ and Redis.

    All the components are running in their own containers (we have dozens of containers running to support this app).

    We choose this path because in case of failures, just 1 service would be down meanwhile the whole system is nearly fully functional. Re-launching a container is very straightforward and is done quickly.

    [1]: https://dockerize.it

  • We use docker at stylight for deploying our frontend app (wildfly). We use the docker hub for storing images. When we do a release we basically push to the hub, pull on all upstreams and restart the containers with the new image. We have a base application image (containing java, wildfly etc.) which basically never changes so builds and distribution are super fast. We really like the fact that the containers are isolated! We ran into an issue the other day where we wanted to dump the heap of the JVM to debug some memory leak issue, this should be easier with 1.3!

  • We are a small startup and host on Softlayer (we are part of their startup program).

    I would postulate this - if you are using AWS, you will not need a lot of what Docker provides. But if you are hosting your own servers, then Docker provides close-to-metal performance with stateless behavior.

    For example, when Heartbleed or Shellshock or POODLE hit the ecosystem, it took us 1 hour to recreate all our servers and be compliant.

    My biggest complaint and wishlist is for Docker to roll-in Fig inside itself. The flexibility to compose services/stacks is very useful and Fig claims to be too closely tied to Orchard.

  • Not technically in production yet, but I use Docker for the following scenarios:

    - Build agents for TeamCity, this was one of the first scenarios and it's been amazingly helpful so far.

    - Building third-party binaries in a reproducible environment

    - Running bioinformatics pipelines in consistent environments (using the above tools)

    - Circumventing the painfully inept IT department to give people in my group easy access to various tools

    I've also been contemplating building a Docker-based HPC cluster for a while now, though unfortunately I'm currently lacking support to make that happen.

  • At Pusher we use Docker for CI.

    I've developed a little command-line tool (https://github.com/zimbatm/cide) that can run the same environment on the developer machine and Jenkins. It also makes the Jenkins configuration much easier since build dependencies are all sandboxed in different docker boxes. The tool is mainly for legacy apps and is able to export artefacts back to Jenkins instead of publishing Docker images.

  • I've used Docker in semi-production environments.

    Configuration abd dependency management is much improved, and more efficient than VMs.

    YAML configuration is easy to hand edit.

    Docker doesn't have to rebuild an entire image for a minor application code or conf change. Incremental cache speeds up the build process.

    Scaleout with "worker" instances is quite easy to manage.

    For full production Elastic Beanstalk is worth a look. I prefer to host on DigitalOcean VMs for dev staging.

    Docker has a great local community in San Francisco.

  • Created own tool for this: https://mcloud.io.

    Cool thing about this, you don't need to change anything in config when deploying. + it follows docker's best practices like one process per container.

    Already can't imagine I will deploy something without it. It's OpenSource (MIT, contributions are welcome) and current functionality will stay free forever.

    Hope will be useful for somebody, not only for our team.

  • I recently 'Docker-ized' http://greptweet.com with https://github.com/kaihendry/greptweet/blob/master/Dockerfil...

    The main thing I really like is that Dockerfile. So if Greptweet needs to move, it's a `docker build .` to setup everything on a CoreOS VPS.

  • At UltimateFanLive we use docker on Elastic Beanstalk to speed up the scaling process. Our load goes from 0 to 60 in minutes, as we are connected with live sports data. Packages like numpy and lxml take way too long to install with yum and pip alone. So we pre-build images with the dependencies but we are still using the rest of the goodies on Elastic Beanstalk. Deploy times have plummeted and we keep t2 cpu credits.

  • We use Docker to set up our testing environments with Jenkins and install the application in it. Every build will be installed in a Docker Container automatically. The Container is used for acceptance tests. The Docker Containers are set up automatically with a Dockerfile. Its an awesome tool for automatating and deployment and used to implement the concepts of "Continuous Delivery".

  • I'm considering Docker for a small side project I have where I want to deploy a Runnable Java Jar as a daemon. Getting Java paths right across different Linux distributions can be a hassle, hoping Docker will help me solve this. For that matter, getting a daemon (service) running correctly on different Linux'es is one more thorn I'd rather not have to deal with.

  • I don't know if docker would be the right service for my use case, but considering the user experience on this thread I thought I'd ask...

    I'm looking to deploy a python based analytics service which runs for about 12 hours per job, uploads the results to a separate server then shuts down. At any given time there could be up to 100 jobs running concurrently.

    Is this 'dockable' ?

  • It seems like a bunch of people are using docker + CoreOS, is anyone using Docker with Marathon in production?

  • We use docker at https://cloudfleet.io to separate the application logic from the data and create a simple interface for additional apps.

    As we deploy on low power devices the minimal overhead of docker is crucial for us.

  • Dokku is a great way to move off Heroku and onto something way more cost effective and useful: http://quaran.to/blog/2014/09/09/dynos-are-done/

  • Most people seem to be using Docker with distro-based images, ie. start with ubuntu and then add their own app on top etc. Is anyone using more application-oriented images, ie. start with empty image and add just your application and its dependencies?

  • I've used to to make an Ubuntu packaged app available on CentOS machines. The compile-from-source was a bit of a headache (lots of dependencies which also had to be compiled from source) so being able to deploy like this saved a lot of hassle.

  • Right now, SFTP as a service (https://github.com/42technologies/docker-sftp-server)

    I also used it when our landing page was wordpress-based.

  • we use docker to run tests, honestly we could deploy the resulting images to our production infrastructure now quite happily however we haven't got round to it yet.

  • I'm really interested in using Docker but I'm having trouble understanding how to manage stateful applications, like databases.

  • I'm running firefox in docker right now using subuser[1].

    [1] http://subuser.org

  • I use it to grade homeworks, no more open TCP sockets, message queues, open files, some naughties trying to delete my home.

  • How do you make a database run on multiple containers and get redundancy with something like postgresql?

  • Using Docker for hands on workshops:

    next.ml

  • undefined

  • Not yet .

  • I'd be interesting in hearing about others production use case scenarios.