Show HN: Auto-Cluster RabbitMQ with AWS Autoscaling Groups

  • RabbitMQ is a great tool, but you should always test your deployment with a tool like Jepsen. Rabbit has historically had issues with multi-node clusters related to split-brains and dropping ack'd messages following network partitions. In contrast, I've had very few issues using Rabbit shovels, a store-and-foreword plugin:

    https://www.rabbitmq.com/shovel.html

    ps: s/foreword/forward/

  • Looks like a helpful project.

    Having said that the clustering is often not the hard part with rabbit, well configuring it at least.

    The more difficult issues are keeping it clustered amidst your typical day of network hiccups in cloud environments, as rabbit can be sensitive, preventing mnesia from going out to lunch, and as far as scaling goes certainly it can't go without mention proper planning, including queue HA strategies and ownership.

    Since rabbit doesn't just scale each queue across all boxes in a way that distributes load, a smart team will use several queues carefully plotted onto specific hosts that own each queue and carefully replicate that to a small number of other rabbits but certainly not more than a small amount of other hosts and never the whole cluster.

    For these reasons I have to think that this would be a nice addiction to aid in setting up clusters and maintaining a fleet size, actually scaling and operating a high volume rabbit cluster this would only be a small tool in the toolbox however.

    Without knowing the whole picture of what kind of workload birthed this approach it makes me nervous to think rabbit would be managed by autoscale, prone to its deciding it wants to replace pieces of the cluster at any time, different high volume queues landing on different hosts, mnesia deciding it doesn't want to play ball anymore etc..

    Possibly it's for very volatile workloads, none of the queues are HA, and this plugin forcefully does a hard reset of rabbit to recluster it when something goes awry?