August 9, 2016
July 8, 2021

Phoenix Channels vs Rails Action Cable

Back-end Development
Front-end Development

Looking for more? Head to the product discovery page for more info.

Go there

Looking for more? Head to the product delivery page for more info.

Go there

Looking for more? Head to the design consulting page for more info.

Go there

Looking for more? Head to the engineering consulting page for more info.

Go there

%[p%]At DockYard, we transitioned our backend development from Ruby and Rails to Elixir and Phoenix once it became clear that Phoenix better served our clients needs to take on the modern web. As we’ve seen, %[a href=""%]Phoenix is Not Rails%[/a%], but we borrow some of their great ideas. We were also delighted to give back in the other direction when Rails announced that Rails 5.0 would be shipping with Action Cable – a feature that takes inspiration from Phoenix Channels.%[/p%]%_%[p%]Now that Rails 5.0 has been released, we have clients with existing Rails stacks asking if they should use Action Cable for their real-time features, or jump to Phoenix for more reliable and scalable applications. When planning architecture decisions, in any language or framework, you should always take measurements to prove out your assumptions and needs. To measure how Phoenix channels and Action Cable handle typical application work-flows, we created a chat application in both frameworks and stressed each through varying levels of workloads.%[/p%]%_%[h2%]How the tests were designed%[/h2%]%_%[p%]For our measurements, we used the %[a href=""%]tsung%[/a%] benchmarking client to open WebSocket connections to each application. We added XML configuration to send specific Phoenix and Rails protocol messages to open channels on the established connection and publish messages. For hardware, we used two Digital Ocean 16GB, 8-core instances for the client and server.%[/p%]%_%[p%]The work-flows for each tsung client connection were as follows:%[/p%]%_%[ul%]%_%[li%]open a WebSocket connection to the server%_%[/li%]%_%[li%]create a single channel subscription on the connection, to a chat room chosen at random%_%[/li%]%_%[li%]Periodically send a message to the chat room, randomly once every 10s-60s to simulate messaging across members in the room%_%[/li%]%_%[/ul%]%_%[p%]On the server, the channel code for Rails and Phoenix is quite simple:%[/p%]%_%[pre%]%[code class="elixir language-elixir"%]defmodule Chat.RoomChannel do%_ use Chat.Web, :channel%_%_ def join("room:" %[%] _id, _params, socket) do%_ {:ok, socket}%_ end%_%_ def handle_in("publish_msg", %{"body" =%] body, "user" =%] user}, socket) do%_ broadcast!(socket, "new_message", %{body: body, user: user})%_ {:reply, :ok, socket}%_ end%_end%[/code%]%[/pre%]%_%[pre%]%[code class="ruby language-ruby"%]class RoomsChannel %[ ApplicationCable::Channel%_ def subscribed%_ @topic = "rooms:#{rand(1..8)}"%_ stream_from @topic%_ end%_%_ def publish_msg(data)%_ ActionCable.server.broadcast(@topic,%_ body: data["body"],%_ username: data["username"],%_ started: data["started"]%_ )%_ end%_end%[/code%]%[/pre%]%_%[p%]After establishing N numbers of rooms, with varying numbers of users per room, we measured each application’s responsiveness. We tested performance by joining a random room from the browser and timing the broadcasts from our local browser to all members in the room. As we increased the numbers of users per room, we measured the change in broadcast latency. We recorded short clips of each application’s behavior under different loads.%[/p%]%_%[p%]This simulates a “chat app”, but the work-flow applies equally to a variety of applications; real-time updates to visitors reading articles, streaming feeds, collaborative editors, and so on. As we evaluate the results, we’ll explore how the numbers relate to different kinds of applications.%[/p%]%_%[h2%]Rails Memory Leaks%[/h2%]%_%[p%]After creating the Rails chat application, setting up redis, and deploying the application to our instance, we immediately observed a memory leak in the application that was visible just by refreshing a browser tab and watching the memory grow; to never be freed. The following recording shows this in action (sped up 10x):%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]We searched recently reported bugs around this area, and found an issue related to Action Cable %[a href=""%]failing to call %[code class="inline"%]socket.close%[/code%] when cleaning up connections%[/a%]. This patch has been applied to the %[code class="inline"%]5-0-stable%[/code%] branch, so we updated the app to the unreleased branch and re-ran the tests. The memory leak persisted.%[/p%]%_%[p%]We haven’t yet isolated the source of the leak, but given the simplicity of the channel code, it must be within the Action Cable stack. This leak is particularly concerning since Rails 5.0 has been released for some time now and the %[code class="inline"%]5-0-stable%[/code%] branch itself has unreleased memory leak patches going back greater than 30 days.%[/p%]%_%[h2%]Scalability%[/h2%]%_%[p%]We set the memory leak issue aside and proceeded with our tests for the following scenarios:%[/p%]%_%[ul%]%_%[li%]Max numbers of rooms supported by a single server, with 50 users per room%_%[/li%]%_%[li%]Max numbers of rooms supported by a single server, with 200 users per room%_%[/li%]%_%[/ul%]%_%[p%]%[em%]Note%[/em%]: For Phoenix, for every scenario we maxed the %[em%]client server’s%[/em%] ability to open more WebSocket connections, giving us 55,000 users to simulate for our tests. Browser -%] Server latency should also be considered when evaluating broadcast latency in these tests.%[/p%]%_%[h3%]50 users per room%[/h3%]%_%[p%]%[strong%]Rails%[/strong%]: 50 rooms, 2500 users:%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]Responsiveness was speedy at 50 rooms, so we upped the room count to 75, giving us 3750 users.%[/p%]%_%[p%]%[strong%]Rails:%[/strong%] 75 rooms, 3750 users:%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]Here, we can see Action Cable falling behind on availability when broadcasting, with messages taking an average of 8s to be broadcast to all 50 users for the given room. For most applications, this level of latency is not acceptable, so the level of performance for maxinum rooms on this server would be somewhere between 50 and 75 rooms, given 50 users per room.%[/p%]%_%[p%]%[strong%]Phoenix:%[/strong%] 1100 rooms, 55,000 users (maxed 55,000 client connections)%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]We can see that Phoenix responds on average in 0.25s, and only is maxed at 1100 rooms because of the 55,000 client limit on the tsung box.%[/p%]%_%[h3%]Making sense of the results, 50 users per room%[/h3%]%_%[p%]These results show that if you are building an application for small to medium sized businesses, if 75 companies visit the app at the same time, their notifications will be delayed. Given the severe degradation in performance from 50 to 75 rooms, it may also be hard to pinpoint your server density when planning for load.%[/p%]%_%[p%]For load planning, horizontal scalability will be required with 50-75 companies per server, but the reliance on redis should be considered as a central bottleneck. Horizontal scalability was the obvious choice for Ruby when it came to stateless HTTP requests, but we can see why vertical scalability becomes increasingly important for stateful connections. If you require dozens or hundreds of servers, Action Cable will still rely on a central redis bottleneck for PubSub.%[/p%]%_%[h3%]200 users per room%[/h3%]%_%[p%]%[strong%]Rails:%[/strong%] 8 rooms, 1600 users:%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]We can see Action Cable starts off with acceptable latency, but begins to quickly fall behind as broadcast latency grows with each message. Broadcast latency grows longer than 10s. Next, we upped the room count by 1, to see where the limit was.%[/p%]%_%[p%]%[strong%]Rails:%[/strong%] 9 rooms, 1800 users before availability was compromised when broadcasting.%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]At 200 users per room, Action Cable is unable to maintain the broadcast load and supported just 9 rooms before we experienced messages stop arriving or subscriptions failing to establish. At these levels, the only consistent performance we could get for 200 users per room, was limiting the server to 7 rooms, or 1400 users.%[/p%]%_%[p%]%[strong%]Phoenix:%[/strong%] 275 rooms, 55,000 users (maxed 55,000 client connections)%[/p%]%_%[p%]%[img src="" alt=""/%]%[/p%]%_%[p%]We can see that Phoenix responds on average in 0.24s, and only is maxed at 275 rooms because of the 55,000 client limit on the tsung box. Additionally, it’s important to note that Phoenix maintains the same responsiveness when broadcasting for both the 50 and 200 users per room tests.%[/p%]%_%[h3%]Making sense of the results, 200 users per room%[/h3%]%_%[p%]You may be thinking, “but my application isn’t a chat app, and only needs simple page updates”. These tests apply equally to many scenarios. Imagine you have an information system where you want to publish periodic updates to pages. This could be for a news site where visitors see new comments, or a booking site where visitors see “20 other people are currently viewing this hotel”.%[/p%]%_%[p%]This tests shows that if your app receives a sudden spike in visitors and more than 7 articles have 200 or more readers, the server won’t be able to keep up with the notification demand for both the popular articles, %[em%]as well as the low traffic ones%[/em%].%[/p%]%_%[p%]For a booking site, imagine 7 hotels across the country release a discount rate and customers jump online for the deal. Suddenly, you need to spin up extra servers to maintain booking notifications, and the severity of the delays becomes worse if critical functionality of your application relies on these notifications.%[/p%]%_%[h2%]Conclusions%[/h2%]%_%[p%]If memory leaks can be ruled out or addressed, the sweet-spot for Action Cable today is small workloads with few subscribers on any given topic. Going beyond this means engineering efforts and resources must be spent on managing multiple nodes and optimizing code out of channels. Higher workloads with broadcasts to more than a few dozen subscribers risks availability. With Phoenix, we’ve shown that channels performance remains consistent as notification demand increases, which is essential for handling traffic spikes and avoiding overload. %[/p%]%_%[p%]When I started my own real-time Rails features with %[a href=""%]Sync%[/a%] several years ago, memory leaks and consistent performance were my main fears that drove me to look elsewhere, find Elixir, and create Phoenix. The rails-core team has done a great job putting a real-time story in place, but wrangling Ruby’s lack of concurrency features will continue to be a challenge.%[/p%]%_%[p%]The source code and instructions for running these tests on your own hardware can be found %[a href=""%]here%[/a%].%[/p%]

More like this

Mike Binns
August 10, 2021
September 23, 2021

What If...CobolToElixir

Marvel Television has been on fire this past year with WandaVision, Falcon and the Winter Soldier, and most recently, Loki.
Web Development
Elle Imhoff
August 5, 2021
September 23, 2021

What's New in IntelliJ Elixir 11.12.0

IntelliJ Elixir, the Elixir plugin for JetBrains IDEs (IntelliJ IDEA, RubyMine, WebStorm, etc), version 11.12.0 has been released. Let’s see what’s new.
Web Development
Elle Imhoff
August 3, 2021
September 23, 2021

What's New in IntelliJ Elixir 11.11.0

IntelliJ Elixir, the Elixir plugin for JetBrains IDEs (IntelliJ IDEA, RubyMine, WebStorm, etc), version 11.11.0 has been released. Let’s see what’s new.
Web Development

Stay in the know

Our latest articles and insights on all things digital product development, delivered straight to your inbox.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Let’s get to work!

Tell us about what you’re looking for and how we can help.

Get in touch
Web Development
User Experience
Testing and QA
Press Release
Remote Work
Diversity, Equity, Inclusion
Software Quality Assurance (SWA)
Project Management
Product Management
User Research
UI/UX Design
Back-end Development
Front-end Development
UX Development
Accessibility (a11y)
Manual QA Testing
Automated QA Testing
Product Delivery
Design Consulting
Engineering Consulting
Product Discovery
Product Consulting & Strategy
MVP Delivery
Existing Product Rebuild
Product Consulting Workshop
UI/UX Design & Research
Architecture Consulting
Software Development
Test Strategy Consulting
Design Thinking
Design Sprints
Mob Programming
Code Review
Pair Programming
Web App
Progressive Web App (PWA)
Native Mobile App
User Stories
Job Stories
Stakeholder Interviews
Competitor Analysis
User Personas
Production-ready Designs
Design-only Prototype
Arch. Review deliverable
Phoenix LiveView
Ruby on Rails