During my talk for ElixirConf 2022, my goal was to convey that Axon is a production-ready deep learning framework. In this post, I’ll re-hash some of those same points, and do my best to make the case for using Nx and Axon in a production machine learning project.
What is Axon?
Axon is a deep learning framework for the Elixir ecosystem that offers a simple and straightforward means to create and train neural networks. Axon is similar to frameworks like PyTorch and TensorFlow from the Python ecosystem. You can check out some of my posts from the DockYard blog to see examples of Axon in action.
What is deep learning?
Before considering how Axon can help you in a production project, you need to understand how deep learning can help you.
Deep learning is a subset of machine learning based on artificial neural networks. Neural networks are somewhat pseudo-biologically inspired. Neural networks don’t really work like the human brain, at least in ways that we understand.
In reality, neural networks make use of composed linear and non-linear transformations to learn hierarchical representations of input data. As it turns out, this relatively simple approach to modeling input data, combined with the unreasonable effectiveness of gradient descent, can yield incredible results on a variety of tasks.
The chances are in the last few years you’ve seen and directly interacted with some incredible innovations that are byproducts of deep learning. GPT-3, DALL-E 2, and Stable Diffusion are incredibly powerful and incredibly popular deep learning models.
Deep learning has essentially broken every expectation of what is technically possible in the realm of artificial intelligence and machine learning. A byproduct of this is the emergence of startups and companies focused on building an AI-centric future.
Many experts are focused on the debate over whether or not the current trajectory of deep learning will lead to artificial general intelligence (AGI). I believe this debate is a red herring. I am cautiously optimistic that deep learning has reached a point where it’s capable of solving challenging open-ended problems across a broad range of industries. The value proposition of using deep learning to some extent in your applications is approaching a point that is too great to ignore.
What are the benefits of Axon?
Axon as a deep learning framework stacks up closely to both PyTorch and TensorFlow. From a feature-set perspective, Axon reaches near feature parity* with both PyTorch and TensorFlow. There are some notable limitations–namely mixed precision training and distributed training–however, both of these are on the roadmap and will more than likely be implemented and tested within the next few months.
If you’re transitioning from using PyTorch and TensorFlow, Axon will feel very familiar to both. This is intentional. Axon’s API is designed to mirror the syntax and semantics of both frameworks–with some exceptions to account for Elixir’s functional style. The barrier to entry in learning Axon as a programmer familiar with Python’s deep learning frameworks is relatively low.
Given the similarities of Axon to PyTorch and TensorFlow, what differentiates Axon? One of the most compelling cases for Axon is that Axon is capable of seamlessly scaling up or down to meet your production needs.
As a byproduct of building on top of Nx’s flexible runtime options, you can use the same deployment interface in Axon when targeting mobile, edge, and server deployments–all you need to change is the Nx backend or compiler. Pair this with a language like Elixir, which also scales up or down to meet your production needs, and you can develop for mobile (Elixir Desktop), edge (Nerves), and server (Phoenix) deployments with the same stack. The Python ecosystem is much more fragmented in this regard. There are a number of deployment runtimes and tools that are designed to overcome the shortcomings of the language.
Using Axon also opens you up to the benefits of using Elixir across the entire machine learning operations lifecycle. Some of Elixir and Erlang/OTP’s best features for designing robust, scalable, and fault-tolerant applications are also unintentional strengths that can benefit you at every step in the lifecycle of a machine learning deployment.
One of the strongest arguments against using Nx and Axon is the lack of maturity in the ecosystem. This is a valid concern. After all, by choosing to use Nx and Axon, you are, in some ways, forgoing the ability to use familiar tools like Pandas, Spark, Airflow, Prefect, and more. However, depending on your use case, you will likely find an analagous tool available in the Elixir ecosystem. Or, you might find that using Elixir completely eliminates the need for an existing solution.
Why should I use Axon if…?
Generally speaking, there are four permutations to the question “Why should I use Axon?”.
- Why should I use Axon if my application uses Elixir, and I have a machine learning need?
- Why should I use Axon if my application uses Elixir, but I don’t have a machine learning need?
- Why should I use Axon if my application doesn’t use Elixir, but I have a machine learning need?
- Why should I use Axon if my application doesn’t use Elixir, and I don’t have a machine learning need?
I will do my best to answer each one.
1. My app uses Elixir and machine learning
If you’re reading this article, it’s likely you are firmly in this camp. You’re already using Elixir in some or all of your development stack, and your application makes use of machine learning.
The most compelling benefit of making the switch to using Axon and Nx is that you can avoid fragmented machine learning workflows.
One of the biggest challenges of MLOps is gluing everything together in production. When your workflows are fragmented across languages and enterprise solutions, gluing feels more like duct taping. Oftentimes this leads teams to reach for enterprise-grade solutions for dataflow engineering such as Spark and Prefect. However, I believe if you make the switch to a full-Elixir stack you might be able to completely eliminate the need for these solutions in production.
If you have an existing data science and machine learning team comfortable working in Python, it’s both costly and difficult to compel them to learn a new language. Fortunately, one of Axon’s development priorities is portability. Using a solution like AxonOnnx (and some future projects still to come), you can easily export most models from Python and import them directly to Elixir. We are constantly adding support for more models, so if you find your model is unsupported, please reach out or open an issue.
2. My app uses Elixir, but not machine learning
If you have an application that is already using Elixir but doesn’t have an immediate machine learning need, then it doesn’t necessarily make sense to search for one so you can use Axon. I am a firm believer that you should not use machine learning unless it’s absolutely necessary.
On the flip side, I am also a firm believer that there are applications of machine learning everywhere. As I mentioned at the beginning of this article, deep learning is eating the world. While I don’t love the idea of using deep learning for deep learning’s sake, I do believe that the value proposition of integrating deep learning into existing applications is getting too high to ignore.
3. My app doesn’t use Elixir but does use machine learning
For those in this group, it’s difficult to make the case that you should upend your entire workflow and start using Elixir for everything. The cost of completely replacing your stack with Elixir is likely too high for you to feel comfortable making that decision.
Rather than upending your entire stack, I would encourage you to investigate parts of your machine learning operations cycle that can benefit from using of Elixir. Perhaps the lowest friction place to start is replacing some of your workflow automation and dataflow engineering with an Elixir solution.
Of course, you might find that completely making the switch isn’t as difficult or as costly as you think. There are success stories in the wild (see Amplified) of companies switching from a fragmented stack to a 100% Elixir stack for their machine learning products, and experiencing the cost and time benefits of working in a full Elixir stack.
4. My app doesn’t use Elixir or machine learning
Similar to group #3, it’s likely you’re not ready to commit to upending your stack in favor of Elixir.
However, similar to group #2 there may be a compelling use case for machine learning in your future. If you find that it makes sense to start integrating some intelligent components into your application, the cost of choosing to start with Nx and Axon is low.
You might find it easier to integrate Elixir into your existing workflows than you would some of the tools in the Python ecosystem. Again, you shouldn’t seek out a machine learning need if one doesn’t exist, but you also shouldn’t ignore the possibilities of what you can accomplish with the power of machine learning today.
I hope this article leads you to think about integrating Nx and Axon into your application. One of my goals for the coming year is to prove the efficacy of the Nx ecosystem in production environments.
If you have a machine learning use case and a desire to start with Nx or Axon, don’t hesitate to reach out. If you’re interested in learning more about Nx and Axon, check out my other blog posts.
Lastly, if you want to help shape the future of the Nx ecosystem, come join us in the EEF ML Working Group.
Until next time!