Over the last year, thanks to the efforts of the amazing Elixir community, the Elixir machine learning ecosystem has grown at an impressive rate, with more and more libraries filling more and more gaps in the Elixir data science and machine learning ecosystem.
A common argument against using Nx for a new machine learning project is its perceived lack of a library/support for some common task that is available in Python. In this post, I’ll do my best to highlight areas where this is not the case, and compare and contrast Elixir projects with their Python equivalents. Additionally, I’ll discuss areas where the Elixir ecosystem still comes up short, and using Nx for a new project might not be the best idea.
The obvious place to start this post is to compare Nx to its Python equivalents. At its core, Nx is intended to serve as a NumPy equivalent with support for automatic differentiation and acceleration via GPUs. In this respect, its main inspiration is JAX—a Python library that supports automatic differentiation and JIT compilation to accelerators via composable function transformations.
The Nx API is (intentionally) considerably smaller than the NumPy API. Because Nx relies on JIT compilation, the API builds around a smaller amount of powerful primitive operations which can be used to build out more complex functions.
NumPy does not have the same luxury, instead needing to rely on specialized implementations for most of the functions in its API. JAX also builds on a set of core primitive operations; however, they intentionally provide wrappers around the NumPy API due to its ubiquity in numerical computing community.
There are pros and cons to having a smaller API. From a learning perspective, a beginner can reasonably pick up and understand 90% of the functions in the Nx API rather quickly.
Unfortunately, this trade-off means Nx implementations can at times be more verbose than their NumPy counterparts. Due to the sheer size of the API, there are often times when a few lines of NumPy translate to considerably more lines in Nx. Additionally, the Nx API falls short of the NumPy API in some areas. For example, Nx PRNG support is not as feature complete as JAX/NumPy, the Nx Linear Algebra module is not as in-depth as NumPy, and Nx does not have support for string data types.
The API shortcomings of Nx are mostly active areas of work. Even with these shortcomings, I’ve found I can be just as productive writing Nx as I can writing NumPy.
From a performance perspective, if you’re using the EXLA compiler, Nx will have (mostly) equivalent performance to JAX. Nx/EXLA relies on the same JIT compiler as JAX in XLA. That means that essentially all of the areas that JAX beats NumPy, Nx will also beat NumPy; and in all of the areas that NumPy beats JAX, NumPy will also beat Nx.
One advantage that Nx has over JAX is its first-class support for pluggable compilers and backends. While the JAX project seems to be moving in the direction of supporting multiple pluggable compilers/runtimes, Nx was built with this flexibility in mind, and thus is positioned for rapid integration with any existing/future tensor compilers and backends.
JAX is ahead in terms of parallelization; however, there are plans to integrate parallel primitives into the Nx API on the roadmap. Given that Nx can build on the same parallelism/sharding implementations as JAX in XLA, Nx can catch up to JAX relatively quickly in this respect.
One of the initial ambitions for the Nx project was to support creating and training deep-learning-type models in Elixir. This is now possible with the Axon library.
Axon is built directly on top of Nx and thus can take advantage of all of the things Nx offers, including JIT compilation and automatic differentiation. Axon most directly compares to tools like PyTorch and TensorFlow/Keras in the Python ecosystem.
From an API perspective, Axon is somewhat even with both PyTorch and TensorFlow/Keras. Aside from Attention/Transformer layers (which are on the roadmap), Axon has an essentially identical offering of model building blocks. Additionally, with its custom layer API, creating and using new layers is as easy as defining an Nx implementation of the layer. More or less any model you can create and perform inference in PyTorch/TensorFlow, you can also create and perform inference in with Axon.
Axon also has a robust training API inspired by libraries in the Python ecosystem such as PyTorch Ignite and PyTorch Lightning. The training API supports out-of-the-box callbacks such as model checkpoints, early stopping, and model validation, as well as an API for integrating custom callbacks. Similar to Keras, the Axon training API offers increasing levels of flexibility at increasing levels of complexity. In other words, you can sacrifice simplicity to have more control over training your models.
One area of concern when migrating to Elixir is the ability to make use of pre-trained models. Thanks to AxonOnnx this is possible for (almost) any model you might have.
If you’re able to export an ONNX version of your model (e.g using
torch.onnx or tf2onnx), you can probably import your model with Axon. AxonOnnx has even been tested to work with pre-trained transformers from the popular transformers library.
While Axon supports importing pre-trained models, there are still some aspects of working with pre-trained models that need ironing out. For example, fine-tuning, while possible, does not yet have a first-class API in Axon. Additionally, features such as mixed-precision and multi-device training that make training large models possible are not 100% supported in Axon yet.
Traditional Machine Learning
Along with deep learning, gradient boosting and decision tree algorithms are perhaps the most popular machine learning algorithms in use. These classes of algorithms typically outperform deep learning with tabular and time-series data, and are often significantly less expensive to train and deploy.
Unfortunately, this is an area still under active development in the Elixir ecosystem. Python has popular libraries such as XGBoost, but there is still no Elixir equivalent. I expect this to change over the next six months; however, for the time being, Elixir is behind in this area.
Elixir also falls short of Python in other traditional machine learning applications. While Python has the excellent scikit-learn, Elixir has the relatively new Scholar library. Because Scholar is new, it’s lacking in features that allow it to be a competitive alternative to sklearn. This is another area of active development on the Nx roadmap, and thus I expect things to look significantly different here in the next six months.
Essentially any data scientist that has worked with Python for any amount of time is familiar with the pandas library for data analysis. Pandas is a library for working with structured, columnar data. It’s popular as a library for any sort of analysis or munging tasks you might need to perform. The Elixir equivalent to Pandas is Explorer. Explorer is built on top of the polars library which implements DataFrames in Rust.
From an API perspective, the Explorer API is different from what you might be used to using with Python. Given that Elixir is a functional language, the Explorer library builds on immutable abstractions, which can feel quite different for somebody migrating from Python and pandas’ mutability. Explorer, like Nx, is notably more succinct than its Python counterpart. Despite this, there isn’t much you can’t do in Explorer that you can do in Pandas.
From a performance perspective, Explorer benefits from the speed of Polars. There are a number of articles that laud Polars as the fastest DataFrame library. Given that Explorer builds on that performance, you might see significant performance improvements migrating from Pandas to Explorer.
In data science, presentations and visualizations are where the money is made. Having a good tool for presenting and visualizing data is a must for any language looking to position itself in the data science sphere. Python has a number of excellent visualization libraries such as Plotly Express and matplotlib. The Elixir equivalent is its VegaLite library which provides bindings around the VegaLite graphics library.
Functionally, you can get essentially equivalent visualizations from both Elixir and Python. The VegaLite API might feel unfamiliar to users coming from Plotly Express and matplotlib; however, the abstractions are incredibly powerful and allow for composing and creating evermore complex graphics with code.
For the most part, I’ve found it possible to perform equivalent visualizations in both Elixir and Python; however, Python seems to have an edge in network visualizations and geographic visualizations. Elixir has no equivalents to Python’s NetworkX and the Folium library. I suspect with companies like Felt using Elixir in the map-making space that we might see geographic visualizations in Elixir improve (fingers crossed).
The Python ecosystem also has a number of libraries concerned with Dashboard creation. Tools such as Dash allow for the creation of interactive demos with a few lines of code. There are no direct equivalents in Elixir just yet; however, the direction of Livebook is promising for the prospects of interactive and shareable demos in Elixir.
Pipelines / Orchestration
Whether it be training large models or creating production-ready data ingest/management solutions, the task of data orchestration and pipeline is an important one for data science/machine learning. There are a large number of Python libraries built specifically to create and orechestrate data pipelines. In Elixir, there are a few; however, this is an area I would personally argue Elixir has a strong edge over Python. Given the Elixir is built on the BEAM, which is designed for concurrency, the task of Concurrent Data Processing in Elixir is a natural extension of the language. Python is just not designed with concurrency in mind. From simple language-level abstractions such as
Task to library-level abstractions such as
Broadway, creating scalable input processing pipelines is incredibly easy with Elixir by default.
That’s not to say that Python doesn’t have some nice libraries for achieving the same results. Both PyTorch and TensorFlow offer nice data loading abstractions in
DataLoader. Additionally, there are a number of libraries designed for building and orchestrating data pipelines at scale (e.g. Prefect and AirFlow). The biggest advantage Elixir has in this space is that it is concurrent and fault-tolerant by design. I don’t think Python can ever beat Elixir in this regard.
There are a number of “domain-specific” libraries that don’t neatly fall into any of the categories I’ve written so far, but which are worth a brief mention in this article—namely computer vision, natural language processing, and signal processing.
There are a number of computer vision-related libraries in the Python ecosystem that generally streamline the task of working with images. This includes Pillow and OpenCV among others. In the Elixir ecosystem, there is Evision, which provides bindings to OpenCV implementations. From both an API and performance perspective, this means working with images in Evision will be somewhat similar to working with images in Python’s OpenCV bindings.
For NLP tasks, Elixir does not have a library that is equivalent to Python’s spacy or NLTK. However, it does offer bindings to Huggingface’s tokenizers, and the ability to import a significant number of HuggingFace models for performing NLP tasks with neural networks.
The Elixir ecosystem still falls behind Python in the area of signal processing; however, with recent work to add FFT support to Nx, and other dedicated efforts, I expect this area to improve in the near future.
This post was meant to serve as a high-level comparison of the Elixir machine learning and data science ecosystem with the Python ecosystem. While there are still many gaps in the Elixir ecosystem, the progress over the last year has been rapid. Almost every library I’ve mentioned in this post is less than two years old, and I suspect there will be many more projects to fill some of the gaps I’ve mentioned in the coming months.
If you’re interested in helping in any of our areas of active development, join us at the EEF ML Working Group to drive machine learning on the BEAM forward. Until next time!