Semantic Search with Phoenix, Axon, Bumblebee, and ExFaiss

Introduction

In my previous post, Semantic Search with Phoenix, Axon, and Elastic, I detailed how you can use Elixir’s machine learning libraries to create a semantic search tool capable of pairing users with wines based on natural language descriptions.

Since that post was published, Elixir’s machine learning ecosystem has grown significantly with the introduction of the Bumblebee library. Bumblebee is a library that gives Elixir developers access to a variety of powerful pre-trained models available on HuggingFace.

Additionally, Nx recently introduced a serving capability designed for online deployment scenarios. Finally, I recently released a library called ExFaiss, which provides bindings to the powerful vector search library FAISS.

With these recent additions to the Elixir ecosystem, I thought it would be a good idea to update my previous post with the newest libraries available. For additional context, I suggest you read my original post on this topic. In this post, we’ll create a semantic search tool for wines using Phoenix, Axon, Bumblebee, and ExFaiss.

Setting Up the Application

Start by creating a new Phoenix application. I am using Phoenix 1.7:

mix phx.new sommelier

Next, you’ll need to add the following dependencies to your application:

[
  ...
  {:bumblebee, "~> 0.1"},
  {:nx, "~> 0.4"},
  {:exla, "~> 0.4"},
  {:ex_faiss, github: "elixir-nx/ex_faiss"}
]

Then, run mix deps.get:

mix deps.get

Finally, you’ll need to create your database:

mix ecto.create

And you’re ready to get started!

Setting up the Wine Resource

In the original semantic search application, you didn’t need to use Ecto to manage wine documents because you used Elasticsearch for persistence. This time, without Elasticsearch, you’ll need an Ecto resource to persist information about wines. Run the following command to generate a new context and schema for wines:

mix phx.gen.context Wines Wine wines name:string url:string embedding:binary

For each wine, we’ll store the name, its URL on wine.com, and an embedding, which is a vector that mathematically captures semantic information about the wine. The embedding will be generated from a semantic similarity model.

Make sure you run mix ecto.migrate to create the wine table:

mix ecto.migrate

Creating the Embedding Pipeline

Your semantic search application will take a natural language query, compute an embedded representation of the query using an Axon model, and then compare the embedded representation to existing representations of wines in an index.

Create a new file lib/sommelier/model.ex. This module will be responsible for the embedding pipeline you’ll use to embed natural language queries:

defmodule Sommelier.Model do
end

In model.ex, create a new function called serving that looks like:

def serving() do
  {:ok, %{model: model, params: params}} = Bumblebee.load_model({:hf, "sentence-transformers/paraphrase-MiniLM-L6-v2"})
  {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/paraphrase-MiniLM-L6-v2"})

  {_init_fn, predict_fn} = Axon.build(model, compiler: EXLA)

  Nx.Serving.new(fn ->
    fn %{size: size} = inputs ->
      inputs = Nx.Batch.pad(inputs, @batch_size - size)
      predict_fn.(params, inputs)[:pooled_state]
    end
  end)
  |> Nx.Serving.client_preprocessing(fn input ->
    inputs = Bumblebee.apply_tokenizer(tokenizer, texts,
      length: @sequence_length,
      return_token_type_ids: false
    )

    {Nx.Batch.concatenate([inputs]), :ok}
  end)
end

Next, add the following predict function to the module:

def predict(text) do
  Nx.Serving.batched_run(SommelierModel, text)
end

Next, add the following to your application.ex:

...
{Nx.Serving,
  serving: Sommelier.Model.serving(),
  name: SommelierModel,
  batch_size: 8,
  batch_timeout: 100},
# Start the Endpoint (http/https)
SommelierWeb.Endpoint

This will create and start a new Nx.Serving, which will handle the pre-processing and model inference in batches behind the scenes to better use resources on the server. You can test that your serving works by starting your application:

iex -S mix phx.server

And attempting to embed some text:

iex> Sommelier.Model.predict("a nice red wine")
[info] TfrtCpuClient created.
#Nx.Tensor<
  f32[1][384]
  EXLA.Backend<host:0, 0.1077614924.2375680020.62643>
  [
    [-0.02617456577718258, -8.819118374958634e-4, 0.05722760409116745, 0.12959082424640656, -0.1351461410522461, 0.020610297098755836, 0.005453622899949551, 0.1129845529794693, 0.005040481220930815, 0.041092704981565475, 0.0013414014829322696, 0.045418690890073776, 0.12092263251543045, -0.050827134400606155, -0.01729273609817028, 0.14232997596263885, 0.19483818113803864, 0.032853033393621445, -0.09650719165802002, 0.11645855009555817, 0.01761060580611229, -0.026606624945998192, 0.009240287356078625, -0.05202469229698181, 0.010420262813568115, 0.1607143133878708, -0.03218967467546463, 0.024632470682263374, 0.03334266319870949, 0.03204822167754173, 0.012620541267096996, 0.022357983514666557, -0.05593165010213852, 0.02747185155749321, 0.030256617814302444, -0.08117566257715225, 0.08132530748844147, 0.11905942112207413, 0.014421811327338219, 0.06395658850669861, 0.06002272665500641, 0.06929747760295868, -0.10164055973291397, 0.14846278727054596, -0.019189205020666122, 0.04716624692082405, -0.17113839089870453, -0.01575590670108795, 0.02289806306362152, -0.09108022600412369, ...]
  ]
>

Creating the Vector Index

In order to perform vector search, you need to create a vector search index using ExFaiss. You can read in depth about ExFaiss in my post here. Create a new module lib/sommelier/index.ex:

defmodule Sommelier.Index do
end

Next, scaffold out a basic GenServer:

use GenServer

def start_link(_opts) do
  GenServer.start_link(__MODULE__, [], name: __MODULE__)
end

@impl true
def init(_opts \\ []) do
  index = ExFaiss.Index.new(384, "IDMap,Flat")
  {:ok, index}
end

When your GenServer starts, it will create a new Flat ExFaiss Index with dimensionality of 384. Next, add the following add client/server API to your GenServer:

def add(id, embedding) do
  GenServer.cast(__MODULE__, {:add, id, embedding})
end

def handle_cast({:add, id, embedding}, index) do
  index = ExFaiss.Index.add_with_ids(index, embedding, id)
  {:noreply, index}
end

Then, add the following search client/server API to your GenServer:

def search(embedding, k) do
  GenServer.call(__MODULE__, {:search, embedding, k})
end

def handle_call({:search, embedding, k}, _from, index) do
  results = ExFaiss.Index.search(index, embedding, k)
  {:reply, results, index}
end

Finally, add the Sommelier.Index to your supervision tree:

[
  ...
  Sommelier.Index,
]

Now, you can test that your index is working properly by restarting your application and adding a few dummy embeddings to the index, and then searching:

iex> embeds = Sommelier.Model.predict("a nice red wine")
iex> Sommelier.Index.add(Nx.tensor([0]), embeds)
iex> Sommelier.Index.search(embeds, 5)
%{
  distances: #Nx.Tensor<
    f32[1][5]
    [
      [0.0, 3.4028234663852886e38, 3.4028234663852886e38, 3.4028234663852886e38, 3.4028234663852886e38]
    ]
  >,
  labels: #Nx.Tensor<
    s64[1][5]
    [
      [0, -1, -1, -1, -1]
    ]
  >
}

Creating the Search Functionality and LiveView

With your basic search and embedding infrastructure in place, you can go about creating the search LiveView. Create a file lib/sommelier_web/search_live/index.ex:

defmodule SommelierWeb.SearchLive.Index do
  use SommelierWeb, :live_view

  @impl true
  def mount(_params, _session, socket) do
    {:ok, assign(socket, :results, [])}
  end
end

Next, implement the following render function to render search results:

@impl true
def render(assigns) do
  ~H"""
  <div>
    <form name="wines-search" id="wines-search" phx-submit="search_for_wines">
      <label for="search" class="block text-sm font-medium text-gray-700">Quick search</label>
      <div class="relative mt-1 flex items-center">
        <input type="text" name="query" id="query" class="block w-full rounded-md border-gray-300 pr-12 shadow-sm focus:border-indigo-500 focus:ring-indigo-500 sm:text-sm">
        <div class="absolute inset-y-0 right-0 flex py-1.5 pr-1.5">
          <kbd class="inline-flex items-center rounded border border-gray-200 px-2 font-sans text-sm font-medium text-gray-400">⌘K</kbd>
        </div>
      </div>
    </form>
    <ul role="list" class="divide-y divide-gray-200">
      <li :for={result <- @results}>
        <p class="text-sm font-medium text-gray-900">
          <a href={result.url}><%= result.name %></a>
        </p>
      </li>
    </ul>
  </div>
  """
end

Next, implement handle_params/3 like this:

@impl true
def handle_params(%{"q" => query}, _uri, socket) do
  results = Sommelier.Wines.search_wine(query)
  {:noreply, assign(socket, :results, results)}
end

def handle_params(_params, _uri, socket) do
  {:noreply, socket}
end

This will look for query parameters in the URL and use the query to search for wines in the database using the unimplemented search_wine/1 function. Finally, implement the following event handler to handle search submissions:

def handle_event("search_for_wines", %{"query" => query}, socket) do
  {:noreply, push_patch(socket, to: ~p"/search?q=#{query}")}
end

Next, you need to implement the actual search functionality in your wine context, like this:

def search_wine(query) do
  embedding = Sommelier.Model.predict(query)
  %{labels: labels} = Sommelier.Index.search(embedding, 5)

  labels
  |> Nx.to_flat_list()
  |> get_wines()
end

def get_wines(ids) do
  from(w in Wine, where: w.id in ^ids) |> Repo.all()
end

Finally, add the following route to your router:

live "/search", SearchLive.Index, :index

Now if you navigate localhost:4000/search and type in a search, you’ll see the URL change, but no results! That’s because you haven’t actually added any wines to the database!

Seeding the Database

The wine dataset is based on the dataset from my original semantic search post. You can access the wine dataset from here. Download the document and move it to the priv directory of your sommelier project. Next, add the following to priv/repo/seeds.exs:

defmodule EmbedWineDocuments do
  def format_document(document) do
    "Name: #{document["name"]}\n" <>
      "Varietal: #{document["varietal"]}\n" <>
      "Location: #{document["location"]}\n" <>
      "Alcohol Volume: #{document["alcohol_volume"]}\n" <>
      "Alcohol Percent: #{document["alcohol_percent"]}\n" <>
      "Price: #{document["price"]}\n" <>
      "Winemaker Notes: #{document["notes"]}\n" <>
      "Reviews:\n#{format_reviews(document["reviews"])}"
  end

  defp format_reviews(reviews) do
    reviews
    |> Enum.map(fn review ->
      "Reviewer: #{review["author"]}\n" <>
        "Review: #{review["review"]}\n" <>
        "Rating: #{review["rating"]}"
    end)
    |> Enum.join("\n")
  end
end

"priv/wine_documents.jsonl"
|> File.stream!()
|> Stream.map(&Jason.decode!/1)
|> Stream.map(fn document ->
  desc = EmbedWineDocuments.format_document(document)
  embedding = Sommelier.Model.predict(desc)
  {document["name"], document["url"], embedding}
end)
|> Enum.each(fn {name, url, embedding} ->
  Sommelier.Wines.create_wine(%{"name" => name, "url" => url, "embedding" => Nx.to_binary(embedding)})
end)

Now, run mix run priv/repo/seeds.exs to add each wine to your database:

mix run priv/repo/seeds.exs

Note that this may run for a while depending on the machine you’re using.

Next, you need to ensure your database remains in sync with your wine index. You can do this by loading embeddings into the index on application startup. Adjust your init/1 function in Sommelier.Index to look like this:

def init(_opts \\ []) do
  index = ExFaiss.Index.new(384, "IDMap,Flat")
  index =
    Sommelier.Wines.list_wines()
    |> Enum.reduce(index, fn wine, index ->
      embedding = wine.embedding
      id = wine.id
      ExFaiss.Index.add_with_ids(index, Nx.from_binary(embedding, :f32), Nx.tensor([id]))
    end)

  {:ok, index}
end

This will load embeddings from the database when your application starts. Note that there are better ways to do this (e.g. by persisting snapshots of your index with native Faiss IO); however, this works well for simplicity.

Running the Search

With your database and index seeded with wines, restart your application and navigate to localhost:4000/search. Now, try running a few queries for wines. You’ll find that you can find excellent wine pairings just by describing what you’re looking for!

Conclusion

The Elixir ecosystem makes it easy to build machine-learning-enabled applications. This is a relatively simplistic example, but it’s still powerful! In about 15 minutes you have a working semantic search application, and you don’t need to use any external tools or services.

Until next time!