Skip to content Skip to sidebar Skip to footer

Elasticsearch Runtime Error For Query Using Sparse Vectors

Very recently Elasticsearch has implemented vector-based queries. This means that each document includes a vector as a field, and we can use a new vector to find a match in our cor

Solution 1:

To solve this problem we need to make sure that Elasticsearch understand the vector field ("embedding" in my case) is actually an sparse vector. For this, use:

"properties": {
    "name": {
        "type": "keyword"
    },
    "reference": {
        "type": "keyword"
    },
    "jurisdiction": {
        "type": "keyword"
    },
    "text": {
        "type": "text"
    },
    "embedding": {
        "type": "sparse_vector"
    }
}

More details in this related question.

There are two important things to note:

  1. The quotes around the field name in the query are necessary.
  2. It is recommended to add +1 to the metric, to avoid negative values.

    "source": "cosineSimilaritySparse(params.queryVector, doc['my_embedding_field_name']) + 1.0"

Credit on these last points goes to jimczi from the Elastic Team (thanks!). See the question on the forums here.

Post a Comment for "Elasticsearch Runtime Error For Query Using Sparse Vectors"