Vector Similarity Search is built into Turso and libSQL Server as a native feature.
FLOAT32
)vector32(...)
)vector_distance_cos
)libsql_vector_idx(column)
expression in the CREATE INDEX
statement to create vector index)vector_top_k(idx_name, q_vector, k)
table-valued function_BLOB
suffix that is consistent with affinity rules.
_BLOB
suffix to make
results more generic and universal. For regular applications, developers can
choose either alternative, as the type name only serves as a hint for
SQLite and external extensions.BLOB
itself. This comes at the cost of a few bytes
per row but greatly simplifies the design of the feature.Type name | Storage (bytes) | Description |
---|---|---|
FLOAT64 | F64_BLOB | Implementation of IEEE 754 double precision format for 64-bit floating point numbers | |
FLOAT32 | F32_BLOB | Implementation of IEEE 754 single precision format for 32-bit floating point numbers | |
FLOAT16 | F16_BLOB | Implementation of IEEE 754-2008 half precision format for 16-bit floating point numbers | |
FLOATB16 | FB16_BLOB | Implementation of bfloat16 format for 16-bit floating point numbers | |
FLOAT8 | F8_BLOB | LibSQL specific implementation which compresses each vector component to single u8 byte b and reconstruct value from it using simple transformation: | |
FLOAT1BIT | F1BIT_BLOB | LibSQL-specific implementation which compresses each vector component down to 1-bit and packs multiple components into a single machine word, achieving a very compact representation |
FLOAT32
type should be a good starting point, but
you may want to explore more compact options if your table has a large number
of rows with vectors.FLOAT16
and FLOATB16
use the same amount of storage, they provide
different trade-offs between speed and accuracy. Generally, operations over
bfloat16
are faster but come at the expense of lower precision.Function name | Description |
---|---|
vector64 | vector32 | vector16 | vectorb16 | vector8 | vector1bit | Conversion function which accepts a valid vector and converts it to the corresponding target type |
vector | Alias for vector32 conversion function |
vector_extract | Extraction function which accepts valid vector and return its text representation |
vector_distance_cos | Cosine distance (1 - cosine similarity) function which operates over vector of same type with same dimensionality |
vector_distance_l2 | Euclidean distance function which operates over vector of same type with same dimensionality |
Create a table
F32_BLOB
datatype:(4)
specifies the dimensionality of the vector. This means each vector in this column will have exactly 4 components.Generate and insert embeddings
Perform a vector similarity search
vector_distance_cos
function calculates the cosine distance, which is defined as:
-10^-14
) may
occasionally appear due to floating-point arithmetic precision. These should
be interpreted as effectively zero, indicating an exact or near-exact match
between vectors.FLOAT1BIT
vectorslibsql_vector_idx
marker function like this
REINDEX movies_idx
commandDROP INDEX movies_idx
commandvector_top_k(idx_name, q_vector, k)
table-valued function. The function accepts index name, query vector and amount of neighbors to return. This function searches for k
approximate nearest neighbors and returns ROWID
of these rows or PRIMARY KEY
if base index does not have ROWID.
In order for table-valued function to work query vector must have the same vector type and dimensionality.
libsql_vector_idx
function as strings in the format key=value
:
Setting key | Value type | Description |
---|---|---|
metric | cosine | l2 | Which distance function to use for building the index. Default: cosine |
max_neighbors | positive integer | How many neighbors to store for every node in the DiskANN graph. The lower the setting — the less storage index will use in exchange to search precision. Default: where — dimensionality of vector column |
compress_neighbors | float1bit |float8 |float16 |floatb16 |float32 | Which vector type must be used to store neighbors for every node in the DiskANN graph. The more compact vector type is used for neighbors — the less storage index will use in exchange to search precision. Default: no compression (neighbors has same type as base table) |
alpha | positive float | “Density” parameter of general sparse neighborhood graph build during DiskANN algorithm. The lower parameter — the more sparse is DiskANN graph which can speed up query speed in exchange to lower search precision. Default: 1.2 |
search_l | positive integer | Setting which limits the amount of neighbors visited during vector search. The lower the setting — the faster will be search query in exchange to search precision. Default: 200 |
insert_l | positive integer | Setting which limits the amount of neighbors visited during vector insert. The lower the setting — the faster will be insert query in exchange to DiskANN graph navigability properties. Default: 70 |
T1
with max_neighbors=M
and
compress_neighbors=T2
will approximately use storage bytes for N
rows.Create a table
F32_BLOB
datatype:(4)
specifies the dimensionality of the vector. This means each vector in this column will have exactly 4 components.Generate and insert embeddings
Create an Index
libsql_vector_idx
function:embedding
column.libsql_vector_idx
marker function is required and used by libSQL to
distinguish ANN
-indices from ordinary B-Tree indices.Query the indexed table
vector_top_k
table-valued function to efficiently find the top 3 most similar vectors to [0.064, 0.777, 0.661, 0.687]
using the index.ROWID
or with singular PRIMARY KEY
. Composite PRIMARY KEY
without ROWID
is not supported