Benchmark bigANNOY against direct RcppAnnoy
Source:R/benchmark_interface.R
benchmark_annoy_vs_rcppannoy.RdRun the same Annoy build and search task through bigANNOY and through a
direct dense RcppAnnoy baseline. The comparison reports both speed metrics
and data-volume metrics such as reference bytes, query bytes, and generated
index size.
Usage
benchmark_annoy_vs_rcppannoy(
x = NULL,
query = NULL,
n_ref = 2000L,
n_query = 200L,
n_dim = 20L,
k = 10L,
n_trees = 50L,
metric = "euclidean",
search_k = -1L,
seed = 42L,
build_seed = seed,
build_threads = -1L,
block_size = annoy_default_block_size(),
backend = getOption("bigANNOY.backend", "cpp"),
exact = TRUE,
filebacked = FALSE,
path_dir = tempdir(),
keep_files = FALSE,
output_path = NULL,
load_mode = "eager"
)Arguments
- x
Optional benchmark reference input. Supply
NULLto generate a synthetic reference matrix, or provide a numeric matrix,big.matrix, descriptor, descriptor path, or external pointer.- query
Optional benchmark query input. Supply
NULLfor self-search, or provide a numeric matrix,big.matrix, descriptor, descriptor path, or external pointer.- n_ref
Number of synthetic reference rows to generate when
x = NULL.- n_query
Number of synthetic query rows to generate when
x = NULLandqueryis notNULL.- n_dim
Number of synthetic columns to generate when
x = NULL.- k
Number of neighbours to return.
- n_trees
Number of Annoy trees to build.
- metric
Annoy metric. One of
"euclidean","angular","manhattan", or"dot".- search_k
Annoy search budget.
- seed
Random seed used for synthetic data generation and, by default, for the Annoy build seed.
- build_seed
Optional Annoy build seed. Defaults to
seed.- build_threads
Native Annoy build-thread setting.
- block_size
Build/search block size.
- backend
Requested bigANNOY backend.
- exact
Logical flag controlling whether to benchmark the exact Euclidean baseline with
bigKNNwhen available.- filebacked
Logical flag; if
TRUE, synthetic or dense reference inputs are converted into file-backedbig.matrixobjects before build.- path_dir
Directory where temporary Annoy and optional file-backed benchmark files should be written.
- keep_files
Logical flag; if
TRUE, leave the generated Annoy index on disk after the benchmark finishes.- output_path
Optional CSV path where the benchmark summary should be written.
- load_mode
Whether the benchmarked index should be returned metadata-only until first search (
"lazy") or eagerly loaded once built ("eager").