Key Features

✨

Supports multiple LLM providers such as HuggingFace, Ollama, and OpenAI-compatible APIs for text generation and embeddings.

✨

Combines coarse-grained filtering with accurate retrieval to optimize search efficiency and performance.

✨

Includes GPU batching, ZMQ-based distance communication, CPU/GPU overlapping, and selective caching of high-degree nodes.

✨

Provides a command-line interface for easy installation, setup, and querying of private data sources like Slack.

✨

Enables zero cloud dependency by running entirely on local machines, supporting privacy-sensitive use cases.