Multi-Provider LLM Support
Supports multiple LLM providers such as HuggingFace, Ollama, and OpenAI-compatible APIs for text generation and embeddings.
Adaptive Search Pipeline
Combines coarse-grained filtering with accurate retrieval to optimize search efficiency and performance.
Performance Optimizations
Includes GPU batching, ZMQ-based distance communication, CPU/GPU overlapping, and selective caching of high-degree nodes.
CLI Tool for Quick Setup
Provides a command-line interface for easy installation, setup, and querying of private data sources like Slack.
Local, Privacy-Focused Deployment
Enables zero cloud dependency by running entirely on local machines, supporting privacy-sensitive use cases.