- Hybrid Transformer-Mamba architecture for efficiency and performance.
- Large 256K context window for processing long documents.
- Mixture-of-Experts (MoE) architecture for optimized resource usage.
- Open-source model, available for self-hosting and private deployments.