Data & Analytics

Scanpy

Scanpy is a Python-based toolkit designed for scalable analysis of single-cell gene expression data. It supports datasets exceeding one million cells and integrates tightly with the anndata data structure for efficient data handling. The toolkit offers a comprehensive suite of functionalities including preprocessing, visualization, clustering, trajectory inference, and differential expression testing. Visualization options include embeddings such as PCA, t-SNE, UMAP, force-directed graph drawing, and diffusion maps. Clustering methods include Leiden and hierarchical clustering, while trajectory inference is performed via geodesic distances along graphs. Scanpy also supports marker gene analysis, gene scoring, cell cycle scoring, and simulation of dynamic gene expression data. The project is actively maintained with 94 releases to date, the latest being version 1.11.5 released in October 2025. It is open-source under the BSD-3-Clause license and supported by a community of 157 contributors. Scanpy can be installed via pip or conda, with some features requiring additional dependencies such as leidenalg and python-igraph. The toolkit is part of a broader ecosystem including related tools like Squidpy for spatial data and Muon for multimodal single-cell data.

Updated Jan 28, 2026open-source

Visit Scanpy ↗Visual Guide

Overview

Scanpy is an open-source Python toolkit for scalable single-cell gene expression data analysis supporting datasets over one million cells.

Pricing

open-source

Single-Cell Transcriptomics Analysis

Researchers analyzing large-scale single-cell RNA sequencing data to identify cell populations and gene expression patterns.

Trajectory and Developmental Pathway Inference

Studying cellular differentiation and lineage trajectories using graph-based trajectory inference methods.

Quick Start

Install Scanpy

Install Scanpy using pip with the command pip install 'scanpy[leiden]' or via conda with conda install -c conda-forge scanpy python-igraph leidenalg.

Import Scanpy

Import the package in your Python environment using import scanpy as sc.

Load Data

Read your single-cell data into an AnnData object for processing.

Preprocess Data

Apply preprocessing functions such as normalization (sc.pp.normalize_total) and log transformation (sc.pp.log1p).

Analyze and Visualize

Perform clustering (e.g., sc.tl.leiden) and visualize results (e.g., sc.pl.umap).

📊

Strategic Context for Scanpy

Get weekly analysis on market dynamics, competitive positioning, and implementation ROI frameworks with AI Intelligence briefings.

Try Intelligence Free →

7 days free · No credit card

Assessment

Strengths

Scales efficiently to datasets exceeding one million cells.
Comprehensive functionality covering preprocessing, visualization, clustering, trajectory inference, and differential expression testing.
Active maintenance with 94 releases and a large contributor community.
Open-source under BSD-3-Clause license with no cost.
Integration with anndata and related Scanpy ecosystem tools.

Limitations

Requires installation of additional dependencies like leidenalg and python-igraph for full functionality.
Development version requires cloning the repository and installing in editable mode.
Documentation building and contribution workflows involve specific tools such as Hatch and submodules.