COR Brief
Data & Analytics

Scanpy

Scanpy is a Python-based toolkit designed for scalable analysis of single-cell gene expression data. It supports datasets exceeding one million cells and integrates tightly with the anndata data structure for efficient data handling. The toolkit offers a comprehensive suite of functionalities including preprocessing, visualization, clustering, trajectory inference, and differential expression testing. Visualization options include embeddings such as PCA, t-SNE, UMAP, force-directed graph drawing, and diffusion maps. Clustering methods include Leiden and hierarchical clustering, while trajectory inference is performed via geodesic distances along graphs. Scanpy also supports marker gene analysis, gene scoring, cell cycle scoring, and simulation of dynamic gene expression data. The project is actively maintained with 94 releases to date, the latest being version 1.11.5 released in October 2025. It is open-source under the BSD-3-Clause license and supported by a community of 157 contributors. Scanpy can be installed via pip or conda, with some features requiring additional dependencies such as leidenalg and python-igraph. The toolkit is part of a broader ecosystem including related tools like Squidpy for spatial data and Muon for multimodal single-cell data.

Updated Jan 28, 2026open-source

Scanpy is an open-source Python toolkit for scalable single-cell gene expression data analysis supporting datasets over one million cells.

Pricing
open-source
Category
Data & Analytics
Company
Interactive PresentationOpen Fullscreen ↗
01
Includes functions for data normalization, log transformation, and multiple embedding techniques such as PCA, t-SNE, UMAP, force-directed graph drawing, and diffusion maps.
02
Provides clustering algorithms including Leiden and hierarchical clustering, and trajectory inference based on geodesic distances along graphs.
03
Supports ranking of marker genes characterizing groups, filtering genes by criteria, gene scoring, and cell cycle scoring.
04
Enables mapping of labels and embeddings from reference datasets to new data for integrated analysis.
05
Handles datasets with more than one million cells efficiently and integrates with related tools like Squidpy and Muon within the Scanpy ecosystem.

Single-Cell Transcriptomics Analysis

Researchers analyzing large-scale single-cell RNA sequencing data to identify cell populations and gene expression patterns.

Trajectory and Developmental Pathway Inference

Studying cellular differentiation and lineage trajectories using graph-based trajectory inference methods.

1
Install Scanpy
Install Scanpy using pip with the command pip install 'scanpy[leiden]' or via conda with conda install -c conda-forge scanpy python-igraph leidenalg.
2
Import Scanpy
Import the package in your Python environment using import scanpy as sc.
3
Load Data
Read your single-cell data into an AnnData object for processing.
4
Preprocess Data
Apply preprocessing functions such as normalization (sc.pp.normalize_total) and log transformation (sc.pp.log1p).
5
Analyze and Visualize
Perform clustering (e.g., sc.tl.leiden) and visualize results (e.g., sc.pl.umap).
📊

Strategic Context for Scanpy

Get weekly analysis on market dynamics, competitive positioning, and implementation ROI frameworks with AI Intelligence briefings.

Try Intelligence Free →
7 days free · No credit card
Pricing
Model: open-source

Scanpy is distributed under the BSD-3-Clause license and is free to use with no paid plans.

Assessment
Strengths
  • Scales efficiently to datasets exceeding one million cells.
  • Comprehensive functionality covering preprocessing, visualization, clustering, trajectory inference, and differential expression testing.
  • Active maintenance with 94 releases and a large contributor community.
  • Open-source under BSD-3-Clause license with no cost.
  • Integration with anndata and related Scanpy ecosystem tools.
Limitations
  • Requires installation of additional dependencies like leidenalg and python-igraph for full functionality.
  • Development version requires cloning the repository and installing in editable mode.
  • Documentation building and contribution workflows involve specific tools such as Hatch and submodules.