Scientific platforms

Science at scale

Scientists at the Wellcome Sanger Institute and EMBL-EBI are supported by their state-of-the-art research tools and scientific platforms that operate at scale.

Sequencing facility Sanger Institute Wellcome Genome Campus

Sequencing

The Wellcome Sanger Institute has one of the largest DNA sequencing facilities in the world and in 2021, the Sequencing Centre outputted almost 40,000 bn DNA bases a day (the human genome is approximately 3bn bases long). Also in 2021, our researchers read the equivalent of one gold-standard (30x) human genome every 3,2 minutes. Last year, at our Sequencing Centre, we read the genomes of 1,551 species.

Thanks to the latest Illumina hardware and bespoke software that was developed in-house, this is one of the most accurate and efficient sequencing facilities in the world.

Learn more

EMBL-EBI South Building

Big data processing and analysis: EMBL-EBI

EMBL-EBI makes open access biological research datasets available. These are used extensively across the world by more than two million researchers in academia and industry. Some 107 million requests for data are made on a daily basis to EMBL-EBI’s websites. Analysing big data has become a bottleneck for life-science research and EMBL-EBI provides facilities to enable this work.

EMBL-EBI’s open data resources include:

The Embassy Cloud provides private, secure, virtual machine-based workspaces within the EMBL-EBI infrastructure, in which clients can make optimal use of their own customised workflows, applications and datasets.

Embassy Cloud partners have access to EMBL-EBI data, services and compute resources, providing a practical and cost-effective alternative to replicating services and downloading vast public datasets locally. The Cloud’s partner companies can access their workspace from anywhere in the world, reducing the need for capital investments in hardware and related operational costs.

Learn more

Data Centre at the Sanger Institute Hinxton

Big data processing and analysis: Wellcome Sanger Institute

The data output from the Wellcome Sanger Institute is increasing all the time and the Institute has developed new technologies for storing and accessing the data. The iRODS (Integrated Rule-Orientated Data System) is a tool that is accessible to all for the management and distribution of sequence data.

The Institute has also developed more efficient data-storage formats that, like all the Institute’s software tools, are made available to the research community on an open-access basis.

Learn more

40,000 bn

DNA bases are read by the Sequencing Centre every day

450+

combined petabytes of storage between Wellcome Sanger Institute and EMBL-EBI

1,551

species were sequenced in 2021

38,000

total number of compute cores in the Data Centre