Sequence file formats for a variety of data analysis options

We provide myriad choices for downstream analysis of sequencing data

Sequence File Formats

Numerous options are available for converting data to compatible sequence file formats such as FASTQ files, and for downstream analysis of sequencing data. Illumina sequencers are designed so data can be easily streamed into BaseSpace Sequence Hub for cloud-based data management, analysis, and collaboration.

On-premise options are also available. And for users interested in additional data analysis options, raw data files are provided in sequence file formats that are compatible, or easily converted, for use with other software platforms.

FASTQ is a text-based sequencing data file format that stores both raw sequence data and quality scores. FASTQ files have become the standard format for storing NGS data from Illumina sequencing systems, and can be used as input for a wide variety of secondary data analysis solutions.

The MiniSeq and MiSeq Sequencing Systems provide the option to automatically convert data from BCL to FASTQ format, so separate conversion software is not required.

FASTQ ORA Sequence File Format

FASTQ ORA is a binary compressed file format of the text-based FASTQ sequencing data file format. fastq.ora files are up to 5x smaller than their corresponding fastq.gz files without compromising data integrity. All fastq.ora files can be read using the free decompression software available here. Once installed, a simple command can be used to directly pipe the output of decompression on the fly into a wide range of popular mapping tools such as BWA,1 STAR,2 and Bowtie.3

Learn More About FASTQ Files

The binary base call (BCL) sequence file format requires conversion to FASTQ format for use with user-developed or third-party data analysis tools. The NextSeq, HiSeq, and NovaSeq Sequencing Systems generate raw data files in BCL format.

The DRAGEN Bio-IT Platform offers rapid BCL conversion to FASTQ files as part of its suite of pipelines.

Illumina also offers bcl2fastq Conversion Software to convert BCL files to FASTQ files. bcl2fastq is a standalone conversion software solution that demultiplexes data and converts BCL files to standard FASTQ file formats for downstream analysis.

Bringing Bioinformatics Pipeline In-House Cuts Costs and Time

Phosphorus uses the DRAGEN Bio-IT Platform to perform genomics data analysis onsite and at an accessible price point.

Read Article
Bringing Bioinformatics Pipeline In-House Reduces Costs and Time

FASTQ files are the typical starting format for sequencing data analysis. However, BaseSpace Sequence Hub can create other file formats that are common to secondary and tertiary analysis programs.

During secondary or tertiary analysis of NGS data, software platforms and apps in the BaseSpace Informatics Suite will often convert raw sequence files from FASTQ files to other sequence file formats (ie, .vcf, .bam) as part of the analysis workflow.

Interested in receiving newsletters, case studies, and information on genomic analysis techniques? Enter your email address.
Sequencing Software Support
Software Support

Access resources and support for Illumina software, including sequencing data analysis and other software tools.

Learn More
Sequencing Online Training
Online Training

These free online courses cover common topics in library prep, sequencing, and data analysis.

Learn More
DRAGEN Technology Overview
Illumina DRAGEN Bio-IT Platform Training

Learn more about the accurate, ultra-rapid secondary analysis platform and accompanying pipelines.

View Training
Secure and Compliant: BaseSpace Sequence Hub on AWS
Secure and Compliant: BaseSpace Sequence Hub on AWS

Build your genomic sequencing practice with this powerful, easy-to-use bioinformatics compute and storage environment.

Read Solution Brief
  1. Li H. and Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009 Jul 15; 25(14): 1754–1760.
  2. Dobin A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013 Jan; 29(1): 15–21.
  3. Langmead B. et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 2009 10:R25