Edge release 19.03: The Sequence Read Archive & more!

Evan Floden
19 March 2019

It’s time for the monthly Nextflow release for March, edge version 19.03. This is another great release with some cool new features, bug fixes and improvements.

SRA channel factory

This sees the introduction of the long-awaited sequence read archive (SRA) channel factory. The SRA is a key public repository for sequencing data and run in coordination between The National Center for Biotechnology Information (NCBI), The European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ).

This feature originates all the way back in 2015 and was worked on during a 2018 Nextflow hackathon. It was brought to fore again thanks to the release of Phil Ewels’ excellent SRA Explorer. The SRA channel factory allows users to pull read data in FASTQ format directly from SRA by referencing a study, accession ID or even a keyword. It works in a similar way to fromFilePairs, returning a sample ID and files (single or pairs of files) for each sample.

The code snippet below creates a channel containing 24 samples from a chromatin dynamics study and runs FASTQC on the resulting files.

Channel
    .fromSRA('SRP043510')
    .set{reads}

process fastqc {
    input:
    set sample_id, file(reads_file) from reads

    output:
    file("fastqc_${sample_id}_logs") into fastqc_ch

    script:
    """
    mkdir fastqc_${sample_id}_logs
    fastqc -o fastqc_${sample_id}_logs -f fastq -q ${reads_file}
    """
}

See the documentation for more details. When combined with downstream processes, you can quickly open a firehose of data on your workflow!

Edge release

Note that this is a monthly edge release. To use it simply execute the following command prior to running Nextflow:

export NXF_VER=19.03.0-edge

If you need help

Please don’t hesitate to use our very active Gitter channel or create a thread in the Google discussion group.

Reporting Issues

Experiencing issues introduced by this release? Please report them in our issue tracker. Make sure to fill in the fields of the issue template.

Contributions

Special thanks to the contributors of this release:

Akira Sekiguchi - pachiras
Jon Haitz Legarreta Gorroño - jhlegarreta
Jonathan Leitschuh - JLLeitschuh
Kevin Sayers - KevinSayers
Lukas Jelonek - lukasjelonek
Paolo Di Tommaso - pditommaso
Toni Hermoso Pulido - toniher
Philippe Hupé phupe
phue

Complete changes

Fix Nextflow hangs submitting jobs to AWS batch #1024
Fix process builder incomplete output [2fe1052c]
Fix Grid executor reports invalid queue status #1045
Fix Script execute permission is lost in container #1060
Fix K8s serviceAccount is not honoured #1049
Fix K8s kuberun login path #1072
Fix K8s imagePullSecret and imagePullPolicy #1062
Fix Google Storage docs #1023
Fix Env variable NXF_CONDA_CACHEDIR is ignored #1051
Fix failing task due to legacy sleep command [3e150b56]
Fix SplitText operator should accept a closure parameter #1021
Add Channel.fromSRA factory method #1070
Add voluntary/involuntary context switches to metrics #1047
Add noHttps option to singularity config #1041
Add docker-daemon Singularity support #1043 [dfef1391]
Use peak_vmem and peak_rss as default output in the trace file instead of rss and vmem #1020
Improve ansi log rendering #996 [33038a18]

Breaking changes:

None known.

nextflow release