With the year nearly over, we thought it was about time to pull together the best-of-the-best guide for learning Nextflow in 2020. These resources will support anyone in the journey from total noob to Nextflow expert so this holiday season, give yourself or someone you know the gift of learning Nextflow!
We recommend that learners are comfortable with using the command line and the basic concepts of a scripting language such as Python or Perl before they start writing pipelines. Nextflow is widely used for bioinformatics applications, and the examples in these guides often focus on applications in these topics. However, Nextflow is now adopted in a number of data-intensive domains such as radio astronomy, satellite imaging and machine learning. No domain expertise is expected.
We estimate that the speediest of learners can complete the material in around 12 hours. It all depends on your background and how deep you want to dive into the rabbit-hole! Most of the content is introductory with some more advanced dataflow and configuration material in the workshops and patterns sections.
Nextflow is an open-source workflow framework for writing and scaling data-intensive computational pipelines. It is designed around the Linux philosophy of simple yet powerful command-line and scripting tools that, when chained together, facilitate complex data manipulations. Combined with support for containerization, support for major cloud providers and on-premise architectures, Nextflow simplifies the writing and deployment of complex data pipelines on any infrastructure.
The following are some high-level motivations on why people choose to adopt Nextflow:
This short YouTube video provides a general overview of Nextflow, the motivations behind its development and a demonstration of some of the latest features.
This hands-on tutorial from Seqera Labs will guide you through implementing a proof-of-concept RNA-seq pipeline. The goal is to become familiar with basic concepts, including how to define parameters, use channels for data and write processes to perform tasks. It includes all scripts, data and resources and is perfect for getting a flavor for Nextflow.
Here you’ll dive deeper into Nextflow’s most prominent features and learn how to apply them. The full workshop includes an excellent section on containers, how to build them and how to use them with Nextflow. The written materials come with examples and hands-on exercises. Optionally, you can also follow with a series of videos from a live training workshop.
The workshop includes topics on:
This advanced section discusses recurring patterns and solutions to many common implementation requirements. Code examples are available with notes to follow along with as well as a GitHub repository.
Nextflow Patterns & GitHub repository.
The following resources will help you dig deeper into Nextflow and other related projects like the nf-core community who maintain curated pipelines and a very active Slack channel. There are plenty of Nextflow tutorials and videos online, and the following list is no way exhaustive. Please let us know if we are missing something.
The reference for the Nextflow language and runtime. The docs should be your first point of reference when something is not clear. Newest features are documented in edge documentation pages released every month with the latest stable releases every three months.
Latest stable & edge documentation.
nf-core is a growing community of Nextflow users and developers. You can find curated sets of biomedical analysis pipelines built by domain experts with Nextflow, that have passed tests and have been implemented according to best practice guidelines. Be sure to sign up to the Slack channel.
Nextflow Tower is a platform to easily monitor, launch and scale Nextflow pipelines on cloud providers and on-premise infrastructure. The documentation provides details on setting up compute environments, monitoring pipelines and launching using either the web graphic interface or API.
A quickstart for deploying a genomics analysis environment on Amazon Web Services (AWS) cloud, using Nextflow to create and orchestrate analysis workflows and AWS Batch to run the workflow processes.
Google Cloud Nextflow step-by-step guide to launching Nextflow Pipelines in Google Cloud.
A collections of Nextflow based pipelines and other resources.
Nextflow is a community-driven project. The list of links below has been collated from a diverse collection of resources and experts to guide you in learning Nextflow. If you have any suggestions, please make a pull request to this page on GitHub.
Also stay tuned for our upcoming post, where we will discuss the ultimate Nextflow development environment.