Welcome to gwf

gwf is a flexible, pragmatic workflow tool for building and running large, scientific workflows. It runs on Python 3.5+ and is developed at the Bioinformatics Research Centre (BiRC), Aarhus University.


To get a feeling for what a gwf workflow looks like, have a look at a few examples.

Getting started

To quickly get started writing workflows in gwf you can read the Tutorial.


We don’t have the backend you need to run your workflow on your cluster? See the Writing Backends section to roll your own.


We aim to make gwf a community developed project. Learn how to contribute.


To install gwf via pip:

pip install gwf

To install gwf via conda:

conda config --add channels gwforg
conda install gwf

We recommend that you install gwf in a project-specific environment:

conda config --add channels gwforg
conda create -n myproject python=3.5 gwf dep1 dep2 ...
source activate myproject

You can find the code for gwf here. You are encouraged to report any issues through the issue tracker, which is also a good place to ask questions.

Change Log

Version 1.6.0


  • Named inputs and outputs. The inputs and outputs arguments to Workflow.target can now be either a string, list or dictionary. See the documentation for more details.

  • Tutorial now explains what happens if a target fails.

  • Documentation now has an official list of gwf plugins.


  • Crash when running gwf init without an existing configuration file.

Version 1.5.1


  • Crash when Slurm returns unknown job state (#244).

Version 1.5.0


  • Users can now run gwf init to bootstrap a new gwf project (c78193).

  • Add option to protect output files in a target from being removed when gwf clean is being run (2f51ed).


  • Ensure job script end with a newline (#239).

  • Ignore missing log files when cleaning on run (#237).

Version 1.4.0


  • Backend for Sun Grid Engine (SGE). The backend does not support all target options supported by the Slurm backend, so workflows can not necessarily run with the SGE backend without changes. See the documentation for a list of supported options.

Version 1.3.2


  • Made the touch command faster.

Version 1.3.1


  • The gwf status command now accepts multiple -s/--status flags and will show targets matching any of the given states. E.g. gwf status -s completed -s running will show all completed and running targets.

  • A new command gwf touch has been introduced. The command touches all files in the workflow in order, creating missing files and updating timestamps, such that gwf thinks that the workflow has been run.

  • When specifying the workflow attribute in the workflow path, e.g. gwf -f workflow.py:foo, the filename part can now be left out and will default to workflow.py. For example, gwf -f :foo will access the foo workflow object in workflow.py.

  • Documentation describing advanced patterns for gwf workflows.

Version 1.3.0

This release contains a bunch of new features and plenty of bug fixes. Most noteworthy is the removal of the progress bars in the status command. The status bars were often confusing and didn’t communicate much more than a simple “percentage completion”. The status command now outputs a table with target name, target status, and percentage completion (see the tutorial for examples). Additionally, the status command now shows all targets by default (not only endpoints). For users who wish to only see endpoints, there’s now a --endpoints flag.

We aim to make gwf a good cluster citizen. Thus, logs from targets that no no longer exist in the workflow will now be removed when running gwf run. This ensures that gwf doesn’t unnecessarily accumulate logs over time.


  • Add missing import to documentation for function templates (4eddcac).

  • Remove reference to --not-endpoints flag (d7ed251).

  • Remove broken badges in README (e352f09).

  • Remove pre-1.0 upgrade documentation (bfa03da6).

  • Fixed bug in scheduler that caused an exception when a target’s input file did not exist, but the output file did (reported by Jonas Berglund) (92301ef3).


  • Dots have been removed from logging output to make copy-pasting target names easier (f33f7195).

  • Now uses pipenv to fix development environment.

  • Improved coloring of logging output when running with -v debug (ab4ac7e3).

  • Remove status bars in gwf status command (47cb7b50).


  • Added undocumented API which allows core and plugins to register validation functions for configuration keys. This fixes issues like #226 (c8c57d7c7).

  • The gwf clean command now shows how much data will be removed (d81f143f1).

  • Remove log files for targets that are no longer defined in the workflow (beb912bd).

  • Note in tutorial on how to terminate the local workers (a long with other updates to the tutorial) (34421498).

Version 1.2.1


  • Bug when returning an AnonymousTarget from a template function without specifying the working_dir in the constructor (#212). Thanks to Steffen Møller-Larsen for reporting this.

Version 1.2


  • Bug when using --format table and no targets were found (#203).

  • Bug when cancelling a target running on the Slurm backend (#199).

  • Link to documentation in error message when unable to connect to local workers.

  • Fixed bug in the FileLogManager where the wrong exception was raised when no log was found.


  • Moved checking of file timestamps to the scheduler. This means that creating a Graph object will never touch the file system, and thus won’t raise an exception if a target depends on a file that doesn’t exist and that’s not provided a target. Instead, unresolved paths are added to Graph.unresolved. They will then be checked by the scheduler (if necessary). For end users, this means that many commands have become substantially faster.


  • Added AnonymousTarget which represents an unnamed target. Target now inherits from this class and templates may now return an AnonymousTarget instead of a tuple.

  • Added backend.slurm.log_mode option, see the documentation for the Slurm backend for usage (#202).

Version 1.1


  • Very slow scheduling when using dry run with unsubmitted targets (#184, 93e71a).

  • Fixed cancellation with the Slurm backend (#183, 29445f).

  • Fixed wildcard filtering of targets (#185, 036e3d).


  • Move file cache construction out of Graph (#186, 93e71a). This change is invisible to end-users, but speeds up the logs, cancel, info, logs and workers commands.

  • Replaced --not-endpoints flag in clean command with --all flag.

  • Made filtering more intuitive in all commands.

  • The info command now outputs JSON instead of invalid YAML.

  • The info command outputs information for all targets in the workflow by default.

  • Backends must now specify a log_manager class attribute specifying which log manager to use for accessing target log files.

  • Backends should now be used as context managers to make sure that Backend.close() is called when the backend is no longer needed, as it is no longer called automatically on exit.


  • Added filtering of targets by name in the info command.

  • Added API documentation for the gwf.filtering module.

  • Added gwf.core.graph_from_path() and gwf.core.graph_from_config().

  • Added gwf.backends.list_backends(), gwf.backends.backend_from_name() and gwf.backends.backend_from_config().

  • Added SlurmBackend.get_job_id() and SlurmBackend.forget_job() to SlurmBackend to make it easier for plugins to integrate with Slurm.

  • Documentation for log managers.

  • Documentation on how to handle large workflows.

Version 1.0

First stable release of gwf! We strongly encourage users of pre-1.0 users to read the tutorial, since quite a lot of things have changed. We also recommend reading the guide for converting pre-1.0 workflows to version 1.0. However, users attempting to do this should be aware that the the template mechanism in 1.0 is slightly different and thus requires rewriting template functions.


  • Fixed a bug which caused gwf to fail when cancelling jobs when using the Slurm backend (8c1717).


  • Documentation in various places, especially the core API.

  • Documentation for maintainers.


  • Topic guide covering templates (b175fe).

  • Added info command (6dbdbb).

Version 1.0b10


  • Fixed a subtle bug in scheduling which caused problems when resubmitting a workflow where some targets were already running (a5d884).

  • Fixed a bug in the SlurmBackend which caused gwf to crash if the Slurm queue contained a job with many dependencies (eb4446).

  • Added back the -e flag in the logs command.

Version 1.0b9


  • Fixed a bug in the SlurmBackend which caused running targets as unknown (33a6bd).


  • The Slurm backend’s database of tracked jobs is now cleaned on initialization to keep it from growing indefinitely (bd3f95).

Version 1.0b8


  • Fixed a bug which caused the gwf logs command to always show stderr (01b267).

  • Fixed a bug which caused dependencies to be set incorrectly when two targets depended on the same target (4d9e07).


  • Improved error message when trying to create a target from an invalid template (d27d1f).

  • Improved error message when assigning a non-string spec to a target (2aca0a).

  • gwf logs command now outputs logs via a pager when the system supports it, unless –no-pager is used (01b267).


  • Added more tests to cover scenarios with included workflows when building the workflow graph (86a68d0).

  • Added a bunch of documentation (69e136, 51a0e7, 942b05).

Version 1.0b7


  • Fixed bug in scheduling which was actually the cause of the incorrect scheduling that was “fixed” in 1.0b6. Also added documentation for gwf.core.schedule (7c47cb).


  • Updated documentation in a bunch of places, mostly styling.

Version 1.0b6


  • A bug in SlurmBackend which caused dependencies between targets to not be set correctly (6b71d2).


  • More improvements to and clean up of build process.

  • Updated some examples in the tutorial with current output from gwf (42c5da).

  • Logging output is now more consistent (b95af04).


  • Documentation for maintainers on how to merge in contributions and rolling a new release (fe1ee3).

Version 1.0b5


  • Unset option passed to backend causes error (#166, dcff44).

  • Set import path to allow import of module in workflow file (64841c).


  • Vastly improved build and deploy process. We’re now actually building and testing with conda.


  • Thomas Mailund

  • Dan Søndergaard

  • Anders Halager

  • Michael Knudsen

  • Tobias Madsen