Skip to content

Datasets

Overview#

The Datasets feature in Nextflow Tower allows users to store CSV and TSV formatted dataset files in a workspace, to use as an input one or more pipelines.

In order for your pipeline to use your dataset as input during runtime, information about the dataset and file format must be included in the relevant parameters of your pipeline-schema. We recommend using the nf-core tools schema build feature to simplify the schema creation process. Commands include an option to validate and lint your schema file according to best practice guidelines from the nf-core community.

Note

This feature is only available in organization workspaces.

Creating a new Dataset#

To create a new dataset, follow these steps:

  1. Open the Datasets tab in your organization workspace.

  2. Select New dataset to open the dataset creation dialog shown below.

  1. Complete the Name and Description fields using information relevant to your dataset.

  2. You can add the dataset file to your workspace using drag-and-drop, or the system file explorer dialog.

  3. You can customize views for the dataset using the First row as header option, for dataset files that use the first row for column names.

Warning

The size of the dataset file cannot exceed 10MB.

Dataset versions#

The Datasets feature can accommodate multiple versions of a dataset. To add a new version for a dataset, follow these steps:

  1. Select Edit next to the dataset you wish to update.

  2. In the Edit dialog, select Add a new version.

  3. Upload the newer version of the dataset and select Update.

Warning

All subsequent versions of a dataset must be in the same data format as the initial version.

Using a Dataset#

To use a dataset with the saved pipelines in your workspace, follow these steps:

  1. Open any pipeline that contains a pipeline-schema from the Launchpad.

  2. Select the input field for the pipeline, removing any default value.

  3. Pick the desired dataset for your pipeline.

Warning

The datasets shown in the dropdown menu depend upon the validation in your pipeline-schema. If the schema specifies only CSV format, no TSV dataset would appear in the dropdown.

Back to top