The Datasets feature in Nextflow Tower allows users to store CSV and TSV formatted dataset files in a workspace, to use as an input one or more pipelines.
In order for your pipeline to use your dataset as input during runtime, information about the dataset and file format must be included in the relevant parameters of your pipeline-schema. We recommend using the nf-core tools schema build feature to simplify the schema creation process. Commands include an option to validate and lint your schema file according to best practice guidelines from the nf-core community.
This feature is only available in organization workspaces.
Creating a new Dataset#
To create a new dataset, follow these steps:
Datasetstab in your organization workspace.
New datasetto open the dataset creation dialog shown below.
Complete the Name and Description fields using information relevant to your dataset.
You can add the dataset file to your workspace using drag-and-drop, or the system file explorer dialog.
You can customize views for the dataset using the
First row as headeroption, for dataset files that use the first row for column names.
The size of the dataset file cannot exceed 10MB.
The Datasets feature can accommodate multiple versions of a dataset. To add a new version for a dataset, follow these steps:
Select Edit next to the dataset you wish to update.
In the Edit dialog, select Add a new version.
Upload the newer version of the dataset and select Update.
All subsequent versions of a dataset must be in the same data format as the initial version.
Using a Dataset#
To use a dataset with the saved pipelines in your workspace, follow these steps:
Open any pipeline that contains a pipeline-schema from the Launchpad.
Select the input field for the pipeline, removing any default value.
Pick the desired dataset for your pipeline.
The datasets shown in the dropdown menu depend upon the validation in your pipeline-schema. If the schema specifies only
CSV format, no
TSV dataset would appear in the dropdown.