Skip to main content
Version: 22.4.0

Frequently Asked Questions

General Questions

Administration Console

Q: How do I access the Administration Console?

The Administration Console allows Tower instance administrators to interact with all users and organizations registered with the platform. Administrators must be identified in your Tower instance configuration files prior to instantiation of the application.

  1. Create a TOWER_ROOT_USERS environment variable (e.g. via tower.env or Kubernetes ConfigMap).

  2. Populate the variable with a sequence of comma-delimited email addresses (no spaces).
    Example: TOWER_ROOT_USERS=foo@foo.com,bar@bar.com

  3. If using a Tower version earlier than 21.12:

    1. Add the following configuration to tower.yml:
    tower:
    admin:
    root-users: "${TOWER_ROOT_USERS:[]}"
  4. Restart the cron and backend containers/Deployments.

  5. The console will now be available via your Profile drop-down menu.

API

Q:I am trying to query more results than the maximum return size allows. Can I do pagination?

Yes. We recommend using pagination to fetch the results in smaller chunks through multiple API calls with the help of max and subsequent offset parameters. You will receive an error like below if you run into the maximum result limit.

{object} length parameter cannot be greater than 100 (current value={value_sent})

We have laid out an example below using the workflow endpoint.

curl -X GET "https://$TOWER_SERVER_URL/workflow/$WORKFLOW_ID/tasks?workspaceId=$WORKSPACE_ID&max=100" \
-H "Accept: application/json" \
-H "Authorization: Bearer $TOWER_ACCESS_TOKEN"

curl -X GET "https://$TOWER_SERVER_URL/workflow/$WORKFLOW_ID/tasks?workspaceId=$WORKSPACE_ID&max=100&offset=100" \
-H "Accept: application/json" \
-H "Authorization: Bearer $TOWER_ACCESS_TOKEN"

Q: Why am I receiving a 403 HTTP Response when trying to launch a pipeline via the /workflow/launch API endpoint?

Launch users have more restricted permissions within a Workspace than Power users. While both can launch pipelines via API calls, Launch users must specify additional values that are optional for a Power user.

One such value is launch.id; attempting to launch a pipeline without specifying a launch.id in the API payload is equivalent to using the "Start Quick Launch" button within a workspace (a feature only available to Power users).

If you have encountered the 403 error as a result of being a Launch user who did not provide a launch.id, please try resolving the problem as follows:

  1. Provide the launch ID to the payload sent to the tower using the same endpoint. To do this;

    1. Query the list of pipelines via the /pipelines endpoint. Find the pipelineId of the pipeline you intend to launch.

    2. Once you have the pipelineId, call the /pipelines/{pipelineId}/launch API to retrieve the pipeline's launch.id.

    3. Include the launch.id in your call to the /workflow/launch API endpoint (see example below).

      {
      "launch": {
      "id": "Q2kVavFZNVCBkC78foTvf",
      "computeEnvId": "4nqF77d6N1JoJrVrrgB8pH",
      "runName": "sample-run",
      "pipeline": "https://github.com/sample-repo/project",
      "workDir": "s3://myBucketName",
      "revision": "main"
      }
      }
  2. If a launch id remains unavailable to you, upgrade your user role to 'Maintain' or higher. This will allow you to execute quick launch-type pipeline invocations.

Common Errors

Q: After following the log-in link, why is my screen frozen at /auth?success=true?

Starting with v22.1, Tower Enterprise implements stricter cookie security by default and will only send an auth cookie if the client is connected via HTTPS. The lack of an auth token will cause HTTP-only log-in attempts to fail (thereby causing the frozen screen).

To remediate this problem, set the following environment variable TOWER_ENABLE_UNSAFE_MODE=true.

Q: "Unknown pipeline repository or missing credentials" error when pulling from a public Github repository?

Github imposes rate limits on repository pulls (including public repositories), where unauthenticated requests are capped at 60 requests/hour and authenticated requests are capped at 5000/hour. Tower users tend to encounter this error due to the 60 request/hour cap.

To resolve the problem, please try the following:

  1. Ensure there is at least one Github credential in your Workspace's Credentials tab.

  2. Ensure that the Access token field of all Github Credential objects is populated with a Personal Access Token value, NOT a user password. (_Github PATs are typically several dozen characters long and begin with a ghp_prefix; example:ghp*IqIMNOZH6zOwIEB4T9A2g4EHMy8Ji42q4HA*)

  3. Confirm that your PAT is providing the elevated threshold and transactions are being charged against it:

    curl -H "Authorization: token ghp_LONG_ALPHANUMERIC_PAT" -H "Accept: application/vnd.github.v3+json" https://api.github.com/rate_limit

Q: "Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect)" error.

This error can occur if incorrect configuration values are assigned to the backend and cron containers' MICRONAUT_ENVIRONMENTS environment variable. You may see other unexpected system behaviour like two exact copies of the same Nextflow job be submitted to the Executor for scheduling.

Please verify the following:

  1. The MICRONAUT_ENVIRONMENTS environment variable associated with the backend container:
    • Contains prod,redis,ha
    • Does not contain cron
  2. The MICRONAUT_ENVIRONMENTS environment variable associated with the cron container:
    • Contains prod,redis,cron
    • Does not contain ha
  3. You do not have another copy of the MICRONAUT_ENVIRONMENTS environment variable defined elsewhere in your application (e.g. a tower.env file or Kubernetes ConfigMap).
  4. If you are using a separate container/pod to execute migrate-db.sh, there is no MICRONAUT_ENVIRONMENTS environment variable assigned to it.

Q: Why do I get a chmod: cannot access PATH/TO/bin/*: No such file or directory exception?

This error will be thrown if you attempt to run chmod against an S3/fusion-backed workdir which contains only hidden files.

The behaviour is patched in Nextflow v22.09.7-edge. If you are unable to upgrade please see the original bug report for alternative workarounds.

Q: "No such variable" error.

This error can occur if you execute a DSL 1-based Nextflow workflow using Nextflow 22.03.0-edge or later.

Q: Does the sleep command work the same way across my entire script?

The sleep commands within your Nextflow workflows may differ in behaviour depending on where they are:

  • If used within an errorStrategy block, the Groovy sleep function will be used (which takes its value in milliseconds).
  • If used within a process script block, that language's sleep binary/method will be used. Example: this BASH script uses the BASH sleep binary, which takes its value in seconds.

Q: Why does re-launching/resuming a run fail with field revision is not writable?

A known issue with Tower versions prior to 22.3 caused resuming runs to fail for users with the launch role. This issue was fixed in Tower 22.3. Upgrade to the latest version of Tower to allow launch users to resume runs.

Compute Environments

Q: Can the name of a Compute Environment created in Tower contain special characters?

No. Tower version 21.12 and later do not support the inclusion of special characters in the name of Compute Environment objects.

Q: How do I set NXF_OPTS values for a Compute Environment?

This depends on your Tower version:

  • For v22.1.1+, specify the values via the Environment variables section of the "Add Compute Environment" screen.

  • For versions earlier than v22.1.1, specify the values via the Staging options > Pre-run script textbox on the "Add Compute Environment" screen. Example:

    export NXF_OPTS="-Xms64m -Xmx512m"

Containers

Q: Can I use rootless containers in my Nextflow pipelines?

Most containers use the root user by default. However, some users prefer to define a non-root user in the container in order to minimize the risk of privilege escalation. Because Nextflow and its tasks use a shared work directory to manage input and output data, using rootless containers can lead to file permissions errors in some environments:

touch: cannot touch '/fsx/work/ab/27d78d2b9b17ee895b88fcee794226/.command.begin': Permission denied

As of Tower 22.1.0 or later, this issue should not occur when using AWS Batch. In other situations, you can avoid this issue by forcing all task containers to run as root. To do so, add one of the following snippets to your Nextflow configuration:

// cloud executors
process.containerOptions = "--user 0:0"

// Kubernetes
k8s.securityContext = [
"runAsUser": 0,
"runAsGroup": 0
]

Databases

Q: Help! I upgraded to Tower Enterprise 22.2.0 and now my database connect is failing.

Tower Enterprise 22.2.0 introduced a breaking change whereby the TOWER_DB_DRIVER is now required to be org.mariadb.jdbc.Driver.

Clients who use Amazon Aurora as their database solution may encounter a java.sql.SQLNonTransientConnectionException: ... could not load system variables error, likely due to a known error tracked within the MariaDB project.

Please modify Tower Enterprise configuration as follows to try resolving the problem:

  1. Ensure your TOWER_DB_DRIVER uses the specified MariaDB URI.
  2. Modify your TOWER_DB_URL to: TOWER_DB_URL=jdbc:mysql://YOUR_DOMAIN:YOUR_PORT/YOUR_TOWER_DB?usePipelineAuth=false&useBatchMultiSend=false

Datasets

Q: Why are uploads of Datasets via direct calls to the Tower API failing?

When uploading Datasets via the Tower UI or CLI, some steps are automatically done on your behalf. To upload Datasets via the TOwer API, additional steps are required:

  1. Explicitly define the MIME type of the file being uploaded.
  2. Make two calls to the API:
    1. Create a Dataset object
    2. Upload the samplesheet to the Dataset object.

Example:

# Step 1: Create the Dataset object
$ curl -X POST "https://api.cloud.seqera.io/workspaces/$WORKSPACE_ID/datasets/" -H "Content-Type: application/json" -H "Authorization: Bearer $TOWER_ACCESS_TOKEN" --data '{"name":"placeholder", "description":"A placeholder for the data we will submit in the next call"}'

# Step 2: Upload the datasheet into the Dataset object
$ curl -X POST "https://api.cloud.seqera.io/workspaces/$WORKSPACE_ID/datasets/$DATASET_ID/upload" -H "Accept: application/json" -H "Authorization: Bearer $TOWER_ACCESS_TOKEN" -H "Content-Type: multipart/form-data" -F "file=@samplesheet_full.csv; type=text/csv"

You can also use the tower-cli to upload the dataset to a particular workspace.

tw datasets add --name "cli_uploaded_samplesheet" ./samplesheet_full.csv

Q: Why is my uploaded Dataset not showing in the Tower Launch screen input field drop-down?

When launching a Nextflow workflow from the Tower GUI, the input field drop-down will only show Datasets whose mimetypes match the rules specified in the associated nextflow_schema.json file. If your Dataset has a different mimetype than specified in the pipeline schema, Tower will not present the file.

Note that a known issue in Tower 22.2 which caused TSV datasets to be unavailable in the drop-down has been fixed in version 22.4.1.

Example: The default nf-core RNASeq pipeline specifies that only files with a csv mimetype should be provided as an input file. If you created a Dataset of mimetype tsv, it would not appear as an input filed dropdown option.

Q: Can an input file mimetype restriction be added to the nextflow_schema.json file generated by the nf-core pipeline schema builder tool?

As of August 2022, it is possible to add a mimetype restriction to the nextflow_schema.json file generated by the nf-core schema builder tool but this must occur manually after generation, not during. Please refer to this RNASeq example to see how the mimetype key-value pair should be specified.

Q: Why are my datasets converted to 'application/vnd.ms-excel' data type when uploading on a browser using Windows OS?

This is a known issue when using Firefox browser with the Tower version prior to 22.2.0. You can either (a) upgrade the Tower version to 22.2.0 or higher or (b) use Chrome.

For context, the Tower will prompt the message below if you encountered this issue.

"Given file is not a dataset file. Detected media type: 'application/vnd.ms-excel'. Allowed types: 'text/csv, text/tab-separated-values'"

Q: Why are TSV-formatted datasets not shown in the Tower launch screen input field drop-down menu?

An issue was identified in Tower version 22.2 which caused TSV datasets to be unavailable in the input data drop-down menu on the launch screen. This has been fixed in Tower version 22.4.1.

Email and TLS

Q: How do I solve TLS errors when attempting to send email?

Nextflow and Nextflow Tower both have the ability to interact with email providers on your behalf. These providers often require TLS connections, with many now requiring at least TLSv1.2.

TLS connection errors can occur due to variability in the default TLS version specified by your underlying JDK distribution. If you encounter any of the following errors, there is likely a mismatch between your default TLS version and what is expected by the email provider:

  • Unexpected error sending mail ... TLS 1.0 and 1.1 are not supported. Please upgrade/update your client to support TLS 1.2" error
  • ERROR nextflow.script.WorkflowMetadata - Failed to invoke 'workflow.onComplete' event handler ... javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)

To fix the problem, you can either:

  1. Set a JDK environment variable to force Nextflow and/or the Tower containers to use TLSv1.2 by default:
export JAVA_OPTIONS="-Dmail.smtp.ssl.protocols=TLSv1.2"
  1. Add the following parameter to your nextflow.config file:
mail {
smtp.ssl.protocols = 'TLSv1.2'
}

In both cases, please ensure these values are also set for Nextflow and/or Tower:

  • mail.smtp.starttls.enable=true
  • mail.smtp.starttls.required=true

Git integration

Q: Tower authentication to BitBucket fails, with the Tower backend log containing a warning: "Can't retrieve revisions for pipeline - https://my.bitbucketserver.com/path/to/pipeline/repo - Cause: Get branches operation not support by BitbucketServerRepositoryProvider provider"

If you have supplied correct BitBucket credentials and URL details in your tower.yml, but experience this error, update your Tower version to at least v22.3.0. This version addresses SCM provider authentication issues and is likely to resolve the retrieval failure described here.

Healthcheck

Q: Does Tower offer a healthcheck API endpoint?

Yes. Customers wishing to implement automated healtcheck functionality should use Tower's service-info endpoint.

Example:

# Run a healthcheck and extract the HTTP response code:
$ curl -o /dev/null -s -w "%{http_code}\n" --connect-timeout 2 "https://api.cloud.seqera.io/service-info" -H "Accept: application/json"
200

Logging

Q: Can Tower enable detailed logging related to sign-in activity?

Yes. For more detailed logging related to login events, set the following environment variable: TOWER_SECURITY_LOGLEVEL=DEBUG.

Q: Can Tower enable detailed logging related to application activites?

Yes. For more detailed logging related to application activities, set the following environment variable: TOWER_LOG_LEVEL=TRACE.

Q: Version 22.3.1: My downloaded Nextflow log file is broken.

A Tower Launcher issue has been identified which affects the Nextflow log file download in Tower version 22.3.1. A patch was released in version 22.3.2 that addresses this behavior. Update Tower to version 22.3.2 or later.

Login

Q: Can I completely disable Tower's email login feature?

The email login feature cannot be completely removed from the Tower login screen.

Q: How can I restrict Tower access to only a subset of email addresses?

You can restrict which emails are allowed to have automatic access to your Tower implementation via a configuration in tower.yml.

Users without automatic access will receive an acknowledgment of their login request but be unable to access the platform until approved by a Tower administration via the Administrator Console.

# This any email address that matches a pattern here will have automatic access.
tower:
trustedEmails:
- '*@seqera.io`
- 'named_user@example.com'

Q: Why am I receiving login errors stating that admin approval is required when using Azure AD OIDC?

The Azure AD app integrated with Tower must have user consent settings configured to "Allow user consent for apps" to ensure that admin approval is not required for each application login. See User consent settings.

Q: Why is my OIDC redirect_url set to http instead of https?

This can occur for several reasons. Please verify the following:

  1. Your TOWER_SERVER_URL environment variable uses the https:// prefix.
  2. Your tower.yml has micronaut.ssl.enabled set to true.
  3. Any Load Balancer instance that sends traffic to the Tower application is configured to use HTTPS as its backend protocol rather than TCP.

Q: Why isn't my OIDC callback working?

Callbacks could fail for many reasons. To more effectively investigate the problem:

  1. Set the Tower environment variable to TOWER_SECURITY_LOGLEVEL=DEBUG.
  2. Ensure your TOWER_OIDC_CLIENT, TOWER_OIDC_SECRET, and TOWER_OIDC_ISSUER environment variables all match the values specified in your OIDC provider's corresponding application.
  3. Ensure your network infrastructure allow necessary egress and ingress traffic.

Q: Why did Google SMTP start returning Username and Password not accepted errors?

Previously functioning Tower Enterprise email integration with Google SMTP are likely to encounter errors as of May 30, 2022 due to a security posture change implemented by Google.

To reestablish email connectivity, please follow the instructions at https://support.google.com/accounts/answer/3466521 to provision an app password. Update your TOWER_SMTP_PASSWORD environment variable with the app password, and restart the application.

Logging

Q: Can Tower enable detailed logging related to sign-in activity?

Yes. For more detailed logging related to login events, set the following environment variable: TOWER_SECURITY_LOGLEVEL=DEBUG.

Q: Can Tower enable detailed logging related to application activities?

Yes. For more detailed logging related to application activities, set the following environment variable: TOWER_LOG_LEVEL=TRACE.

Miscellaneous

Q: Is my data safe?

Yes, your data stays strictly within your infrastructure itself. When you launch a workflow through Tower, you need to connect your infrastructure (HPC/VMs/K8s) by creating the appropriate credentials and compute environment in a workspace.

Tower then uses this configuration to trigger a Nextflow workflow within your infrastructure similar to what is done via the Nextflow CLI, therefore Tower does not manipulate any data itself and no data is transferred to the infrastructure where Tower is running.

Monitoring

Q: Can Tower integrate with 3rd party Java-based Application Performance Monitoring (APM) solutions?

Yes. You can mount the APM solution's JAR file in the backend container and set the agent JVM option via the JAVA_OPTS env variable.

Q: Is it possible to retrieve the trace file for a Tower-based workflow run?

Yes. Although it is not possible to directly download the file via Tower, you can configure your workflow to export the file to persistent storage:

  1. Set the following block in your nextflow.config:
trace {
enabled = true
}
  1. Add a copy command to your pipeline's Advanced options > Post-run script field:
# Example: Export the generated trace file to an S3 bucket
# Ensure that your Nextflow head job has the necessary permissions to interact with the target storage medium!
aws s3 cp ./trace.txt s3://MY_BUCKET/trace/trace.txt

Q: When monitoring pipeline execution via the Runs tab, why do I occasionally see Tower reporting "Live events sync offline"?

Nextflow Tower uses server-sent events to push real-time updates to your browser. The client must establish a connection to the Nextflow Tower server's /api/live endpoint to initiate the stream of data, and this connection can occasionally fail due to factors like network latency.

To resolve the issue, please try reloading the UI to reinitiate the client's connection to the server. If reloading fails to resolve the problem, please contact Seqera Support for assistance with webserver timeout settings adjustments.

Nextflow Configuration

Q: How can I specify Nextflow CLI run arguments when launching from Tower?

As of Nextflow v22.09.1-edge, when invoking a pipeline from Tower, you can specify Nextflow CLI run arguments by setting the NXF_CLI_OPTS environment variable via pre-run script:

# Example:
export NXF_CLI_OPTS='-dump-hashes'

Q: Can a repository's nextflow_schema.json support multiple input file mimetypes?

No. As of April 2022, it is not possible to configure an input field (example) to support different mime types (e.g. a text/csv-type file during one execution, and a text/tab-separated-values file in a subsequent run).

Q: Why are my --outdir artefacts not available when executing runs in a cloud environment?

As of April 2022, Nextflow resolves relative paths against the current working directory. In a classic grid HPC, this normally corresponds to a subdirectory of the user's $HOME directory. In a cloud execution environment, however, the path will be resolved relative to the container file system meaning files will be lost when the container is termination. See here for more details.

Tower Users can avoid this problem by specifying the following configuration in the Advanced options > Nextflow config file configuration textbox: params.outdir = workDir + '/results. This will ensure the output files are written to your stateful storage rather than ephemeral container storage.

Q: Can Nextflow be configured to ignore a Singularity cache?

Yes. To ignore the Singularity cache, add the following configuration item to your workflow: process.container = 'file:///some/singularity/image.sif'.

Q: Why does Nextflow fail with a WARN: Cannot read project manifest ... path=nextflow.config error message?

This error can occur when executing a pipeline where the source git repository's default branch is not populated with main.nf and nextflow.config files, regardless of whether the invoked pipeline is using a non-default revision/branch (e.g. dev).

Current as of May 16, 2022, there is no solution for this problem other than to create blank main.nf and nextflow.config files in the default branch. This will allow the pipeline to run, using the content of the main.nf and nextflow.config in your target revision.

Q: Is it possible to maintain different Nextflow configuration files for different environments?

Yes. The main nextflow.config file will always be imported by default. Instead of managing multiple nextflow.config files (each customized for an environment), you can create unique environment config files and import them as their own profile in the main nextflow.config.

Example:

// nextflow.config

<truncated>

profiles {
test { includeConfig 'conf/test.config' }
prod { includeConfig 'conf/prod.config' }
uat { includeConfig 'conf/uat.config' }
}

<truncated>

Q: Is there a limitation to the size of the BAM files that can be uploaded to the S3 bucket?

You will see this on your log file if you encountered an error related to this: WARN: Failed to publish file: s3://[bucket-name]

AWS have a limitation on the size of the object that can be uploaded to S3 when using the multipart upload feature. You may refer to this documentation for more information. For this specific instance, it is hitting the maximum number of parts per upload.

The following configuration are suggested to work with the above stated AWS limitation:

  • Head Job CPUs = 16

  • Head Job Memory = 60000

  • Pre-run script = export NXF_OPTS="-Xms20G -Xmx40G"

  • Update the nextflow.config to increase the chunk size and slow down the number of transfers.

    aws {
    batch {
    maxParallelTransfers = 5
    maxTransferAttempts = 3
    delayBetweenAttempts = 30
    }
    client {
    uploadChunkSize = '200MB'
    maxConnections = 10
    maxErrorRetry = 10
    uploadMaxThreads = 10
    uploadMaxAttempts = 10
    uploadRetrySleep = '10 sec'
    }
    }

Q: Why is Nextflow forbidden to retrieve a params file from Nextflow Tower?

Ephemeral endpoints can only be consumed once. Nextflow versions older than 22.04 may try to call the same endpoint more than once, resulting in an error similar to the following: Cannot parse params file: /ephemeral/example.json - Cause: Server returned HTTP response code: 403 for URL: https://api.tower.nf/ephemeral/example.json.

To resolve this problem, please upgrade your Nextflow version to version 22.04.x or later.

Q: How can I prevent Nextflow from uploading intermediate files from local scratch to my S3 work directory?

Nextflow will only unstage files/folders that have been explicitly defined as process outputs. If your workflow has processes that generate folder-type outputs, please ensure that the process also purges any intermediate files that reside within. Failure to do so will result in the intermediate files being copied as part of the task unstaging process, resulting in additional storage costs and lengthened pipeline execution times.

Q: Why do some values specified in my git repository's nextflow.config change when the pipeline is launched via Tower?

You may notice that some values specified in your pipeline repository's nextflow.config have changed when the pipeline is invoked via Tower. This occurs because Tower is configured with a set of default values that are superimposed on the pipeline configuration (with the Tower defaults winning).

Example: The following code block is specified in your nextflow.config:

aws {
region = 'us-east-1'
client {
uploadChunkSize = 209715200 // 200 MB
}
...
}

When the job instantiates on the AWS Batch Compute Environment, you will see that the uploadChunkSize changed:

aws {
region = 'us-east-1'
client {
uploadChunkSize = 10485760 // 10 MB
}
...
}

This change occurred because Tower superimposes its 10 MB default value rather than using the value specified in the nextflow.config file.

To force the Tower-invoked job to use your desired value, please add the configuration setting in the Tower Workspace Launch screen's Advanced options > Nextflow config file textbox. In the case of our example above, you would simply need to add aws.client.uploadChunkSize = 209715200 // 200 MB .

Nextflow configuration values that are affected by this behaviour include:

  • aws.client.uploadChunkSize
  • aws.client.storageEncryption

Q: Missing output file(s) [X] expected by process [Y] error during task execution in an environment using Fusion v1

Fusion v1 has a limitation which causes tasks that run for less than 60 seconds to fail as the output file generated by the task is not yet detected by Nextflow. This is a limitation inherited from a Goofys driver used by the Fusion v1 implementation. Fusion v2 (to be made available to Tower Enterprise users during Q1 of 2023) resolves this issue.

If Fusion v2 is not yet available, or updating to v2 is not feasible, this issue can be addressed by instructing Nextflow to wait for 60 seconds after the task completes.

From Advanced options > Nextflow config file in Pipeline settings, add the following line to your Nextflow configuration:

process.afterScript = 'sleep 60'

Q: Why are jobs in RUNNING status not terminated when my pipeline run is canceled?

The behavior of Tower when canceling a run depends on the errorStrategy defined in your process script. If the process errorStrategy is set to finish, an orderly pipeline shutdown is initiated when you cancel (or otherwise interrupt) a run. This instructs Nextflow to wait for the completion of any submitted jobs. To ensure that all jobs are terminated when your run is canceled, set errorStrategy to terminate in your Nextflow config. For example:


process ignoreAnyError {
errorStrategy 'ignore'

script:
<your command string here>
}

Q: Why do some cached tasks run from scratch when I re-launch a pipeline?

When re-launching a pipeline, Tower relies on Nextflow's resume functionality for the continuation of a workflow execution. This skips previously completed tasks and uses a cached result in downstream tasks, rather than running the completed tasks again. The unique ID (hash) of the task is calculated using a composition of the task's:

  • Input values
  • Input files
  • Command line string
  • Container ID
  • Conda environment
  • Environment modules
  • Any executed scripts in the bin directory

A change in any of these values results in a changed task hash. Changing the task hash value means that the task will be run again when the pipeline is re-launched. To aid debugging efforts when a re-launch behaves unexpectedly, run the pipeline twice with dumpHashes=true set in your Nextflow config file (from Advanced options -> Nextflow config file in the Pipeline settings). This will instruct Nextflow to dump the task hashes for both executions in the nextflow.log file. You can compare the log files to determine the point at which the hashes diverge in your pipeline when it is resumed.

See here for more information on the Nextflow resume mechanism.

Q: Why does my run fail with an "o.h.e.jdbc.spi.SqlExceptionHelper - Incorrect string value" error?


[scheduled-executor-thread-2] - WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 1366, SQLState: HY000
[scheduled-executor-thread-2] - ERROR o.h.e.jdbc.spi.SqlExceptionHelper - (conn=34) Incorrect string value: '\xF0\x9F\x94\x8D |...' for column 'error_report' at row 1
[scheduled-executor-thread-2] - ERROR i.s.t.service.job.JobSchedulerImpl - Oops .. unable to save status of job id=18165; name=nf-workflow-26uD5XXXXXXXX; opId=nf-workflow-26uD5XXXXXXXX; status=UNKNOWN

Runs will fail if your Nextflow script or Nextflow config contain illegal characters (such as emojis or other non-UTF8 characters). Validate your script and config files for any illegal characters before atttempting to run again.

Nextflow Launcher

Q: There are several nf-launcher images available in the Seqera image registry. How can I tell which one is most appropriate for my implementation?

Your Tower implementation knows the nf-launcher image version it needs and will specify this value automatically when launching a pipeline.

If you are restricted from using public container registries, please see Tower Enterprise Release Note instructions (example) for the specific image you should use and how to set this as the default when invoking pipelines.

Q: The nf-launcher is pinned to a specific Nextflow version. How can I make it use a different release?

Each Nextflow Tower release uses a specific nf-launcher image by default. This image is loaded with a specific Nextflow version, meaning that any workflow run in the container uses this Nextflow version by default. You can force your jobs to use a newer/older version of Nextflow with any of the following strategies:

  1. Use the Pre-run script advanced launch option to set the desired Nextflow version. Example: export NXF_VER=22.08.0-edge
  2. For jobs executing in an AWS Batch compute environment, create a custom job definition which references a different nf-launcher image.

OIDC

Q: Can I have users seamlessly log in to Tower if they already have an active session with their OpenId Connect (OIDC) Identity Provider (IDP)?

Yes. If you are using OIDC as your authentication method, it is possible to implement a seamless login flow for your users.

Rather than directing your users to http(s)://YOUR_TOWER_HOSTNAME or http(s)://YOUR_TOWER_HOSTNAME/login, point the user-initiated login URL here instead: http(s)://YOUR_TOWER_HOSTNAME/oauth/login/oidc.

If your user already has an active session established with the IDP, they will be automatically logged into Tower rather than having to manually choose their authentication method.

Optimization

Q: When using optimization, why are tasks failing with an OutOfMemoryError: Container killed due to memory usage error?

Improvements are being made to the way Nextflow calculates the optimal memory needed for containerized tasks, which will resolve issues with underestimating memory allocation in an upcoming release.

A temporary workaround for this issue is to implement a retry error strategy in the failing process that will increase the allocated memory each time the failed task is retried. Add the following errorStrategy block to the failing process:

process {
errorStrategy = 'retry'
maxRetries = ​3
memory = 1.GB * task.attempt
}

Plugins

Q: Is it possible to use the Nextflow SQL DB plugin to query AWS Athena?

Yes. As of Nextflow 22.05.0-edge, your Nextflow pipelines can query data from AWS Athena. You must add the following configuration items to your nextflow.config (Note: the use of secrets is optional):

plugins {
id 'nf-sqldb@0.4.0'
}

sql {
db {
'athena' {
url = 'jdbc:awsathena://AwsRegion=YOUR_REGION;S3OutputLocation=s3://YOUR_S3_BUCKET'
user = secrets.ATHENA_USER
password = secrets.ATHENA_PASSWORD
}
}
}

You can then call the functionality from within your workflow.

// Example
channel.sql.fromQuery("select * from test", db: "athena", emitColumns:true).view()
}

For more information on the implementation, please see https://github.com/nextflow-io/nf-sqldb/discussions/5.

Repositories

Q: Can Tower integrate with private docker registries like JFrog Artifactory?

Yes. Tower-invoked jobs can pull container images from private docker registries. The method to do so differs depending on platform, however:

  • If using AWS Batch, modify your EC2 Launch Template as per these directions from AWS.
    Note:
    • This solution requires that your Docker Engine be at least 17.07 to use --password-stdin.
    • You may need to add the following additional commands to your Launch Template depending on your security posture:
      cp /root/.docker/config.json /home/ec2-user/.docker/config.json && chmod 777 /home/ec2-user/.docker/config.json
  • If using Azure Batch, please create a Container Registry-type credential in your Tower Workspace and associate it with the Azure Batch object also defined in the Workspace.
  • If using Kubernetes, please use an imagePullSecret as per https://github.com/nextflow-io/nextflow/issues/2827.

Q: Why does my Nextflow log have a Remote resource not found error when trying to contact the workflow repository?

This error can occur if the Nextflow head job fails to retrieve the necessary repository credentials from Nextflow Tower.

To determine if this is the case, please do the following:

  1. Check your Nextflow log for an entry like DEBUG nextflow.scm.RepositoryProvider - Request [credentials -:-].
  2. If the above is true, please check the protocol of the string that was assigned to your Tower instance's TOWER_SERVER_URL configuration value. It is possible this has been erroneously set to http rather than https.

Secrets

Q: When using secrets in Tower workflow run, the process executed with an error Missing AWS execution role arn

The ECS Agent must be empowered to retrieve Secrets from the AWS Secrets Manager. Secrets-using pipelines that are launched from Nextflow Tower and execute in an AWS Batch Compute Environment will encounter this error if an IAM Execution Role is not provided. Please see the Pipeline Secrets for remediation steps.

Q: Why do work tasks which use Secrets fail when running in AWS Batch?

Users may encounter a few different errors when executing pipelines that use Secrets, via AWS Batch:

  • If you use nf-sqldb version 0.4.1 or earlier and have Secrets in your nextflow.config, you may see following error in your Nextflow Log: nextflow.secret.MissingSecretException: Unknown config secret {SECRET_NAME}.
    You can resolve this error by explicitly defining the xpack-amzn plugin in your configuration.
    Example:

    plugins {
    id 'xpack-amzn'
    id 'nf-sqldb'
    }
  • If you have two or more processes that use the same container image, but only a subset of these processes use Secrets, your Secret-using processes may fail during the initial run but succeed when resumed. This is due to an bug in how Nextflow (22.07.1-edge and earlier) registers jobs with AWS Batch.

    To resolve the issue, please upgrade your Nextflow version to 22.08.0-edge. If you cannot upgrade, you can use the following as workarounds:

    1. Use a different container image for each process.
    2. Define the same set of Secrets in each process that uses the same container image.

Tower Agent

Q:Tower Agent closes a session with "Unexpected Exception in WebSocket [io.seqera.tower.agent.AgentClientSocket$Intercepted@698514a]: Operation timed out java.io.IOException: Operation timed out"

The reconnection logic of Tower Agent has been improved with the release of version 0.5.0. Update your Tower Agent version before relaunching your pipeline.

Tower Configuration

Q: Can I customize menu items on the Tower navigation menu?

Yes. Using the navbar snippet in the tower.yml configuration file, you can specify custom navigation menu items for your Tower installation. See here for more details.

Q: Can a custom path be specified for the tower.yml configuration file?

Yes. Provide a POSIX-compliant path to the TOWER_CONFIG_FILE environment variable.

Q: Why do parts of tower.yml not seem to work when I run my Tower implementation?

There are two reasons why configurations specified in tower.yml are not being expressed by your Tower instance:

  1. There is a typo in one of the key value pairs.

  2. There is a duplicate key present in your file.

    # EXAMPLE
    # This block will not end up being enforced because there is another `tower` key below.
    tower:
    trustedEmails:
    - user@example.com

    # This block will end up being enforced because it is defined last.
    tower:
    auth:
    oidc:
    - "*@foo.com"

Q: Do you have guidance on how to create custom Nextflow containers?

Yes. Please see https://github.com/seqeralabs/gatk4-germline-snps-indels/tree/master/containers.

Q: What DSL version does Nextflow Tower set as default for Nextflow head jobs?

As of Nextflow 22.03.0-edge, DSL2 is the default syntax.

To minimize disruption on existing pipelines, Nextflow Tower version 22.1.x and later are configured to default Nextflow head jobs to DSL 1 for a transition period (ending TBD).

You can force your Nextflow head job to use DSL2 syntax via any of the following techniques:

  • Adding export NXF_DEFAULT_DSL=2 in the Advanced Features > Pre-run script field of Tower Launch UI.
  • Specifying nextflow.enable.dsl = 2 at the top of your Nextflow workflow file.
  • Providing the -dsl2 flag when invoking the Nextflow CLI (e.g. nextflow run ... -dsl2)

Q: Can Tower to use a Nextflow workflow stored in a local git repository?

Yes. As of v22.1, Nextflow Tower Enterprise can link to workflows stored in "local" git repositories. To do so:

  1. Volume mount your repository folder into the Tower Enterprise backend container.
  2. Update your tower.yml with the following configuration:
tower:
pipeline:
allow-local-repos:
- /path/to/repo

Note: This feature is not available to Tower Cloud users.

Q: Am I forced to define sensitive values in tower.env?

No. You can inject values directly into tower.yml or - in the case of a Kubernetes deployment - reference data from a secrets manager like Hashicorp Vault.

Please contact Seqera Labs for more details if this is of interest.

Batch Forge

Q: What does the Enable GPU option do when building an AWS Batch cluster via Batch Forge?

Activating the Enable GPU field while creating an AWS Batch environment with Batch Forge will result in an AWS-recommended GPU-optimized ECS AMI being used as your Batch cluster's default image.

Note:

  1. Activation does not cause GPU-enabled instances to automatically spawn in your Batch cluster. You must still specify these in the Forge screen's Advanced options > Instance types field.
  2. Population of the Forge screen's Advanced options > AMI Id field will supersede the AWS-recommended AMI.
  3. Your Nextflow script must include accelerator directives to use the provisioned GPUs.

tw CLI

Q: Can a custom run name be specified when launch a pipeline via the tw CLI?

Yes. As of tw v0.6.0, this is possible. Example: tw launch --name CUSTOM_NAME ...

Q: Why are tw cli commands resulting in segfault errors?

tw cli versions 0.6.1 through 0.6.4 were compiled using glibc instead of MUSL. This change was discovered to cause segfaults in certain operating systems and has been rolled back in tw cli 0.6.5.

To resolve this error, please try using the MUSL-based binary first. If this fails to work on your machine, an alternative Java JAR-based solution is available for download and use.

Q: Can tw cli communicate with hosts using http?

This error indicates that your Tower host accepts connections using http (insecure), rather than https. If your host cannot be configured to accept https connections, run your tw cli command with the --insecure flag.

 ERROR: You are trying to connect to an insecure server: http://hostname:port/api
if you want to force the connection use '--insecure'. NOT RECOMMENDED!

To do this, add the --insecure flag before your cli command (see below). Note that, although this approach is available for use in deployments that do not accept https: connections, it is not recommended. Best practice is to use https: wherever possible.

$ tw --insecure info

Note: The ${TOWER_API_ENDPOINT} is equivalent to the ${TOWER_SERVER_URL}/api.

Q: Can a user resume/relaunch a pipeline using the tw cli?

Yes, it is possible with tw runs relaunch.

$ tw runs relaunch -i 3adMwRdD75ah6P -w 161372824019700

Workflow 5fUvqUMB89zr2W submitted at [org / private] workspace.


$ tw runs list -w 161372824019700

Pipeline runs at [org / private] workspace:

ID | Status | Project Name | Run Name | Username | Submit Date
----------------+-----------+----------------+-----------------+-------------+-------------------------------
5fUvqUMB89zr2W | SUBMITTED | nf/hello | magical_darwin | seqera-user | Tue, 10 Sep 2022 14:40:52 GMT
3adMwRdD75ah6P | SUCCEEDED | nf/hello | high_hodgkin | seqera-user | Tue, 10 Sep 2022 13:10:50 GMT


Workspaces

Q: Why is my Tower-invoked pipeline trying to contact a different Workspace than the one it was launched from?

This problem will express itself with the following entry in your Nextflow log: Unexpected response for request http://YOUR_TOWER_URL/api/trace/TRACE_ID/begin?workspaceId=WORKSPACE_ID.

This can occur due to the following reasons:

  1. An access token value has been hardcoded in the tower.accessToken block of your nextflow.config (either via the git repository itself or override value in the launch form).
  2. In cases where your compute environment is an HPC cluster, the credentialized user's home directory contains a stateful nextflow.config with a hardcoded token (e.g. `~/.nextflow/config).

Q: What privilege level is granted to a user assigned to a Workspace both as a Participant and Team member?

It is possible for a user to be concurrently assigned to a Workspace both as a named Participant and member of a Team. In such cases, Tower will grant the higher of the two privilege sets.

Example:

  • If the Participant role is Launch and the Team role is Admin, the user will have Admin rights.
  • If the Participant role is Admin and the Team role is Launch, the user will have Admin rights.
  • If the Participant role is Launch and the Team role is Launch, the user will have Launch rights.

As a best practice, Seqera suggests using Teams as the primary vehicle for assigning rights within a Workspace and only adding named Participants when one-off privilege escalations are deemed necessary.

Amazon

EBS

Q: EBS Autoscaling: Why do some EBS volumes remain active after their associated jobs have completed?

The EBS autoscaling solution relies on an AWS-provided script running on each container host. This script performs AWS EC2 API requests to delete EBS volumes when the jobs using those volumes have been completed. When running large Batch clusters (hundreds of compute nodes or more), EC2 API rate limits may cause the deletion of unattached EBS volumes to fail. Volumes that remain active after Nextflow jobs have been completed will incur additional costs and should therefore be manually deleted. You can monitor your AWS account for any orphaned EBS volumes via the EC2 console or with a Lambda function. See here for more information.

EC2 Instances

Q: Can I run a Nextflow head job on AWS Graviton instances?

Yes, Nextflow supports Graviton architecture — use AWS Batch queues with Graviton-based instance types.

ECS

Q:How often are Docker images pulled by the ECS Agent?

As part of the AWS Batch creation process, Batch Forge will set ECS Agent parameters in the EC2 Launch Template that is created for your cluster's EC2 instances:

  • For clients using Tower Enterprise v22.01 or later:
    • Any AWS Batch environment created by Batch Forge will set the ECS Agent's ECS_IMAGE_PULL_BEHAVIOUR set to once.
  • For clients using Tower Enterprise v21.12 or earlier:
    • Any AWS Batch environment created by Batch Forge will set the ECS Agent's ECS_IMAGE_PULL_BEHAVIOUR set to default.

Please see the AWS ECS documentation for an in-depth explanation of this difference.

Note:

This behaviour cannot be changed within the Tower Application.

Q: We encountered an error saying unable to parse HTTP 429 response body.

CannotPullContainerError: Error response from daemon: error parsing HTTP 429 response body: invalid character 'T' looking for beginning of value: "Too Many Requests (HAP429)"

This is because of the dockerhub rate limit of 100 anonymous pulls per 6 hours. We suggest to use the following on your launch template in order to avoid this issue:

echo ECS_IMAGE_PULL_BEHAVIOR=once >> /etc/ecs/ecs.config

Q: Help! My job failed due to a CannotInspectContainerError error.

There are multiple reasons why your pipeline could fail with an Essential container in task exited - CannotInspectContainerError: Could not transition to inspecting; timed out after waiting 30s error.

Please try the following:

  1. Upgrade your ECS Agent to 1.54.1 or newer (instructions for checking your ECS Agent version);
  2. Provision more storage space for your EC2 instance (preferably via ebs-autoscaling to ensure scalability).
  3. If the error is accompanied by command exit status: 123 and a permissions denied error tied to a system command, please ensure that the binary is set to be executable (i.e. chmod u+x).

Queues

Q: Does Nextflow Tower support the use of multiple AWS Batch queues during a single job execution?

Yes. Even though you can only create/identify a single work queue during the definition of your AWS Batch Compute Environment within Nextflow Tower, you can spread tasks across multiple queues when your job is sent to Batch for execution via your pipeline configuration.

Adding the following snippet to either your nextflow.config or the Advanced Features > Nextflow config gile field of Tower Launch UI, will cause processes to be distributed across two AWS Batch queues, depending on the assigned named.

# nextflow.config

process {
withName: foo {
queue: `TowerForge-1jJRSZmHyrrCvCVEOhmL3c-work`
}
}

process {
withName: bar {
queue: `custom-second-queue`
}
}

Security

Q: Can Tower connect to an RDS instance using IAM credentials instead of username/password?

No. Nextflow Tower must be supplied with a username & password to connect to its associated database.

Storage

Q: Can I use EFS as my work directory?

As of Nextflow Tower v21.12, you can specify an Amazon Elastic File System instance as your Nextflow work directory when creating your AWS Batch Compute Environment via Batch Forge.

Q: Can I use FSX for Luster as my work directory?

As of Nextflow Tower v21.12, you can specify an Amazon FSX for Lustre instance as your Nextflow work directory when creating your AWS Batch Compute Environment via Batch Forge.

Q: How do I configure my Tower-invoked pipeline to be able to write to an S3 bucket that enforces AES256 server-side encryption?

If you need to save files to an S3 bucket protected by a bucket policy which enforces AES256 server-side encryption, additional configuration settings must be provided to the nf-launcher script which invokes the Nextflow head job:

  1. Add the following configuration to the Advanced options > Nextflow config file textbox of the Launch Pipeline screen:

    aws {
    client {
    storageEncryption = 'AES256'
    }
    }
  2. Add the following configuration to the Advanced options > Pre-run script textbox of the Launch Pipeline screen:

    export TOWER_AWS_SSE=AES256

Note: This solution requires at least Tower v21.10.4 and Nextflow 22.04.0.

Azure

AKS

Q: Why is Nextflow returning a "... /.git/HEAD.lock: Operation not supported" error?

This problem can occur if your Nextflow pod uses an Azure Files-type (SMB) Persistent Volume as its storage medium. By default, the jgit library used by Nextflow attempts a filesystem link operation which is not supported by Azure Files (SMB).

To avoid this problem, please add the following code snippet in your pipeline's pre-run script field:

cat <<EOT > ~/.gitconfig
[core]
supportsatomicfilecreation = true
EOT

Batch

Q: Why is my Azure Batch VM quota set to 0?

In order to manage capacity during the global health pandemic, Microsoft has reduced core quotas for new Batch accounts. Depending on your region and subscription type, a newly-created account may not be entitled to any VMs without first making a service request to Azure.

Please see Azure's Batch service quotas and limits page for further details.

SSL

Q: "Problem with the SSL CA cert (path? access rights?)" error

This can occur if a tool/library in your task container requires SSL certificates to validate the identity of an external data source.

You may be able to solve the issue by:

  1. Mounting host certificates into the container (example).

Q: Why is my deployment using Azure SQL database returning an error about Connections using insecure transport are prohibited while --require_secure_transport=ON.

This is due to Azure's default MySQL behavior of enforcing the SSL connections between your server and client application, as detailed here. In order to fix this, append the following to your TOWER_DB_URL connection string: useSSL=true&enabledSslProtocolSuites=TLSv1.2&trustServerCertificate=true

eg, TOWER_DB_URL=jdbc:mysql://azuredatabase.com/tower?serverTimezone=UTC&useSSL=true&enabledSslProtocolSuites=TLSv1.2&trustServerCertificate=true

Google

Retry

Q: How do I make my Nextflow pipelines more resilient to VM preemption?

Running your pipelines on preemptible VMs provides significant cost savings but increases the likelihood that a task will be interrupted before completion. It is a recommended best practice to implement a retry strategy when you encounter exit codes that are commonly related to preemption. Example:

process {
errorStrategy = { task.exitStatus in [8,10,14] ? 'retry' : 'finish' }
maxRetries = 3
maxErrors = '-1'
}

Q: What are the minimum Tower Service account permissions needed for GLS and GKE?

The following roles are needed to be granted to the nextflow-service-account.

  1. Cloud Life Sciences Workflows Runner
  2. Service Account User
  3. Service Usage Consumer
  4. Storage Object Admin

For detailed information, please refer to this guide.

Kubernetes

Q: Pod failing with 'Invalid value: "xxx": must be less or equal to memory limit' error

This error may be encountered when you specify a value in the Head Job memory field during the creation of a Kubernetes-type Compute Environment.

If you receive an error that includes field: spec.containers[x].resources.requests and message: Invalid value: "xxx": must be less than or equal to memory limit, your Kubernetes cluster may be configured with system resource limits which deny the Nextflow head job's resource request. To isolate which component is causing the problem, try to launch a Pod directly on your cluster via your Kubernetes administration solution. Example:

---
apiVersion: v1
kind: Pod
metadata:
name: debug
labels:
app: debug
spec:
containers:
- name: debug
image: busybox
command: ["sh", "-c", "sleep 10"]
resources:
requests:
memory: "xxxMi" # or "xxxGi"
restartPolicy: Never

On-Prem HPC

Q: "java: command not found"

When submitting jobs to your on-prem HPC (regardless of whether using SSH or Tower-Agent authentication), the following error may appear in your Nextflow logs even though you have Java on your $PATH environment variable:

java: command not found
Nextflow is trying to use the Java VM defined for the following environment variables:
JAVA_CMD: java
NXF_OPTS:

Possible reasons for this error:

  1. The queue where the Nextflow head job runs in a different environment/node than your login node userspace.
  2. If your HPC cluster uses modules, the Java module may not be loaded by default.

To troubleshoot:

  1. Open an interactive session with the head job queue.
  2. Launch the Nextflow job from the interactive session.
  3. If you cluster used modules:
    1. Add module load <your_java_module> in the Advanced Features > Pre-run script field when creating your HPC Compute Environment within Nextflow Tower.
  4. If you cluster does not use modules:
    1. Source an environment with java and Nextflow using the Advanced Features > Pre-run script field when creating your HPC Compute Environment within Nextflow Tower.