Error output in nextflow pipeline using fasterq-dump

Question

I have one file with multiple SRA accession called Dataframe_with_accession.txt, for this example I just put one SRA in that file called :SRR8933535

And the idea is to create a nextflow pipeline to download the sra files in the Dataframe_with_accession.txt, zip the files using pigz and remove the previous fastq file

To do that I used this code :

params.inputFile = "/crex/proj/Dataframe_with_accession.txt"
params.outputDir = "/crex/proj/Output"
process DOWNLOAD_FASTQ {
publishDir params.outputDir, mode: 'symlink'

input:
each (accession)

output:
tuple path(&quot;params.outputDir/<span class="math-container">${accession}_1.fastq.gz")
path("params.outputDir/$</span>{accession}_2.fastq.gz&quot;)

script:
&quot;&quot;&quot;
fasterq-dump --threads 12 --outdir <span class="math-container">${params.outputDir}  $</span>{accession}
pigz -p12 <span class="math-container">${params.outputDir}/$</span>{accession}_1.fastq
pigz -p12 <span class="math-container">${params.outputDir}/$</span>{accession}_2.fastq
&quot;&quot;&quot;

}
workflow {
    // Read accessions from the input file
    accList = file(params.inputFile).readLines()
DOWNLOAD_FASTQ(accList)

}

Then I run this nextflow file.

But I got the following error message :

[01/298f4b] process > DOWNLOAD_FASTQ (1) [100%] 1 of 1, failed: 1 ✘
Error executing process > 'DOWNLOAD_FASTQ (1)'
Caused by:
  Missing output file(s) SRR8933535_1.fastq expected by process DOWNLOAD_FASTQ (1)
Command executed:
fasterq-dump --threads 12 --outdir /crex/proj/Output  SRR8933535
Command exit status:
  0
Command output:
  (empty)
Command error:
  spots read      : 8,389,148
  reads read      : 16,778,296
  reads written   : 16,778,296
Work dir:
  /crex/proj/Output/work/01/298f4be6dde2ee1856df31db169bec
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

And I do not undestrand what is going wrong here ? It says

Command output:
  (empty)

But the file is created

Are you aware of the fromSRA Nextflow channel factory? https://www.nextflow.io/docs/edge/channel.html#fromsra — mribeirodantas, Sep 12 '23 at 13:36

score 2 · Accepted Answer · answered Sep 12 '23 at 11:25

2

Each nextflow process looks in its working directory for the files listed in output: . In your script: you already direct the output to its eventual destination. After running the script nextflow tries to find the output files and stops, because it can't find them. Solution: keep the output files in your workdir, nextflow will detect them and use the publishDir directive to copy them to their eventual destination.

answered Sep 12 '23 at 11:25

Pallie

697
5
11

ok perfect thanks – Grendel Sep 12 '23 at 11:37

Error output in nextflow pipeline using fasterq-dump

1 Answers1