An instance of a Galaxy portal for the processing of genomic data has been setup at:
https://covid19.eosc-synergy.eu/galaxy
The portal is freely accessible and includes data on coronavirae and specially samples from SARS-CoV-2 (COVID19) daily updated from the public international databanks, as well as some key tools for identification of mutations, phylogenetic analysis, sample processing and visualization.
The portal is an open lab for researchers that want to run their experiments without the burden of downloading data and installing tools, with easy tools for sharing data, workflows and results. The backend that supports the processing of the data is jointly provided by the EOSC Synergy collaboration.
Datasets available
Although every user can download its own data, the platform has already downloaded a number of collections:
- All Coronavirus sequences from the China National Centre for Bioinformatics (CNCB) – nearly 40K sequences.
- The collection of 961 2019-NCOV samples from China National Centre for Bioinformatics (CNCB)
- A reference genome of the SARS-CoV-2 Wuhan-Hu-1, accession number NC_045512.2.
- The collection of 170 SARS-CoV-2 samples from Genebank
Reference genome indexes for bwa, gatk and bowtie2 for the NC_045512.2 SARS-CoV-2 Wuhan-Hu-1, can be built.
Tools
Along with the standard tools that come with the basic installation of Galaxy, the next table shows the tools that have been installed
Name | Description | Owner | Revision | |
beast | Bayesian MCMC analysis of molecular sequences | malex | 2ca3df65222b | |
bowtie2 | Bowtie2: Fast and sensitive read alignment | devteam | 749c918495f7 | |
bwa | Wrapper for bwa mem, aln, sampe, and samse | devteam | 01ac0a5fedc3 | |
Clustalw | ClustalW multiple sequence alignment program for DNA or proteins | devteam | d6694932c5e0 | |
Collapse_collections | Collection tool that collapses a list of files into a single datasset in order of appears in collection | nml | 830961c48e42 | |
data_manager_bowtie_index_builder | Data Manager for building bowtie indexes | iuc | 86e9af693a33 | |
data_manager_fetch_genome_dbkeys_all_fasta | Allows optionally defining a new DBKEY and retrieves a FASTA file and populate the all_fasta.loc data table. | devteam | 4d3eff1bc421 | |
data_manager_gatk_picard_index_builder | Data Manager for building gatk picard indexes | devteam | b31f1fcb203c | |
emboss_datatypes | Galaxy applicable data formats used by Emboss tools. | devteam | a89163f31369 | |
fastp | Fast all-in-one preprocessing for FASTQ files | iuc | 1d8fe9bc4cb0 | |
fastqc | Read QC reports using FastQC | devteam | e7b2202befea | |
fasttree | FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences – GVL | iuc | e005e659ae21 | |
freebayes | Galaxy Freebayes Bayesian genetic variant detector tool | devteam | ef2c525bd8cd | |
gatk2 | The Genome Analysis Toolkit in Version 2 | iuc | 35c00763cb5c | |
lofreq_call | Call variants with LoFreq in Galaxy. | iuc | dfadc322b065 | |
lofreq_viterbi | Realign reads with LoFreq in Galaxy. | iuc | ecd80c7c3886 | |
mafft | Multiple alignment program for amino acid or nucleotide sequences | rnateam | c5908940967d | |
minimap2 | A fast pairwise aligner for genomic and spliced nucleotide sequences | iuc | b3eab4b67562 | |
multiqc | MultiQC aggregates results from bioinformatics analyses across many samples into a single report | iuc | 3d93dd18d9f8 | |
nanopplot | Plotting tool for long read sequencing data and alignments | iuc | edbb6c5028f5 | |
ncbi_acc_download | Download sequences from GenBank/RefSeq by accession through the NCBI ENTREZ API | iuc | 1c58de56d587 | |
package_gatk_1_4 | Contains a tool dependency definition that downloads and installs version 1.4 of GATK. | devteam | ec95ec570854 | |
package_picard_1_56_0 | Contains a tool dependency definition that downloads and compiles version 1.56.0 of the Picard package. | devteam | 99a28567c3a3 | |
package_r_2_15_0 | r 2.15 | devteam | 6c34eaa82fed | |
package_r_ggplot2_0_9_3 | Contains a tool dependency definition that downloads and compiles verion 0.9.3.x from gglot2 the R package. | iuc | 07de191649b4 | |
package_samtools_0_1_18 | Contains a tool dependency definition that downloads and compiles version 0.1.18 of the SAMTools package | devteam | 171cd8bc208d | |
package_samtools_0_1_19 | Contains a tool dependency definition that downloads and compiles version 0.1.19 of the SAMTools package | iuc | c9bd782f5342 | |
picard | Picard SAM/BAM manipulation tools. | devteam | a1f0b3f4b781 | |
samtool_filter2 | Filter BAM/SAM on FLAG,MAPQ,RG,LB or by region and produce a BAM/SAM on demand | devteam | 649a225999a5 | |
samtools_fastx | Extract reads | iuc | a8d69aee190e | |
samtools_mpileup | MPileup SNP and indel caller | devteam | fa7ad9b89f4a | |
samtools_rmdup | Remove PCR duplicates | devteam | 586f9e1cdb2b | |
samtools_stat | Generate statistics for a BAM or SAM file | devteam | 145f6d74ff5e | |
snpeff | SnpEff is a genetic variant annotation and effect prediction toolbox | iuc | 74aebe30fb52 | |
snpsift | snpEff SnpSift tools from Pablo Cingolani | iuc | 2b3e65a4252f | |
variant_recalibrator | Variant Recalibrator | devteam | cb7cf57397a7 | |
varscan_mpileup | Wrapper for VarScan mpileup | iuc | e3f170cc4f95 |