最近在windows系统里下载了MobaXterm可以远程登入服务器,处理RNA的数据,需要从NCBI数据库上下载数据。本文提供用虚拟机ubuntu或者linux系统下载Aspera的方法和问题解决,以及从NCBI上批量下载数据库、最后得到一个项目里的所有fastq文件。
Aspera下载数据巨快,100多条fastq文件几分钟就下载好了,比SRA Toolkit快很多
1.首先登入SRA-explorer官网
然后输入数据下载的项目号(PRJNA000123/SRP123456) ,我这里输入的是SRP062637
点击Aspera commands for downloading FastQ files
点击Copy
在linux系统中输入
vim download.sh
##进入vim编辑器,点击i,将复制的脚步粘贴,然后退出保存:wq
#!/usr/bin/env bash
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/008/SRR2176358/SRR2176358.fastq.gz . && mv SRR2176358.fastq.gz SRR2176358_RNA-seq_of_Blondee_fruit_skin_with_flesh_at_stage_I_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/001/SRR2176361/SRR2176361.fastq.gz . && mv SRR2176361.fastq.gz SRR2176361_RNA-seq_of_Kidds-D_8_fruit_skin_with_flesh_at_stage_I_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/009/SRR2176359/SRR2176359.fastq.gz . && mv SRR2176359.fastq.gz SRR2176359_RNA-seq_of_Blondee_fruit_skin_with_flesh_at_stage_I_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/000/SRR2176360/SRR2176360.fastq.gz . && mv SRR2176360.fastq.gz SRR2176360_RNA-seq_of_Blondee_fruit_skin_with_flesh_at_stage_I_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/002/SRR2176362/SRR2176362.fastq.gz . && mv SRR2176362.fastq.gz SRR2176362_RNA-seq_of_Kidds-D_8_fruit_skin_with_flesh_at_stage_I_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/003/SRR2176363/SRR2176363.fastq.gz . && mv SRR2176363.fastq.gz SRR2176363_RNA-seq_of_Kidds-D_8_fruit_skin_with_flesh_at_stage_I_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/005/SRR2176365/SRR2176365.fastq.gz . && mv SRR2176365.fastq.gz SRR2176365_RNA-seq_of_Blondee_fruit_skin_at_stage_II_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/006/SRR2176366/SRR2176366.fastq.gz . && mv SRR2176366.fastq.gz SRR2176366_RNA-seq_of_Blondee_fruit_skin_at_stage_II_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/008/SRR2176368/SRR2176368.fastq.gz . && mv SRR2176368.fastq.gz SRR2176368_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_II_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/007/SRR2176367/SRR2176367.fastq.gz . && mv SRR2176367.fastq.gz SRR2176367_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_II_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/009/SRR2176369/SRR2176369.fastq.gz . && mv SRR2176369.fastq.gz SRR2176369_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_II_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/004/SRR2176364/SRR2176364.fastq.gz . && mv SRR2176364.fastq.gz SRR2176364_RNA-seq_of_Blondee_fruit_skin_at_stage_II_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/000/SRR2176370/SRR2176370.fastq.gz . && mv SRR2176370.fastq.gz SRR2176370_RNA-seq_of_Blondee_fruit_skin_at_stage_III_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/002/SRR2176372/SRR2176372.fastq.gz . && mv SRR2176372.fastq.gz SRR2176372_RNA-seq_of_Blondee_fruit_skin_at_stage_III_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/001/SRR2176371/SRR2176371.fastq.gz . && mv SRR2176371.fastq.gz SRR2176371_RNA-seq_of_Blondee_fruit_skin_at_stage_III_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/003/SRR2176373/SRR2176373.fastq.gz . && mv SRR2176373.fastq.gz SRR2176373_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_III_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/005/SRR2176375/SRR2176375.fastq.gz . && mv SRR2176375.fastq.gz SRR2176375_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_III_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/004/SRR2176374/SRR2176374.fastq.gz . && mv SRR2176374.fastq.gz SRR2176374_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_III_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/006/SRR2176376/SRR2176376.fastq.gz . && mv SRR2176376.fastq.gz SRR2176376_RNA-seq_of_Blondee_fruit_skin_at_stage_IV_at_harvest_Rep._I.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/000/SRR2176380/SRR2176380.fastq.gz . && mv SRR2176380.fastq.gz SRR2176380_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_IV_at_harvest_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/008/SRR2176378/SRR2176378.fastq.gz . && mv SRR2176378.fastq.gz SRR2176378_RNA-seq_of_Blondee_fruit_skin_at_stage_IV_at_harvest_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/001/SRR2176381/SRR2176381.fastq.gz . && mv SRR2176381.fastq.gz SRR2176381_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_IV_at_harvest_Rep._III.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/007/SRR2176377/SRR2176377.fastq.gz . && mv SRR2176377.fastq.gz SRR2176377_RNA-seq_of_Blondee_fruit_skin_at_stage_IV_at_harvest_Rep._II.fastq.gz
ascp -QT -l 300m -P33001 -i $HOME/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR217/009/SRR2176379/SRR2176379.fastq.gz . && mv SRR2176379.fastq.gz SRR2176379_RNA-seq_of_Kidds-D_8_fruit_skin_at_stage_IV_at_harvest_Rep._I.fastq.gz
2.Aspera的下载和安装
注意上面的链接是截止2024.5.9的最新下载Aspera的链接,可以去Downloads - IBM Aspera查看最新版本(选择IBM Aspera Connect下面的Linux系统,复制链接,见下图),同时更改上面代码里的版本号即可
##利用wget下载
wget ibm-aspera-connect-3.9.8.176272-linux-g2.12-64.tar.gz
##解压缩
tar -zxvf ibm-aspera-connect-3.9.8.176272-linux-g2.12-64.tar.gz
##安装
bash ibm-aspera-connect-3.9.8.176272-linux-g2.12-64.sh
##然后进入安装目录
pwd
/data/home/hgzhong/.aspera/connect
cd /data/home/hgzhong/.aspera/connect
#将ascp软件的安装的全路径配置到环境变量
vim ~/.bashrc
export PATH=/data/home/hgzhong/.aspera/connect:$PATH
保存退出
#更新一下
source ~/.bashrc
#which ascp
显示配置到环境变量中成功
#运行命令,利用nohup挂载命令,可以断点续传下载
nohup sh download.sh &
参考文献:
Aspera史上最全安装方法 + 批量下载fastq数据_aspera安装-CSDN博客