打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
用TCGAbiolinks从TCGA数据下载到下游分析的学习笔记

前言

之前一直在用RTCGA包下载数据,看着永不更新的数据,心里总觉得怪怪的,于是下定决心重新学习一个好用的包——TCGAbiolinks。这个包调用GDC的API,应该是最新的数据。
主要参考:TCGAbiolinks: TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

下载数据

直接上代码

# if (!requireNamespace("BiocManager", quietly=TRUE))#   install.packages("BiocManager")# BiocManager::install("TCGAbiolinks") library(TCGAbiolinks)library(dplyr)library(DT)library(SummarizedExperiment)#下面填入要下载的癌症种类request_cancer=c("PRAD","BLCA","KICH","KIRC","KIRP")for (i in request_cancer) {  cancer_type=paste("TCGA",i,sep="-")  print(cancer_type)  #下载临床数据  clinical <- GDCquery_clinic(project = cancer_type, type = "clinical")  write.csv(clinical,file = paste(cancer_type,"clinical.csv",sep = "-"))    #下载rna-seq的counts数据  query <- GDCquery(project = cancer_type,                     data.category = "Transcriptome Profiling",                     data.type = "Gene Expression Quantification",                     workflow.type = "HTSeq - Counts")    GDCdownload(query, method = "api", files.per.chunk = 100)  expdat <- GDCprepare(query = query)  count_matrix=assay(expdat)  write.csv(count_matrix,file = paste(cancer_type,"Counts.csv",sep = "-"))    #下载miRNA数据  query <- GDCquery(project = cancer_type,                     data.category = "Transcriptome Profiling",                     data.type = "miRNA Expression Quantification",                     workflow.type = "BCGSC miRNA Profiling")    GDCdownload(query, method = "api", files.per.chunk = 50)  expdat <- GDCprepare(query = query)  count_matrix=assay(expdat)  write.csv(count_matrix,file = paste(cancer_type,"miRNA.csv",sep = "-"))    #下载Copy Number Variation数据  query <- GDCquery(project = cancer_type,                     data.category = "Copy Number Variation",                     data.type = "Copy Number Segment")    GDCdownload(query, method = "api", files.per.chunk = 50)  expdat <- GDCprepare(query = query)  count_matrix=assay(expdat)  write.csv(count_matrix,file = paste(cancer_type,"Copy-Number-Variation.csv",sep = "-"))    #下载甲基化数据  query.met <- GDCquery(project =cancer_type,                        legacy = TRUE,                        data.category = "DNA methylation")  GDCdownload(query.met, method = "api", files.per.chunk = 300)  expdat <- GDCprepare(query = query)  count_matrix=assay(expdat)  write.csv(count_matrix,file = paste(cancer_type,"methylation.csv",sep = "-"))}

常用的一些数据基本都下下来了,放在当前目录下。

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
TCGAbiolinks数据下载TCGA数据
一文就会TCGA数据库基因表达差异分析
数据挖掘:是时候更新一下TCGA的数据了
1行代码提取6种TCGA表达矩阵2.0版
TCGA-mRNA数据下载
TCGA数据下载
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服