CummeRbund的安装与使用(for linux)

Exploration, analysis and visualization of Cufflinks high-throughput RNA-Seq data

CummeRbund is an R package that is designed to aid and simplify the task of analyzing Cufflinks RNA-Seq output.

CummeRbund是针对Cufflinks RNA-Seq输出结果分析与可视化开发的R包,极大的简化了后续的分析。

RNA高通量测序已经得到广泛的应用,比如转录组拼接、qantitation、以及差异表达分析。这些数据分析的结果往往数据量大、数据种类多样,差异巨大,而CummeRbund设计的目的就是为了帮助简化这些分析。

High-throughput sequencing of RNA-fragments is a powerful technique that has many applications, including but not limited to transcript assembly, qantitation, and differential expression analysis. The results of these analyses is often very large data sets with a high degree of relations between various data types and can be somewhat overwhelming. CummeRbund was designed to help simplify the analysis and exploration portion of RNA-Seq data derrived from the output of a differential expression analysis using cuffdiff with the goal of providing fast and intuitive access to your results.

CummeRbund takes the various output files from a cuffdiff run and creates a SQLite database of the results describing appropriate relationships betweeen genes, transcripts, transcription start sites, and CDS regions. Once stored and indexed, data for these features, even across multiple samples or conditions, can be retrieved very efficiently and allows the user to explore subfeatures of individual genes, or genesets as the analysis requires. We have implemented numerous plotting functions as well for commonly used visualizations. Check back often as we are constantly updating features.

CummeRbund创建了一个SQLite数据库,将cuffdiff运行产生的结果都存储到数据库中,将genes、transcripts、transcription start sites、以及CDS建立关联。将这些数据存储到数据库中,并建立相关的索引,就可以很容易的对多个样本之间或者其他条件的数据进行查询检索,允许用户对于单个或者一组基因的各种feature进行比较分析。同时还提供了诸多的绘图函数,可以满足一般的数据可视化需要。

安装

R的安装

wget http://cran.r-project.org/src/base/R-2/R-2.15.2.tar.gz
 tar -zxvf R-2.15.2.tar.gz
 cd R-2.15.2
 ./configure --prefix=/where/you/want/R/to/go
 make
 make check
 make install

CummeRbund的安装

 cd /where/you/want/to/work
 wget http://bioconductor.org/packages/2.11/bioc/src/contrib/cummeRbund_2.0.0.tar.gz
 cd /where/you/want/R/to/go/bin
 ./R
 setwd("/where/you/want/to/work")
 install.packages("RSQLite")
 install.packages("ggplot2")
 install.packages("plyr")
 install.packages("fastcluster")
 source("http://www.bioconductor.org/biocLite.R")
 biocLite("rtracklayer")
 biocLite("Gviz")
 biocLite("BiocGenerics")
 install.packages("cummeRbund_2.0.0.tar.gz")

 使用

library(cummeRbund)
cuff <- readCufflinks()
cuff <- readCufflinks(rebuilt=T)
#make boxplot
csBoxplot(genes(cuff))
#get the top 100 diff expr genes
gene.diff <- diffData(genes(cuff))
gene.diff.top <- gene.diff[order(gene.diff$q_value),][1:100,]
# gene ids of top 100 diff expr genes
myGeneIds <- gene.diff.top$gene_id
# get genes
myGenes <- getGenes(cuff, myGeneIds)
# make a heatmap
csHeatmap(myGenes, cluster="both")
csVolcano(genes(cuff))
mySigGeneIds <- getSig(cuff,alpha=0.05,level='genes')
> head(mySigGeneIds)
[1] "AATK" "ABCA1" "ABCC9" "ABCG1" "ABLIM3" "ABR"
> length(mySigGeneIds)
[1] 471
> mySigDeiIds <- getSig(cuff,alpha=0.05,level='isoforms')
> length(mySigDeiIds)
[1] 368h
> mySigDetIds <- getSig(cuff,alpha=0.05,level='TSS')
> length(mySigDetIds)
[1] 384
> mySigDecIds <- getSig(cuff,alpha=0.05,level='CDS')
> length(mySigDecIds)
[1] 390
得到up/down基因列表
diffData <- diffData(myDiffGenes)
myDiffdata <- NULL
myDiffdata <- diffData[order(diffData$q_value),]
rep("up", length(myDiffdata$gene_id)) ->UpStr
rep("down", length(myDiffdata$gene_id)) ->DownStr
ifelse(myDiffdata$value_1<myDiffdata$value_2, UpStr, DownStr) -> UpDown
diffGenes <- NULL
diffGenes <- cbind(diffGenes, myDiffdata$gene_id)
diffGenes <- cbind(diffGenes, UpDown)
colnames(diffGenes )<-c("gene_id","UpDown")
write.table(diffGenes ,".//sig.txt",sep="\t",col.names=T,row.names=F,quote=F)

 

参考

新书推荐

 » 转载文章请注明,转载自:博耘生物 » 《CummeRbund的安装与使用(for linux)》
 » 原文链接:http://boyun.sh.cn/bio/?p=1951

20 thoughts on “CummeRbund的安装与使用(for linux)

  1. 感谢您的精彩展现!
    我在CummeRbund的安装时候,始终无法完成biocLite(“rtracklayer”)过程,其中可能的原因是XML无法安装或其labrary(labxml)安装出错,尝试无果,特来请教!
    Error:
    installation of package ‘XML’ had non-zero exit status

    • 您好,我想问一下我使用cummeRbund画图后,它的结果不是自动保存在Rplots.pdf中的,但是很多时候把这个pdf下下来,它却会报pdf格式损坏的错误

    • 包括对于Cannot find curl-config

      对于Centos
      sudo yum -y install libxml2 libxml2-devel
      sudo yum -y install curl curl-devel

      对于Ubuntu
      sudo apt-get install libxml2-dev
      sudo apt-get install libcurl4-gnutls-dev

  2. Pingback: CummeRbund的安装与使用(for linux) | 博云社区

  3. 您好,我按照你的步骤做heatmap的时候,总是出现这个错误。不知道是为什么?
    > csHeatmap(myGenes, cluster=”both”)
    Using tracking_id, sample_name as id variables
    No id variables; using all as measure variables

  4. 求助
    用cummeRbund包处理RNAseq数据 第一次用 数据好像导不进去 cuff之后错误
    Error in sqliteSendQuery(con, statement, bind.data) :
    error in statement: no such table: genes

  5. 您好,我用source(“http://bioconductor.org/biocLite.R”) ;biocLite(“cummeRbund”),安装时,一直出现“non-zero exit status”这种问题。我用的R版本是3.2.3的,具体bug如下:
    The downloaded source packages are in
    ‘/tmp/RtmpOUV8Ed/downloaded_packages’
    Updating HTML index of packages in ‘.Library’
    Making ‘packages.html’ … done
    Warning messages:
    1: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘Rsamtools’ had non-zero exit status
    2: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘GenomicAlignments’ had non-zero exit status
    3: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘rtracklayer’ had non-zero exit status
    4: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘GenomicFeatures’ had non-zero exit status
    5: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘BSgenome’ had non-zero exit status
    6: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘VariantAnnotation’ had non-zero exit status
    7: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘biovizBase’ had non-zero exit status
    8: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘Gviz’ had non-zero exit status
    9: In install.packages(pkgs = doing, lib = lib, …) :
    installation of package ‘cummeRbund’ had non-zero exit status
    网上有说是DBI包没有正确安装的缘故,但没有说具体解决方案。您觉得可能是什么原因?您上面给的安装方式是通过下载源代码包,自己安装R模块,事先一定要安装这几个包biocLite(“rtracklayer”); biocLite(“Gviz”); biocLite(“BiocGenerics”)么?

    • 下面提示的八个包要先安,cummeRbund会调用
      那八个包每一个可能还有一些前置的包要安,要安一堆。。看报错提示,我安了一天~
      每个包都有可能安不上,提示版本问题,链接问题,下载出错什么的,一般换镜像都可以解决~
      用chooseBioCmirror() 选择biocLite的下载镜像
      如果都不行,命令重复两三次,有时候这次不行,下次就可以了,我觉得应该是网络问题吧~
      biocLite的镜像不多,如果哪些包全部重复了几次都安不上,也可以直接用install.packages(“*”)安装,这个使用choose CRAN mirror()选择镜像
      如果想加快安装速度,可以开多个账号,打开不同的镜像备用

  6. 您好,运行过程中报错:不知道大神知不知道怎么回事儿,能不能帮忙解决
    > library(cummeRbund)
    > setwd(“/usr/cuffdiff”)
    > cuff_data <- readCufflinks()
    Creating database /usr/cuffdiff/cuffData.db
    Reading Run Info File /usr/cuffdiff/run.info
    Writing runInfo Table
    Reading Read Group Info /usr/cuffdiff/read_groups.info
    Writing replicates Table
    Reading Var Model Info /usr/cuffdiff/var_model.info
    Writing varModel Table
    Reading /usr/cuffdiff/genes.fpkm_tracking
    Checking samples table…
    Populating samples table…
    Error: Column name mismatch.
    In addition: There were 50 or more warnings (use warnings() to see the first 50)

  7. cuff<-readCufflinks()
    Creating database /public/home/zffang/TM/fanfeifei/cuffdiff_results/cuffData.db
    Reading /public/home/zffang/TM/fanfeifei/cuffdiff_results/genes.fpkm_tracking
    Checking samples table…
    Populating samples table…
    Error: Column name mismatch.
    In addition: There were 50 or more warnings (use warnings() to see the first 50)

发表评论

电子邮件地址不会被公开。 必填项已用*标注

请启用Javascript,以完成验证!