打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
实用|这基因是啥功能啊?

这基因是啥功能呀?

拿到序列,看到一个一个碱基(氨基酸),是不是感觉很抽象,很迷糊呀?


到底是干吗用的啊?


不要急,小编来给你仔细说一说,这条序列到底是啥来头。


啥手段啊?


当然是基因功能注释了!


那开讲吧


OK!


基因功能注释主要将基因的序列与各数据库进行比对,然后获取对应的功能注释信息。简单说,就是数据库中A知道功能,然后咱的序列B去跟数据库比对,恰好比对到A了,然后我们推测B也应该具有A的功能(理论依据是序列的相似性与基因功能是密切相关的)。我们常用的数据库有Nt, Nr, Swissprot, trEMBL,EggNO, KEGG, InterPro ,GO等。


1. Blast database



Nt, Nr is the non-redundant NCBI collection of nucleotide and protein sequence database.

此数据库下载到本地后可以进行大规模的基因注释,也可以使用少量序列使用NCBI的BLAST选择NT或者NR库进行在线注释。涉及到的一些用法,小编已经阐释过了,如下:

NCBI在线BLAST用法详解


NT和NR分割



2.UniProtKB



1)Swiss-Prot

a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure,post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases.

这里面的蛋白注释都经过人工Check的,注释准确性非常高。

2)TrEMBL

a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot.

这里面的蛋白功能注释是预测的(借助序列同源性来推测的),是对UniProtKB/Swiss-Prot的补充。

3)在线比对网址:

http://www.uniprot.org/blast/



3. eggNo


eggNo(evolutionary genealogy of genes: Non-supervised Orthologous Groups),The database currently covers 2031 eukaryotic and prokaryotic organisms, as well as precomputed mappings for 1655 additional prokaryotes and 352 virus. 

目前有开发的针对EggNOG的比对工具eggNOG-mapper,在线网站为:

http://eggnogdb.embl.de/#/app/emapper




4. KEGG


KEGG(Kyoto Encyclopedia of Genes and Genomes),KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies.

KEGG在线网站: http://www.kegg.jp/


具体用法见:

最新实用KEGG介绍


更多解释见:

KEGG图怎么看


KEGG Pathway 中基因的颜色怎么简单的标记上去


KEGG数据库的自动注释服务



5. InterPro annotation



InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium.


在线网站:http://www.ebi.ac.uk/interpro/



Pfam, PRINTS, PROSITE, ProDom, and SMART, 这五个库是最好的



6. GO



Gene Onotology (GO) is for eukaryote species. We can use the web-based browser amiGO http://amigo.geneontology.org/cgi-bin/amigo/go.cgi , or software OBO-Edit to search any GO term.

现今的生物学家们浪费了太多的时间和精力在搜寻生物信息上。这种情况归结为生物学上定义混乱的原因:不光是精确的计算机难以搜寻到这些随时间和人为多重因 素而随机改变的定义,即使是完全由人手动处理也无法完成。举个例子来说,如果需要找到一个用于制抗生素的药物靶点,你可能想找到所有的和细菌蛋白质合成相 关的基因产物,特别是那些和人中蛋白质合成组分显著不同的。但如果一个数据库描述这些基因产物为“翻译类”,而另一个描述其为“蛋白质合成类”,那么这无 疑对于计算机来说是难以区分这两个在字面上相差甚远却在功能上相一致的定义。Gene Ontology (GO)项目正是为了能够使对各种数据库中基因产物功能描述相一致的努力结果。

GO分为三大类:(丛三个角度,交叉重复)

Biological Process: Those processes specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end.

Cellular Component: The part of a cell or its extracellular environment in which a gene product is located. A gene product may be located in one or more parts of a cell and its location may be as specific as a particular macromolecular complex, that is, a stable, persistent association of macromolecules that function together.

Molecular Function: Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.

利用Interprocan数据库进行GO注释(单条序列直接进行Interprocan注释,即可获得GO号),相应的Mapping文件如下,可以下载本地进行大规模基因的GO注释:

http://www.geneontology.org/external2go/interpro2go;

另外Blast2GO也有试用版可以下载:

https://www.blast2go.com/blast2go-pro/download-b2g



更多信息参见:

GO数据库简介和使用


GO功能富集


本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
GO 和 KEGG 的区别 | GO KEGG数据库用法 | 基因集功能注释 | 代谢通路富集
GSVA可以理解为pathway级别的差异分析
GEO数据挖掘流程 STRING VS R in KEGG/GO
零代码功能富集分析(DAVID数据库、KOBAS数据库使用教程)
最新实用KEGG介绍
一文教你如何掌握基因功能(GO)和信号通路(Pathway)分析
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服