CAS OpenIR  > 中科院上海应用物理研究所2011-2020年
ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval
Wang, JY(王靖琰); Gao, X; Wang, QQ; Li, YP(李勇平)
2012
Source PublicationBMC BIOINFORMATICS
ISSN1471-2105
Volume13
AbstractBackground: The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database. Results: In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure d(ij) by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N(i) and N(j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing d(ij) by a factor learned from the context N(i) and N(j). Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update the Protein Hierarchial Context Coherently in an iterative algorithm-ProDis-ContSHC. We test the performance of ProDis-ContSHC on two benchmark sets, i.e., the ASTRAL 1.73 database and the FSSP/DALI database. Experimental results demonstrate that plugging our supervised contextual dissimilarity measures into the retrieval systems significantly outperforms the context-free dissimilarity/similarity measures and other unsupervised contextual dissimilarity measures that do not use the class label information. Conclusions: Using the contextual proteins with their class labels in the database, we can improve the accuracy of the pairwise dissimilarity/similarity measures dramatically for the protein retrieval tasks. In this work, for the first time, we propose the idea of supervised contextual dissimilarity learning, resulting in the ProDis-ContSHC
Language英语
Funding Project应物所项目组
WOS IDWOS:000303940000003
Citation statistics
Cited Times:18[WOS]   [WOS Record]     [Related Records in WOS]
Document Type期刊论文
Identifierhttp://ir.sinap.ac.cn/handle/331007/13259
Collection中科院上海应用物理研究所2011-2020年
Recommended Citation
GB/T 7714
Wang, JY,Gao, X,Wang, QQ,et al. ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval[J]. BMC BIOINFORMATICS,2012,13.
APA Wang, JY,Gao, X,Wang, QQ,&Li, YP.(2012).ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.BMC BIOINFORMATICS,13.
MLA Wang, JY,et al."ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval".BMC BIOINFORMATICS 13(2012).
Files in This Item: Download All
File Name/Size DocType Version Access License
ProDis-ContSHC-learn(2264KB) 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, JY(王靖琰)]'s Articles
[Gao, X]'s Articles
[Wang, QQ]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, JY(王靖琰)]'s Articles
[Gao, X]'s Articles
[Wang, QQ]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, JY(王靖琰)]'s Articles
[Gao, X]'s Articles
[Wang, QQ]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: ProDis-ContSHC-learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.