An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
No Thumbnail Available
Date
2014-03-17
Authors
Cui, Song
Youn, Eunseog
Lee, Joohyun
Maas, Stephan J.
Kestler, Hans A.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs)
makes great contribution to understanding the gene regulatory networks. However, these approaches are based on
laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to
circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail
to consider the structural property of the datasets. We proposed a refined systematic computational approach for
predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana
co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm
converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a
classification problem using support vector machine-based classifier. Our approach showed significant improvement
compared to other computational methods based on the area under curve value of the receiver operating characteristic
curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated
other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory
results.
Description
Keywords
Citation
PLoS ONE. 2014 Mar 17;9(4):e94519