상세 보기
Sequence-Based Prediction of Putative Transcription Factor Binding Sites in DNA Sequences of Any Length
- Lee, Wook;
- Park, Byungkyu;
- Han, Kyungsook
WEB OF SCIENCE
5SCOPUS
5초록
A transcription factor (TF) is a protein that regulates gene expression by binding to specific DNA sequences. Despite recent advances in experimental techniques for identifying transcription factor binding sites (TFBS) in DNA sequences, a large number of TFBS are to be unveiled in many species. Several computational methods developed for predicting TFBS in DNA are tissue- or species-specific methods, and therefore cannot be used without prior knowledge of tissue or species. Some computational methods are applicable to identifying TFBS in short DNA sequences only. In this paper, we propose a new learning method for predicting TFBS in DNA of any length using the composition, transition, and distribution of nucleotides and amino acids in DNA and TF sequences. In independent testing of the method on datasets that were not used in training the method, the accuracy and MCC were as high as 81.84 percent and 0.634, respectively. The proposed method can be a useful aid for selecting potential TFBS in a large amount of DNA sequences before conducting biochemical experiments to empirically determine TFBS.
키워드
- 제목
- Sequence-Based Prediction of Putative Transcription Factor Binding Sites in DNA Sequences of Any Length
- 저자
- Lee, Wook; Park, Byungkyu; Han, Kyungsook
- 발행일
- 2018-09
- 유형
- Article; Proceedings Paper
- 권
- 15
- 호
- 5
- 페이지
- 1461 ~ 1469