Clustering Techniques for Biological Sequence Analysis: A Review
Published: 2015
Author(s) Name: Jyoti Lakhani, Anupama Chowdhary, Dharmesh Harwani |
Author(s) Affiliation:
Locked
Subscribed
Available for All
Abstract
In the present scenario there are a variety of
technical tools for supporting and validating wet-lab experiments in the field of science and biotechnology. In order to analyze biological sequences it is necessary to group similar genes. Grouping of genes can be done by using various techniques like pattern matching, classification, clustering etc. In the present study clustering is used as a tool for analyzing biological data. Clustering of Biological sequences is a very interesting and fascinating area as various researchers are working on it. But simple clustering algorithms are not much suitable for sequence
analysis problems. Most of the biological sequence analysis problems are NP-hard and some strong optimization algorithm are required for these types of problems. The manuscript presented here is a survey of various clustering techniques useful for analysis of biological sequences. The 3+ stage review process is adopted for the review of literature. To prepare this report 98 papers have been reviewed from year 1997 to 2014
according to the year of publish. The papers reviewed have discussed various issues related to the analysis of biological sequences. The major issues discovered in the reviewed papers were prediction, sequence alignment, motif discovery, cluster boundary prediction etc. Various solution approaches used by researchers
for the biological sequence analysis are evolutionary clustering, neural networks, hierarchical clustering, k-means, Go technologies, feature selection, incremental approach, bio-inspired methods, particle swarm optimization, fuzzy techniques, rough set theory and bi-clustering etc. Researchers have applied these solution approaches on various types of datasets. In this communication we have also discussed about these datasets and the parameters used with results mentioned in papers.
Keywords: Biological Sequences, Sequence Analysis, Clustering, Sequence Clustering
View PDF