Extracting meaningful information from gene expression data poses a great challenge to the community of researchers in the field of computation as well as to biologists. It is possible to determine the behavioral patterns of genes such as nature of their interaction, similarity of their behavior and so on, through the analysis of gene expression data. If two different genes show similar expression patterns across the samples, this suggests a common pattern of regulation or relationship between their functions. These patterns have huge significance and application in bioinformatics and clinical research such as drug discovery, treatment planning, accurate diagnosis, prognosis, protein network analysis and so on.
In order to identify various patterns from gene expression data, data mining techniques are essential. Major data mining techniques which can be applied for the analysis of gene expression data include clustering, classification, association rule mining etc. Clustering is an important data mining technique for the analysis of gene expression data. However clustering has some disadvantages. To overcome the problems associated with clustering, biclustering is introduced. Clustering is a global model where as biclustering is a local model. Discovering such local expression patterns is essential for identifying many genetic pathways that are not apparent otherwise. It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.
Biclustering is a two dimensional clustering problem where we group the genes and samples simultaneously. It has a great potential in detecting marker genes that are associated with certain tissues or diseases. However, since the problem is NP-hard, there has been a lot of research in biclustering involving statistical and graph-theoretic. The proposed Cuckoo Search (CS) method finds the significant biclusters in large expression data. The experiment results are demonstrated on benchmark datasets. Also, this work determines the biological relevance of the biclusters with Gene Ontology in terms of function.
Inhaltsverzeichnis (Table of Contents)
- ABSTRACT
- LIST OF TABLES
- LIST OF FIGURES
- INTRODUCTION
- MICROARRAY TECHNOLOGY
- MICROARRAY DATA CLUSTERING ANALYSIS
- BICLUSTERING
- Bicluster Types
- MOTIVATION
- PROBLEM STATEMENT
- RESEARCH OBJECTIVE
- ENCODING OF BICLUSTER
- DATASETS USED
- BIOLOGICAL VALIDATION OF BICLUSTERS
- LITERATURE REVIEW
- SYSTEMATIC BICLUSTERING ALGORITHMS
- Divide and Conquer Approach
- Greedy Iterative Search Approach
- Biclusters Enumeration Approach
- STOCHASTIC BICLUSTERING ALGORITHMS
- Neighbourhood Search Approach
- Evolutionary Computation Approach
- SYSTEMATIC BICLUSTERING ALGORITHMS
- BICLUSTERING GENE EXPRESSION DATA USING CUCKOO SEARCH
- CUCKOO SEARCH
- EXPERIMENT RESULTS ANALYSIS
- Experimental Setup
- Bicluster extraction for Yeast and Human Lymphoma Dataset
- Biological Relevance
- Biological Annotation for Yeast cell cycle using GOTermFinder Toolbox
- SAMMARY
- REFERENCES
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This work focuses on extracting meaningful information from gene expression data using a biclustering approach. The primary objective is to identify local expression patterns in gene expression data, which can be used for a variety of applications in bioinformatics and clinical research, such as drug discovery, treatment planning, and protein network analysis.
- Biclustering as a method for identifying local patterns in gene expression data
- The use of Cuckoo Search (CS) algorithm for biclustering
- The biological relevance of biclusters and their application in gene ontology analysis
- The identification of marker genes associated with specific tissues or diseases
- The potential of biclustering for various bioinformatics and clinical applications
Zusammenfassung der Kapitel (Chapter Summaries)
The introduction chapter provides an overview of microarray technology, data clustering analysis, and biclustering, highlighting its importance in gene expression data analysis. It also defines the motivation, problem statement, research objective, and datasets used in the study.
The literature review chapter presents a comprehensive overview of existing biclustering algorithms, categorizing them into systematic and stochastic approaches. It discusses various techniques and their limitations, setting the stage for the proposed method.
The chapter on biclustering gene expression data using Cuckoo Search delves into the details of the CS algorithm and its application in identifying significant biclusters. It describes the experimental setup, results analysis, and biological relevance of the discovered biclusters.
Schlüsselwörter (Keywords)
Gene expression data, biclustering, Cuckoo Search, data mining, bioinformatics, clinical research, gene ontology, marker genes, biological relevance, local patterns, microarray technology, data clustering analysis.
- Quote paper
- B. Rengeswaran (Author), A.M. Natarajan (Author), K. Premalatha (Author), 2015, A Nature Inspired Algorithm for Biclustering Microarray Data Analysis, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/300977