Life Science Data Mining

Life Science Data Mining

5.0 1
by Stephen Wong
     
 

This timely book identifies and highlights the latest data mining paradigms to analyze, combine, integrate, model and simulate vast amounts of heterogeneous multi-modal, multi-scale data for emerging real-world applications in life science. The cutting-edge topics presented include bio-surveillance, disease outbreak detection, high throughput bioimaging, drug… See more details below

Overview

This timely book identifies and highlights the latest data mining paradigms to analyze, combine, integrate, model and simulate vast amounts of heterogeneous multi-modal, multi-scale data for emerging real-world applications in life science. The cutting-edge topics presented include bio-surveillance, disease outbreak detection, high throughput bioimaging, drug screening, predictive toxicology, biosensors, and the integration of macro-scale bio-surveillance and environmental data with micro-scale biological data for personalized medicine. This collection of works from leading researchers in the field offers readers an exceptional start in these areas.

Product Details

ISBN-13:
9789812700650
Publisher:
World Scientific Publishing Company, Incorporated
Publication date:
12/28/2006
Series:
SCIENCE, ENGINEERING, and BIOLOGY INFORMATICS Series
Pages:
388
Product dimensions:
6.00(w) x 9.00(h) x 0.90(d)

Related Subjects

Table of Contents


Preface     v
Survey of Early Warning Systems for Environmental and Public Health Applications     1
Introduction     1
Disease Surveillance     3
Reference Architecture for Model Extraction     5
Problem Domain     9
Data Sources     10
Detection Methods     12
Summary and Conclusion     13
References     14
Time-Lapse Cell Cycle Quantitative Data Analysis Using Gaussian Mixture Models     17
Introduction     18
Material and Feature Extraction     20
Material and cell feature extraction     20
Model the time-lapse data using AR model     23
Problem Statement and Formulation     24
Classification Methods     26
Gaussian mixture models and the EM algorithm     26
K-Nearest Neighbor (KNN) classifier     28
Neural networks     28
Decision tree     29
Fisher clustering     30
Experimental Results     30
Trace identification     31
Cell morphologic similarity analysis     33
Phase identification     35
Cluster analysis of time-lapse data     37
Conclusion     40
Appendix A     41
Appendix B     42
References     43
Diversity and Accuracy of Data Mining Ensemble     47
Introduction     47
Ensemble and Diversity     49
Why needs diversity?     49
Diversity measures     51
Probability Analysis     52
Coincident Failure Diversity     52
Ensemble Accuracy     55
Relationship between random guess and accuracy of lower bound single models     55
Relationship between accuracy A and the number of models N     56
When model's accuracy [Less than] 50%     57
Construction of Effective Ensembles     58
Strategies for increasing diversity     59
Ensembles of neural networks     60
Ensembles of decision trees     61
Hybrid ensembles     62
An Application: Osteoporosis Classification Problem     62
Osteoporosis problem     63
Results from the ensembles of neural nets     63
Results from ensembles of the decision trees     66
Results of hybrid ensembles     67
Discussion and Conclusions     68
References      70
Integrated Clustering for Microarray Data     73
Introduction     73
Related Work     77
Data Preprocessing     81
Integrated Clustering     83
Clustering algorithms     83
Integration methodology     88
Experimental Evaluation     89
Evaluation methodology     89
Results     91
Discussion     93
Conclusions     94
References     94
Complexity and Synchronization of EEG with Parametric Modeling     99
Introduction     100
Brief review of EEG recording analysis     100
AR modeling based EEG analysis     101
TVAR Modeling     104
Complexity Measure     105
Synchronization Measure     109
Conclusions     113
References     114
Bayesian Fusion of Syndromic Surveillance with Sensor Data for Disease Outbreak Classification     119
Introduction     120
Approach     122
Bayesian belief networks     122
Syndromic data     126
Environmental data     128
Test scenarios     130
Evaluation metrics      130
Results     131
Scenario 1     131
Scenario 2     134
Promptness     135
Summary and Conclusions     136
References     137
An Evaluation of Over-the-Counter Medication Sales for Syndromic Surveillance     143
Introduction     143
Background and Related Work     144
Data     144
Approaches     145
Lead-lag correlation analysis     145
Regression test of predictive ability     146
Detection-based approaches     148
Supervised algorithm for outbreak detection in OTC data     148
Modified Holt-Winters forecaster     150
Forecasting based on multi-channel regression     151
Experiments     153
Lead-lag correlation analysis of OTC data     153
Regression test of the predicative value of OTC     154
Results from detection-based approaches     156
Conclusions and Future Work     158
References     159
Collaborative Health Sentinel     163
Introduction     163
Infectious Disease and Existing Health Surveillance Programs     166
Elements of the Collaborative Health Sentinel (CHS) System     170
Sampling     170
Creating a national health map     177
Detection     177
Reaction     183
Cost considerations     184
Interaction with the Health Information Technology (HCIT) World     185
Conclusion     188
References     189
HL7     192
A Multi-Modal System Approach for Drug Abuse Research and Treatment Evaluation: Information Systems Needs and Challenges     195
Introduction     195
Context     198
Data sources     198
Examples of relevant questions     199
Possible System Structure     201
Challenges in System Development and Implementation     204
Ontology development     204
Data source control, proprietary issues     205
Privacy, security issues     205
Costs to implement/maintain system     206
Historical hypothesis-testing paradigm     206
Utility, usability, credibility of such a system     206
Funding of system development     207
Summary     207
References     208
Knowledge Representation for Versatile Hybrid Intelligent Processing Applied in Predictive Toxicology     213
Introduction     214
Hybrid Intelligent Techniques for Predictive Toxicology Knowledge Representation     217
XML Schemas for Knowledge Representation and Processing in AI and Predictive Toxicology     218
Towards a Standard for Chemical Data Representation in Predictive Toxicology     220
Hybrid Intelligent Systems for Knowledge Representation in Predictive Toxicology     225
A formal description of implicit and explicit knowledge-based intelligent systems     226
An XML schema for hybrid intelligent systems     228
A Case Study     231
Materials and methods     232
Results     233
Conclusions     235
References     236
Ensemble Classification System Implementation for Biomedical Microarray Data     239
Introduction     240
Background     241
Reasons for ensemble     241
Diversity and ensemble     241
Relationship between measures of diversity and combination method     243
Measures of diversity     243
Microarray data     244
Ensemble Classification System (ECS) Design     245
ECS overview     245
Feature subset selection      247
Base classifiers     248
Combination strategy     249
Experiments     250
Experimental datasets     250
Experimental results     252
Conclusion and Further Work     254
References     255
An Automated Method for Cell Phase Identification in High Throughput Time-Lapse Screens     257
Introduction     258
Nuclei Segmentation and Tracking     259
Cell Phase Identification     260
Feature calculation     260
Identifying cell phase     262
Correcting cell phase identification errors     265
Experimental Results     266
Conclusion     272
References     272
Inference of Transcriptional Regulatory Networks Based on Cancer Microarray Data     275
Introduction     275
Subnetworks and Transcriptional Regulatory Networks Inference     277
Inferring subnetworks using z-score     277
Inferring subnetworks based on graph theory     278
Inferring subnetworks based on Bayesian networks     279
Inferring transcriptional regulatory networks based on integrated expression and sequence data     283
Multinomial Probit Regression with Baysian Gene Selection     284
Problem formulation     284
Bayesian variable selection     286
Bayesian estimation using the strongest genes     288
Experimental results     289
Network Construction Based on Clustering and Predictor Design     293
Predictor construction using reversible jump MCMC annealing     293
CoD for predictors     295
Experimental results on a Myeloid line     296
Concluding Remarks     298
References     299
Data Mining in Biomedicine     305
Introduction     305
Predictive Model Construction     306
Derivation of unsupervised models     307
Derivation of supervised models     311
Validation     316
Impact Analysis     318
Summary     319
References     319
Mining Multilevel Association Rules from Gene Ontology and Microarray Data     321
Introduction     321
Proposed Methods     323
Preprocessing     323
Hierarchy-information encoding     324
The MAGO Algorithm     326
MAGO algorithm     327
CMAGO (Constrained Multilevel Association rules with Gene Ontology)     329
Experimental Results     330
The characteristic of the dataset     331
Experimental results     331
Interpretation     334
Concluding Remarks     335
References     336
A Proposed Sensor-Configuration and Sensitivity Analysis of Parameters with Applications to Biosensors     339
Introduction     340
Sensor-System Configuration     342
Optical Biosensors     346
Relationship between parameters     347
Modelling of parameters     351
Discussion     356
Conclusion     358
References     359
Epilogue     361
References     364
Index     365

Read More

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >