## Handbook of Statistics: Data Mining and Data VisualizationThis book focuses on dealing with large-scale data, a field commonly referred to as data mining. The book is divided into three sections. The first deals with an introduction to statistical aspects of data mining and machine learning and includes applications to text analysis, computer intrusion detection, and hiding of information in digital files. The second section focuses on a variety of statistical methodologies that have proven to be effective in data mining applications. These include clustering, classification, multivariate density estimation, tree-based methods, pattern recognition, outlier detection, genetic algorithms, and dimensionality reduction. The third section focuses on data visualization and covers issues of visualization of high-dimensional data, novel graphical techniques with a focus on human factors, interactive graphics, and data visualization using virtual reality. This book represents a thorough cross section of internationally renowned thinkers who are inventing methods for dealing with a new data paradigm. Key Features: - Distinguished contributors who are international experts in aspects of data mining - Includes data mining approaches to non-numerical data mining including text data, Internet traffic data, and geographic data - Highly topical discussions reflecting current thinking on contemporary technical issues, e.g. streaming data - Discusses taxonomy of dataset sizes, computational complexity, and scalability usually ignored in most discussions - Thorough discussion of data visualization issues blending statistical, human factors, and computational insights · Distinguished contributors who are international experts in aspects of data mining · Includes data mining approaches to non-numerical data mining including text data, Internet traffic data, and geographic data · Highly topical discussions reflecting current thinking on contemporary technical issues, e.g. streaming data · Discusses taxonomy of dataset sizes, computational complexity, and scalability usually ignored in most discussions · Thorough discussion of data visualization issues blending statistical, human factors, and computational insights |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

1 | |

47 | |

3 Mining Computer Securitycomputer security Data | 77 |

4 Data Mining of Text Files | 109 |

5 Text Data Mining with Minimal Spanning Trees | 133 |

Steganography and Steganalysis | 171 |

7 Canonical Variate Analysis and Related Methods for Reduction of Dimensionality and Graphical Representation | 189 |

8 Pattern Recognition | 213 |

12 Fast Algorithms for Classification Using Class Cover Catch Digraphs | 331 |

13 On Genetic Algorithms and their Applications | 359 |

14 Computational Methods for HighDimensional Rotations in Data Visualization | 391 |

15 Some Recent Graphics Templates and Software for Showing Statistical Summaries | 415 |

the Paradigm of Linked Views | 437 |

17 Data Visualization and Virtual Reality | 539 |

back matter | 565 |

609 | |

9 Multidimensional Density Estimation | 229 |

10 Multivariate Outlier Detection and Robustness | 263 |

11 Classification and Regression Trees Bagging and Boosting | 303 |

Contents of Previous Volumes | 619 |

### Common terms and phrases

applications approach attributes bar chart bits boxplots C.R. Rao canonical coordinates classifier clustering color figures section color reproduction corresponding covariance crossover data analysis data mining data set data visualization database defined density estimation detection dimensions distance distribution dominating set example exploratory data analysis frame function genetic algorithms graph graphical elements groups Hellinger distance highlighting histogram IEEE implemented interactive statistical kernel knowledge mining learning sample linear linking LM plots Machine Learning matrix measure methods Michalski misclassification rate mosaic plot multivariate nodes nonparametric objects observations operator optimal Order Statistics outliers packets parallel coordinates parameters pattern pixels prediction principal components problem profiles projection random regression represent robust rotations rule sample population scatterplot selection sequence shows space split statistical graphics steganalysis steganography strings structure subset techniques tion tree values variables vector Wegman

### Popular passages

Page 19 - Equal to != Not equal to > Greater than < Less than >= Greater than or equal to...