## Data Mining: Multimedia, Soft Computing, and BioinformaticsA primer on traditional hard and emerging soft computing approaches for mining multimedia data While the digital revolution has made huge volumes of high dimensional multimedia data available, it has also challenged users to extract the information they seek from heretofore unthinkably huge datasets. Traditional hard computing data mining techniques have concentrated on flat-file applications. Soft computing tools-such as fuzzy sets, artificial neural networks, genetic algorithms, and rough sets-however, offer the opportunity to apply a wide range of data types to a variety of vital functions by handling real-life uncertainty with low-cost solutions. Data Mining: Multimedia, Soft Computing, and Bioinformatics provides an accessible introduction to fundamental and advanced data mining technologies. This readable survey describes data mining strategies for a slew of data types, including numeric and alpha-numeric formats, text, images, video, graphics, and the mixed representations therein. Along with traditional concepts and functions of data mining-like classification, clustering, and rule mining-the authors highlight topical issues in multimedia applications and bioinformatics. Principal topics discussed throughout the text include: The role of soft computing and its principles in data mining Principles and classical algorithms on string matching and their role in data (mainly text) mining Data compression principles for both lossless and lossy techniques, including their scope in data mining Access of data using matching pursuits both in raw and compressed data domains Application in mining biological databases |

### From inside the book

Results 1-3 of 41

9.2.3 Mathematical modeling of

**documents**The text data can be loosely

considered as a composition of two basic units, namely,

**document**and term [2, 5]

. In the general sense, a

**document**is a structured or semistructured segment of a

text.

Multimedia, Soft Computing, and Bioinformatics Sushmita Mitra, Tinku Acharya.

9.2.4 Similarity-based matching for

**documents**and queries When a

**document**is

modeled using the

**document**-term frequency matrix representation or its variants,

...

The dimensionality of the original

**document**-term frequency matrix F is often

prohibitively large. The Latent Semantic Analysis (LSA) approximates the original

M x N

**document**-term frequency matrix F to a much smaller matrix of size N x K, ...

### What people are saying - Write a review

### Contents

Soft Computing | 35 |

Multimedia Data Compression | 89 |

standard | 129 |

Copyright | |

9 other sections not shown

### Other editions - View all

Data Mining: Multimedia, Soft Computing, and Bioinformatics Sushmita Mitra,Tinku Acharya Limited preview - 2005 |

Data Mining: Multimedia, Soft Computing, and Bioinformatics Sushmita Mitra,Tinku Acharya No preview available - 2005 |