Foundations of Data Science

Front Cover
Cambridge University Press, Jan 23, 2020 - Computers - 424 pages
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.
 

Contents

Introduction
1
BestFit Subspaces and Singular Value Decomposition SVD
29
Random Walks and Markov Chains
64
Machine Learning
109
Streaming
159
Clustering
182
Random Graphs
215
8
226
29
270
Topic Models Nonnegative Matrix Factorization Hidden Markov
274
Other Topics
318
Wavelets
341
Background Material
360
62
380
References
411
65
412

13
238
18
245

Other editions - View all

Common terms and phrases

About the author (2020)

Avrim Blum is Chief Academic Officer at Toyota Technical Institute at Chicago and formerly Professor at Carnegie Mellon University, Pennsylvania. He has over 25,000 citations for his work in algorithms and machine learning. He has received the AI Journal Classic Paper Award, ICML/COLT 10-Year Best Paper Award, Sloan Fellowship, NSF NYI award, and Herb Simon Teaching Award, and is a Fellow of the Association for Computing Machinery (ACM). John Hopcroft is a member of the National Academy of Sciences and National Academy of Engineering, and a foreign member of the Chinese Academy of Sciences. He received the Turing Award in 1986, was appointed to the National Science Board in 1992 by President George H. W. Bush, and was presented with the Friendship Award by Premier Li Keqiang for his work in China. Ravi Kannan is Principal Researcher for Microsoft Research, India. He was the recipient of the Fulkerson Prize in Discrete Mathematics (1991) and the Knuth Prize (ACM) in 2011. He is a distinguished alumnus of the Indian Institute of Technology, Bombay, and his past faculty appointments include Massachusetts Institute of Technology, Carnegie Mellon University, Pennsylvania, Yale University, Connecticut, and the Indian Institute of Science.

Bibliographic information