An Introduction to Statistical Learning: with Applications in R

Front Cover
Springer Science & Business Media, Jun 24, 2013 - Mathematics - 426 pages

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform.

Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

 

What people are saying - Write a review

LibraryThing Review

User Review  - Mohammedkb - LibraryThing

I was lucky to attend a MOOC course delivered by the authors of this book- Trevor Hastie and Robert Tibshirani, which was offered by Stanford University. The book presents a balanced amount of theory ... Read full review

User Review - Flag as inappropriate

I first started to study econometrics in 1977, when it was all about Statistical Inference. Henri Theil's textbook defined the gold standard in econometric studies at the time. Throughout my career, I have seen the sample sizes that I worked with grow from 30 to 300,000. In the world, of 300K samples the old rules simply do not apply. I am grateful to Gareth James and company for giving my a way forward in this new world.  

Contents

1 Introduction
1
2 Statistical Learning
15
3 Linear Regression
59
4 Classification
127
5 Resampling Methods
174
6 Linear Model Selection and Regularization
203
7 Moving Beyond Linearity
265
8 TreeBased Methods
302
9 Support Vector Machines
337
10 Unsupervised Learning
373
Index
419
Copyright

Other editions - View all

Common terms and phrases

About the author (2013)

Gareth James is a professor of data sciences and operations at the University of Southern California. He has published an extensive body of methodological work in the domain of statistical learning with particular emphasis on high-dimensional and functional data. The conceptual framework for this book grew out of his MBA elective courses in this area.

Daniela Witten is an associate professor of statistics and biostatistics at the University of Washington. Her research focuses largely on statistical machine learning in the high-dimensional setting, with an emphasis on unsupervised learning.


Trevor Hastie and Robert Tibshirani are professors of statistics at Stanford University, and are co-authors of the successful textbook Elements of Statistical Learning. Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap.

Bibliographic information