Discover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. This book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R.
Beginning Data Science in R details how data science is a combination of statistics, computational science, and machine learning. You’ll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this.
This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming.
Those with some data science or analytics background, but not necessarily experience with the R programming language.
Chapter 1: Introduction to R Programming
Chapter 2: Reproducible Analysis
Chapter 3: Data Manipulation
Chapter 4: Visualizing Data
Chapter 5: Working with Large Datasets
Chapter 6: Supervised Learning
Chapter 7: Unsupervised Learning
Chapter 8: More R Programming
Chapter 9: Advanced R Programming
Chapter 10: Object Oriented Programming
Chapter 11: Building an R Package
Chapter 12: Testing and Package Checking
Chapter 13: Version Control
Chapter 14: Profiling and Optimizing