About this course:
The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.
The primary audience for this course is people who wish to analyze large datasets within a big data environment.
The secondary audience are developers who need to integrate R analyses into their solutions.
In addition to their professional experience, students who attend this course should have:
- Programming experience using R, and familiarity with common R packages
- Knowledge of common statistical methods and data analysis best practices
- Basic knowledge of the Microsoft Windows operating system and its core functionality
After completing this course, students will be able to:
- Explain how Microsoft R Server and Microsoft R Client work.
- Use R Client with R Server to explore big data held in different data stores.
- Visualize data by using graphs and plots.
- Transform and clean big data sets.
- Implement options for splitting analysis jobs into parallel tasks.
- Build and evaluate regression models generated from big data.
- Create, score, and deploy partitioning models generated from big data.
- Use R in the SQL Server and Hadoop environments.