Course Summary
This course is designed to introduce learners to big data and machine learning concepts and tools used in data analysis, prediction, and decision-making. Students will gain hands-on experience through programming assignments and learn how to apply concepts to real-world problems.Key Learning Points
- Learn the basics of big data and machine learning
- Gain hands-on experience through programming assignments
- Apply concepts to real-world problems
Related Topics for further study
Learning Outcomes
- Understand the basics of big data and machine learning
- Gain hands-on experience through programming assignments
- Apply concepts to real-world problems
Prerequisites or good to have knowledge before taking this course
- Basic programming skills in Python
- Familiarity with statistics
Course Difficulty Level
IntermediateCourse Format
- Online
- Self-paced
Similar Courses
- Data Science Essentials
- Data Science Methodology
Related Education Paths
Related Books
Description
Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems.
Outline
- Welcome
- Welcome to Machine Learning With Big Data
- Summary of Big Data Integration and Processing
- Introduction to Machine Learning with Big Data
- Machine Learning Overview
- Categories Of Machine Learning Techniques
- Machine Learning Process
- Goals and Activities in the Machine Learning Process
- CRISP-DM
- Scaling Up Machine Learning Algorithms
- Tools Used in this Course
- Slides: Machine Learning Overview and Applications
- Downloading, Installing and Using KNIME
- Downloading and Installing the Cloudera VM Instructions (Windows)
- Downloading and Installing the Cloudera VM Instructions (Mac)
- Instructions for Downloading Hands On Datasets
- Instructions for Starting Jupyter
- PDFs of Readings for Week 1 Hands-On
- Machine Learning Overview
- Data Exploration
- Data Terminology
- Data Exploration
- Data Exploration through Summary Statistics
- Data Exploration through Plots
- Exploring Data with KNIME Plots
- Data Exploration in Spark
- Slides: Data Exploration Overview and Terminology
- Description of Daily Weather Dataset
- Exploring Data with KNIME Plots
- Data Exploration in Spark
- PDFs of Activities for Data Exploration Hands-On Readings
- Data Exploration
- Data Exploration in KNIME and Spark Quiz
- Data Preparation
- Data Preparation
- Data Quality
- Addressing Data Quality Issues
- Feature Selection
- Feature Transformation
- Dimensionality Reduction
- Handling Missing Values in KNIME
- Handling Missing Values in Spark
- Slides: Data Preparation for Machine Learning
- Handling Missing Values in KNIME
- Handling Missing Values in Spark
- PDFs for Data Preparation Hands-On Readings
- Data Preparation
- Handling Missing Values in KNIME and Spark Quiz
- Classification
- Classification
- Building and Applying a Classification Model
- Classification Algorithms
- k-Nearest Neighbors
- Decision Trees
- Naïve Bayes
- Classification using Decision Tree in KNIME
- Classification in Spark
- Slides: What is Classification?
- Slides: Classification Algorithms
- Classification using Decision Tree in KNIME
- Interpreting a Decision Tree in KNIME
- Instructions for Changing the Number of Cloudera VM CPUs
- Classification in Spark
- PDFs for Classification Hands-On Readings
- Classification
- Classification in KNIME and Spark Quiz
- Evaluation of Machine Learning Models
- Generalization and Overfitting
- Overfitting in Decision Trees
- Using a Validation Set
- Metrics to Evaluate Model Performance
- Confusion Matrix
- Evaluation of Decision Tree in KNIME
- Evaluation of Decision Tree in Spark
- Slides: Overfitting: What is it and how would you prevent it?
- Slides: Model evaluation metrics and methods
- Evaluation of Decision Tree in KNIME
- Completed KNIME Workflows
- Evaluation of Decision Tree in Spark
- Comparing Classification Results for KNIME and Spark
- PDFs for Evaluation of Machine Learning Models Hands-On Readings
- Model Evaluation
- Model Evaluation in KNIME and Spark Quiz
- Regression, Cluster Analysis, and Association Analysis
- Regression Overview
- Linear Regression
- Cluster Analysis
- k-Means Clustering
- Association Analysis
- Association Analysis in Detail
- Machine Learning With Big Data - Final Remarks
- Cluster Analysis in Spark
- Slides: Regression
- Slides: Cluster Analysis
- Slides: Association Analysis
- Description of Minute Weather Dataset
- Cluster Analysis in Spark
- PDFs of Cluster Analysis in Spark Hands-On Readings
- Regression, Cluster Analysis, & Association Analysis
- Cluster Analysis in Spark Quiz
Summary of User Reviews
Discover the fascinating world of big data and machine learning with this comprehensive course on Coursera. Students rave about the engaging lectures, hands-on assignments, and practical applications of the concepts learned. One of the key highlights of the course is the emphasis on real-world examples and case studies, which help students understand the concepts in a more meaningful way.Pros from User Reviews
- Engaging lectures
- Hands-on assignments
- Practical applications
- Real-world examples
- Case studies
Cons from User Reviews
- Some concepts may be too advanced for beginners
- Course material can be overwhelming at times
- Requires a significant time commitment
- Limited interaction with the instructor
- Not all topics are covered in depth