Applied Data Science for Data Analysts

  • 4.2
Approx. 16 hours to complete

Course Summary

This course is designed for data analysts who want to learn how to apply data science techniques to their work. It covers topics such as data cleaning, data visualization, and machine learning.

Key Learning Points

  • Learn how to clean and analyze data using Python
  • Master data visualization techniques to communicate insights effectively
  • Apply machine learning algorithms to solve real-world problems

Job Positions & Salaries of people who have taken this course might have

    • USA: $62,453
    • India: ₹5,51,120
    • Spain: €29,430
    • USA: $62,453
    • India: ₹5,51,120
    • Spain: €29,430

    • USA: $113,309
    • India: ₹9,98,278
    • Spain: €53,332
    • USA: $62,453
    • India: ₹5,51,120
    • Spain: €29,430

    • USA: $113,309
    • India: ₹9,98,278
    • Spain: €53,332

    • USA: $70,478
    • India: ₹6,19,930
    • Spain: €33,142

Related Topics for further study


Learning Outcomes

  • Ability to clean and analyze data using Python
  • Mastering data visualization techniques to communicate insights effectively
  • Ability to apply machine learning algorithms to solve real-world problems

Prerequisites or good to have knowledge before taking this course

  • Basic knowledge of Python programming
  • Familiarity with data analysis concepts

Course Difficulty Level

Intermediate

Course Format

  • Online
  • Self-paced
  • Video Lectures
  • Hands-on Projects

Similar Courses

  • Applied Data Science with Python
  • Data Science Essentials

Related Education Paths


Related Books

Description

In this course, you will develop your data science skills while solving real-world problems. You'll work through the data science process to and use unsupervised learning to explore data, engineer and select meaningful features, and solve complex supervised learning problems using tree-based models. You will also learn to apply hyperparameter tuning and cross-validation strategies to improve model performance.

Knowledge

  • Explore data using unsupervised machine learning.
  • Solve complex supervised learning problems using tree-based models.
  • Apply hyperparameter tuning and cross-validation strategies to improve model performance.

Outline

  • Welcome to the Course
  • Course introduction
  • Review of Data Science
  • Review of Machine Learning
  • Data Science Process vs. Machine Learning Workflow
  • Introduction to Databricks (Optional)
  • Introduction to the Platform (Optional)
  • Introduction to Apache Spark (Optional)
  • Introduction to Delta Lake (Optional)
  • Before you begin
  • Hands-on with Databricks Lab (Optional)
  • Course Introduction and Prerequisites
  • Applied Unsupervised Learning
  • Lesson Introduction
  • Exploring Data
  • Visualizing Data
  • Introduction to K-means Clustering
  • Applied K-means Clustering
  • Identifying the Number of Clusters
  • Identifying the Number of Clusters Demo
  • Utilizing Clusters
  • Lesson Introduction
  • Feature Relationships
  • Correlation Matrix
  • Introduction to Principal Components Analysis
  • Applied Principal Components Analysis
  • PCA for Feature Relationships
  • PCA for Dimensionality Reduction
  • K-means Clustering Lab
  • Principal Components Analysis Lab
  • Exploring and Visualizing Data
  • K-means Clustering
  • K-means Clustering Lab Results
  • Feature Correlation
  • Principal Components Analysis
  • PCA Lab Results
  • Feature Engineering and Selection
  • Lesson Introduction
  • Introduction to Feature Engineering
  • Common Feature Improvements
  • Handling Missing Values
  • Imputing Missing Values
  • Feature Scaling
  • Converting Feature Types
  • Representing Categorical Features
  • One-hot Encoding
  • Lesson Introduction
  • Problems with High Dimensions and Dimensionality Reduction
  • A Review of Feature Importance
  • Linear Regression Coefficients and P-values
  • Introduction to Feature Selection
  • Regularization
  • Regularized Regression
  • Applied Regularized Regression
  • Feature Engineering Lab
  • Feature Selection Lab
  • Feature Engineering Concepts
  • Missing Values
  • Feature Engineering Lab Results
  • Dimensionality and Feature Importance
  • Feature Selection in Linear Regression
  • Feature Selection Lab Results
  • Applied Tree-based Models
  • Lesson Introduction
  • A Review of Decision Trees
  • Algorithm Selection
  • String Indexing Categorical Features
  • Decision Tree Pruning
  • Lesson Introduction
  • Introduction to Ensemble Modeling
  • Bootstrap Sampling Training Data
  • Applied Random Forest
  • Lesson Introduction
  • A Review of Classification Evaluation Metrics
  • A Review of Assigning Classes
  • Oversampling and Undersampling Classes
  • Weighting Classes in Random Forest
  • Feature Engineering in Decision Trees
  • Preventing Overfitting
  • Applied Decision Trees Lab
  • Aggregating Bootstrapped Results
  • Random Forest Algorithm
  • Applied Random Forest Lab
  • Problems with Class Imbalance
  • Label-based Bootstrap Sampling
  • Label-based Evaluation Weighting
  • Label Imbalance Lab
  • Algorithm Selection and Decision Trees
  • Categorical Features
  • Applied Decision Trees Lab Results
  • Tree-based Ensemble Modeling
  • Bootstrap Aggregation
  • Applied Random Forest Lab Results
  • Classification Evaluation
  • Label Imbalance and Sampling
  • Label Imbalance Lab Results
  • Model Optimization
  • Lesson Introduction
  • Introduction to Hyperparameters
  • Hyperparameters in Tree-based Models
  • Optimizing Hyperparameters
  • Grid Search for Hyperparameter Optimization
  • Validation Set
  • Grid-search for Random Forests
  • Lesson Introduction
  • A Review of Model Generalization
  • Validation Set Limitations
  • Introduction to Cross-Validation
  • K-fold Cross-Validation with Random Forest
  • Other Cross-Validation Strategies
  • Hyperparameter Search Lab
  • Cross-Validation Lab
  • Hyperparameters in Tree-based Models
  • Grid Search
  • Hyperparameter Search Lab Results
  • Model Generalization and Validation Set
  • Cross-Validation
  • Cross-Validation Lab Results

Summary of User Reviews

Read reviews on Coursera's Applied Data Science for Data Analysts course. Overall, users found the course to be informative and engaging, with a strong emphasis on practical skills. Many praised the hands-on approach to learning.

Key Aspect Users Liked About This Course

Hands-on approach to learning

Pros from User Reviews

  • Clear and concise explanations
  • Practical skills that can be applied immediately
  • Engaging instructors with industry experience

Cons from User Reviews

  • Some technical difficulties with the online platform
  • Not enough focus on theoretical concepts
  • Limited interaction with other students
  • Some assignments were too difficult or time-consuming
English
Available now
Approx. 16 hours to complete
Kevin Coyle, Mark Roepke, Emma Freeman
Databricks
Coursera

Instructor

Kevin Coyle

  • 4.2 Raiting
Share
Saved Course list
Cancel
Get Course Update
Computer Courses