How to Win a Data Science Competition: Learn from Top Kagglers

  • 4.7
Approx. 54 hours to complete

Course Summary

Learn how to analyze and manipulate data, build predictive models, and compete in global data science competitions with this course.

Key Learning Points

  • Gain hands-on experience in data analysis and machine learning
  • Learn from top data scientists and compete in real-world competitions
  • Build a portfolio of projects to showcase your skills to potential employers

Related Topics for further study


Learning Outcomes

  • Develop a strong foundation in data science concepts and techniques
  • Gain practical experience in analyzing and manipulating data
  • Build a portfolio of projects to showcase your skills to potential employers

Prerequisites or good to have knowledge before taking this course

  • Basic programming knowledge in Python
  • Familiarity with data structures and algorithms

Course Difficulty Level

Intermediate

Course Format

  • Online
  • Self-paced
  • Project-based

Similar Courses

  • Applied Data Science with Python
  • Machine Learning

Related Education Paths


Related Books

Description

If you want to break into competitive data science, then this course is for you! Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales’ forecasting and computer vision to name a few. At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science.

Outline

  • Introduction & Recap
  • About the University
  • Introduction
  • Meet your lecturers
  • Course overview
  • Competition Mechanics
  • Kaggle Overview [screencast]
  • Real World Application vs Competitions
  • Recap of main ML algorithms
  • Software/Hardware Requirements
  • About University
  • Rules on the academic integrity in the course
  • Welcome!
  • Week 1 overview
  • Disclaimer
  • Explanation for quiz questions
  • Additional Materials and Links
  • Explanation for quiz questions
  • Additional Material and Links
  • Practice Quiz
  • Recap
  • Recap
  • Software/Hardware
  • Graded Soft/Hard Quiz
  • Feature Preprocessing and Generation with Respect to Models
  • Overview
  • Numeric features
  • Categorical and ordinal features
  • Datetime and coordinates
  • Handling missing values
  • Bag of words
  • Word2vec, CNN
  • Explanation for quiz questions
  • Additional Material and Links
  • Explanation for quiz questions
  • Additional Material and Links
  • Feature preprocessing and generation with respect to models
  • Feature preprocessing and generation with respect to models
  • Feature extraction from text and images
  • Feature extraction from text and images
  • Final Project Description
  • Final project overview
  • Final project
  • Final project advice #1
  • Exploratory Data Analysis
  • Exploratory data analysis
  • Building intuition about the data
  • Exploring anonymized data
  • Visualizations
  • Dataset cleaning and other things to check
  • Springleaf competition EDA I
  • Springleaf competition EDA II
  • Numerai competition EDA
  • Week 2 overview
  • Additional material and links
  • Exploratory data analysis
  • Validation
  • Validation and overfitting
  • Validation strategies
  • Data splitting strategies
  • Problems occurring during validation
  • Validation strategies
  • Comments on quiz
  • Additional material and links
  • Validation
  • Validation
  • Data Leakages
  • Basic data leaks
  • Leaderboard probing and examples of rare data leaks
  • Expedia challenge
  • Comments on quiz
  • Additional material and links
  • Final project advice #2
  • Data leakages
  • Metrics Optimization
  • Motivation
  • Regression metrics review I
  • Regression metrics review II
  • Classification metrics review
  • General approaches for metrics optimization
  • Regression metrics optimization
  • Classification metrics optimization I
  • Classification metrics optimization II
  • Week 3 overview
  • Comments on quiz
  • Additional material and links
  • Metrics
  • Metrics
  • Advanced Feature Engineering I
  • Concept of mean encoding
  • Regularization
  • Extensions and generalizations
  • Comments on quiz
  • Final project advice #3
  • Mean encodings
  • Hyperparameter Optimization
  • Hyperparameter tuning I
  • Hyperparameter tuning II
  • Hyperparameter tuning III
  • Practical guide
  • KazAnova's competition pipeline, part 1
  • KazAnova's competition pipeline, part 2
  • Week 4 overview
  • Comments on quiz
  • Additional material and links
  • Additional materials and links
  • Practice quiz
  • Graded quiz
  • Advanced feature engineering II
  • Statistics and distance based features
  • Matrix factorizations
  • Feature Interactions
  • t-SNE
  • Comments on quiz
  • Additional Materials and Links
  • Graded Advanced Features II Quiz
  • Ensembling
  • Introduction into ensemble methods
  • Bagging
  • Boosting
  • Stacking
  • StackNet
  • Ensembling Tips and Tricks
  • CatBoost 1
  • CatBoost 2
  • Validation schemes for 2-nd level models
  • Comments on quiz
  • Additional materials and links
  • Final project advice #4
  • Ensembling
  • Ensembling
  • Competitions go through
  • Crowdflower Competition
  • Springleaf Marketing Response
  • Microsoft Malware Classification Challenge
  • Walmart: Trip Type Classification
  • Acquire Valued Shoppers Challenge, part 1
  • Acquire Valued Shoppers Challenge, part 2
  • Week 5 overview
  • Additional material and links
  • Final Project

Summary of User Reviews

The Competitive Data Science course on Coursera has received great reviews from users. Many have praised the course for its comprehensive curriculum and practical approach to learning. Overall, users have found the course to be highly beneficial for their data science careers.

Key Aspect Users Liked About This Course

The course provides hands-on experience with real-world datasets.

Pros from User Reviews

  • Comprehensive curriculum covering a wide range of topics in data science
  • Practical approach to learning with real-world datasets and projects
  • Engaging and knowledgeable instructors who are responsive to student needs

Cons from User Reviews

  • Some users have found the course to be too challenging for beginners
  • The course can be time-consuming and require a significant time commitment
  • The course may require additional resources or outside help for certain topics
English
Available now
Approx. 54 hours to complete
Dmitry Ulyanov, Alexander Guschin, Mikhail Trofimov, Dmitry Altukhov, Marios Michailidis
HSE University
Coursera

Instructor

Dmitry Ulyanov

  • 4.7 Raiting
Share
Saved Course list
Cancel
Get Course Update
Computer Courses