Data Science: Wrangling

  • 0.0
8 weeks long

Brief Introduction

Learn to process and convert raw data into formats needed for analysis. 

Course Summary

Learn the art of data science wrangling with this Harvard course. Gain insights into how to collect, clean, and transform data to make it more usable for analysis.

Key Learning Points

  • Learn how to collect and clean data to make it more usable for analysis
  • Gain insights into transforming data using a variety of techniques
  • Understand how to use data wrangling to make better business decisions

Related Topics for further study


Learning Outcomes

  • Ability to collect and clean data for analysis
  • Understanding of how to transform data using various techniques
  • Ability to use data wrangling to make better business decisions

Prerequisites or good to have knowledge before taking this course

  • Basic knowledge of statistics and programming concepts
  • Access to a computer with an internet connection

Course Difficulty Level

Intermediate

Course Format

  • Online
  • Self-Paced

Similar Courses

  • Data Science Essentials
  • Data Science: Machine Learning

Related Education Paths


Notable People in This Field

  • Hadley Wickham
  • Kaggle

Related Books

Description

Course description

In this course, part of our Professional Certificate Program in Data Science, we cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point. 

Very rarely is data easily accessible in a data science project. It's more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidyverse package. The steps that convert data from its raw form to the tidy form is called data wrangling.

This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.

Knowledge

  • What you'll learn
  • Importing data into R from different file formats
  • Web scraping
  • How to tidy data using the tidyverse to better facilitate analysis
  • String processing with regular expressions (regex) 
  • Wrangling data using dplyr
  • How to work with dates and times as file formats, and text mining

Summary of User Reviews

Find out what users are saying about Harvard's online course in Data Science Wrangling. This course has received high praise for its comprehensive curriculum, interactive learning experience, and expert instructors.

Key Aspect Users Liked About This Course

Many users have praised the course's comprehensive curriculum, which covers all aspects of data science wrangling in depth.

Pros from User Reviews

  • Comprehensive curriculum
  • Expert instructors
  • Interactive learning experience
  • Real-world applications
  • Flexible scheduling

Cons from User Reviews

  • Expensive tuition
  • Heavy workload
  • Technical difficulties with online platform
  • Limited interaction with instructors
  • Requires prior knowledge of programming and data analysis
Free*
English
27th Jan, 2020
30th Jun, 2021
8 weeks long
Rafael Irizarry
Harvard University, Harvard T.H. Chan School of Public Health
Harvard University

Instructor

Share
Saved Course list
Cancel
Get Course Update
Computer Courses