Getting and Cleaning Data

  • 4.6
Approx. 20 hours to complete

Course Summary

Learn how to effectively clean and organize data with this comprehensive course. Gain hands-on experience with real-world data sets and tools to prepare you for a career in data cleaning.

Key Learning Points

  • Understand the importance of data cleaning and its impact on data analysis
  • Learn how to identify and handle missing or incorrect data
  • Gain experience with popular data cleaning tools such as OpenRefine and Trifacta

Job Positions & Salaries of people who have taken this course might have

  • Data Analyst
    • USA: $65,000
    • India: ₹6,00,000
    • Spain: €30,000
  • Data Quality Analyst
    • USA: $75,000
    • India: ₹8,00,000
    • Spain: €35,000
  • Data Engineer
    • USA: $95,000
    • India: ₹12,00,000
    • Spain: €45,000

Related Topics for further study


Learning Outcomes

  • Understand the importance of data cleaning and its impact on data analysis
  • Gain hands-on experience with popular data cleaning tools
  • Learn techniques for identifying and handling missing or incorrect data

Prerequisites or good to have knowledge before taking this course

  • Basic understanding of data analysis
  • Familiarity with spreadsheets

Course Difficulty Level

Intermediate

Course Format

  • Self-paced
  • Online
  • Video lectures
  • Hands-on exercises

Similar Courses

  • Data Wrangling with MongoDB
  • Data Analysis with Python

Related Education Paths


Related Books

Description

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Knowledge

  • Understand common data storage systems
  • Apply data cleaning basics to make data "tidy"
  • Use R for text and date manipulation
  • Obtain usable data from the web, APIs, and databases

Outline

  • Week 1
  • Obtaining Data Motivation
  • Raw and Processed Data
  • Components of Tidy Data
  • Downloading Files
  • Reading Local Files
  • Reading Excel Files
  • Reading XML
  • Reading JSON
  • The data.table Package
  • Welcome to Week 1
  • Syllabus
  • Pre-Course Survey
  • Practical R Exercises in swirl Part 1
  • Week 1 Quiz
  • Week 2
  • Reading from MySQL
  • Reading from HDF5
  • Reading from The Web
  • Reading From APIs
  • Reading From Other Sources
  • Week 2 Quiz
  • Week 3
  • Subsetting and Sorting
  • Summarizing Data
  • Creating New Variables
  • Reshaping Data
  • Managing Data Frames with dplyr - Introduction
  • Managing Data Frames with dplyr - Basic Tools
  • Merging Data
  • Practical R Exercises in swirl Part 2
  • Week 3 Quiz
  • Week 4
  • Editing Text Variables
  • Regular Expressions I
  • Regular Expressions II
  • Working with Dates
  • Data Resources
  • Practical R Exercises in swirl Part 4
  • Post-Course Survey
  • Week 4 Quiz

Summary of User Reviews

Discover the art of data cleaning with this comprehensive course from Coursera. Learners rave about the course's clear structure and engaging content, making it easy to pick up new skills. Explore the ins and outs of data cleaning and learn how to prepare your data for analysis, all while earning a certificate of completion.

Key Aspect Users Liked About This Course

Clear structure and engaging content

Pros from User Reviews

  • Great for beginners to data cleaning
  • Instructors are knowledgeable and responsive
  • Hands-on exercises reinforce learning

Cons from User Reviews

  • Some learners found the course content too basic
  • Not much focus on advanced topics
  • Course may move too quickly for some
English
Available now
Approx. 20 hours to complete
Jeff Leek, PhD, Roger D. Peng, PhD, Brian Caffo, PhD
Johns Hopkins University
Coursera

Instructor

Jeff Leek, PhD

  • 4.6 Raiting
Share
Saved Course list
Cancel
Get Course Update
Computer Courses