Big Data Integration and Processing

  • 4.4
Approx. 18 hours to complete

Course Summary

Learn how to integrate and process big data using Hadoop and Spark frameworks with this course. Gain hands-on experience with real-world big data projects and learn from industry experts.

Key Learning Points

  • Understand the basics of big data integration and processing
  • Gain hands-on experience with Hadoop and Spark frameworks
  • Learn from industry experts and real-world big data projects

Related Topics for further study


Learning Outcomes

  • Understand the basics of big data integration and processing
  • Gain hands-on experience with Hadoop and Spark
  • Be able to apply knowledge to real-world big data projects

Prerequisites or good to have knowledge before taking this course

  • Basic understanding of programming
  • Familiarity with SQL

Course Difficulty Level

Intermediate

Course Format

  • Online self-paced course
  • Video lectures
  • Hands-on projects

Similar Courses

  • Big Data Analytics
  • Data Science Essentials

Related Education Paths


Notable People in This Field

  • Creator of Hadoop
  • Creator of Spark

Related Books

Description

At the end of the course, you will be able to:

Outline

  • Welcome to Big Data Integration and Processing
  • What is in this Course?
  • Summary of Big Data Modeling and Management
  • Why is Big Data Processing Different?
  • Slides: Summary & Why Is Big Data Processing Different
  • Downloading and Installing the Cloudera VM Instructions (Windows)
  • Downloading and Installing the Cloudera VM Instructions (Mac)
  • Software Installation Frequently Asked Questions (FAQ)
  • Instructions for Downloading Hands On Datasets
  • Instructions for Starting Jupyter
  • Retrieving Big Data (Part 1)
  • What is Data Retrieval? Part 1
  • What is Data Retrieval? Part 2
  • Querying Two Relations
  • Subqueries
  • Querying Relational Data with Postgres
  • Slides: What is Data Retrieval?
  • Querying Relational Data with Postgres
  • Retrieving Big Data (Part 2)
  • Querying JSON Data with MongoDB
  • Aggregation Functions
  • Querying Aerospike
  • Querying Documents in MongoDB
  • Exploring Pandas DataFrames
  • Slides: Querying Data Part 2
  • Querying Documents in MongoDB
  • Exploring Pandas DataFrames
  • Retrieving Big Data Quiz
  • Postgres, MongoDB, and Pandas
  • Big Data Integration
  • Overview of Information Integration
  • A Data Integration Scenario
  • Integration for Multichannel Customer Analytics
  • Big Data Management and Processing Using Splunk and Datameer
  • Why Splunk?
  • Connected Cars with Ford's OpenXC and Splunk
  • Big Data Management and Processing using Datameer
  • Installing Splunk Enterprise on Windows
  • Installing Splunk Enterprise on Linux
  • Exploring Splunk Queries
  • Optional: Creating Pivot Reports in Splunk
  • Slides: Information Integration
  • Downloading Splunk Enterprise
  • Exploring Splunk Queries
  • Optional: Instructions for Splunk Pivot Tutorial
  • Information Integration - Quiz
  • Hands-On With Splunk
  • Processing Big Data
  • Big Data Processing Pipelines
  • Some High-Level Processing Operations in Big Data Pipelines
  • Aggregation Operations in Big Data Pipelines
  • Typical Analytical Operations in Big Data Pipelines
  • Overview of Big Data Processing Systems
  • The Integration and Processing Layer
  • Introduction to Apache Spark
  • Getting Started with Spark
  • WordCount in Spark
  • Big Data Processing Pipelines Slides
  • Big Data Workflow Management
  • Slides for Big Data Processing Tools and Systems
  • WordCount in Spark
  • Pipeline and Tools
  • WordCount in Spark
  • Big Data Analytics using Spark
  • Spark Core: Programming In Spark using RDDs in Pipelines
  • Spark Core: Transformations
  • Spark Core: Actions
  • Spark SQL
  • Spark Streaming
  • Spark MLLib
  • Spark GraphX
  • Exploring SparkSQL and Spark DataFrames
  • Analyzing Sensor Data with Spark Streaming
  • Slides for Module 5 Lesson 1
  • Slides for Module 5 Lesson 2
  • Exploring SparkSQL and Spark DataFrames
  • Instructions for Configuring VirtualBox for Spark Streaming
  • Analyzing Sensor Data with Spark Streaming
  • More on Spark
  • SparkSQL and Spark Streaming
  • Learn By Doing: Putting MongoDB and Spark to Work
  • Let's Analyze Soccer Tweets!
  • Expressing Analytical Questions as MongoDB Queries
  • Exporting Data from MongoDB to a CSV File
  • Analyzing Tweets About Countries
  • Check Your Query Results
  • Check Your Analysis Results

Summary of User Reviews

Discover the world of big data integration and processing with this highly rated course on Coursera. The course covers various topics related to big data processing and has received positive reviews from many users.

Key Aspect Users Liked About This Course

The course provides practical examples and hands-on assignments that help users understand the concepts better.

Pros from User Reviews

  • The course provides a comprehensive overview of big data processing and integration.
  • The instructor is knowledgeable and provides clear explanations.
  • The hands-on assignments help users apply the concepts learned in the course.
  • The course is well-structured and easy to follow.
  • The course provides useful resources and references that users can use to further their understanding of the topic.

Cons from User Reviews

  • Some users found the course to be too technical and challenging.
  • The course may not be suitable for beginners who have no prior experience with big data processing.
  • Some users found the course to be too theoretical and lacking in practical examples.
  • The course may require a significant time commitment from users.
  • The course may not provide enough support for users who need additional assistance.
English
Available now
Approx. 18 hours to complete
Ilkay Altintas, Amarnath Gupta
University of California San Diego
Coursera

Instructor

Ilkay Altintas

  • 4.4 Raiting
Share
Saved Course list
Cancel
Get Course Update
Computer Courses