Chevron Left
Back to ETL and Data Pipelines with Shell, Airflow and Kafka

Learner Reviews & Feedback for ETL and Data Pipelines with Shell, Airflow and Kafka by IBM

4.5
stars
364 ratings

About the Course

Delve into the two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. In this course, you will learn about the different tools and techniques that are used with ETL and Data pipelines. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for loading data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. By the end of this course, you will also know how to use Apache Airflow to build data pipelines as well be knowledgeable about the advantages of using this approach. You will also learn how to use Apache Kafka to build streaming pipelines as well as the core components of Kafka which include: brokers, topics, partitions, replications, producers, and consumers. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module....

Top reviews

BN

Mar 30, 2023

Overall it's a good course. I wish I could use dos2unix, tr, or sed for removing ^M from the toll_data.tsv. The Final Assignment Instructions could have been clearer.

MB

Oct 11, 2022

Course Is Good but, if you can add some more practicles that will surely help understand better and help all learner grasp things very quickly.

Filter by:

1 - 25 of 87 Reviews for ETL and Data Pipelines with Shell, Airflow and Kafka

By Dmitry K

•

Sep 17, 2021

Buggy practice. Not possible to complete without fixing airflow start script yourself. Nobody monitor or fixing issues here

By Chris B

•

Apr 20, 2022

Course content is good but labs are riddled with bugs and in dire need of quality control. I encountered many time-consuming, frustrating technical issues that made completing this course a slog. Final assignment introduces some difficult linux manipulations that were not covered in the coures and are not really that relevant to the subject matter. Some questions on the final are unclear and could be better written. Would recommend the instructors or whomever created this course to eat their own cooking and go through this course and fix the various issues.

By Nataliya S

•

Oct 12, 2021

Thanks to IBMOpens in a new tab and CourseraOpens in a new tab for the great "ETL and Data Pipelines with Shell, Airflow and Kafka" course, that I passed with Grade Achieved: 100%. It's the third course, that I've passed, as a part of "IBM Data Engineering Specialization". I was so carried away by the course that I literally sat up until 2 am almost every day. In this course I could apply my knowledge of Python, Pandas, SQL, Bash commands to build ETL Batch and Stream pipelines.

By Tal M

•

Jul 17, 2022

The course is really basic, it only introduces the keywords and very high level concepts of ETL. Barely discusses any technical challenges or constraints. Some of the questions in the quizzes are absurd.

By Benjamin A A

•

Aug 20, 2022

I cannot proceed with the "SUBMIT a DAG" lab as I am constantly being shown the error - "cp: cannot create regular file '/home/project/airflow/dags/my_first_dag.py': Permission denied" when I run the command - "cp my_first_dag.py $AIRFLOW_HOME/dags".

How are you expecting me to complee this lab when I am getting a permission denied error. Please fix this asap.

By RLee

•

Jan 13, 2022

The final project to connect Airflow as a pipeline management tool to Kafka server is a very useful hands-on project. More details or explanations on the syntax of Python calling Kafka producer and consumer, which are in the files of toll_traffic_generator.py and streaming_data_reader.py, would be more valuable rather than just providing these two files to run on its own.

By Evgeny D

•

Sep 29, 2021

It's one of the most challenging courses I've been enrolled!

By Santiago Z A

•

Sep 15, 2022

REALLY A GOOD COURSE BUT:

- Labs are not debugged (inaccuracies)

- I understand that Kafka a wide technology and maybe it will take more than a week to cover in a appropiate way, but the labs were only about copy and paste commands.

By Ilya K

•

Jan 13, 2022

Perfect environment to make experiments! Very easy and powerful in use.

By Omar H

•

Jan 26, 2022

It's great introduction for airflow and kafka but still an introduction it is shallow doesn't offer much but at the end you will understand what you need to continue further in both technologies.

By YANGYANG C

•

Jan 17, 2022

Love the labs, but do not like the robotic lectures.

By bengisu p

•

Aug 17, 2023

I can't understand some of the questions in quizzes. Moreover, the peer-to-peer grading system should be converted to automatic grading.

By Natale F

•

Dec 15, 2021

Interesting course with enough labs.

By Hugo A O O

•

Dec 6, 2021

i really liked the labs

By Chris W

•

Apr 3, 2022

A decent overview of Airflow and Kafka. Worth it for the time invested. The labs were good, however the execution of the final assignment was poor -- you have to submit two dozen screen captures for a peer reviewed assignment. Taking screen caps of code is silly, why not just submit the code? Plus you are taking the caps before you even know if your code works. And you are relying on strangers to read and understand your code before you can get credit for the course. Fortunately, some kind soul found mine quickly and gave me 100%. My code did work -- I tested it thoroughly -- but you can't really tell from screen caps.

By Sina S S

•

May 7, 2022

A good introductory course to airflow and kafka. Could have been broken up into at least two courses focusing on each of these platform, and going more in depth in each one. Also, the final assignment is a pain to complete especially due to some errors in instructions. But overall, It is a decent course.

By Warwick S

•

Oct 13, 2023

A good overview and introduction to using Airflow and Kafka. The quizzes are lazily written and ask specific rather than generalisable knowledge questions. The final assignment for Airflow was great - lots of coding and debugging. Kafka not so much - just paste commands and watch it run.

By Katarzyna G

•

Mar 26, 2022

It would be much better with real instructors and with no peer review that is not objecitve and no proper ansers clue

By Roberta B

•

Apr 3, 2022

Ok, Very good course, but during the exam the focus was a very difficult part made of commands of Linux Shell, expecially dealing with files that are not CSV. That was not the main focus of the course, actually.....

By Aleksandra

•

Dec 10, 2023

The labs in the module lacked proper planning. Connecting to servers consumed excessive time, and errors meant reconnecting, often without success. The instructions provided by instructors were vague, suggesting solutions like 'try using other networks,' which prolonged the process. Sadly, this meant spending a month solely on server connections instead of delving into the ETL process. There was a significant amount of time wasted. Moreover, the lectures could benefit from a more contemporary approach beyond a mere slideshow. Additionally, the lecturer's voice was somewhat grating; it felt almost artificial, prompting the question of whether it was actually a human reading the slides. It might be worth considering this aspect for future presentations.

By BO W

•

Jul 8, 2022

final quiz sucks!

why are you so sick to make up this quiz ?

this quiz is pretty much more like GMAT reading test instead of IT assessment !

By Harald M

•

Sep 29, 2024

This is a well-crafted course about ETL and building/streaming Data Pipelines. The hands-on labs experience including practicing shell scripting prepared well for the final assignment task. Writing the real-world-scenario DAG tasks to create the ETL Data Pipelines using Apache Airflow was challenging. Successfully submitting the DAG and monitoring it in the UI DAGs list was at the same time satisfactory.

By Matthew M

•

Apr 21, 2023

Great course! I found the challenge intensity for the final peer-graded assignment to be at a perfect level for this course. It brought together many skills from this course and several previous courses in the IBM Data Engineering Professional Certificate curriculum.

By Sureerat P

•

May 2, 2024

The course is excellent and well-prepared. The instructor is very helpful and responsive on the discussion board. I really appreciate having the opportunity to learn from this course. Thank you to all the instructors and peers for reviewing assignments.

By Brusk A

•

Feb 25, 2023

Amazing for beginners to this subject! The labs are super useful and everything is explained in a really nice way. Can definitely get you started doing a simple project using all that you've learned. Something nice for your portfolio and github :)