Chevron Left
Back to ETL and Data Pipelines with Shell, Airflow and Kafka

Learner Reviews & Feedback for ETL and Data Pipelines with Shell, Airflow and Kafka by IBM

4.5
stars
357 ratings

About the Course

Delve into the two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. In this course, you will learn about the different tools and techniques that are used with ETL and Data pipelines. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for loading data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. By the end of this course, you will also know how to use Apache Airflow to build data pipelines as well be knowledgeable about the advantages of using this approach. You will also learn how to use Apache Kafka to build streaming pipelines as well as the core components of Kafka which include: brokers, topics, partitions, replications, producers, and consumers. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module....

Top reviews

BN

Mar 30, 2023

Overall it's a good course. I wish I could use dos2unix, tr, or sed for removing ^M from the toll_data.tsv. The Final Assignment Instructions could have been clearer.

MB

Oct 11, 2022

Course Is Good but, if you can add some more practicles that will surely help understand better and help all learner grasp things very quickly.

Filter by:

76 - 87 of 87 Reviews for ETL and Data Pipelines with Shell, Airflow and Kafka

By Krishna k K

•

Apr 12, 2022

good

By Mimi Z

•

Oct 28, 2022

The course material was basic so make sure do to a lot of your own additional learning outside of the coureswork. The discussion staff are not helpful/don't understand or even read your questions before replying. The labs don't always work and the instructions don't always line up with current software upgrades. Just be prepared to do a lot of troubleshooting with not much help. I wish the course would tell you what to do when certain errors occur/are more thorough with their instructions.

By Yao G A

•

Feb 25, 2022

Cette note est du au fait du probleme de notation des examens. Le fait de laisser à l'appréciation des étudiants de juger de la bonne réponse basé sur uniquement que des indices... par exemple pour le Task 1.2 à 1.8 je crois avoir eu 2 presque partout maison ne m'en a donné que 1. Ce que je ne trouve pas vraiment juste

By Kasra A

•

Jan 14, 2024

The final exam experience was so poor. I have got disconnected many times and my correct answers were shown incorrect due to time exceeded error. Although, labs project were good and a little bit challenging which I liked it.

By Sokhibjamol B

•

Mar 22, 2024

The lab exercises were not loaded, so I had to move to the next section and it was not understandable, there is a technical issue! Also, I did not like the material and the explanation was not clear.

By Arbnor Z

•

May 16, 2023

A bit too easy, more into details. More advanced exercises to learn and be more ready for working with it.

By William S

•

Aug 8, 2024

Could be better if the airflow and kafka labs didn't have intermittent loading issues.

By Saïfallah B

•

May 30, 2024

Good Course It help me manipulating Data

By Boris V

•

Apr 16, 2024

Week 1 feels useless because the main idea is to learn about Airflow and Kafka, and all this information about ETL it is not relevant if the course is positioned as an advanced one. In Week 4, the Apache Kafka lab is not working. I have logs with errors, making it impossible to install Kafka Server on the VM. It's impossible to do any examples. Why do I need install all this on my Linux VM to perform this lab? I pay money for broken lab. I strongly suggest not taking this course because 50% of the course, the lab, is unusable/unavailable.

By Steven W

•

Jul 19, 2023

I feel though the final project suffered from issues with permissions, and there was a lack of a standard setup. Where should DAG scripts go? Why should they be in a folder with admin only permissions? Submitting screenshots is tedious and (frankly) shows a lack of willingness on the part of the course designers to use tools like nbgrader/Jupyter notebooks or other automated grading solutions.

Warning, if you can write a "Hello World" program in any language, you probably want to skip this course/certification.

By Kamil S

•

Jan 30, 2024

There are a few issues with this course. Firstly this course teaches Kafka version 2.12 which uses Apache Zookeeper meanwhile as far as I understand it zookeeper was removed from version 3.4 so this should be updated. Also I had to loose marks in my assessment as I was not able to move files in the network lab as I didnt have enough permissions according to the system and the code that was provided did not help.

By Trevor F K

•

Feb 16, 2024

As with all these IBM courses this one is super boring. Robot voice talking over powerpoints, as usual. This one stuck out as especially bad because the online lab environment is very unreliable. So much time was wasted waiting for airflow to fail to start. Extremely frustrating!