Chevron Left
Back to ETL and Data Pipelines with Shell, Airflow and Kafka

Learner Reviews & Feedback for ETL and Data Pipelines with Shell, Airflow and Kafka by IBM

4.5
stars
335 ratings

About the Course

Delve into the two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. In this course, you will learn about the different tools and techniques that are used with ETL and Data pipelines. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for loading data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. By the end of this course, you will also know how to use Apache Airflow to build data pipelines as well be knowledgeable about the advantages of using this approach. You will also learn how to use Apache Kafka to build streaming pipelines as well as the core components of Kafka which include: brokers, topics, partitions, replications, producers, and consumers. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module....

Top reviews

ED

Invalid date

It's one of the most challenging courses I've been enrolled!

BN

Invalid date

Overall it's a good course. I wish I could use dos2unix, tr, or sed for removing ^M from the toll_data.tsv. The Final Assignment Instructions could have been clearer.

Filter by:

76 - 84 of 84 Reviews for ETL and Data Pipelines with Shell, Airflow and Kafka

By Roberta B

•

Apr 3, 2022

Ok, Very good course, but during the exam the focus was a very difficult part made of commands of Linux Shell, expecially dealing with files that are not CSV. That was not the main focus of the course, actually.....

By Sokhibjamol B

•

Mar 22, 2024

The lab exercises were not loaded, so I had to move to the next section and it was not understandable, there is a technical issue! Also, I did not like the material and the explanation was not clear.

By Arbnor Z

•

May 16, 2023

A bit too easy, more into details. More advanced exercises to learn and be more ready for working with it.

By William S

•

Aug 8, 2024

Could be better if the airflow and kafka labs didn't have intermittent loading issues.

By Saïfallah B

•

May 30, 2024

Good Course It help me manipulating Data

By Boris V

•

Apr 16, 2024

Week 1 feels useless because the main idea is to learn about Airflow and Kafka, and all this information about ETL it is not relevant if the course is positioned as an advanced one. In Week 4, the Apache Kafka lab is not working. I have logs with errors, making it impossible to install Kafka Server on the VM. It's impossible to do any examples. Why do I need install all this on my Linux VM to perform this lab? I pay money for broken lab. I strongly suggest not taking this course because 50% of the course, the lab, is unusable/unavailable.

By Steven W

•

Jul 19, 2023

I feel though the final project suffered from issues with permissions, and there was a lack of a standard setup. Where should DAG scripts go? Why should they be in a folder with admin only permissions? Submitting screenshots is tedious and (frankly) shows a lack of willingness on the part of the course designers to use tools like nbgrader/Jupyter notebooks or other automated grading solutions.

Warning, if you can write a "Hello World" program in any language, you probably want to skip this course/certification.

By Kamil S

•

Jan 30, 2024

There are a few issues with this course. Firstly this course teaches Kafka version 2.12 which uses Apache Zookeeper meanwhile as far as I understand it zookeeper was removed from version 3.4 so this should be updated. Also I had to loose marks in my assessment as I was not able to move files in the network lab as I didnt have enough permissions according to the system and the code that was provided did not help.

By Trevor F K

•

Feb 16, 2024

As with all these IBM courses this one is super boring. Robot voice talking over powerpoints, as usual. This one stuck out as especially bad because the online lab environment is very unreliable. So much time was wasted waiting for airflow to fail to start. Extremely frustrating!