What Is Social Capital?
October 4, 2024
Article
Prepare for a career as a Data Engineer. Build job-ready skills – and must-have AI skills – for an in-demand career. Earn a credential from IBM. No prior experience required.
Instructors: IBM Skills Network Team
109,612 already enrolled
Included with
(5,412 reviews)
Recommended experience
Beginner level
Basic computer skills and a grounding in IT systems. Comfort working in either Linux, Windows, or MacOS. No prior programming or data skills needed.
(5,412 reviews)
Recommended experience
Beginner level
Basic computer skills and a grounding in IT systems. Comfort working in either Linux, Windows, or MacOS. No prior programming or data skills needed.
Master the most up-to-date practical skills and knowledge data engineers use in their daily roles
Learn to create, design, & manage relational databases & apply database administration (DBA) concepts to RDBMSs such as MySQL, PostgreSQL, & IBM Db2
Develop working knowledge of NoSQL & Big Data using MongoDB, Cassandra, Cloudant, Hadoop, Apache Spark, Spark SQL, Spark ML, and Spark Streaming
Implement ETL & Data Pipelines with Bash, Airflow & Kafka; architect, populate, deploy Data Warehouses; create BI reports & interactive dashboards
Add to your LinkedIn profile
Prepare for a career in the high-growth field of data engineering. In this program, you’ll learn in-demand skills like Python, SQL, and Databases to get job-ready in less than 5 months.
Data engineering is building systems to gather data, process and organize raw data into usable information. Data engineers provide the foundational information that data scientists and business intelligence analysts use to make decisions.
This program will teach you the foundational data engineering skills employers are seeking for entry level data engineering roles, including Python, one of the most widely used programming languages. You’ll also master SQL, RDBMS, ETL, Data Warehousing, NoSQL, Big Data, and Spark with hands-on labs and projects.
You’ll learn to use Python programming language and Linux/UNIX shell scripts to extract, transform and load (ETL) data. You’ll work with Relational Databases (RDBMS) and query data using SQL statements and use NoSQL databases as well as unstructured data. You'll also learn how generative AI tools and techniques are used in data engineering.
Upon completion, you’ll have a portfolio of projects and a Professional Certificate from IBM to showcase your expertise. You’ll also earn an IBM Digital badge and will gain access to career resources to help you in your job search, including mock interviews and resume support.
This program is ACE® recommended—when you complete, you can earn up to 12 college credits.
Applied Learning Project
Throughout this Professional Certificate, you will complete hands-on labs and projects to help you gain practical experience with Python, SQL, relational databases, NoSQL databases, Apache Spark, building data pipelines, managing databases, and working with data warehouses.
Projects:
Design a relational database to help a coffee franchise improve operations.
Use SQL to query census, crime, and school demographic data sets.
Write a Bash shell script on Linux that backups changed files.
Set up, test, and optimize a data platform that contains MySQL, PostgreSQL, and IBM Db2 databases.
Analyze road traffic data to perform ETL and create a pipeline using Airflow and Kafka.
Design and implement a data warehouse for a solid-waste management company.
Move, query, and analyze data in MongoDB, Cassandra, and Cloudant NoSQL databases.
Train a machine learning model by creating an Apache Spark application.
Design, deploy, and manage an end-to-end data engineering platform.
List basic skills required for an entry-level data engineering role.
Discuss various stages and concepts in the data engineering lifecycle.
Describe data engineering technologies such as Relational Databases, NoSQL Data Stores, and Big Data Engines.
Summarize concepts in data security, governance, and compliance.
Learn Python - the most popular programming language and for Data Science and Software Development.
Apply Python programming logic Variables, Data Structures, Branching, Loops, Functions, Objects & Classes.
Demonstrate proficiency in using Python libraries such as Pandas & Numpy, and developing code using Jupyter Notebooks.
Access and web scrape data using APIs and Python libraries like Beautiful Soup.
Demonstrate your skills in Python for working with and manipulating data
Implement webscraping and use APIs to extract data with Python
Play the role of a Data Engineer working on a real project to extract, transform, and load data
Use Jupyter notebooks and IDEs to complete your project
Describe data, databases, relational databases, and cloud databases.
Describe information and data models, relational databases, and relational model concepts (including schemas and tables).
Explain an Entity Relationship Diagram and design a relational database for a specific use case.
Develop a working knowledge of popular DBMSes including MySQL, PostgreSQL, and IBM DB2
Analyze data within a database using SQL and Python.
Create a relational database and work with multiple tables using DDL commands.
Construct basic to intermediate level SQL queries using DML commands.
Compose more powerful queries with advanced SQL techniques like views, transactions, stored procedures, and joins.
Describe the Linux architecture and common Linux distributions and update and install software on a Linux system.
Perform common informational, file, content, navigational, compression, and networking commands in Bash shell.
Develop shell scripts using Linux commands, environment variables, pipes, and filters.
Schedule cron jobs in Linux with crontab and explain the cron syntax.
Create, query, and configure databases and access and build system objects such as tables.
Perform basic database management including backing up and restoring databases as well as managing user roles and permissions.
Monitor and optimize important aspects of database performance.
Troubleshoot database issues such as connectivity, login, and configuration and automate functions such as reports, notifications, and alerts.
Describe and contrast Extract, Transform, Load (ETL) processes and Extract, Load, Transform (ELT) processes.
Explain batch vs concurrent modes of execution.
Implement ETL workflow through bash and Python functions.
Describe data pipeline components, processes, tools, and technologies.
Job-ready data warehousing skills in just 6 weeks, supported by practical experience and an IBM credential.
Design and populate a data warehouse, and model and query data using CUBE, ROLLUP, and materialized views.
Identify popular data analytics and business intelligence tools and vendors and create data visualizations using IBM Cognos Analytics.
How to design and load data into a data warehouse, write aggregation queries, create materialized query tables, and create an analytics dashboard.
Explore the purpose of analytics and Business Intelligence (BI) tools
Discover the capabilities of IBM Cognos Analytics and Google Looker Studio
Showcase your proficiency in analyzing DB2 data with IBM Cognos Analytics
Create and share interactive dashboards using IBM Cognos Analytics and Google Looker Studio
Differentiate among the four main categories of NoSQL repositories.
Describe the characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools.
Perform common tasks using MongoDB tasks including create, read, update, and delete (CRUD) operations.
Execute keyspace, table, and CRUD operations in Cassandra.
Explain the impact of big data, including use cases, tools, and processing methods.
Describe Apache Hadoop architecture, ecosystem, practices, and user-related applications, including Hive, HDFS, HBase, Spark, and MapReduce.
Apply Spark programming basics, including parallel programming basics for DataFrames, data sets, and Spark SQL.
Use Spark’s RDDs and data sets, optimize Spark SQL using Catalyst and Tungsten, and use Spark’s development and runtime environment options.
Describe ML, explain its role in data engineering, summarize generative AI, discuss Spark's uses, and analyze ML pipelines and model persistence.
Evaluate ML models, distinguish between regression, classification, and clustering models, and compare data engineering pipelines with ML pipelines.
Construct the data analysis processes using Spark SQL, and perform regression, classification, and clustering using SparkML.
Demonstrate connecting to Spark clusters, build ML pipelines, perform feature extraction and transformation, and model persistence.
Demonstrate proficiency in skills required for an entry-level data engineering role.
Design and implement various concepts and components in the data engineering lifecycle such as data repositories.
Showcase working knowledge with relational databases, NoSQL data stores, big data engines, data warehouses, and data pipelines.
Apply skills in Linux shell scripting, SQL, and Python programming languages to Data Engineering problems.
Leverage various generative AI tools and techniques in data engineering processes across industries
Implement various data engineering processes such as data generation, augmentation, and anonymization using generative AI tools
Practice generative AI skills in hands-on labs and projects for data warehouse schema design and infrastructure setup
Evaluate real-world case studies showcasing the successful application of Generative AI for ETL and data repositories
Describe the role of a data engineer and some career path options as well as the prospective opportunities in the field.
Explain how to build a foundation for a job search, including researching job listings, writing a resume, and making a portfolio of work.
Summarize what a candidate can expect during a typical job interview cycle, different types of interviews, and how to prepare for interviews.
Explain how to give an effective interview, including techniques for answering questions and how to make a professional personal presentation.
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
When you complete this Professional Certificate, you may be able to have your learning recognized for credit if you are admitted and enroll in one of the following online degree programs.¹
When you complete this Professional Certificate, you may be able to have your learning recognized for credit if you are admitted and enroll in one of the following online degree programs.¹
University of Maryland Global Campus
Degree · 48 months
Illinois Tech
Degree
University of Maryland Global Campus
Degree · 48 months
Heriot-Watt University
Degree · 18 months - 8 years
Illinois Tech
Degree · 12-15 months
¹Successful application and enrollment are required. Eligibility requirements apply. Each institution determines the number of credits recognized by completing this content that may count towards degree requirements, considering any existing credits you may have. Click on a specific course for more information.
At IBM, we know how rapidly tech evolves and recognize the crucial need for businesses and professionals to build job-ready, hands-on skills quickly. As a market-leading tech innovator, we’re committed to helping you thrive in this dynamic landscape. Through IBM Skills Network, our expertly designed training programs in AI, software development, cybersecurity, data science, business management, and more, provide the essential skills you need to secure your first job, advance your career, or drive business success. Whether you’re upskilling yourself or your team, our courses, Specializations, and Professional Certificates build the technical expertise that ensures you, and your organization, excel in a competitive world.
“
The IBM Data Engineering Professional Certificate opened my eyes to the wonderful world of data. It also gave me the basics to start doing some data projects on my own in order to remain competitive.
Learning from the U.S.
“
The IBM Data Engineering Professional Certificate opened my eyes to the wonderful world of data. It also gave me the basics to start doing some data projects on my own in order to remain competitive.
Learning from the U.S.
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Earn a degree from world-class universities - 100% online
Upskill your employees to excel in the digital economy
This is a self-paced Professional Certificate that you can complete on your own schedule in less than 5 months.
This Professional Certificate is open for anyone with any job and academic background. It pre-reqs basic IT literacy and knowledge of IT infrastructure and familiarity working with Windows, Linux or MacOS. No prior computer programming experience is necessary, but is an asset, as is high school math.
Yes, it is highly recommended to take the courses in the order they are listed, as they progressively build on concepts taught in previous courses.
Data engineering is building systems to gather data, process and organize raw data into usable information, and manage the data. The work data engineers do provides the foundational information that data scientists and business analysts use to make recommendations and decisions.
You will develop practical skills using hands-on labs and projects throughout the program. By the end you will have acquired skills and knowledge to enable you to become job ready for an entry level career in Data Engineering.
Most people without formal training or degree in the data engineering field start as a business analyst or software engineer, or if they have job role specific skills, they can directly start in a junior level data engineering role. From there you can move into more specialized roles such as Database Administrator, Data Warehouse Engineer, Data Architect, or Big Data Engineer. Some choose to combine their data engineering expertise with data science or artificial intelligence (AI) to become a Data Science Engineer, or Machine Learning (ML) Engineer. Others progress their careers by taking on software engineering management roles or even achieve the executive role of Chief Data Officer.
Yes, learners can earn a recommendation of 9 college credits for completing this program.
To share proof of completion with schools, certificate graduates will receive an email prompting them to claim their Credly badge, which contains the ACE®️credit recommendation. Once claimed, you will receive a competency-based transcript that signifies the credit recommendation, which can be shared directly with a school from the Credly platform. Please note that the decision to accept specific credit recommendations is up to each institution and is not guaranteed.
Please see Coursera’s ACE Recommendations FAQ.
This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings and assignments anytime and anywhere via the web or your mobile device.
If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.
Yes! To get started, click the course card that interests you and enroll. You can enroll and complete the course to earn a shareable certificate, or you can audit it to view the course materials for free. When you subscribe to a course that is part of a Certificate, you’re automatically subscribed to the full Certificate. Visit your learner dashboard to track your progress.
¹ Median salary and job opening data are sourced from Lightcast™ Job Postings Report. Data for job roles relevant to featured programs (2/1/2024 - 2/1/2025)
These cookies are necessary for the website to function and cannot be switched off in our systems. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will not then work.
These cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site. If you do not allow these cookies we will not know when you have visited our site, and will not be able to monitor its performance.
These cookies enable the website to provide enhanced functionality and personalization. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies then some or all of these services may not function properly.