Applied Python Data Engineering Specialization

Applied Python Data Engineering Specialization

Name: Applied Python Data Engineering
Rating: 3.835652173913043 (115 reviews)

Elevate your coding skills with data engineering. Use big data for decision-making, analysis, AI and machine learning

Instructors: Kennedy Behrman

Access provided by EmployNV

6,814 already enrolled

3 course series

Get in-depth knowledge of a subject

from 115 reviews of courses in this program

Intermediate level

Recommended experience

5 months to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

3 course series

Get in-depth knowledge of a subject

from 115 reviews of courses in this program

Intermediate level

Recommended experience

5 months to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

What you'll learn

Create scalable big data pipelines (Hadoop, Spark, Snowflake, Databricks) for efficient data handling.
Build machine learning workflows (PySpark, MLFlow) on Databricks for seamless model development and deployment.
Implement DataOps/DevOps to streamline data engineering processes.
Formulate and communicate data-driven insights and narratives through impactful visualizations with Python and data storytelling

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Advance your subject-matter expertise

Learn in-demand skills from university and industry experts
Master a subject or tool with hands-on projects
Develop a deep understanding of key concepts
Earn a career certificate from Duke University

Specialization - 3 course series

Learn how to use data engineering to leverage big data for business strategy, data analysis, or machine learning and AI. By completing this course series, you'll empower yourself with the knowledge and proficiency required to build efficient data pipelines, manage cutting-edge platforms like Hadoop, Spark, Snowflake, Databricks, and Kubernetes, and tell stories with data through visualization. You will delve into foundational big data concepts, distributed computing with Spark, Snowflake’s architecture, Databricks’ machine learning capabilities, Python techniques for data visualization, and critical methodologies like DataOps.

This course series is designed for software engineers, developers, researchers, and data scientists who want to strengthen their specialization in data science or machine learning, as well as for professionals who are interested in pursuing a career as a data-focused software engineer, data scientist, or a data engineer working in cloud, machine learning, business intelligence, or other field.

Applied Learning Project

The Specialization features a capstone project focused on using Databricks’ API to replicate an existing project. This provides hands-on experience working with Databricks to build a portfolio-ready data solution. You will apply Python to a variety of data engineering tasks.

Spark, Hadoop, and Snowflake for Data Engineering

Course 1 30 hours

What you'll learn

Create scalable data pipelines (Hadoop, Spark, Snowflake, Databricks) for efficient data handling.
Optimize data engineering with clustering and scaling to boost performance and resource use.
Build ML solutions (PySpark, MLFlow) on Databricks for seamless model development and deployment.
Implement DataOps and DevOps practices for continuous integration and deployment (CI/CD) of data-driven applications, including automating processes.

Skills you'll gain

Category: PySpark

Category: Databricks

Category: Apache Spark

Category: DevOps

Category: Apache Hadoop

Category: MLOps (Machine Learning Operations)

Category: Data Warehousing

Category: Data Pipelines

Category: Data Quality

Category: Distributed Computing

Category: Data Processing

Category: SQL

Category: Data Transformation

Category: Python Programming

Category: Data Integration

Category: Database Architecture and Administration

Category: Big Data

Virtualization, Docker, and Kubernetes for Data Engineering

Course 2 27 hours

What you'll learn

Master virtualization, containerization, and Docker, including Dockerfile creation and multi-container orchestration with Compose and Airflow.
Develop expertise in Kubernetes core concepts, cluster architecture, and deployment using cloud environments, GitHub Codespaces, and AI-driven tools.
Navigate data scenarios mastering containerization, deploying apps, and addressing production issues with cloud orchestration and SRE practices.

Skills you'll gain

Category: Containerization

Category: Kubernetes

Category: Docker (Software)

Category: Virtual Machines

Category: Site Reliability Engineering

Category: Virtualization

Category: Scalability

Category: Microservices

Category: Cloud Deployment

Category: Devops Tools

Category: Cloud-Based Integration

Category: Application Deployment

Category: Database Management

Category: Cloud Development

Data Visualization with Python

Course 3 9 hours

What you'll learn

Apply Python, spreadsheets, and BI tooling proficiently to create visually compelling and interactive data visualizations.
Formulate and communicate data-driven insights and narratives through impactful visualizations and data storytelling.
Assess and select the most suitable visualization tools and techniques to address organizational data needs and objectives.