What will I be able to do upon completing the Specialization?

This Specialization teaches learners how to create and scale data pipelines for big data using Hadoop, Spark, Snowflake, and Databbricks, build machine learning workflows with PySpark and MLFlow, implement DataOps/DevOps to streamline data engineering processes, and develop data visualizations with Python.

Is this course really 100% online? Do I need to attend any classes in person?

This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings and assignments anytime and anywhere via the web or your mobile device.

Can I just enroll in a single course?

Yes! To get started, click the course card that interests you and enroll. You can enroll and complete the course to earn a shareable certificate. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. Visit your learner dashboard to track your progress.

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Can I take the course for free?

No, you cannot take this course for free. When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. If you cannot afford the fee, you can apply for financial aid.

Spécialisation Applied Python Data Engineering

Ce spécialisation n'est pas disponible en Français (France)

Nous sommes actuellement en train de le traduire dans plus de langues. Consultez les langues disponibles.

Spécialisation Applied Python Data Engineering

Elevate your coding skills with data engineering. Use big data for decision-making, analysis, AI and machine learning

Instructeurs : Kennedy Behrman

6 767 déjà inscrits

Inclus avec

Série de 3 cours

Approfondissez votre connaissance d’un sujet

from 61 reviews of courses in this program

niveau Intermédiaire

Expérience recommandée

5 mois à compléter

à 10 heures par semaine

Planning flexible

Apprenez à votre propre rythme

Série de 3 cours

Approfondissez votre connaissance d’un sujet

from 61 reviews of courses in this program

niveau Intermédiaire

Expérience recommandée

5 mois à compléter

à 10 heures par semaine

Planning flexible

Apprenez à votre propre rythme

Ce que vous apprendrez

Create scalable big data pipelines (Hadoop, Spark, Snowflake, Databricks) for efficient data handling.
Build machine learning workflows (PySpark, MLFlow) on Databricks for seamless model development and deployment.
Implement DataOps/DevOps to streamline data engineering processes.
Formulate and communicate data-driven insights and narratives through impactful visualizations with Python and data storytelling

Compétences que vous acquerrez

Catégorie : Plot (Graphics)
Catégorie : Data Pipelines
Catégorie : Matplotlib
Catégorie : Containerization
Catégorie : Big Data
Catégorie : Data Visualization
Catégorie : Statistical Visualization
Catégorie : Data Visualization Software
Catégorie : Site Reliability Engineering
Catégorie : Data Science
Catégorie : Data Storytelling
Catégorie : Interactive Data Visualization

Outils que vous découvrirez

Catégorie : Docker (Software)
Catégorie : Kubernetes
Catégorie : Apache Hadoop
Catégorie : PySpark
Catégorie : Databricks
Catégorie : Plotly
Catégorie : Python Programming
Catégorie : Apache Spark

Détails à connaître

Certificat partageable

Ajouter à votre profil LinkedIn

Enseigné en Anglais

Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

En savoir plus sur Coursera pour les affaires

logos de Petrobras, TATA, Danone, Capgemini, P&G et L'Oreal

Améliorez votre expertise en la matière

Acquérez des compétences recherchées auprès d’universités et d’experts du secteur
Maîtrisez un sujet ou un outil avec des projets pratiques
Développez une compréhension approfondie de concepts clés
Obtenez un certificat professionnel auprès de Duke University

Spécialisation - série de 3 cours

Learn how to use data engineering to leverage big data for business strategy, data analysis, or machine learning and AI. By completing this course series, you'll empower yourself with the knowledge and proficiency required to build efficient data pipelines, manage cutting-edge platforms like Hadoop, Spark, Snowflake, Databricks, and Kubernetes, and tell stories with data through visualization. You will delve into foundational big data concepts, distributed computing with Spark, Snowflake’s architecture, Databricks’ machine learning capabilities, Python techniques for data visualization, and critical methodologies like DataOps.

This course series is designed for software engineers, developers, researchers, and data scientists who want to strengthen their specialization in data science or machine learning, as well as for professionals who are interested in pursuing a career as a data-focused software engineer, data scientist, or a data engineer working in cloud, machine learning, business intelligence, or other field.

Projet d'apprentissage appliqué

The Specialization features a capstone project focused on using Databricks’ API to replicate an existing project. This provides hands-on experience working with Databricks to build a portfolio-ready data solution. You will apply Python to a variety of data engineering tasks.

Spark, Hadoop, and Snowflake for Data Engineering

COURS 1 30 heures

Ce que vous apprendrez

Create scalable data pipelines (Hadoop, Spark, Snowflake, Databricks) for efficient data handling.
Optimize data engineering with clustering and scaling to boost performance and resource use.
Build ML solutions (PySpark, MLFlow) on Databricks for seamless model development and deployment.
Implement DataOps and DevOps practices for continuous integration and deployment (CI/CD) of data-driven applications, including automating processes.

Compétences que vous acquerrez

Catégorie : PySpark

Catégorie : Apache Spark

Catégorie : Databricks

Catégorie : MLOps (Machine Learning Operations)

Catégorie : DevOps

Catégorie : Apache Hadoop

Catégorie : Database Architecture and Administration

Catégorie : Data Pipelines

Catégorie : Big Data

Catégorie : Data Quality

Catégorie : SQL

Catégorie : Data Warehousing

Catégorie : Data Transformation

Catégorie : Distributed Computing

Catégorie : Data Integration

Catégorie : Data Processing

Catégorie : Python Programming

Virtualization, Docker, and Kubernetes for Data Engineering

COURS 2 27 heures

Ce que vous apprendrez

Master virtualization, containerization, and Docker, including Dockerfile creation and multi-container orchestration with Compose and Airflow.
Develop expertise in Kubernetes core concepts, cluster architecture, and deployment using cloud environments, GitHub Codespaces, and AI-driven tools.
Navigate data scenarios mastering containerization, deploying apps, and addressing production issues with cloud orchestration and SRE practices.

Compétences que vous acquerrez

Catégorie : Containerization

Catégorie : Docker (Software)

Catégorie : Kubernetes

Catégorie : Scalability

Catégorie : Site Reliability Engineering

Catégorie : Microservices

Catégorie : Virtual Machines

Catégorie : Virtualization

Catégorie : Cloud Development

Catégorie : Database Management

Catégorie : Cloud-Based Integration

Catégorie : Devops Tools

Catégorie : Cloud Deployment

Catégorie : Application Deployment

Data Visualization with Python

COURS 3 9 heures

Ce que vous apprendrez

Apply Python, spreadsheets, and BI tooling proficiently to create visually compelling and interactive data visualizations.
Formulate and communicate data-driven insights and narratives through impactful visualizations and data storytelling.
Assess and select the most suitable visualization tools and techniques to address organizational data needs and objectives.

Compétences que vous acquerrez

Catégorie : Interactive Data Visualization

Catégorie : Plotly

Catégorie : Tableau Software

Catégorie : Seaborn

Catégorie : Scatter Plots

Catégorie : Matplotlib

Catégorie : Data Storytelling

Catégorie : Google Sheets

Catégorie : Data Presentation

Catégorie : Heat Maps

Catégorie : Histogram

Catégorie : Dashboard

Catégorie : Plot (Graphics)

Catégorie : Data Visualization Software

Catégorie : Python Programming

Catégorie : Business Communication

Catégorie : Data Visualization

Catégorie : Cloud Applications

Catégorie : Statistical Visualization

Catégorie : Data Analysis

Obtenez un certificat professionnel

Ajoutez ce titre à votre profil LinkedIn, à votre curriculum vitae ou à votre CV. Partagez-le sur les médias sociaux et dans votre évaluation des performances.

Instructeurs

Kennedy Behrman

Duke University

7 Cours 63 712 apprenants

Offert par

Duke University

Vous aimerez peut-être aussi

Pour quelles raisons les étudiants sur Coursera nous choisissent-ils pour leur carrière ?

Felipe M.

Étudiant(e) depuis 2018

’Pouvoir suivre des cours à mon rythme à été une expérience extraordinaire. Je peux apprendre chaque fois que mon emploi du temps me le permet et en fonction de mon humeur.’

Jennifer J.

Étudiant(e) depuis 2020

’J'ai directement appliqué les concepts et les compétences que j'ai appris de mes cours à un nouveau projet passionnant au travail.’

Larry W.

Étudiant(e) depuis 2021

’Lorsque j'ai besoin de cours sur des sujets que mon université ne propose pas, Coursera est l'un des meilleurs endroits où se rendre.’

Chaitanya A.

’Apprendre, ce n'est pas seulement s'améliorer dans son travail : c'est bien plus que cela. Coursera me permet d'apprendre sans limites.’

Ouvrez de nouvelles portes avec Coursera Plus

Accès illimité à 10,000+ cours de niveau international, projets pratiques et programmes de certification prêts à l'emploi - tous inclus dans votre abonnement.

Faites progresser votre carrière avec un diplôme en ligne

Obtenez un diplôme auprès d’universités de renommée mondiale - 100 % en ligne

Découvrir les diplômes

Rejoignez plus de 3 400 entreprises mondiales qui ont choisi Coursera pour les affaires

Améliorez les compétences de vos employés pour exceller dans l’économie numérique

Foire Aux Questions

The course series takes approximately 5 months to complete.

Experience in working with Python, Git for version control, Docker for containerization and Kubernetes for deployment and scaling; also a strong foundation in linear algebra and statistics.

The course series is designed to be completed in the order outlined here on this page.

Note that the Specialization Certificate does not represent official academic credit from the partner institution offering the course. Duke cannot provide a transcript for your completion of the Specialization; however, we encourage you to share your Coursera completion certificate with your employer and community to demonstrate your completion of the course series.