In the last installment of the Dataflow course series, we will introduce the components of the Dataflow operational model. We will examine tools and techniques for troubleshooting and optimizing pipeline performance. We will then review testing, deployment, and reliability best practices for Dataflow pipelines. We will conclude with a review of Templates, which makes it easy to scale Dataflow pipelines to organizations with hundreds of users. These lessons will help ensure that your data platform is stable and resilient to unanticipated circumstances.
Serverless Data Processing with Dataflow: Operations
This course is part of multiple programs.
Instructor: Google Cloud Training
Sponsored by Abu Dhabi National Oil Company
1,957 already enrolled
(17 reviews)
What you'll learn
Perform monitoring, troubleshooting, testing and CI/CD on Dataflow pipelines.
Deploy Dataflow pipelines with reliability in mind to maximize stability for your data processing platform
Skills you'll gain
- Application Performance Management
- Extract, Transform, Load
- Cloud Development
- Cloud Applications
- Cloud-Based Integration
- Real Time Data
- Site Reliability Engineering
- System Monitoring
- Software Testing
- Continuous Integration
- Data Engineering
- Continuous Delivery
- CI/CD
- Data Integration
- Cloud API
- DevOps
- Continuous Deployment
- Data Pipelines
- Software Development
- Dataflow
Details to know
Add to your LinkedIn profile
7 assignments
See how employees at top companies are mastering in-demand skills
Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
There are 9 modules in this course
This module covers the course outline
What's included
2 videos2 readings
In this module, we learn how to use the Jobs List page to filter for jobs that we want to monitor or investigate. We look at how the Job Graph, Job Info, and Job Metrics tabs collectively provide a comprehensive summary of your Dataflow job. Lastly, we learn how we can use Dataflow’s integration with Metrics Explorer to create alerting policies for Dataflow metrics.
What's included
5 videos1 reading1 assignment
In this module, we learn how to use the Log panel at the bottom of both the Job Graph and Job Metrics pages, and learn about the centralized Error Reporting page.
What's included
2 videos1 reading1 assignment
In this module, we learn how to troubleshoot and debug Dataflow pipelines. We will also review the four common modes of failure for Dataflow: failure to build the pipeline, failure to start the pipeline on Dataflow, failure during pipeline execution, and performance issues.
What's included
2 videos1 reading1 assignment1 app item
In this module, we will discuss performance considerations we should be aware of while developing batch and streaming pipelines in Dataflow.
What's included
4 videos1 reading1 assignment
This module will discuss unit testing your Dataflow pipelines. We also introduce frameworks and features available to streamline your CI/CD workflow for Dataflow pipelines.
What's included
5 videos1 reading1 assignment3 app items
In this module we will discuss methods for building systems that are resilient to corrupted data and data center outages.
What's included
5 videos1 reading1 assignment
This module covers Flex Templates, a feature that helps data engineering teams standardize and reuse Dataflow pipeline code. Many operational challenges can be solved with Flex Templates.
What's included
4 videos1 reading1 assignment2 app items
This module reviews the topics covered in the course
What's included
1 video
Instructor
Offered by
Why people choose Coursera for their career
Learner reviews
17 reviews
- 5 stars
47.05%
- 4 stars
11.76%
- 3 stars
11.76%
- 2 stars
11.76%
- 1 star
17.64%
Showing 3 of 17
Reviewed on Oct 21, 2023
Good intermediate course covering the big picture about how to develop data platforms using GCP and Dataflow.
Reviewed on Aug 4, 2022
Labs are keeping up-to-date, but are lacking overall theoretical summary to teach symmatically how each code could work. Still a very typical problem of courses offered by Google Cloud.
Recommended if you're interested in Data Science
Microsoft
Microsoft
University of Colorado Boulder
Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy