BOOKS - PROGRAMMING - Cost-Effective Data Pipelines Balancing Trade-Offs When Develop...
Cost-Effective Data Pipelines Balancing Trade-Offs When Developing Pipelines in the Cloud (Final Release) - Sev Leonard 2023 EPUB | MOBI O’Reilly Media, Inc. BOOKS PROGRAMMING
US $5.65

Views
104791
Cost-Effective Data Pipelines Balancing Trade-Offs When Developing Pipelines in the Cloud (Final Release)
Author: Sev Leonard
Year: 2023
Number of pages: 286
Format: EPUB | MOBI
File size: 10.2 MB
Language: ENG

The low cost of getting started with cloud services can easily evolve into a significant expense down the road. That's challenging for teams developing data pipelines, particularly when rapid changes in technology and workload require a constant cycle of redesign. How do you deliver scalable, highly available products while keeping costs in check? With this practical guide, author Sev Leonard provides a holistic approach to designing scalable data pipelines in the cloud. Intermediate data engineers, software developers, and architects will learn how to navigate costperformance trade-offs and how to choose and configure compute and storage. You'll also pick up best practices for code development, testing, and monitoring. When working with Spark, the Spark UI provides additional diagnostic information regarding executor load, how well balanced (or not) your computation is across executors, shuffles, spill, and query plans, showing you how Spark is running your query. This information can help you tune Spark settings, data partitioning, and data transformation code.

You may also be interested in: