BOOKS - Getting Started with DuckDB: A practical guide for accelerating your data sci...
US $6.56
915865
915865
Getting Started with DuckDB: A practical guide for accelerating your data science, data analytics, and data engineering workflows
Author: Simon Aubury
Year: June 24, 2024
Format: PDF
File size: PDF 18 MB
Language: English
Year: June 24, 2024
Format: PDF
File size: PDF 18 MB
Language: English
Run complex queries on large datasets with amazing speed using a flexible and extensible SQL-embedded database.Key FeaturesEnhance your data analysis capabilities by integrating DuckDB with external data sources and powerful data manipulation librariesGain practical experience by using SQL to efficiently manipulate vast amounts of dataLearn how to clean, reshape and manipulate data from different sources and formatsBook DescriptionDuckDB is one of the hottest and fastest growing databases, driven by its powerful analytical capabilities, ease of use, versatility, and engaged community. It provides readers with an efficient, in-memory, column-oriented, and standards-compliant database to quickly process analytical query workloads through a standard SQL interface.This book teaches you how to install and deploy DuckDB on different platforms and environments. You'll learn to create tables, load and query data with SQL, and progress to cleaning, reshaping, and manipulating data. You'll discover advanced features and techniques to improve operations and performance in DuckDB. You'll explore how DuckDB can be used for complex and efficient data analysis. As you explore later chapters, you'll learn to perform descriptive statistics and exploratory data analysis, and integrate DuckDB with Python, R, and other data analysis libraries. You'll also explore creating, reading, and modifying JSON data in DuckDB, extending the database with SQL editors and third-party data viewers, and optimizing query performance. Lastly, you'll learn the best practices for using DuckDB effectively, including a roadmap of future enhancements.By the end of this book, you will have the skills to leverage DuckDB and unlock meaningful insights from data, making it more impactful.What you will learnExplore the principles of using an in-memory, column-oriented, and fast databaseUse the SQL language to manipulate large amounts of data quicklyPerform exploratory data analysis using powerful data manipulation librariesConnect DuckDB to a range of external data sourcesDiscover techniques to load, transform, model, and aggregate dataEnhance data visualization with Python and R librariesProcess complex data sets in Parquet and nested JSONWho This Book Is ForThis book is for data analysts who want to explore complex data, data engineers who want a lean and efficient transformation tool, and data scientists who need the flexibility of a data manipulation library that integrates seamlessly with Python and R. The readers are required to understand foundational data concepts, such as querying database tables, and have exposure to a programming language such as Python or JavaScript. They'll also need familiarity interacting with command line interfaces and will benefit from having exposure to traditional databases such as PostgreSQL or SQL ServerTable of ContentsIntroduction to DuckDBLoading data into DuckDBData Manipulation with DuckDBDuckDB operations and performanceAdvanced Data Manipulation Features in DuckDBPerform descriptive statistics and exploratory data analysis with DuckDBIntegrating DuckDB with PythonStructured data manipulationExtending the data analysis features of DuckDBEffective DuckDB usageDuckDB - The river ahead