Выходит новая книга
Data Engineering with AWS, автор книги является Senior Solution Architect в AWS.
Book Description
Knowing how to architect and implement complex data pipelines is a highly sought-after skill. Data engineers are responsible for building these pipelines that ingest, transform, and join raw datasets - creating new value from the data in the process.
Amazon Web Services (AWS) offers a range of tools to simplify a data engineer's job, making it the preferred platform for performing data engineering tasks.
This book will take you through the services and the skills you need to architect and implement data pipelines on AWS. You'll begin by reviewing important data engineering concepts and some of the core AWS services that form a part of the data engineer's toolkit. You'll then architect a data pipeline, review raw data sources, transform the data, and learn how the transformed data is used by various data consumers. The book also teaches you about populating data marts and data warehouses along with how a data lakehouse fits into the picture. Later, you'll be introduced to AWS tools for analyzing data, including those for ad-hoc SQL queries and creating visualizations. In the final chapters, you'll understand how the power of machine learning and artificial intelligence can be used to draw new insights from data.
By the end of this AWS book, you'll be able to carry out data engineering tasks and implement a data pipeline on AWS independently.
What you will learn
-Understand data engineering concepts and emerging technologies
-Ingest streaming data with Amazon Kinesis Data Firehose
Optimize, denormalize, and join datasets with AWS Glue Studio
-Use Amazon S3 events to trigger a Lambda process to transform a file
-Run complex SQL queries on data lake data using Amazon Athena
-Load data into a Redshift data warehouse and run queries
-Create a visualization of your data using Amazon QuickSight
-Extract sentiment data from a dataset using Amazon
Table of Contents
- An Introduction to Data Engineering
- Data Management Architectures for Analytics
- The AWS Data Engineer's Toolkit
- Data Cataloging, Security and Governance
- Architecting Data Engineering Pipelines
- Ingesting Batch and Streaming Data
- Transforming Data to Optimize for Analytics
- Identifying and Enabling Data Consumers
- Loading Data into a Data Mart
- Orchestrating the Data Pipeline
- Ad Hoc Queries with Amazon Athena
- Visualizing Data with Amazon QuickSight
- Enabling Artificial Intelligence and Machine LearningBook DescriptionKnowing how to architect and implement complex data pipelines is a highly sought-after skill. Data engineers are responsible for building these pipelines that ingest, transform, and join raw datasets - creating new value from the data in the process.
Amazon Web Services (AWS) offers a range of tools to simplify a data engineer's job, making it the preferred platform for performing data engineering tasks.
This book will take you through the services and the skills you need to architect and implement data pipelines on AWS. You'll begin by reviewing important data engineering concepts and some of the core AWS services that form a part of the data engineer's toolkit. You'll then architect a data pipeline, review raw data sources, transform the data, and learn how the transformed data is used by various data consumers. The book also teaches you about populating data marts and data warehouses along with how a data lakehouse fits into the picture. Later, you'll be introduced to AWS tools for analyzing data, including those for ad-hoc SQL queries and creating visualizations. In the final chapters, you'll understand how the power of machine learning and artificial intelligence can be used to draw new insights from data.
By the end of this AWS book, you'll be able to carry out data engineering tasks and implement a data pipeline on AWS independently.
What you will learn-Understand data engineering concepts and emerging technologies
-Ingest streaming data with Amazon Kinesis Data Firehose
Optimize, denormalize, and join datasets with AWS Glue Studio
-Use Amazon S3 events to trigger a Lambda process to transform a file
-Run complex SQL queries on data lake data using Amazon Athena
-Load data into a Redshift data warehouse and run queries
-Create a visualization of your data using Amazon QuickSight
-Extract sentiment data from a dataset using Amazon
Table of Contents- An Introduction to Data Engineering
- Data Management Architectures for Analytics
- The AWS Data Engineer's Toolkit
- Data Cataloging, Security and Governance
- Architecting Data Engineering Pipelines
- Ingesting Batch and Streaming Data
- Transforming Data to Optimize for Analytics
- Identifying and Enabling Data Consumers
- Loading Data into a Data Mart
- Orchestrating the Data Pipeline
- Ad Hoc Queries with Amazon Athena
- Visualizing Data with Amazon QuickSight
- Enabling Artificial Intelligence and Machine Learning
Для тех кому предстоит работать в AWS книга будет очень кстати.