Become Amazon Certified with updated AWS-DEA-C01 exam questions and correct answers
A sales company uses AWS Glue ETL to collect, process, and ingest data into an Amazon S3 bucket. The AWS Glue pipeline creates a new file in the S3 bucket every hour. File sizes vary from 200 KB to 300 KB. The company wants to build a sales prediction model by using data from the previous 5 years. The historic data includes 44,000 files. The company builds a second AWS Glue ETL pipeline by using the smallest worker type. The second pipeline retrieves the historic files from the S3 bucket and processes the files for downstream analysis. The company notices significant performance issues with the second ETL pipeline. The company needs to improve the performance of the second pipeline. Which solution will meet this requirement MOST cost-effectively?
A Cloud Data Engineering Team is configuring access control for their company's new multi-account AWS environment. They have a centralized security account and require the ability to allow audit teams to access AWS resources across all accounts for monitoring and compliance purposes. The data engineering team needs to establish a mechanism to grant audit team members from the security account least-privilege access to the necessary resources without creating individual IAM users in each account.
Which of the following steps should the data engineering team implement to set up IAM roles effectively for this requirement? (Select TWO)
A company wants to migrate an application and an on-premises Apache Kafka server to AWS. Theapplication processes incremental updates that an on-premises Oracle database sends to the Kafka server. Thecompany wants to use the replatform migration strategy instead of the refactor strategy.Which solution will meet these requirements with the LEAST management overhead?
A data engineer wants to orchestrate a set of extract, transform, and load (ETL) jobs that run on AWS. TheETL jobs contain tasks that must run Apache Spark jobs on Amazon EMR, make API calls to Salesforce, andload data into Amazon Redshift.The ETL jobs need to handle failures and retries automatically. The data engineer needs to use Python toorchestrate the jobs.Which service will meet these requirements?
A data engineering team at an online retail company is optimizing the performance of their Amazon Redshift data warehouse. The warehouse contains a large sales table with millions of rows and a smaller products table. Queries often join these two tables, and the team wants to optimize the query performance, especially for these join operations.
Which Redshift distribution style should the team use for the sales and products tables to enhance query performance?
© Copyrights DumpsCertify 2026. All Rights Reserved
We use cookies to ensure your best experience. So we hope you are happy to receive all cookies on the DumpsCertify.