Become Amazon Certified with updated AWS-DEA-C01 exam questions and correct answers
A data engineer needs to build an enterprise data catalog based on the company's Amazon S3 buckets andAmazon RDS databases. The data catalog must include storage format metadata for the data in the catalog.Which solution will meet these requirements with the LEAST effort?
A sales company uses AWS Glue ETL to collect, process, and ingest data into an Amazon S3 bucket. The AWS Glue pipeline creates a new file in the S3 bucket every hour. File sizes vary from 200 KB to 300 KB. The company wants to build a sales prediction model by using data from the previous 5 years. The historic data includes 44,000 files. The company builds a second AWS Glue ETL pipeline by using the smallest worker type. The second pipeline retrieves the historic files from the S3 bucket and processes the files for downstream analysis. The company notices significant performance issues with the second ETL pipeline. The company needs to improve the performance of the second pipeline. Which solution will meet this requirement MOST cost-effectively?
A mobile app tracks user activity data, which is continuously streamed to Amazon Kinesis Data Streams. The app requires a solution to process this data in real-time and update user profiles stored in Amazon DynamoDB based on the activity data.
What combination of AWS services should be used for real-time processing of the stream and updating the user profiles in DynamoDB?
A company has as JSON file that contains personally identifiable information (PIT) data and non-PII data. The company needs to make the data available for querying and analysis. The non-PII data must be available to everyone in the company. The PII data must be available only to a limited group of employees. Which solution will meet these requirements with the LEAST operational overhead?
A company uses Amazon S3 as a data lake. The company sets up a data warehouse by using a multi-nodeAmazon Redshift cluster. The company organizes the data files in the data lake based on the data source ofeach data file.The company loads all the data files into one table in the Redshift cluster by using a separate COPY commandfor each data file location. This approach takes a long time to load all the data files into the table. Thecompany must increase the speed of the data ingestion. The company does not want to increase the cost of theprocess.Which solution will meet these requirements?
© Copyrights DumpsCertify 2026. All Rights Reserved
We use cookies to ensure your best experience. So we hope you are happy to receive all cookies on the DumpsCertify.