[Full-Version] 2024 New Preparation Guide of Amazon AWS-Certified-Data-Analytics-Specialty Exam [Q72-Q97]

Share

[Full-Version] 2024 New Preparation Guide of Amazon AWS-Certified-Data-Analytics-Specialty Exam

AWS-Certified-Data-Analytics-Specialty Practice Exam - 209 Unique Questions


Earning the DAS-C01 certification demonstrates a candidate's expertise in building and maintaining data analytics solutions using AWS services. AWS Certified Data Analytics - Specialty (DAS-C01) Exam certification can help professionals advance their careers and open up new job opportunities in the field of data analytics.

 

NEW QUESTION # 72
A regional energy company collects voltage data from sensors attached to buildings. To address any known dangerous conditions, the company wants to be alerted when a sequence of two voltage drops is detected within 10 minutes of a voltage spike at the same building. It is important to ensure that all messages are delivered as quickly as possible. The system must be fully managed and highly available. The company also needs a solution that will automatically scale up as it covers additional cites with this monitoring feature. The alerting system is subscribed to an Amazon SNS topic for remediation.
Which solution meets these requirements?

  • A. Create an Amazon Kinesis data stream to capture the incoming sensor data and create another stream for alert messages. Set up AWS Application Auto Scaling on both. Create a Kinesis Data Analytics for Java application to detect the known event sequence, and add a message to the message stream. Configure an AWS Lambda function to poll the message stream and publish to the SNS topic.
  • B. Create a REST-based web service using Amazon API Gateway in front of an AWS Lambda function.
    Create an Amazon RDS for PostgreSQL database with sufficient Provisioned IOPS (PIOPS). In the Lambda function, store incoming events in the RDS database and query the latest data to detect the known event sequence and send the SNS message.
  • C. Create an Amazon Kinesis Data Firehose delivery stream to capture the incoming sensor data. Use an AWS Lambda transformation function to detect the known event sequence and send the SNS message.
  • D. Create an Amazon Managed Streaming for Kafka cluster to ingest the data, and use an Apache Spark Streaming with Apache Kafka consumer API in an automatically scaled Amazon EMR cluster to process the incoming data. Use the Spark Streaming application to detect the known event sequence and send the SNS message.

Answer: A


NEW QUESTION # 73
A company has a process that writes two datasets in CSV format to an Amazon S3 bucket every 6 hours. The company needs to join the datasets, convert the data to Apache Parquet, and store the data within another bucket for users to query using Amazon Athen a. The data also needs to be loaded to Amazon Redshift for advanced analytics. The company needs a solution that is resilient to the failure of any individual job component and can be restarted in case of an error.
Which solution meets these requirements with the LEAST amount of operational overhead?

  • A. Create an AWS Glue job using PySpark that creates dynamic frames of the datasets in Amazon S3, transforms the data, joins the data, writes the data back to Amazon S3, and loads the data to Amazon Redshift. Use an AWS Glue workflow to orchestrate the AWS Glue job.
  • B. Create an AWS Glue job using Python Shell that generates dynamic frames of the datasets in Amazon S3, transforms the data, joins the data, writes the data back to Amazon S3, and loads the data to Amazon Redshift. Use an AWS Glue workflow to orchestrate the AWS Glue job at the desired frequency.
  • C. Use AWS Step Functions to orchestrate an Amazon EMR cluster running Apache Spark. Use PySpark to generate data frames of the datasets in Amazon S3, transform the data, join the data, write the data back to Amazon S3, and load the data to Amazon Redshift.
  • D. Use AWS Step Functions to orchestrate the AWS Glue job. Create an AWS Glue job using Python Shell that creates dynamic frames of the datasets in Amazon S3, transforms the data, joins the data, writes the data back to Amazon S3, and loads the data to Amazon Redshift.

Answer: A

Explanation:
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics1. It can process datasets from various sources and formats, such as CSV and Parquet, and write them to different destinations, such as Amazon S3 and Amazon Redshift2.
AWS Glue provides two types of jobs: Spark and Python Shell. Spark jobs run on Apache Spark, a distributed processing framework that supports a wide range of data processing tasks3. Python Shell jobs run Python scripts on a managed serverless infrastructure4. Spark jobs are more suitable for complex data transformations and joins than Python Shell jobs.
AWS Glue provides dynamic frames, which are an extension of Apache Spark data frames. Dynamic frames handle schema variations and errors in the data more easily than data frames. They also provide a set of transformations that can be applied to the data, such as join, filter, map, etc.
AWS Glue provides workflows, which are directed acyclic graphs (DAGs) that orchestrate multiple ETL jobs and crawlers. Workflows can handle dependencies, retries, error handling, and concurrency for ETL jobs and crawlers. They can also be triggered by schedules or events.
By creating an AWS Glue job using PySpark that creates dynamic frames of the datasets in Amazon S3, transforms the data, joins the data, writes the data back to Amazon S3, and loads the data to Amazon Redshift, the company can perform the required ETL tasks with a single job. By using an AWS Glue workflow to orchestrate the AWS Glue job, the company can schedule and monitor the job execution with minimal operational overhead.


NEW QUESTION # 74
A large university has adopted a strategic goal of increasing diversity among enrolled students. The data analytics team is creating a dashboard with data visualizations to enable stakeholders to view historical trends.
All access must be authenticated using Microsoft Active Directory. All data in transit and at rest must be encrypted.
Which solution meets these requirements?

  • A. Amazon QuickSight Standard edition configured to perform identity federation using SAML 2.0. and the default encryption settings.
  • B. Amazon QuickSight Enterprise edition using AD Connector to authenticate using Active Directory.
    Configure Amazon QuickSight to use customer-provided keys imported into AWS KMS.
  • C. Amazon QuickSight Enterprise edition configured to perform identity federation using SAML 2.0 and the default encryption settings.
  • D. Amazon QuckSight Standard edition using AD Connector to authenticate using Active Directory.
    Configure Amazon QuickSight to use customer-provided keys imported into AWS KMS.

Answer: B


NEW QUESTION # 75
A company is building a data lake and needs to ingest data from a relational database that has time-series data.
The company wants to use managed services to accomplish this. The process needs to be scheduled daily and bring incremental data only from the source into Amazon S3.
What is the MOST cost-effective approach to meet these requirements?

  • A. Use AWS Glue to connect to the data source using JDBC Drivers. Store the last updated key in an Amazon DynamoDB table and ingest the data using the updated key as a filter.
  • B. Use AWS Glue to connect to the data source using JDBC Drivers and ingest the full data. Use AWS DataSync to ensure the delta only is written into Amazon S3.
  • C. Use AWS Glue to connect to the data source using JDBC Drivers and ingest the entire dataset. Use appropriate Apache Spark libraries to compare the dataset, and find the delta.
  • D. Use AWS Glue to connect to the data source using JDBC Drivers. Ingest incremental records only using job bookmarks.

Answer: D

Explanation:
Explanation
https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html


NEW QUESTION # 76
A large company receives files from external parties in Amazon EC2 throughout the day. At the end of the day, the files are combined into a single file, compressed into a gzip file, and uploaded to Amazon S3. The total size of all the files is close to 100 GB daily. Once the files are uploaded to Amazon S3, an AWS Batch program executes a COPY command to load the files into an Amazon Redshift cluster.
Which program modification will accelerate the COPY process?

  • A. Split the number of files so they are equal to a multiple of the number of slices in the Amazon Redshift cluster. Gzip and upload the files to Amazon S3. Run the COPY command on the files.
  • B. Split the number of files so they are equal to a multiple of the number of compute nodes in the Amazon Redshift cluster. Gzip and upload the files to Amazon S3. Run the COPY command on the files.
  • C. Apply sharding by breaking up the files so the distkey columns with the same values go to the same file.
    Gzip and upload the sharded files to Amazon S3. Run the COPY command on the files.
  • D. Upload the individual files to Amazon S3 and run the COPY command as soon as the files become available.

Answer: A


NEW QUESTION # 77
An education provider's learning management system (LMS) is hosted in a 100 TB data lake that is built on Amazon S3. The provider's LMS supports hundreds of schools. The provider wants to build an advanced analytics reporting platform using Amazon Redshift to handle complex queries with optimal performance.
System users will query the most recent 4 months of data 95% of the time while 5% of the queries will leverage data from the previous 12 months.
Which solution meets these requirements in the MOST cost-effective way?

  • A. Store the most recent 4 months of data in the Amazon Redshift cluster. Use Amazon Redshift Spectrum to query data in the data lake. Use S3 lifecycle management rules to store data from the previous 12 months in Amazon S3 Glacier storage.
  • B. Store the most recent 4 months of data in the Amazon Redshift cluster. Use Amazon Redshift Spectrum to query data in the data lake. Ensure the S3 Standard storage class is in use with objects in the data lake.
  • C. Store the most recent 4 months of data in the Amazon Redshift cluster. Use Amazon Redshift federated queries to join cluster data with the data lake to reduce costs. Ensure the S3 Standard storage class is in use with objects in the data lake.
  • D. Leverage DS2 nodes for the Amazon Redshift cluster. Migrate all data from Amazon S3 to Amazon Redshift. Decommission the data lake.

Answer: B


NEW QUESTION # 78
A marketing company has data in Salesforce, MySQL, and Amazon S3. The company wants to use data from these three locations and create mobile dashboards for its users. The company is unsure how it should create the dashboards and needs a solution with the least possible customization and coding.
Which solution meets these requirements?

  • A. Use Amazon Redshift federated queries to join the data sources. Use Amazon QuickSight to generate the mobile dashboards.
  • B. Use AWS Lake Formation to migrate the data sources into Amazon S3. Use Amazon QuickSight to generate the mobile dashboards.
  • C. Use Amazon QuickSight to connect to the data sources and generate the mobile dashboards.
  • D. Use Amazon Athena federated queries to join the data sources. Use Amazon QuickSight to generate the mobile dashboards.

Answer: A


NEW QUESTION # 79
A financial services company needs to aggregate daily stock trade data from the exchanges into a data store.
The company requires that data be streamed directly into the data store, but also occasionally allows data to be modified using SQL. The solution should integrate complex, analytic queries running with minimal latency.
The solution must provide a business intelligence dashboard that enables viewing of the top contributors to anomalies in stock prices.
Which solution meets the company's requirements?

  • A. Use Amazon Kinesis Data Firehose to stream data to Amazon S3. Use Amazon Athena as a data source for Amazon QuickSight to create a business intelligence dashboard.
  • B. Use Amazon Kinesis Data Streams to stream data to Amazon S3. Use Amazon Athena as a data source for Amazon QuickSight to create a business intelligence dashboard.
  • C. Use Amazon Kinesis Data Firehose to stream data to Amazon Redshift. Use Amazon Redshift as a data source for Amazon QuickSight to create a business intelligence dashboard.
  • D. Use Amazon Kinesis Data Streams to stream data to Amazon Redshift. Use Amazon Redshift as a data source for Amazon QuickSight to create a business intelligence dashboard.

Answer: C


NEW QUESTION # 80
A company uses Amazon Redshift as its data warehouse. The Redshift cluster is not encrypted. A data analytics specialist needs to use hardware security module (HSM) managed encryption keys to encrypt the data that is stored in the Redshift cluster.
Which combination of steps will meet these requirements? (Select THREE.)

  • A. Modify the source cluster by activating encryption from an external HSM. Configure Amazon Redshift to automatically migrate data to a new encrypted cluster.
  • B. Stop all write operations on the source cluster. Unload data from the source cluster.
  • C. Copy the data to a new target cluster that is encrypted with an HSM from AWS CloudHSM.
  • D. Copy the data to a new target cluster that is encrypted with AWS Key Management Service (AWS KMS).
  • E. Rename the source cluster and the target cluster after the migration so that the target cluster is using the original endpoint.
  • F. Modify the source cluster by activating AWS CloudHSM encryption. Configure Amazon Redshift to automatically migrate data to a new encrypted cluster.

Answer: B,C,E


NEW QUESTION # 81
A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs from the job run show no error codes. The data analyst wants to improve the job execution time without overprovisioning.
Which actions should the data analyst take?

  • A. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter.
  • B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the spark.yarn.executor.memoryOverhead job parameter.
  • C. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the num-executors job parameter.
  • D. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the executor-cores job parameter.

Answer: A


NEW QUESTION # 82
A media company has been performing analytics on log data generated by its applications. There has been a recent increase in the number of concurrent analytics jobs running, and the overall performance of existing jobs is decreasing as the number of new jobs is increasing. The partitioned data is stored in Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) and the analytic processing is performed on Amazon EMR clusters using the EMR File System (EMRFS) with consistent view enabled. A data analyst has determined that it is taking longer for the EMR task nodes to list objects in Amazon S3.
Which action would MOST likely increase the performance of accessing log data in Amazon S3?

  • A. Increase the read capacity units (RCUs) for the shared Amazon DynamoDB table.
  • B. Use a hash function to create a random string and add that to the beginning of the object prefixes when storing the log data in Amazon S3.
  • C. Redeploy the EMR clusters that are running slowly to a different Availability Zone.
  • D. Use a lifecycle policy to change the S3 storage class to S3 Standard for the log data.

Answer: A

Explanation:
Explanation
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emrfs-metadata.html


NEW QUESTION # 83
A company developed a new elections reporting website that uses Amazon Kinesis Data Firehose to deliver full logs from AWS WAF to an Amazon S3 bucket. The company is now seeking a low-cost option to perform this infrequent data analysis with visualizations of logs in a way that requires minimal development effort.
Which solution meets these requirements?

  • A. Use an AWS Glue crawler to create and update a table in the Glue data catalog from the logs. Use Athena to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.
  • B. Create a second Kinesis Data Firehose delivery stream to deliver the log files to Amazon Elasticsearch Service (Amazon ES). Use Amazon ES to perform text-based searches of the logs for ad-hoc analyses and use Kibana for data visualizations.
  • C. Create an Amazon EMR cluster and use Amazon S3 as the data source. Create an Apache Spark job to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.
  • D. Create an AWS Lambda function to convert the logs into .csv format. Then add the function to the Kinesis Data Firehose transformation configuration. Use Amazon Redshift to perform ad-hoc analyses of the logs using SQL queries and use Amazon QuickSight to develop data visualizations.

Answer: A

Explanation:
Explanation
https://aws.amazon.com/blogs/big-data/analyzing-aws-waf-logs-with-amazon-es-amazon-athena-and-amazon-qu


NEW QUESTION # 84
A company is planning to create a data lake in Amazon S3. The company wants to create tiered storage based on access patterns and cost objectives. The solution must include support for JDBC connections from legacy clients, metadata management that allows federation for access control, and batch-based ETL using PySpark and Scala. Operational management should be limited.
Which combination of components can meet these requirements? (Choose three.)

  • A. AWS Glue Data Catalog for metadata management
  • B. Amazon EMR with Apache Hive, using an Amazon RDS with MySQL-compatible backed metastore
  • C. Amazon EMR with Apache Hive for JDBC clients
  • D. Amazon Athena for querying data in Amazon S3 using JDBC drivers
  • E. Amazon EMR with Apache Spark for ETL
  • F. AWS Glue for Scala-based ETL

Answer: B,D,E


NEW QUESTION # 85
A company stores its sales and marketing data that includes personally identifiable information (PII) in Amazon S3. The company allows its analysts to launch their own Amazon EMR cluster and run analytics reports with the data. To meet compliance requirements, the company must ensure the data is not publicly accessible throughout this process. A data engineer has secured Amazon S3 but must ensure the individual EMR clusters created by the analysts are not exposed to the public internet.
Which solution should the data engineer to meet this compliance requirement with LEAST amount of effort?

  • A. Check the security group of the EMR clusters regularly to ensure it does not allow inbound traffic from IPv4 0.0.0.0/0 or IPv6 ::/0.
  • B. Create an EMR security configuration and ensure the security configuration is associated with the EMR clusters when they are created.
  • C. Enable the block public access setting for Amazon EMR at the account level before any EMR cluster is created.
  • D. Use AWS WAF to block public internet access to the EMR clusters across the board.

Answer: C

Explanation:
Explanation
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-block-public-access.html


NEW QUESTION # 86
A company wants to provide its data analysts with uninterrupted access to the data in its Amazon Redshift cluster. All data is streamed to an Amazon S3 bucket with Amazon Kinesis Data Firehose. An AWS Glue job that is scheduled to run every 5 minutes issues a COPY command to move the data into Amazon Redshift.
The amount of data delivered is uneven throughout the day, and cluster utilization is high during certain periods. The COPY command usually completes within a couple of seconds. However, when load spike occurs, locks can exist and data can be missed. Currently, the AWS Glue job is configured to run without retries, with timeout at 5 minutes and concurrency at 1.
How should a data analytics specialist configure the AWS Glue job to optimize fault tolerance and improve data availability in the Amazon Redshift cluster?

  • A. Keep the number of retries at 0. Decrease the timeout value. Increase the job concurrency.
  • B. Keep the number of retries at 0. Decrease the timeout value. Keep the job concurrency at 1.
  • C. Keep the number of retries at 0. Increase the timeout value. Keep the job concurrency at 1.
  • D. Increase the number of retries. Decrease the timeout value. Increase the job concurrency.

Answer: A


NEW QUESTION # 87
A media company wants to perform machine learning and analytics on the data residing in its Amazon S3 data lake. There are two data transformation requirements that will enable the consumers within the company to create reports:
Daily transformations of 300 GB of data with different file formats landing in Amazon S3 at a scheduled time.
One-time transformations of terabytes of archived data residing in the S3 data lake.
Which combination of solutions cost-effectively meets the company's requirements for transforming the data? (Choose three.)

  • A. For archived data, use Amazon SageMaker to perform data transformations.
  • B. For daily incoming data, use Amazon Athena to scan and identify the schema.
  • C. For archived data, use Amazon EMR to perform data transformations.
  • D. For daily incoming data, use AWS Glue crawlers to scan and identify the schema.
  • E. For daily incoming data, use AWS Glue workflows with AWS Glue jobs to perform transformations.
  • F. For daily incoming data, use Amazon Redshift to perform transformations.

Answer: C,D,E


NEW QUESTION # 88
A company wants to run analytics on its Elastic Load Balancing logs stored in Amazon S3. A data analyst needs to be able to query all data from a desired year, month, or day. The data analyst should also be able to query a subset of the columns. The company requires minimal operational overhead and the most cost-effective solution.
Which approach meets these requirements for optimizing and querying the log data?

  • A. Launch a long-running Amazon EMR cluster that continuously transforms new log files from Amazon S3 into its Hadoop Distributed File System (HDFS) storage and partitions by year, month, and day. Use Apache Presto to query the optimized format.
  • B. Use an AWS Glue job nightly to transform new log files into Apache Parquet format and partition by year, month, and day. Use AWS Glue crawlers to detect new partitions. Use Amazon Athena to query data.
  • C. Launch a transient Amazon EMR cluster nightly to transform new log files into Apache ORC format and partition by year, month, and day. Use Amazon Redshift Spectrum to query the data.
  • D. Use an AWS Glue job nightly to transform new log files into .csv format and partition by year, month, and day. Use AWS Glue crawlers to detect new partitions. Use Amazon Athena to query data.

Answer: C


NEW QUESTION # 89
A marketing company has an application that stores event data in an Amazon RDS database. The company is replicating this data to Amazon Redshift for reporting and business intelligence (BI) purposes. New event data is continuously generated and ingested into the RDS database throughout the day and captured by a change data capture (CDC) replication task in AWS Database Migration Service (AWS DMS). The company requires that the new data be replicated to Amazon Redshift in near-real time.
Which solution meets these requirements?

  • A. Use Amazon Kinesis Data Firehose as the destination of the CDC replication task in AWS DMS. Use an AWS Glue streaming job to read changed records from Kinesis Data Firehose and perform an upsert into the Redshift cluster.
  • B. Use Amazon DynamoDB as the destination of the CDC replication task in AWS DMS. Use the COPY command to load data into the Redshift cluster.
  • C. Use Amazon Kinesis Data Streams as the destination of the CDC replication task in AWS DMS. Use an AWS Glue streaming job to read changed records from Kinesis Data Streams and perform an upsert into the Redshift cluster.
  • D. Use Amazon S3 as the destination of the CDC replication task in AWS DMS. Use the COPY command to load data into the Redshift cluster.

Answer: C


NEW QUESTION # 90
A retail company's data analytics team recently created multiple product sales analysis dashboards for the average selling price per product using Amazon QuickSight. The dashboards were created from .csv files uploaded to Amazon S3. The team is now planning to share the dashboards with the respective external product owners by creating individual users in Amazon QuickSight. For compliance and governance reasons, restricting access is a key requirement. The product owners should view only their respective product analysis in the dashboard reports.
Which approach should the data analytics team take to allow product owners to view only their products in the dashboard?

  • A. Create dataset rules with row-level security.
  • B. Create a manifest file with row-level security.
  • C. Separate the data by product and use S3 bucket policies for authorization.
  • D. Separate the data by product and use IAM policies for authorization.

Answer: D


NEW QUESTION # 91
An ecommerce company uses Amazon Aurora PostgreSQL to process and store live transactional data and uses Amazon Redshift for its data warehouse solution. A nightly ET L job has been implemented to update the Redshift cluster with new data from the PostgreSQL database. The business has grown rapidly and so has the size and cost of the Redshift cluster. The company's data analytics team needs to create a solution to archive historical data and only keep the most recent 12 months of data in Amazon Redshift to reduce costs. Data analysts should also be able to run analytics queries that effectively combine data from live transactional data in PostgreSQL, current data in Redshift, and archived historical data.
Which combination of tasks will meet these requirements? (Select THREE.)

  • A. Configure Amazon Redshift Spectrum to query live transactional data in the PostgreSQL database.
  • B. Schedule a monthly job to copy data older than 12 months to Amazon S3 Glacier Flexible Retrieval by using the UNLOAD command, and then delete that data from the Redshift cluster. Configure Redshift Spectrum to access historical data with S3 Glacier Flexible Retrieval.
  • C. Configure the Amazon Redshift Federated Query feature to query live transactional data in the PostgreSQL database.
  • D. Create a materialized view in Amazon Redshift that combines live, current, and historical data from different sources.
  • E. Create a late-binding view in Amazon Redshift that combines live, current, and historical data from different sources.
  • F. Schedule a monthly job to copy data older than 12 months to Amazon S3 by using the UNLOAD command, and then delete that data from the Redshift cluster. Configure Amazon Redshift Spectrum to access historical data in Amazon S3.

Answer: C,E,F


NEW QUESTION # 92
A banking company is currently using an Amazon Redshift cluster with dense storage (DS) nodes to store sensitive data. An audit found that the cluster is unencrypted. Compliance requirements state that a database with sensitive data must be encrypted through a hardware security module (HSM) with automated key rotation.
Which combination of steps is required to achieve compliance? (Choose two.)

  • A. Enable Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) encryption in the HSM.
  • B. Set up a trusted connection with HSM using a client and server certificate with automatic key rotation.
  • C. Modify the cluster with an HSM encryption option and automatic key rotation.
  • D. Create a new HSM-encrypted Amazon Redshift cluster and migrate the data to the new cluster.
  • E. Enable HSM with key rotation through the AWS CLI.

Answer: C,E


NEW QUESTION # 93
A business intelligence (Bl) engineer must create a dashboard to visualize how often certain keywords are used in relation to others in social media posts about a public figure. The Bl engineer extracts the keywords from the posts and loads them into an Amazon Redshift table. The table displays the keywords and the count corresponding to each keyword.
The Bl engineer needs to display the top keywords with more emphasis on the most frequently used keywords.
Which visual type in Amazon QuickSight meets these requirements?

  • A. Circle packing
  • B. Word clouds
  • C. Bar charts
  • D. Heat maps

Answer: B


NEW QUESTION # 94
A company analyzes its data in an Amazon Redshift data warehouse, which currently has a cluster of three dense storage nodes. Due to a recent business acquisition, the company needs to load an additional 4 TB of user data into Amazon Redshift. The engineering team will combine all the user data and apply complex calculations that require I/O intensive resources. The company needs to adjust the cluster's capacity to support the change in analytical and storage requirements.
Which solution meets these requirements?

  • A. Resize the cluster using classic resize with dense storage nodes.
  • B. Resize the cluster using elastic resize with dense compute nodes.
  • C. Resize the cluster using elastic resize with dense storage nodes.
  • D. Resize the cluster using classic resize with dense compute nodes.

Answer: C


NEW QUESTION # 95
A retail company is building its data warehouse solution using Amazon Redshift. As a part of that effort, the company is loading hundreds of files into the fact table created in its Amazon Redshift cluster. The company wants the solution to achieve the highest throughput and optimally use cluster resources when loading data into the company's fact table.
How should the company meet these requirements?

  • A. Use S3DistCp to load multiple files into the Hadoop Distributed File System (HDFS) and use an HDFS connector to ingest the data into the Amazon Redshift cluster.
  • B. Use LOAD commands equal to the number of Amazon Redshift cluster nodes and load the data in parallel into each node.
  • C. Use a single COPY command to load the data into the Amazon Redshift cluster.
  • D. Use multiple COPY commands to load the data into the Amazon Redshift cluster.

Answer: C

Explanation:
Explanation
https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-single-copy-command.html


NEW QUESTION # 96
A company uses Amazon Elasticsearch Service (Amazon ES) to store and analyze its website clickstream dat a. The company ingests 1 TB of data daily using Amazon Kinesis Data Firehose and stores one day's worth of data in an Amazon ES cluster.
The company has very slow query performance on the Amazon ES index and occasionally sees errors from Kinesis Data Firehose when attempting to write to the index. The Amazon ES cluster has 10 nodes running a single index and 3 dedicated master nodes. Each data node has 1.5 TB of Amazon EBS storage attached and the cluster is configured with 1,000 shards. Occasionally, JVMMemoryPressure errors are found in the cluster logs.
Which solution will improve the performance of Amazon ES?

  • A. Decrease the number of Amazon ES data nodes.
  • B. Decrease the number of Amazon ES shards for the index.
  • C. Increase the memory of the Amazon ES master nodes.
  • D. Increase the number of Amazon ES shards for the index.

Answer: B

Explanation:
https://aws.amazon.com/premiumsupport/knowledge-center/high-jvm-memory-pressure-elasticsearch/


NEW QUESTION # 97
......


Amazon AWS Certified Data Analytics Specialty (DAS-C01) certification exam is designed for professionals who work with data analytics in the Amazon Web Services (AWS) environment. AWS Certified Data Analytics - Specialty (DAS-C01) Exam certification exam tests the candidate's knowledge and skills in designing, building, and maintaining data analytics solutions using AWS services. AWS-Certified-Data-Analytics-Specialty exam covers a wide range of topics, including data collection, data storage, data processing, and data visualization.


The AWS Certified Data Analytics - Specialty (DAS-C01) certification exam is designed for individuals who are looking to validate their expertise in data analytics on the AWS platform. AWS Certified Data Analytics - Specialty (DAS-C01) Exam certification is suitable for data analysts, data engineers, and other professionals who work with data on a regular basis. The DAS-C01 exam tests your knowledge on a wide range of topics related to data analytics, including data collection, processing, storage, and analysis.

 

Latest Questions AWS-Certified-Data-Analytics-Specialty Guide to Prepare Free Practice Tests: https://actual4test.torrentvce.com/AWS-Certified-Data-Analytics-Specialty-valid-vce-collection.html