Skip to main content

Athena Start Query Execution

Runs SQL queries against data stored in Amazon S3 using Amazon Athena and provides a unique ID of the corresponding query execution.

component image

In-ports

query String — SQL query.

trigger <any> — triggers a query execution.

abort Boolean — aborts all pending query executions.

config JSON (dynamic) — accepts a JSON object with configuration properties that can be set at runtime.

Out-ports

query-execution JSON — emits query execution object including queryExecutionId property.

errors JSON — emits any errors that occur during query execution.

Overview

The Athena Start Query Execution component allows you to run SQL queries against data stored in Amazon S3 using Amazon Athena. This component is part of the group of components that are designed to perform advanced operations with Amazon Athena. Simpler version of Athena interface is available in the Athena Query component.

This component implements the StartQueryExecution Athena action and returns a query execution object that includes a unique queryExecutionId and other query parameters. For more information about the query execution object, see the Query Execution API documentation.

To use this component, you will need to connect it to your Amazon Athena account and pass an SQL query to the query port. The component will start query execution and return a query execution object on the query-execution port. You can then use the Athena Get Query Execution component to monitor the status of the query, and use the Athena Get Results component to retrieve the query results.

If the trigger input port is connected, the component will wait to execute the query until an event is sent on the trigger port. The component will execute the query every time an event is received on the trigger port. If any errors occur during query execution, the corresponding error message will be emitted on the errors output port.

Reusing query results

Kelp generates a unique token for each request using a combination of query parameters and the application ID. The underlying Athena query will only be executed once per unique token, and subsequent requests with the same token will use the previous results. By default, the component generates a unique token for every request, which forces Athena to execute the query every time.

You can optionally choose to reuse the last stored query result by turning on the Reuse query results setting. When the setting is on, the token will not be regenerated until the query or parameters change or the maximum age for reuse (60 minutes) is exceeded. This feature can improve performance and reduce costs by avoiding unnecessary query execution and data scanning.

Configuration

This component supports dynamic configuration. You can specify the required settings either in the Settings dialog or through a configuration object. To enable the config port for runtime configuration, turn on the Enable realtime config port setting.

Settings

Authentication

Configure authentication to the target service. Select one of the existing connections from the drop-down list, or configure a new connection.

Enable realtime config port

If this setting is enabled, the component can be configured through the config port. This port accepts a configuration object as input and allows you to set dynamic properties at runtime. Note that using this port does not cause the component to reinitialize, but it may cause some previous state of the component to be lost.

Database (database)

The name of the database against which the query is executed.

Type: String
Required: Yes

Workgroup (workGroup)

The workgroup name. Athena workgroups allow you to isolate the queries for you or your group of users from others in the same account, and to configure the location of query results and the encryption configuration.

Type: String
Required: Yes

Region (region)

The name of AWS Region in which you are using Athena. Athena allows you to query Amazon S3 data in a different AWS Region than the one in which you are using Athena. Learn more about querying across AWS regions.

Type: String
Required: Yes

Output location (outputLocation)

The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/.

Type: String
Required: Yes

Max query wait time (waitTimeout)

The maximum amount of time, in milliseconds, to wait for a query to complete successfully.

Type: Number
Default: 5000
Required: No

Encryption option (encryptionOption)

Athena supports several encryption options for storing datasets and query results in Amazon S3. Select the encryption option to use:

  • SSE_S3 (SSE_S3) — Server-side encryption (SSE) with an Amazon S3-managed key.
  • SSE_KMS (SSE_KMS) — Server-side encryption (SSE) with a AWS Key Management Service customer managed key.
  • CSE_KMS (CSE_KMS) — Client-side encryption (CSE) with a AWS KMS customer managed key

Type: String
Required: No

KMS Key (kmsKey)

Provide ARN or ID of your AWS KMS key if you use SSE_KMS or CSE_KMS encryption option.

Type: String
Required: No

Reuse query results (reuseQueryResults)

If this setting is enabled, the component will return the last stored query result when re-running a query until the query or parameters change or the maximum age for reuse (60 minutes) is exceeded.

Type: Boolean
Default: true
Required: No

Keep always active

Determines whether the component will remain active even if it is not connected to a visible widget or another active component.

Configuration object

Here is an example of a configuration object that you can use as a template:

{
"database": "my_database",
"workGroup": "my_workgroup",
"region": "us-east-1",
"outputLocation": "s3://aws-athena-results/test",
"encryptionOption": "SSE_KMS",
"waitTimeout": 5000,
"kmsKey": "my-key"
}

See also

For more information about Amazon Athena, see the following: