Athena Start Query Execution
Runs SQL queries against data stored in Amazon S3 using Amazon Athena and provides a unique ID of the corresponding query execution.
In-ports
query String
— SQL query.
trigger <any>
— triggers a query execution.
abort Boolean
— aborts all pending query executions.
config JSON
(dynamic) — accepts a JSON object with configuration properties that can be set at runtime.
Out-ports
query-execution JSON
— emits query execution object including queryExecutionId
property.
errors JSON
— emits any errors that occur during query execution.
Overview
The Athena Start Query Execution component allows you to run SQL queries against data stored in Amazon S3 using Amazon Athena. This component is part of the group of components that are designed to perform advanced operations with Amazon Athena. Simpler version of Athena interface is available in the Athena Query component.
This component implements the StartQueryExecution Athena action and returns a query execution object that includes a unique queryExecutionId
and other query parameters. For more information about the query execution object, see the Query Execution API documentation.
To use this component, you will need to connect it to your Amazon Athena account and pass an SQL query to the query port. The component will start query execution and return a query execution object on the query-execution port. You can then use the Athena Get Query Execution component to monitor the status of the query, and use the Athena Get Results component to retrieve the query results.
If the trigger input port is connected, the component will wait to execute the query until an event is sent on the trigger port. The component will execute the query every time an event is received on the trigger port. If any errors occur during query execution, the corresponding error message will be emitted on the errors output port.
Reusing query results
Kelp generates a unique token for each request using a combination of query parameters and the application ID. The underlying Athena query will only be executed once per unique token, and subsequent requests with the same token will use the previous results. By default, the component generates a unique token for every request, which forces Athena to execute the query every time.
You can optionally choose to reuse the last stored query result by turning on the Reuse query results setting. When the setting is on, the token will not be regenerated until the query or parameters change or the maximum age for reuse (60 minutes) is exceeded. This feature can improve performance and reduce costs by avoiding unnecessary query execution and data scanning.
Configuration
This component supports dynamic configuration. You can specify the required settings either in the Settings dialog or through a configuration object. To enable the config port for runtime configuration, turn on the Enable realtime config port setting.
Settings
Authentication
Configure authentication to the target service. Select one of the existing connections from the drop-down list, or configure a new connection.
Enable realtime config port
If this setting is enabled, the component can be configured through the config port. This port accepts a configuration object as input and allows you to set dynamic properties at runtime. Note that using this port does not cause the component to reinitialize, but it may cause some previous state of the component to be lost.
Database (database
)
The name of the database against which the query is executed.
Type: String
Required: Yes
Workgroup (workGroup
)
The workgroup name. Athena workgroups allow you to isolate the queries for you or your group of users from others in the same account, and to configure the location of query results and the encryption configuration.
Type: String
Required: Yes
Region (region
)
The name of AWS Region in which you are using Athena. Athena allows you to query Amazon S3 data in a different AWS Region than the one in which you are using Athena. Learn more about querying across AWS regions.
Type: String
Required: Yes
Output location (outputLocation
)
The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/
.
Type: String
Required: Yes
Max query wait time (waitTimeout
)
The maximum amount of time, in milliseconds, to wait for a query to complete successfully.
Type: Number
Default: 5000
Required: No
Encryption option (encryptionOption
)
Athena supports several encryption options for storing datasets and query results in Amazon S3. Select the encryption option to use:
- SSE_S3 (
SSE_S3
) — Server-side encryption (SSE) with an Amazon S3-managed key. - SSE_KMS (
SSE_KMS
) — Server-side encryption (SSE) with a AWS Key Management Service customer managed key. - CSE_KMS (
CSE_KMS
) — Client-side encryption (CSE) with a AWS KMS customer managed key
Type: String
Required: No
KMS Key (kmsKey
)
Provide ARN or ID of your AWS KMS key if you use SSE_KMS
or CSE_KMS
encryption option.
Type: String
Required: No
Reuse query results (reuseQueryResults
)
If this setting is enabled, the component will return the last stored query result when re-running a query until the query or parameters change or the maximum age for reuse (60 minutes) is exceeded.
Type: Boolean
Default: true
Required: No
Keep always active
Determines whether the component will remain active even if it is not connected to a visible widget or another active component.
Configuration object
Here is an example of a configuration object that you can use as a template:
{
"database": "my_database",
"workGroup": "my_workgroup",
"region": "us-east-1",
"outputLocation": "s3://aws-athena-results/test",
"encryptionOption": "SSE_KMS",
"waitTimeout": 5000,
"kmsKey": "my-key"
}
Related
- Athena Simple Query
- Athena List Query Executions
- Athena Get Query Execution
- Athena Get Results
- Connecting to Data
See also
For more information about Amazon Athena, see the following: