Skip to main content

Athena Simple Query

Runs SQL queries against data stored in Amazon S3 using Amazon Athena.

component image

In-ports

query String — SQL query.

trigger <any> — triggers the query execution.

abort Boolean — aborts all pending queries.

config JSON (dynamic) — accepts a JSON object with configuration properties that can be set at runtime.

Out-ports

result-set JSON — aggregates and emits a complete result set (all pages).

by-page JSON — emits query results one page at a time until the query is complete. Note that result-set and by-page ports can work simultaneously.

errors JSON — emits any errors that occur during query execution.

Overview

Athena Query component allows you to run SQL queries on data stored in Amazon S3 using Amazon Athena. It is a compact, single-component version of the Amazon Athena integration and intended for queries that produce quick and short result sets. For more advanced management of query executions and query results use Athena Start Query Execution and Athena Get Results components.

To use this component, you will need to connect it to your Amazon Athena account and pass an SQL query to the query port. The component will create a QueryExecution and iterate through the results pages until the end of the query is reached. Each page of results will be emitted on the by-page port, and the complete result set will be returned on the result-set port.

If the trigger input port is connected, the component will wait to execute the query until an event is received on the trigger port. The component will execute the query every time an event is received on the trigger port. If any errors occur during query execution, the corresponding error message will be emitted on the errors output port. To abort query execution, simply pass any event on the abort input port.

Reusing query results

Kelp generates a unique token for each request using a combination of query parameters and the application ID. The underlying Athena query will only be executed once per unique token, and subsequent requests with the same token will use the previous results. By default, the component generates a unique token for every request, which forces Athena to execute the query every time.

You can optionally choose to reuse the last stored query result by turning on the Reuse query results setting. When the setting is on, the token will not be regenerated until the query or parameters change or the maximum age for reuse (60 minutes) is exceeded. This feature can improve performance and reduce costs by avoiding unnecessary query execution and data scanning.

Configuration

Athena Query component supports dynamic configuration. You can specify the required Athena settings either in the Settings dialog or through a configuration object. To enable the config port for runtime configuration, turn on the Enable realtime config port setting.

Settings

Authentication

Configure authentication to the target service. Select one of the existing connections from the drop-down list, or configure a new connection.

Enable realtime config port

If this setting is enabled, the component can be configured through the config port. This port accepts a configuration object as input and allows you to set dynamic properties at runtime. Note that using this port does not cause the component to reinitialize, but it may cause some previous state of the component to be lost.

Database (database)

The name of the database against which the query is executed.

Type: String
Required: Yes

Workgroup (workGroup)

The workgroup name. Athena workgroups allow you to isolate the queries for you or your group of users from others in the same account, and to configure the location of query results and the encryption configuration.

Type: String
Required: Yes

Region (region)

The name of AWS Region in which you are using Athena. Athena allows you to query Amazon S3 data in a different AWS Region than the one in which you are using Athena. Learn more about querying across AWS regions.

Type: String
Required: Yes

Output location (outputLocation)

The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/.

Type: String
Required: Yes

Max query wait time (waitTimeout)

The maximum amount of time, in milliseconds, to wait for a query to complete successfully.

Type: Number
Default: 5000
Required: No

Results page size (maxResults)

The maximum number of results (rows) to return in one page request.

Type: Number
Default: 1000
Required: Yes

Max number of pages (maxPages)

The maximum number of pages to return. Set the value to 0 to return all pages.

Type: Number
Default: 5
Required: Yes

Encryption option (encryptionOption)

Athena supports several encryption options for storing datasets and query results in Amazon S3. Select the encryption option to use:

  • SSE_S3 (SSE_S3) — Server-side encryption (SSE) with an Amazon S3-managed key.
  • SSE_KMS (SSE_KMS) — Server-side encryption (SSE) with a AWS Key Management Service customer managed key.
  • CSE_KMS (CSE_KMS) — Client-side encryption (CSE) with a AWS KMS customer managed key

Type: String
Required: No

KMS Key (kmsKey)

Provide ARN or ID of your AWS KMS key if you use SSE_KMS or CSE_KMS encryption option.

Type: String
Required: No

Reuse query results (reuseQueryResults)

If this setting is enabled, the component will return the last stored query result when re-running a query until the query or parameters change or the maximum age for reuse (60 minutes) is exceeded.

Type: Boolean
Default: true
Required: No

Keep always active

Determines whether the component will remain active even if it is not connected to a visible widget or another active component.

Examples

Configuration object

Here is an example of a configuration object that you can use as a template:

{
"database": "my_database",
"workGroup": "my_workgroup",
"region": "us-east-1",
"outputLocation": "s3://aws-athena-results/test",
"encryptionOption": "SSE_KMS",
"waitTimeout": 5000,
"maxResults": 1000,
"maxPages": 5,
"kmsKey": "my-key"
}

See also

For more information about Amazon Athena, see the following: