AWS S3 Node Configuration Guide

The AWS S3 Node is used to retrieve files from an Amazon S3 bucket and ingest them into a Rayven workflow. It supports polling a folder at regular intervals, parsing supported file types (e.g., JSON), and extracting metadata.

What It Does

This node connects securely to an Amazon S3 bucket, downloads files from a specified folder, and emits each file's content as a JSON payload. It can also tag the payload with UID, timestamp, and filename information for traceability and routing.

Step-by-Step: How to Configure the AWS S3 Node

Add the node
- Drag the AWS S3 Node from the Inputs section into the workflow canvas.
Open the configuration panel
- Double-click the node to set up credentials, folder targeting, and extraction rules.
Connect downstream nodes
- Link to nodes that will process or visualize the imported data.

Configuration Fields

AWS Connection & Bucket Access

Field	Requirement	Description
Node Name*	Required	Internal name for the node in your workflow.
S3 Bucket Name*	Required	Name of the S3 bucket to read files from.
AWS Access Key ID*	Required	IAM access key for S3 authentication.
AWS Secret Access Key*	Required	Secret access key paired with the above.
S3 Folder Name*	Required	The folder (prefix) inside the bucket to monitor. Use `/` or leave blank to target the root.
AWS Region*	Required	AWS region where the bucket is located (e.g., `ap-northeast-1`).

Polling Settings

Field	Requirement	Description
File Download Interval (minutes)*	Required	How often to poll the S3 folder for new files (e.g., every 5 minutes).

Payload Parsing

Field	Requirement	Description
Payload Format*	Required	Format of the file: currently supports `JSON`.
UID JSON Key / UID Column / XML Tag	Optional	Used to extract the UID from the file content based on its structure.

Advanced Features

Field	Description
Filter by File Name	Optional pattern to limit files by name (e.g., `.json`, `sensor_.txt`).
Outgoing JSON Key for Filename	If specified, adds the source filename to the payload under this key.
Remove Extension from Filename	If enabled, strips the file extension (e.g., `.json`) from the filename.

Payload Date Settings (Optional)

Field	Description
Timestamp JSON Key	Field inside the file that contains the timestamp.
Timestamp Format	Format used to parse the timestamp (e.g., `yyyy-MM-dd HH:mm:ss`).
Time Zone	Time zone to interpret the timestamp (`UTC`, `UID time zone`, or specific zone).

Output Example

Given a JSON file in S3:

{
  "device_id": "sensor-01",
  "temperature": 24.5,
  "timestamp": "2025-07-15 12:30:00"
}

And configuration:

UID JSON Key = device_id
Timestamp JSON Key = timestamp
Outgoing JSON Key for Filename = source_file
Remove extension = true

The output payload:

{
  "uid": "sensor-01",
  "temperature": 24.5,
  "timestamp": "2025-07-15T12:30:00Z",
  "source_file": "data_123"
}

Best Practices

Use a dedicated IAM user with least-privilege permissions for read-only access to the S3 bucket.
Ensure your S3 folder naming matches the exact structure used in AWS (e.g., avoid trailing slashes unless intentional).
If you are using filenames for routing or debugging, enable filename extraction.
Set a filter pattern if only a subset of files should be ingested.
Align timestamp parsing with downstream time-series logic to ensure temporal accuracy.

Use Cases

Ingest sensor exports or data snapshots from S3
Process batch file uploads from external systems
Enrich and label file-based telemetry for visualization
Integrate periodic data drops into live workflows

FAQ

Q: What happens to already downloaded files?

A: Files are cached and tracked. The node will not download the same file twice.

Q: Does this node delete files from S3?

A: No. Files remain in S3 unless managed by external lifecycle policies.

Q: Can I connect to S3-compatible services (e.g., MinIO)?

A: No. This node is designed for native Amazon S3 services only.