The AWS S3 Node is used to retrieve files from an Amazon S3 bucket and ingest them into a Rayven workflow. It supports polling a folder at regular intervals, parsing supported file types (e.g., JSON), and extracting metadata.
What It Does
This node connects securely to an Amazon S3 bucket, downloads files from a specified folder, and emits each file's content as a JSON payload. It can also tag the payload with UID, timestamp, and filename information for traceability and routing.
Step-by-Step: How to Configure the AWS S3 Node
-
Add the node
-
Drag the AWS S3 Node from the Inputs section into the workflow canvas.
-
-
Open the configuration panel
-
Double-click the node to set up credentials, folder targeting, and extraction rules.
-
-
Connect downstream nodes
-
Link to nodes that will process or visualize the imported data.
-
Configuration Fields
AWS Connection & Bucket Access
Field | Requirement | Description |
---|---|---|
Node Name* | Required | Internal name for the node in your workflow. |
S3 Bucket Name* | Required | Name of the S3 bucket to read files from. |
AWS Access Key ID* | Required | IAM access key for S3 authentication. |
AWS Secret Access Key* | Required | Secret access key paired with the above. |
S3 Folder Name* | Required | The folder (prefix) inside the bucket to monitor. Use / or leave blank to target the root. |
AWS Region* | Required | AWS region where the bucket is located (e.g., ap-northeast-1 ). |
Polling Settings
Field | Requirement | Description |
---|---|---|
File Download Interval (minutes)* | Required | How often to poll the S3 folder for new files (e.g., every 5 minutes). |
Payload Parsing
Field | Requirement | Description |
---|---|---|
Payload Format* | Required | Format of the file: currently supports JSON . |
UID JSON Key / UID Column / XML Tag | Optional | Used to extract the UID from the file content based on its structure. |
Advanced Features
Field | Description |
---|---|
Filter by File Name | Optional pattern to limit files by name (e.g., *.json , sensor_*.txt ). |
Outgoing JSON Key for Filename | If specified, adds the source filename to the payload under this key. |
Remove Extension from Filename | If enabled, strips the file extension (e.g., .json ) from the filename. |
Payload Date Settings (Optional)
Field | Description |
---|---|
Timestamp JSON Key | Field inside the file that contains the timestamp. |
Timestamp Format | Format used to parse the timestamp (e.g., yyyy-MM-dd HH:mm:ss ). |
Time Zone | Time zone to interpret the timestamp (UTC , UID time zone , or specific zone). |
Output Example
Given a JSON file in S3:
{
"device_id": "sensor-01",
"temperature": 24.5,
"timestamp": "2025-07-15 12:30:00"
}
And configuration:
-
UID JSON Key =
device_id
-
Timestamp JSON Key =
timestamp
-
Outgoing JSON Key for Filename =
source_file
-
Remove extension = true
The output payload:
{
"uid": "sensor-01",
"temperature": 24.5,
"timestamp": "2025-07-15T12:30:00Z",
"source_file": "data_123"
}
Best Practices
-
Use a dedicated IAM user with least-privilege permissions for read-only access to the S3 bucket.
-
Ensure your S3 folder naming matches the exact structure used in AWS (e.g., avoid trailing slashes unless intentional).
-
If you are using filenames for routing or debugging, enable filename extraction.
-
Set a filter pattern if only a subset of files should be ingested.
-
Align timestamp parsing with downstream time-series logic to ensure temporal accuracy.
Use Cases
-
Ingest sensor exports or data snapshots from S3
-
Process batch file uploads from external systems
-
Enrich and label file-based telemetry for visualization
-
Integrate periodic data drops into live workflows
FAQ
Q: What happens to already downloaded files?
A: Files are cached and tracked. The node will not download the same file twice.
Q: Does this node delete files from S3?
A: No. Files remain in S3 unless managed by external lifecycle policies.
Q: Can I connect to S3-compatible services (e.g., MinIO)?
A: No. This node is designed for native Amazon S3 services only.