File-Based Workflows in Rayven (CSV, FTP, SFTP, Document Processing)

Rayven supports robust file-based data ingestion and export through CSV, FTP, and SFTP protocols, enabling automation around structured and semi-structured files.

This allows seamless integration with legacy systems, external partners, and batch data operations.



Supported File Types & Protocols

  • CSV Files: Flat files with comma-separated values, ideal for tabular data exchange.

  • FTP/SFTP: Secure transfer protocols used to ingest or push files to remote servers.

  • Local Uploads: For development, simulation, or manual testing.

  • Document Files (PDF, Word, Google Docs): When combined with generative AI, Rayven can extract and process information from semi-structured documents.


Importing Files (File Ingestion)

To ingest structured files into your workflow, use:


FTP-input

FTP/SFTP Input Node

  • Configure:

    • Host, port, username, password (or key-based auth)

    • File path or folder

    • Polling interval (e.g., every 10 minutes)

    • Filename filters (e.g., *.csv, daily_log_*.txt)

  • Supports secure polling and event-driven file detection.

CSV Input & Parsing

Once a file is detected:

  • Pass it to the CSV Parser Node.

  • Configure:

    • Header row (present or not)

    • Field delimiter (, or ;)

    • File encoding (UTF-8, etc.)

  • Output is structured as a list of row objects.



Writing and Exporting Files

Rayven supports file generation and delivery using:


FTP-node

FTP/SFTP Output Node

  • Configure:

    • Destination FTP/SFTP server, credentials, and file path

    • Filename templates with variables (e.g., report_.csv)

    • Overwrite or append mode

  • Output format:

    • CSV (default)

    • JSON or XML (optional via formatting nodes or templates)


Processing Uploaded Documents with Generative AI

In addition to structured files, Rayven can process unstructured documents using a combination of file ingestion and Generative AI:

Supported File Types:

  • PDF

  • Microsoft Word (.docx)

  • Google Docs (downloaded as .docx or .pdf)

  • Text files (.txt)

How It Works:

  1. Upload or detect documents via FTP/SFTP Input Node.

  2. Route file content to a Generative AI Node.

  3. Extract:

    • Summaries

    • Key fields or metrics

    • Actionable insights

  4. Use parsed or summarized output in:

    • Workflow logic (e.g., if document mentions "urgent", trigger alert)

    • Table updates

    • Dashboards (render results in an HTML Node or widget)

Example: A PDF report is uploaded nightly → AI summarizes key findings → Workflow sends a Slack message with summary + stores metadata in Rayven tables.


Common Use Cases

  • Ingesting telemetry logs or batch data from remote systems

  • Importing lookup tables (e.g., pricing, customer lists) for enrichment

  • Pushing processed data summaries (e.g., daily reports) to partners

  • Reading external reports (PDF, Word) and extracting actionable content with LLMs

  • Archiving workflow results as downloadable files


Best Practices

  • Validate file structure with test data before enabling automation

  • Use file naming conventions with timestamps or unique IDs

  • Log results of each import/export action into Rayven tables for traceability

  • Clean up remote directories to prevent reprocessing old files

  • Secure credentials using environment variables or Rayven secrets manager

  • Document expected file formats and fallback handling logic


Integration with Dashboards and MySQL Tables

  • Ingested CSV or document data can be mapped to Rayven’s internal MySQL tables (Primary or Secondary).

  • Parsed data can feed:

    • Dashboards and HTML nodes

    • Alert logic (e.g., file contains critical term)

    • Workflow decisions, transformations, or control actions

  • Exported files (via SFTP Output Node) can reflect processed metrics, filtered logs, or calculated summaries.


Q&A

Q: Can I process files other than CSV?
A: Yes. Rayven supports .txt, .json, .xml, and documents like PDF or DOCX via GenAI-based processing.

Q: How often can I poll for new files?
A: The FTP/SFTP Input Node allows polling intervals as low as every few minutes. Choose based on file arrival patterns and processing needs.

Q: How do I avoid reprocessing the same file?
A: Use file renaming, archival, or deletion options in the Output Node, or log filenames to prevent duplicates.

Q: Can I push files to multiple destinations?
A: Yes. Use multiple FTP Output Nodes with different configurations in the same workflow.

Q: Is file content secure during transfer?
A: Yes. Rayven supports encrypted transfers (SFTP/FTPS), and credentials can be securely stored in the environment.

Q: Can I extract structured values from a PDF?
A: Yes. Rayven’s Generative AI Node can parse uploaded PDFs and extract structured data, which can then be mapped or stored.