Skip to main content
Source-available · Agentic data pipelines

Data pipelines,
in plain English.

Describe what you want to move. rsync.ai discovers the schema, flags PII, asks for your approval, then runs the pipeline — with real-time CDC for Postgres and MySQL and AI-generated connectors for any REST or GraphQL API.

Watch demo
~5 min
Chat to first synced rows
Plain English
No SQL, no YAML, no DAGs
20+
Built-in connectors
Self-hosted
Your data, your servers
default's workspace
D
rsync.airsync.ai
HOME
Home
CREATE
Agentic Data Pipeline
PIPELINES
All Pipelines
OVERVIEW
Executions
DATA
Explorer
Connections
Connectors
SETTINGS
Settings
ADMIN
Admin
Sync Shopify orders to BigQuery every hour, mask customer emails
✓ Connected to Shopify · ✓ Connected to BigQuery Discovered 5 tables: orders, order_items, customers, products, refunds
⚠️ PII detected: email in customers — will be masked. Ready to deploy?
Confirm & start pipeline
Describe what you want to move…

Sound familiar? You're not alone.

6–12 weeks
waiting for engineering to build a new data connector
100% dependent
on engineers for every pipeline change, no matter how small
$0.15 per row
punishing pricing that makes scaling your data 10x more expensive
See it in action

One sentence.
A running pipeline.

Watch the agent turn a plain-English request into a running Shopify→Postgres pipeline — then query the synced data in natural-language SQL. All in under 20 seconds.

rsync.ai agent confirming a Shopify to PostgreSQL pipeline from a plain-English request
~19 sec
Describe it in plain EnglishAI discovers schema + flags PIIYou approve — then rows flowSelf-hosted · No per-row fees
How it works

Three steps.
No SQL, no YAML, no DAGs.

If you can write a Slack message, you can run a production data pipeline on rsync.ai — backed by Temporal workflows, Debezium CDC, and OpenTelemetry tracing under the hood.

01

Describe the pipeline in plain English

Type what you want. The LLM agent parses the source, destination, cadence, and any constraints (mask PII, exclude tables, JSONB nested objects). No forms, no DAG editors.

"Sync orders from MySQL to BigQuery hourly, mask emails"
02

Review the plan, approve the gates

rsync.ai discovers your schema, proposes destination tables, and runs a PII scan. Each step pauses on a human-in-the-loop gate — connection, tables, PII rules — until you approve.

Schema discovered · PII rules surfaced · Tables you pick
03

Pipeline runs on Temporal, you watch

Pipeline runs as a Temporal workflow with checkpointed cursors and automatic retries. OpenTelemetry traces let you replay any past run — no engineer required to debug.

Live row counts · Auto-retry · Replayable event log
Product tour

What's inside
rsync.ai

Six surfaces that work together: chat-driven pipeline creation, data exploration, an AI connector builder, schema discovery, live monitoring, and an MCP server generator.

Describe it. Approve it. Done.

Type a request in plain English. The LLM agent parses source, destination, cadence, and PII rules into a Temporal workflow with human-in-the-loop approval gates.

  • LLM planner — heuristic + DAG strategies
  • Temporal workflow under the hood (durable + resumable)
  • Approval gates at every critical step
app.rsync.ai / chat
rsync.ai Pipeline Creation — Describe it. Approve it. Done.
What rsync.ai actually does

Everything a data team needs.
Source-available. Self-hosted.

rsync.ai gives analysts, ops leads, and data heads a full pipeline platform — natural-language setup, real CDC for relational sources, PII detection, and an AI connector builder for any API.

Plain English

Anyone on your team can build pipelines

Your analyst, ops lead, or PM can create a data pipeline by chatting with rsync.ai. The LLM agent breaks the request into steps — no SQL, no YAML, no DAG editors.

Data Explorer

SQL workbench with AI inside

CodeMirror 6 SQL editor with schema-aware autocomplete and NL→SQL. Click any table or column in the sidebar to insert it at cursor. Export to CSV, TSV, or JSON and send results to Metabase or Superset.

AI Tool Generator

AI generates connectors for any REST or GraphQL API

Paste an API docs URL or OpenAPI spec. rsync.ai reads it, generates a versioned MCP connector with auth, schema discovery, and pagination, and ships a Docker image in minutes.

PII protection

PII caught before a single row moves

rsync.ai scans every column for personal data — emails, phones, IDs, addresses — and asks for per-field rules. Choose mask, hash (SHA-256/HMAC), drop, or pass-through.

Human oversight

You approve, AI executes

Every critical step — connection, table selection, PII rules, schema changes — pauses for explicit approval via a human-in-the-loop gate. Nothing moves until you say yes.

CDC + scheduled

Real-time CDC for Postgres and MySQL

Log-based change data capture via Debezium keeps Postgres and MySQL destinations in sync within seconds. SaaS sources sync on a schedule you describe in plain English.

Observability

See exactly what's happening, always

Live row counts, run history, error alerts, and OpenTelemetry traces (SigNoz integration included). Replay any past run from the event log to debug.

Self-hosted

Self-hosted — your data stays yours

Run the full stack on your own infrastructure via Docker Compose. Credentials are AES-256 encrypted at rest, and Ollama is supported for fully local LLM inference.

Multi-tenant ready

Per-pipeline namespace isolation

Each pipeline writes into its own destination schema (e.g., shopify.orders, shopify_brand_b.orders) with collision detection and ownership gating — Fivetran/Airbyte-style multi-tenant safety.

20+ built-in · unlimited via AI

Connect what you actually use

Built-in connectors for the databases, warehouses, and SaaS tools most teams need. Don't see yours? The AI Tool Generator builds a working connector from any REST or GraphQL docs URL.

Databases (with real-time CDC)
PostgreSQLMySQLMariaDBSQLiteClickHouse
Data warehouses
SnowflakeBigQueryRedshift
Storage & streaming
AWS S3MinIOKafkaGoogle Sheets
SaaS & APIs
ShopifyStripeHubSpotGitHubSlackNotionLinearPipedriveSegmentmParticleQuickBooksMetabase
Don't see your connector?

Point the AI Tool Generator at any REST or GraphQL docs URL and it produces a versioned MCP connector — auth, schema discovery, cursor pagination, Dockerfile included.

Generate a connector
Why rsync.ai

Built like a modern
data platform.

Plain-English UX on top of a serious foundation — Debezium-based CDC, Temporal workflows, OpenTelemetry tracing, and source-available connectors you can read, fork, or rebuild.

Feature
rsync.aiyou
Fivetran
Airbyte
n8n
Natural-language pipeline creation
AI-generated connectors from any docs URL
Log-based CDC for Postgres & MySQL
Human-in-the-loop approval gates
PII detection with per-field mask/hash/drop
Source-available (auditable connector code)
Self-hosted / on-premise
Per-pipeline destination namespace isolation
Auto schema discovery
OpenTelemetry tracing (SigNoz / Jaeger)
No per-row / per-MAR pricing
Supported Partial Not supported

Built for teams that care about security & control

Production-grade infrastructure, audit-friendly logging, source-available code — without per-row pricing or vendor lock-in.

Self-hosted by design
Run the full stack on Docker Compose inside your VPC. Source data and credentials never leave your network.
AES-256 encryption at rest
Connector credentials are encrypted with AES-256-GCM via a key you control (ENCRYPTION_KEY env). Key rotation supported.
Source-available under the Elastic License 2.0
Every connector lives in the public GitHub repo. Inspect, fork, or rebuild — no vendor lock-in.
OpenTelemetry tracing
Every pipeline run emits structured traces (SigNoz dashboard included). Replayable Kafka + Temporal event log gives a full execution history.
PII rules before data moves
Scanner detects emails, phones, IDs, addresses. You set per-field rules — mask, SHA-256/HMAC hash, drop, or pass-through.
Flat infrastructure cost
No per-row, per-MAR, or per-connector fees. Pay only for the compute you run rsync.ai on.
~5 min
Chat to synced rows
End-to-end e2e: install → connect → sync 6 Shopify tables to Postgres
Plain English
Pipeline setup
LLM-driven Temporal workflow — no SQL, YAML, or DAGs to write
20+
Built-in connectors
Plus AI Tool Generator for any REST or GraphQL API in minutes
Self-hosted
Source-available under the Elastic License 2.0
Run on your infra. Credentials AES-256 encrypted at rest.

Run rsync.ai
on your own stack.

Self-host in one Docker Compose command, or book a 30-minute live walkthrough. Plain-English pipelines, real CDC, source-available under the Elastic License 2.0.

✓ Source-available under the Elastic License 2.0✓ Your data stays on your servers✓ Self-host in one Docker command✓ No per-row pricing, ever