Azure Blob Storage integration

Azure Blob Storage pipelines,
described in plain English.

Land PostgreSQL, MySQL, and SaaS data in Azure Blob as Parquet, JSON, or CSV — CDC streaming or scheduled snapshots — and move raw files byte-for-byte between Azure, S3, and GCS. Live on rsync.ai Cloud, no per-row fees.

TL;DR

rsync.ai writes to Azure Blob Storage two ways: structured output (Parquet, JSON, or CSV with a schema manifest, from Postgres/MySQL CDC or snapshots, ADLS Gen2 partitioned for Synapse and Fabric) and byte-identical blob passthrough (copy any object between Azure, S3, and GCS). Authenticate with an access key, SAS token, or managed identity; PII columns are masked before the write.

Parquet, JSON, or CSV — ADLS Gen2 & Hive partitioning
CDC streaming or scheduled snapshots — or both
SAS token or managed-identity auth
Blob passthrough between Azure Blob, S3, and GCS

Start freeAll integrations

Move data into Azure Blob

Start from a database, or describe any other source in plain English.

PostgreSQL connector

Real-time CDC source — see the PostgreSQL connector overview, then route it into Azure Blob as Parquet, JSON, or CSV.

MySQL connector

Real-time CDC source — see the MySQL connector overview, then route it into Azure Blob as Parquet, JSON, or CSV.

What the Azure Blob connector does

Structured exports for analytics, and raw blob passthrough for everything else.

Structured exports

Postgres & MySQL tables to Parquet, JSON, or CSV with a schema manifest.

Blob passthrough

Copy any object byte-for-byte between Azure, S3, and GCS — SHA-256 verified.

PII-safe

Mask or hash sensitive columns before a single byte lands in your container.

Parquet (Snappy), JSON, or CSV outputDate / hour / Hive-style partitioningSAS token or managed-identity authPII masking before the Azure writeResumable block-blob uploadsSHA-256 integrity on every objectBlob passthrough: Azure ↔ S3 ↔ GCSNo per-row or per-GB pricing

rsync.ai vs. Fivetran, Airbyte, custom scripts for Azure Blob

What you give up — and gain — choosing rsync.ai for pipelines into Azure Blob Storage.

Feature

rsync.aiyou

Fivetran

Airbyte

Custom scripts

Plain-English pipeline setup

CDC streaming to Azure Blob (Postgres & MySQL)

Parquet output with schema manifest

Blob passthrough (Azure ↔ S3 ↔ GCS)

PII masking before write

No per-row / per-MAR pricing

Resumable snapshots (no restart on failure)

Azure Blob Storage pipelines — frequently asked

What can rsync.ai write to Azure Blob Storage?

Structured data and raw files. PostgreSQL and MySQL tables (via CDC or snapshot) and any other source land as Parquet, JSON, or CSV with a schema manifest — ready for Azure Synapse, Microsoft Fabric, or Databricks. Separately, blob passthrough copies any object byte-for-byte from S3 or GCS into Azure Blob without re-encoding.

How does rsync.ai authenticate to Azure Blob?

Use a storage-account access key, a scoped SAS token, or — if rsync.ai runs inside Azure — a managed identity so no long-lived secret is stored. You point rsync.ai at the storage account and container; the path layout and access tier are yours to configure.

Does it work with ADLS Gen2 and hierarchical namespaces?

Yes. Azure Data Lake Storage Gen2 (hierarchical namespace on Blob) is supported, including Hive-style partitioned Parquet that Synapse serverless SQL pools and Microsoft Fabric read directly. A JSON schema manifest is written alongside each batch for schema-evolution discovery.

Is Azure Blob a source or a destination?

Both. Azure Blob is typically a destination for data-lake and archival workloads, but rsync.ai can also read objects from Azure Blob and move them to another store (S3, GCS, or another container) with byte-identical blob passthrough. Blob → relational database is intentionally rejected — a raw binary can't be written to a table row without parsing.

Do I have to deploy anything to use the Azure Blob connector?

No. rsync.ai Cloud is live at app.rsync.ai — sign up free and build an Azure Blob pipeline in minutes, nothing to provision. If you'd rather run the whole stack inside your own VPC, self-hosting (source-available, Elastic License 2.0) arrives July 2026.

Other storage destinations

AWS S3 Google Cloud Storage All integrations