Skip to main content

PostgreSQL

note

The Synth PostgreSQL integration is currently in beta.

Usage#

synth can use PostgreSQL as a data source or sink. Connecting synth to a PostgreSQL is as simple as specifying a URI and schema during the import or generate phase.

URI format#

postgres://<username>:<password>@<host>:<port>/<catalog>

Import#

synth can import directly from a PostgreSQL database and create a data model from the database schema. During import, a new namespace will be created from your database schema, and a collection is created for each table in a separate JSON file. synth will map database columns to fields in the collections it creates. It then provides default generators for every collection. Synth will default to the public schema but this can be overriden with the --schema flag.

synth will automatically detect primary key and foreign key constraints at import time and update the namespace and collection to reflect them. Primary keys get mapped to synth's id generator, and foreign keys get mapped to the same_as generator.

Finally synth will sample data randomly from every table in order to create a more realistic data model by automatically inferring bounds on types.

synth has its own internal data model, and so does Postgres, therefore a conversion occurs between synth types and Postgres types. The inferred type can be seen below. The synth types link to default generator variant generated during the import process for that PostgreSQL type.

Note, not all PostgreSQL types have been covered yet. If there is a type you need, open an issue on GitHub.

PostgreSQL TypeSynth Type
Null | Tone_of<null, T>
booleanbool
charstring
varchar(x)string
textstring
bpchar(x)string
namestring
int2i64
int4i32
int8i64
float4f32
float8f64
numericf64
timestamptzdate_time
timestampnaive_date_time
datenaive_date
uuidstring

Example Import#

Below is an example import for a single table.

Postgres table definition:

create table doctors(    id          int primary key,    hospital_id int not null,     name        varchar(255) not null,    date_joined date,    constraint hospital_fk        foreign key(hospital_id)            references hospitals(id));

And the corresponding synth collection:

{  "type": "array",  "length": {    "type": "number",    "range": {      "low": 0,      "high": 2,      "step": 1    },    "subtype": "u64"  },  "content": {    "type": "object",    "date_joined": {      "type": "one_of",      "variants": [        {          "weight": 1.0,          "type": "string",          "date_time": {            "format": "%Y-%m-%d",            "subtype": "naive_date",            "begin": null,            "end": null          }        },        {          "weight": 1.0,          "type": "null"        }      ]    },    "hospital_id": {      "type": "same_as",      "ref": "hospitals.content.id"    },    "id": {      "type": "number",      "id": {},      "subtype": "u64"    },    "name": {      "type": "string",      "pattern": "[a-zA-Z0-9]{0, 255}"    }  }

Example Import Command#

synth import --from postgres://user:pass@localhost:5432/postgres --schema main my_namespace 

Example#

Generate#

synth can generate data directly into your PostgreSQL database. First synth will generate as much data as required, then open a connection to your database, and then perform batch insert to quickly insert as much data as you need.

synth will also respect primary key and foreign key constraints, by performing a topological sort on the data and inserting it in the right order such that no constraints are violated.

Example Generation Command#

synth generate --to postgres://user:pass@localhost:5432/ --schema main my_namespace