MySQL to PostgreSQL Migration: A Practical Guide for Engineers

A hands-on guide to migrating from MySQL to PostgreSQL, covering schema conversion, data type mapping, tooling choices like pgloader, and the SQL dialect changes your application code will need.

MySQL and PostgreSQL are both mature, production-proven relational databases - but they are not interchangeable. Teams that outgrow MySQL's type system, hit its concurrency limits, or need features like JSONB, partial indexes, or native array types eventually face a migration. This guide walks through the practical engineering work involved: schema conversion, data migration tooling, and the application-level SQL changes that catch most teams off guard.

Why Teams Move from MySQL to PostgreSQL

The decision is rarely about MySQL being "bad." It is about PostgreSQL offering capabilities that MySQL lacks or implements differently. Here are the differences that most often drive the switch:

Feature	MySQL (8.x / 9.x)	PostgreSQL (16 / 17)
Concurrency model	Row-level locking, limited MVCC	Full MVCC - readers never block writers
JSON support	JSON type with limited indexing	JSONB with GIN indexes and rich operators
Array columns	Not supported	Native array types with indexing
Partial indexes	Not supported	Supported (`CREATE INDEX ... WHERE`)
Generated columns	Stored and virtual	Stored and virtual (virtual improved in PG 17)
Full-text search	Basic built-in	Rich, extensible with dictionaries and ranking
Extensions	Plugin-based (limited)	Hundreds of extensions (PostGIS, pg_trgm, pgvector)
SQL standards compliance	Partial	Strict

PostgreSQL's MVCC implementation means concurrent reads and writes do not block each other. In MySQL's InnoDB, certain operations still acquire locks that can stall readers, particularly under heavy write loads. For applications doing mixed read/write workloads at scale, this difference alone can justify a migration.

The extension ecosystem is another strong pull factor. PostGIS for geospatial, pgvector for embeddings and vector search, pg_trgm for fuzzy text matching - these ship as first-class extensions, not bolted-on plugins.

Schema Conversion: Data Types and Gotchas

Schema translation is the first and most error-prone step. Some mappings are straightforward. Others silently corrupt data if you get them wrong.

Data Type Mapping

MySQL Type	PostgreSQL Equivalent	Notes
`INT AUTO_INCREMENT`	`INT GENERATED ALWAYS AS IDENTITY`	`SERIAL` also works but `IDENTITY` is the modern standard (PostgreSQL docs)
`TINYINT(1)`	`BOOLEAN`	MySQL has no native boolean - it uses `TINYINT(1)` as a stand-in
`DOUBLE`	`DOUBLE PRECISION`	Identical IEEE 754 behavior
`DATETIME`	`TIMESTAMP`	MySQL `DATETIME` stores no timezone; PostgreSQL `TIMESTAMP` also stores none. Use `TIMESTAMPTZ` if you need timezone awareness
`TIMESTAMP`	`TIMESTAMPTZ`	MySQL's `TIMESTAMP` auto-converts to UTC on storage. PostgreSQL's `TIMESTAMPTZ` behaves similarly
`ENUM('a','b','c')`	Custom `CREATE TYPE` or `CHECK` constraint	PostgreSQL ENUMs must be created as standalone types before table creation
`TEXT` / `MEDIUMTEXT` / `LONGTEXT`	`TEXT`	PostgreSQL's `TEXT` has no length limit. No need for the `MEDIUM`/`LONG` variants
`BLOB` / `MEDIUMBLOB`	`BYTEA`	Different encoding; `BYTEA` uses hex or escape format
`JSON`	`JSONB`	Prefer `JSONB` - it stores parsed binary JSON, enabling indexing and efficient queries

Three Things That Break Silently

1. Zero dates. MySQL allows 0000-00-00 and 0000-00-00 00:00:00 as valid date/timestamp values. PostgreSQL rejects them outright. You need to map these to NULL or a sentinel date during migration. pgloader handles this automatically.

2. Case sensitivity. MySQL's default collation (utf8mb4_general_ci) is case-insensitive for string comparisons. PostgreSQL is case-sensitive by default. Queries like WHERE name = 'john' that matched John in MySQL will stop matching in PostgreSQL. Fix this with the citext extension or by adding LOWER() calls.

3. Identifier quoting. MySQL uses backticks for identifiers: `column_name`. PostgreSQL uses double quotes: "column_name". If your application or migration scripts contain backtick-quoted identifiers, they will fail. Most ORMs abstract this, but raw SQL and stored procedures need manual conversion.

Migration Tools and Strategies

pgloader: The Standard Choice

pgloader is the go-to open-source tool for MySQL-to-PostgreSQL migration. A single command can migrate an entire database:

pgloader mysql://user:pass@mysql-host/mydb \
           postgresql://user:pass@pg-host/mydb

pgloader handles schema discovery, type conversion, and data loading in one pass. Its standout feature: when it encounters bad rows (like zero dates or encoding issues), it logs them to a reject file and keeps going instead of aborting the entire table load. This behavior is a deliberate departure from PostgreSQL's default transactional COPY, which rolls back everything on a single error.

For more control, use a pgloader command file:

LOAD DATABASE
    FROM mysql://user:pass@mysql-host/mydb
    INTO postgresql://user:pass@pg-host/mydb
  
  WITH include drop, create tables, create indexes, reset sequences
  
  CAST type tinyint to boolean using tinyint-to-boolean
  
  ALTER SCHEMA 'mydb' RENAME TO 'public';

AWS Database Migration Service (DMS)

For teams already on AWS, DMS supports MySQL-to-PostgreSQL migration with continuous replication. It handles the initial full load and then streams ongoing changes (CDC) from MySQL's binlog. This enables near-zero-downtime cutover: run both databases in parallel, validate, then switch traffic.

Tool Comparison

Tool	Best For	Handles Schema?	Continuous Replication?	Cost
pgloader	One-shot migrations, self-managed infra	Yes (auto-converts)	No	Free / open-source
AWS DMS	Cloud migrations with zero-downtime needs	Partial (needs manual review)	Yes (CDC from binlog)	Pay-per-use
pg_chameleon	Self-managed with replication needs	Yes	Yes (MySQL binlog)	Free / open-source

Regardless of tool choice, always migrate to a staging environment first. Run your full test suite against the PostgreSQL copy before touching production.

Rewriting Application Code

Schema migration gets the data across. But your application's SQL dialect needs updating too. Here are the changes that trip up most teams.

SQL Syntax Differences

GROUP BY strictness. MySQL's default sql_mode allows SELECT columns not listed in GROUP BY. PostgreSQL enforces the SQL standard: every non-aggregated column in SELECT must appear in GROUP BY. Queries that worked in MySQL will fail in PostgreSQL with column must appear in the GROUP BY clause.

String aggregation. MySQL's GROUP_CONCAT() becomes PostgreSQL's string_agg():

-- MySQL
  SELECT department, GROUP_CONCAT(name SEPARATOR ', ') FROM employees GROUP BY department;
  
  -- PostgreSQL
  SELECT department, string_agg(name, ', ') FROM employees GROUP BY department;

UPSERT syntax. MySQL uses INSERT ... ON DUPLICATE KEY UPDATE. PostgreSQL uses INSERT ... ON CONFLICT DO UPDATE:

-- MySQL
  INSERT INTO users (id, email, name) VALUES (1, 'a@b.com', 'Alice')
  ON DUPLICATE KEY UPDATE name = VALUES(name);
  
  -- PostgreSQL
  INSERT INTO users (id, email, name) VALUES (1, 'a@b.com', 'Alice')
  ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name;

LIMIT with offset. Both databases support LIMIT ... OFFSET, so this is one area that usually migrates cleanly.

Stored Procedures and Functions

If your application uses MySQL stored procedures, expect a full rewrite. MySQL procedures use a custom procedural dialect. PostgreSQL uses PL/pgSQL, which is more capable but syntactically different. Delimiter declarations (DELIMITER //), BEGIN...END blocks, and variable declarations all change.

For applications using an ORM (Django, Rails, SQLAlchemy, Sequelize), the migration is smoother. Update the connection string and database driver, then run your test suite. Most ORMs abstract dialect differences, but watch for raw SQL fragments, custom queries, and database-specific features like MySQL's LAST_INSERT_ID() (PostgreSQL uses RETURNING id or currval()).

Key Takeaways

PostgreSQL's MVCC, extension ecosystem (PostGIS, pgvector), and stricter SQL compliance are the most common reasons teams migrate from MySQL.
Schema conversion requires careful attention to data types. AUTO_INCREMENT becomes IDENTITY, TINYINT(1) becomes BOOLEAN, and MySQL ENUMs need standalone type definitions.
Zero dates, case-sensitive string comparisons, and backtick quoting are the three issues most likely to break silently.
pgloader is the fastest path for one-shot migrations. AWS DMS adds continuous replication for zero-downtime cutover.
Application code needs updates for GROUP BY strictness, GROUP_CONCAT to string_agg, ON DUPLICATE KEY to ON CONFLICT, and stored procedure rewrites.
Always migrate to a staging environment first. Run your full test suite against PostgreSQL before cutting over production traffic.