A hands-on guide to migrating from MySQL to PostgreSQL, covering schema conversion, data type mapping, tooling choices like pgloader, and the SQL dialect changes your application code will need.

MySQL and PostgreSQL are both mature, production-proven relational databases - but they are not interchangeable. Teams that outgrow MySQL's type system, hit its concurrency limits, or need features like JSONB, partial indexes, or native array types eventually face a migration. This guide walks through the practical engineering work involved: schema conversion, data migration tooling, and the application-level SQL changes that catch most teams off guard.

Why Teams Move from MySQL to PostgreSQL

The decision is rarely about MySQL being "bad." It is about PostgreSQL offering capabilities that MySQL lacks or implements differently. Here are the differences that most often drive the switch:

Feature MySQL (8.x / 9.x) PostgreSQL (16 / 17)
Concurrency model Row-level locking, limited MVCC Full MVCC - readers never block writers
JSON support JSON type with limited indexing JSONB with GIN indexes and rich operators
Array columns Not supported Native array types with indexing
Partial indexes Not supported Supported (CREATE INDEX ... WHERE)
Generated columns Stored and virtual Stored and virtual (virtual improved in PG 17)
Full-text search Basic built-in Rich, extensible with dictionaries and ranking
Extensions Plugin-based (limited) Hundreds of extensions (PostGIS, pg_trgm, pgvector)
SQL standards compliance Partial Strict

PostgreSQL's MVCC implementation means concurrent reads and writes do not block each other. In MySQL's InnoDB, certain operations still acquire locks that can stall readers, particularly under heavy write loads. For applications doing mixed read/write workloads at scale, this difference alone can justify a migration.

The extension ecosystem is another strong pull factor. PostGIS for geospatial, pgvector for embeddings and vector search, pg_trgm for fuzzy text matching - these ship as first-class extensions, not bolted-on plugins.

Schema Conversion: Data Types and Gotchas

Schema translation is the first and most error-prone step. Some mappings are straightforward. Others silently corrupt data if you get them wrong.

Data Type Mapping

MySQL Type PostgreSQL Equivalent Notes
INT AUTO_INCREMENT INT GENERATED ALWAYS AS IDENTITY SERIAL also works but IDENTITY is the modern standard (PostgreSQL docs)
TINYINT(1) BOOLEAN MySQL has no native boolean - it uses TINYINT(1) as a stand-in
DOUBLE DOUBLE PRECISION Identical IEEE 754 behavior
DATETIME TIMESTAMP MySQL DATETIME stores no timezone; PostgreSQL TIMESTAMP also stores none. Use TIMESTAMPTZ if you need timezone awareness
TIMESTAMP TIMESTAMPTZ MySQL's TIMESTAMP auto-converts to UTC on storage. PostgreSQL's TIMESTAMPTZ behaves similarly
ENUM('a','b','c') Custom CREATE TYPE or CHECK constraint PostgreSQL ENUMs must be created as standalone types before table creation
TEXT / MEDIUMTEXT / LONGTEXT TEXT PostgreSQL's TEXT has no length limit. No need for the MEDIUM/LONG variants
BLOB / MEDIUMBLOB BYTEA Different encoding; BYTEA uses hex or escape format
JSON JSONB Prefer JSONB - it stores parsed binary JSON, enabling indexing and efficient queries

Three Things That Break Silently

1. Zero dates. MySQL allows 0000-00-00 and 0000-00-00 00:00:00 as valid date/timestamp values. PostgreSQL rejects them outright. You need to map these to NULL or a sentinel date during migration. pgloader handles this automatically.

2. Case sensitivity. MySQL's default collation (utf8mb4_general_ci) is case-insensitive for string comparisons. PostgreSQL is case-sensitive by default. Queries like WHERE name = 'john' that matched John in MySQL will stop matching in PostgreSQL. Fix this with the citext extension or by adding LOWER() calls.

3. Identifier quoting. MySQL uses backticks for identifiers: `column_name`. PostgreSQL uses double quotes: "column_name". If your application or migration scripts contain backtick-quoted identifiers, they will fail. Most ORMs abstract this, but raw SQL and stored procedures need manual conversion.

Migration Tools and Strategies

pgloader: The Standard Choice

pgloader is the go-to open-source tool for MySQL-to-PostgreSQL migration. A single command can migrate an entire database:

pgloader mysql://user:pass@mysql-host/mydb \
           postgresql://user:pass@pg-host/mydb
  

pgloader handles schema discovery, type conversion, and data loading in one pass. Its standout feature: when it encounters bad rows (like zero dates or encoding issues), it logs them to a reject file and keeps going instead of aborting the entire table load. This behavior is a deliberate departure from PostgreSQL's default transactional COPY, which rolls back everything on a single error.

For more control, use a pgloader command file:

LOAD DATABASE
    FROM mysql://user:pass@mysql-host/mydb
    INTO postgresql://user:pass@pg-host/mydb
  
  WITH include drop, create tables, create indexes, reset sequences
  
  CAST type tinyint to boolean using tinyint-to-boolean
  
  ALTER SCHEMA 'mydb' RENAME TO 'public';
  

AWS Database Migration Service (DMS)

For teams already on AWS, DMS supports MySQL-to-PostgreSQL migration with continuous replication. It handles the initial full load and then streams ongoing changes (CDC) from MySQL's binlog. This enables near-zero-downtime cutover: run both databases in parallel, validate, then switch traffic.

Tool Comparison

Tool Best For Handles Schema? Continuous Replication? Cost
pgloader One-shot migrations, self-managed infra Yes (auto-converts) No Free / open-source
AWS DMS Cloud migrations with zero-downtime needs Partial (needs manual review) Yes (CDC from binlog) Pay-per-use
pg_chameleon Self-managed with replication needs Yes Yes (MySQL binlog) Free / open-source

Regardless of tool choice, always migrate to a staging environment first. Run your full test suite against the PostgreSQL copy before touching production.

Rewriting Application Code

Schema migration gets the data across. But your application's SQL dialect needs updating too. Here are the changes that trip up most teams.

SQL Syntax Differences

GROUP BY strictness. MySQL's default sql_mode allows SELECT columns not listed in GROUP BY. PostgreSQL enforces the SQL standard: every non-aggregated column in SELECT must appear in GROUP BY. Queries that worked in MySQL will fail in PostgreSQL with column must appear in the GROUP BY clause.

String aggregation. MySQL's GROUP_CONCAT() becomes PostgreSQL's string_agg():

-- MySQL
  SELECT department, GROUP_CONCAT(name SEPARATOR ', ') FROM employees GROUP BY department;
  
  -- PostgreSQL
  SELECT department, string_agg(name, ', ') FROM employees GROUP BY department;
  

UPSERT syntax. MySQL uses INSERT ... ON DUPLICATE KEY UPDATE. PostgreSQL uses INSERT ... ON CONFLICT DO UPDATE:

-- MySQL
  INSERT INTO users (id, email, name) VALUES (1, 'a@b.com', 'Alice')
  ON DUPLICATE KEY UPDATE name = VALUES(name);
  
  -- PostgreSQL
  INSERT INTO users (id, email, name) VALUES (1, 'a@b.com', 'Alice')
  ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name;
  

LIMIT with offset. Both databases support LIMIT ... OFFSET, so this is one area that usually migrates cleanly.

Stored Procedures and Functions

If your application uses MySQL stored procedures, expect a full rewrite. MySQL procedures use a custom procedural dialect. PostgreSQL uses PL/pgSQL, which is more capable but syntactically different. Delimiter declarations (DELIMITER //), BEGIN...END blocks, and variable declarations all change.

For applications using an ORM (Django, Rails, SQLAlchemy, Sequelize), the migration is smoother. Update the connection string and database driver, then run your test suite. Most ORMs abstract dialect differences, but watch for raw SQL fragments, custom queries, and database-specific features like MySQL's LAST_INSERT_ID() (PostgreSQL uses RETURNING id or currval()).

Key Takeaways

  • PostgreSQL's MVCC, extension ecosystem (PostGIS, pgvector), and stricter SQL compliance are the most common reasons teams migrate from MySQL.
  • Schema conversion requires careful attention to data types. AUTO_INCREMENT becomes IDENTITY, TINYINT(1) becomes BOOLEAN, and MySQL ENUMs need standalone type definitions.
  • Zero dates, case-sensitive string comparisons, and backtick quoting are the three issues most likely to break silently.
  • pgloader is the fastest path for one-shot migrations. AWS DMS adds continuous replication for zero-downtime cutover.
  • Application code needs updates for GROUP BY strictness, GROUP_CONCAT to string_agg, ON DUPLICATE KEY to ON CONFLICT, and stored procedure rewrites.
  • Always migrate to a staging environment first. Run your full test suite against PostgreSQL before cutting over production traffic.