Copyright © 2004, 2005, 2008, 2019 Jeroen Ruigrok van der Werven
2019-11-07
Abstract
This reference guide describes how to migrate from MySQL to PostgreSQL on all levels.
Table of Contents
Work in Progress
Please note that this document is undergoing revision. Be advised that some information might be outdated or not fully fleshed out. And the structure absolutely sucks right now.
2019-11-07 update: After a long time I am updating this document by first overhauling the HTML. No content updates just yet.
1. Thanks
Thanks goes out to these individuals for their comments and suggestions (alphabetical order) :
-
David Fetter
-
Robert Treat
Preface
This guide hopes to detail the changes a person must make to his/her database design in order to switch from MySQL to PostgreSQL. The database versions used in the comparison/conversion testing are MySQL 5.x and PostgreSQL 8.3.x. No doubt a lot of the information presented here also applied and applies to past and future versions.
2. Audience
This guide is aimed at people who are migrating from MySQL to PostgreSQL.
3. Prerequisite Knowledge
You should be familiar with the operating system(s) on which you are running MySQL and PostgreSQL. You should also be familiar with relational database concepts.
4. Conventions Used
Abstract
This section describes the conventions used within this guide for text and code examples. It describes:
-
Text Conventions Used
-
Code Examples Conventions Used
4.1. Text Conventions Used
Convention | Meaning | Example |
---|---|---|
Bold | The bold typeface indicates terms that are defined in the text, that appear in the glossary, or both. | When you use this argument, you create a unique index. |
Italics | The italics typeface indicates book titles, emphasis, placeholders, or syntax clauses. | PostgreSQL 8.3 Users' Guide |
UPPERCASE monospace | The uppercase monospace typeface indicates system-supplied elements, such as SQL keywords, commands, datatypes, and privileges. | The NUMERIC datatype allows for arbitrary precision. |
lowercase monospace | The lowercase monospace typeface indicates user-supplied elements and executables. Such elements include database names, column names, user names, and parameter values. | Use psql to start the command line client to PostgreSQL |
4.2. Code Examples Conventions Used
The code examples show typical use of SQL, PL/pgSQL, or other command-line statements. They are displayed using a monospace typeface and clearly separated from the normal text.
SELECT firstname FROM customers WHERE lastname = 'Smith';
The following is a list of conventions used in the code examples, their meaning, and an example.
Convention | Meaning | Example |
---|---|---|
[] | Indicates that the enclosed arguments are optional. | CREATE SCHEMA schemaname [ AUTHORIZATION username ] |
{} | Indicates that one of the enclosed arguments is required. | CREATE { TEMPORARY | TEMP } TABLE table_name |
| | Separates alternative items that may be optional or required. | [ ENCRYPTED | UNENCRYPTED ] { BEFORE | AFTER } |
... | Indicates that the preceding item can be repeated. In code fragments an ellipsis means that code not relevant to the discussion has been omitted. | CREATE USER username [ [ WITH ] option [ ... ] ] - SELECT column1, column2, ..., column/n/ FROM table; |
Italics | Indicates variables that you must supply values for. | column/n/ |
UPPERCASE | ||
lowercase |
5. Overview
5.1. Features
5.1.1. MySQL Features
TBD
5.1.2. PostgreSQL Features
-
inheritance
-
data types
-
functions
Other features provide additional power and flexibility:
-
constraints
-
triggers
-
rules
-
transactional integrity
5.2. So, exactly why are you changing?
There are some reasons a person might want to switch from MySQL to PostgreSQL. Some are based on technical merit, others on personal feelings.
A big difference is that PostgreSQL is BSD-licensed. This makes a lot of difference for some people. This is but a small reason to make a change. The true reasoning for switching databases comes from the technical side. PostgreSQL, unlike MySQL, is a full-fledged relational database management system (RDBMS). This means PostgreSQL is pushing its way up there with the big names such as: Oracle, IBM, and Sybase.
The reason MySQL took off over PostgreSQL is that it was blazingly fast on the SELECT query side. And given this and the average deployment of MySQL as a backend for websites allowed for a big boost in use. There comes a time, however, when a (beginning) DBA starts to feel the shortcomings of MySQL and desires/needs a real RDBMS.
Both MySQL and PostgreSQL are nowadays available as a native Windows package as well.
6. Differences
6.1. Differences that stand out
PostgreSQL makes a difference between single quotes (') and double quotes (") as the ANSI standard dictates. MySQL does not follow this behaviour and allows you to use both single and double quotes to quote values and tends to favour the backtick (`), whereas it should only be single quotes. Double quotes are meant to be used on quoting field names, system identifiers, table names, and so on.
MySQL tends to ship with a database locale/collation set to be case insensitive. This means that doing a SELECT query on a lastname like: WHERE lastname='Johnson' would match 'Johnson', 'JOHNSON', or 'johnson'. This is solved easily, but depends on what you want to do:
-
use the correct case in the query, like in the example above,
-
convert the string to lowercase, e.g.: WHERE lower(lastname)='johnson',
-
use a case-insensitive query operator, e.g. ILIKE or ~*.
ANSI only allows '--' as a comment delimiter, whereas MySQL supports the shell/perl's '#' use of a comment delimiter. Just use '--' since both understand this.
6.2. Data Types
PostgreSQL has 3 integer types, namely SMALLINT (2 bytes big), INTEGER (INT, 4 bytes big) and BIGINT (8 bytes big). MySQL supports 5, namely: TINYINT (1 byte big), SMALLINT (2 bytes big), MEDIUMINT (3 bytes big), INT (4 bytes big), BIGINT (8 bytes big). So just take the equivalent or the one closest to it to contain the other in. Per the standard, PostgreSQL does not have unsigned integer types, so you'll need to go up to the next-larger type if you have unsigned data in MySQL big enough to overflow the signed types.
You can use PostgreSQL's TEXT type to replace all of MySQL's *TEXT types (TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT).
PostgreSQL's DECIMAL and NUMERIC are equivalent to each other, just like MySQL's.
MySQL supports, as an extension to the SQL92 standard, a synonym for DOUBLE PRECISION using just DOUBLE.
6.3. What's The Time?
MySQL's DATETIME is an ISO-8601-like date and timestamp format in the form of: YYYY-MM-DD HH:MM:SS. This excludes any timezone information whatsoever. Within PostgreSQL use TIMESTAMP. This automatically makes it TIMESTAMP WITHOUT TIME ZONE.
To confuse matters MySQL also has a date/time type called TIMESTAMP. This type is updated under a number of conditions. PostgreSQL does not support something like this. Most often you can get away with using the PostgreSQL TIMESTAMP type with a default of now() to have the field receive the current date and time. To get the creation time of the table you would need to use DEFAULT TIMESTAMP.
6.4. Constraints -- Hold What?
MySQL's schema (and Sybase too) often includes declarations like:
CREATE TABLE blah ( id SMALLINT NOT NULL PRIMARY KEY );
PostgreSQL (and Oracle) also support this syntax. Another way to have this working is to define a constraint, e.g.:
CREATE TABLE blah ( id SMALLINT NOT NULL ); ALTER TABLE ONLY blah ADD CONSTRAINT blah_pk PRIMARY KEY (id);
MySQL also knows a concept of enumerators (enum). Basically you specify an enumerated list of allowed values, e.g.:
CREATE TABLE blah ( sentence enum('This', 'is', 'a', 'test') );
With PostgreSQL 8.3 or later, you can use enums, but you need to create the type first. The bright side is, you can then use that type elsewhere without repeating yourself.
CREATE TYPE sentence as ENUM('This', 'is', 'a', 'test'); CREATE TABLE blah ( foo sentence );
6.5. More specialized differences
MySQL has an argument to fields, auto_increment, which automatically increases the value of the field whenever a new entry is made. PostgreSQL does not work like this. Within PostgreSQL (and Oracle) you normally use a SERIAL. See below for an example:
MySQL definition:
CREATE TABLE test ( id AUTO_INCREMENT, name TEXT ); INSERT INTO test (name) VALUES ('a name');
PostgreSQL definition:
CREATE SEQUENCE test_sequence; CREATE TABLE test ( id SERIAL PRIMARY KEY name TEXT ); INSERT INTO test (name) VALUES ('a name');
MySQL also support the SERIAL datatype nowadays.
6.6. Portability reminders
ANSI SQL-92 reserves a lot of words that cannot/should not be used as table or column names. Please avoid using these names as this increases portability between different database backends. Different RDBMSes treat the reserved keyword list differently though, so check with the available documentation.
6.7. Random notes
MySQL's IFNULL(x, y) is definitely not equivalent with
PostgreSQL's NULLIF(x, y). The first returns x if x is not NULL,
otherwise it returns y. PostgreSQL's NULLIF returns a NULL only if
both x and y are equal. MySQL also knows a NULLIF which is the same
as PostgreSQL's. The equivalent to MySQL's IFNULL
would be
COALESCE(x, y)
.
Do note that while MySQL supports N/0 to be equivalent to NULL, apparently, PostgreSQL reports this as a floating point operation division by zero.
MySQL does not do any date checking. February is allowed to have 30 or 31 days and even leap years are not checked so you could also have a day 29 every year. PostgreSQL enforces correct dates.
MySQL has a UNIX_TIMESTAMP('date')
which converts date back to
seconds since the Unix epoch. PostgreSQL can also do this through an
EXTRACT(EPOCH from TIMESTAMP 'date')
.
MySQL's LENGTH(string)
operator is the same as PostgreSQL's
CHAR_LENGTH(string);
PostgreSQL does not really have anything like MySQL's
LAST_INSERT_ID()
. Instead you need to use the sequences. If you have
inserted in the same session you can use SELECT currval(pg_get_serial_sequence('your_table','your_column'));
or
otherwise you can use SELECT last_value from sequence;.