Ms2pg is a utility to assist with migration of a database on Microsoft SQL Server to a PostgreSQL database server. This script will connect to the MS-SQL server and to the PostgreSQL server to copy tables, primary keys, indexes and data. At this time the foreign keys are not copied. Plans for a future is to add the foreign key copy, update sequence counter (should this be done after data copy?) and other fun stuffs. Edoceo also provides consulting and development services for PostgreSQL.
Script Preparation
The PostgreSQL ODBC Driver must be installed first, this can be downloaded and installed from the link below:
Download
If you have a patch use the contact form to let us know.
- ms2pg 0.4 - Does some UNICODE/UTF8
Script Usage
This script has a few parameters that determine its operation, they are listed below. The parentheses indicate which version the option first appeared in.
Flags: -c create structure only, no data copy, default false (>=0.2) -f flush destination of duplicate tables before creating, default false (>=0.4) -l turns table and colum names to lowercase (>=0.2) -p pretend, prints the SQL output, default false (>=0.3) -t time myself and report stats, default false (>=0.2) -v makes verbose, default false (>=0.1) Parameters: /shostname:[source hostname, default localhost] (>=0.1) /sdatabase:[source database name, required] (>=0.1) /susername:[source connection username, default sa] (>=0.1) /spassword:[source connection password, default blank] (>=0.1) /dhostname:[destination hostname, default localhost] (>=0.1) /ddatabase:[destination database, default sdatabase] (>=0.1) /dusername:[destination connection username, default postgres] (>=0.1) /dpassword:[destination connection password, default blank] (>=0.1) /table:[tablename] (>=0.1)
Script Usage Example
Wrapped lines have a \ at the end and the following line is indented.
# Migrate the database `clients` verbosely C:\>cscript ms2pg.vbs -v /sdatabase:clients /dhostname:db.mydomain.com # Convert the database `clients` verbosely, with specific users C:\>cscript ms2pg.vbs -v /sdatabase:clients /susername:dba /spassword:fr43ed \ /dhostname:db.mydomain.com /dusername:webdb /dpassword:simple # Copy the database, new name or into existing database C:\>cscript ms2pg.vbs -v /sdatabase:clients /dhostname:db.mydomain.com \ /ddatabase:big_master # Copy from one server with one name to another server with another name C:\>cscript ms2pg.vbs /shostname:192.168.1.56 /sdatabase:clients \ /dhostname:192.168.1.65 /ddatabase:big_master
UNICODE/UTF8 Warning
If the source database uses any of the nchar, ntext or nvarchar field types the destination database must support UTF8 encoding. To do this in PostgreSQL set the encoding at database creation time as follows:
create database [something] encoding 'UTF8';
If the source database uses these types, and the destination does not have the correct encoding ms2pg will attempt the data copy but the results are undefined.
Invalid Data Types
Sadly this script cannot handle all of the data types, the exception list is below. While running if the script encounters one of these data types it will halt. If any tables were created in the destination system they will not be removed, this script does not run in a transaction.
- binary - don't know how to convert, patches?
- decimal - exact conversion unknown, could try numeric
- image - don't know how to convert, patches?
- sql_variant - useless!
- timestamp - this isn't a timestamp it's a unique rowversion, bigint. Not supported, no plan to support.
- uniqueidentifier - Could try a char(40), patches?
- varbinary - don't know how to convert, patches?
Converted Data Types
Some data types must be converted, the list is below
- bit - becomes boolean
- datetime,smalldatetime - becomes timestamp with out timezone
- float,real - becomes double precision
- money,smallmoney - becomes numeric(19,4)
- nchar,ntext,nvarchar - become char,text and varchar; PostgreSQL must use UTF8 encoding, see above.
- tinyint - becomes smallint
Performance
Since v0.2 and the timing feature and statistics have been available.
This test environment had two machines connected via 100Mb isolated network. As a little note this script uses about 8-12Mb (according to Task Manager) and lots of time and I/O, go figure. This test took 503 seconds to run, not to bad
SQL Server | Script Machine | PostgreSQL Server | |
---|---|---|---|
Machine: | localhost | localhost | imperium |
Hardware: | AthlonXP 2200+ 378 MB 40 GB IDE | AthlonXP 2200+ 378 MB 40 GB IDE66 | AthlonXP 1700+ 512 MB 250 GB IDE66 |
Database: | 14 tables, 281147 rows, 71MB | PostgreSQL equivalent |
Here is the output of this script run from one dual Xeon machine to another on 100Mb network:
ms2pg time: 5 minutes 53 seconds to complete ms2pg stat: 68 tables, 76227 rows, 59 primary keys, 0 indexes
History / Change Log
The first release of this package, on 4 Dec 2004, simply copied the structure of the tables. A beta data copy procedure was in place, very buggy. Plans were made for the other features that would eventually show up.
Digging in MSDN for more of the stored procedures that can query the database structure allowed for the ability to copy primary keys and indexes and the release of version 0.2 on 5 Dec 2004. The code for the data copy was cleaned and datatype formatting was added. Features to disable the data copy (-c), munge all names to lowercase (-l) and for timing and statistics (-t) were added.
After the first real usage patches were made, code was cleaned and the pretend (-p) functionality was added. One function was renamed, stylish changes were made. This page was updated and version 0.3 was born on 9 Dec 2004.
August 2005 and a commissioned project led to the update in 0.4 to support UNICODE/UTF8 data field types. In Microsoft SQL they are called nchar, ntext and nvarchar. To use them in PostgreSQL the the client_encoding must be set to UTF8 on the database before running ms2pg.
Thanks
Thanks to stelf for pointing out bugs.
See Also
These links cover topics on RDBMS comparisons and the developer guides that were used in creation of this script.
- Transaction Processing Performance Council - Database Benchmarksm, doesn't mention PostgreSQL or mySQL though
- The Open Source Database Benchmark
- PostgreSQL vs. MySQL vs. Commercial Databases: It's All About What You Need
- ADO API Reference
- Transact-SQL Reference - System Stored Procedures
- Transact-SQL Reference - Data Types
- PostgreSQL Data Types