Bracing against the wind  

Tuesday, December 03, 2013

UTF8 Encoding and Postgres Dump

If you back up a database from one version of postgres using pg_dumpall, then restore to another version, it's very possible that the strictness or some other aspect of the UTF8 library has changed. The end result is that you will see an error "invalid character for encoding" while restoring. The utility "iconv" can help, and it was really hard to find the answer online, which is why I'm reposting for my own sanity:

gunzip -c dumpall.gz | iconv | sudo -u postgres psql > restore.errs 2>&1 
Miraculously, iconv worked without options, cleaning up version incompatibilities, while maintaining all the data... even the funny French city names. If you're changing encodings while also upgrading, that's fine too. iconv takes a "from" and "to" argument, so log in to the source db and destination db, and use:

SHOW client_encoding;
Then use the results as the "-f FROM" and "-t TO" arguments to iconv. No conversion is perfect, but iconv seems to do as good a job as possible.

[View/Post Comments] [Digg] [] [Stumble]

Home | Email me when this weblog updates: | View Archive

(C) 2002 Erik Aronesty/DocumentRoot.Com. Right to copy, without attribution, is given freely to anyone for any reason.

Listed on BlogShares | Bloghop: the best pretty good | Blogarama | Technorati | Blogwise