E.1. Release 8.3

Release date: 2007-12-??

Release date: CURRENT AS OF 2007-10-24

E.1.1. Overview

This release represents a major leap forward for PostgreSQL by adding significant new functionality and performance enhancements. This was made possible by a growing community that has dramatically accelerated the pace of development. This release adds the follow major capabilities:

Major performance improvements are listed below. Fortunately, most of these enhancements are automatic and do not require user changes or tuning:

E.1.2. Migration to version 8.3

A dump/restore using pg_dump is required for those wishing to migrate data from any previous release.

Observe the following incompatibilities:

E.1.3. Changes

Below you will find a detailed account of the changes between PostgreSQL 8.3 and the previous major release.

E.1.3.1. Performance Improvements

  • Asynchronous commit delays writes to WAL for committed transactions (Simon)

    This feature dramatically increases performance for data-modifying queries. The disadvantage is that because on-disk changes are delayed, if the operating system crashes before data is written to the disk, committed data will be lost. This is useful only for applications that can accept some data loss. Unlike fsync, asynchronous commit does not risk database corruption; the worst case is that after an operating system crash the last few reportedly-committed transactions will be missing. This feature is enabled turning synchronous_commit off and setting wal_writer_delay.

  • Distributed checkpoints prevent I/O spikes during checkpoints (Itagaki Takahiro and Heikki Linnakangas)

    Previously all modified buffers were forced to disk at checkpoint time, causing an I/O spike and decreasing server performance. This new capability spreads checkpoint activity out between checkpoints, reducing peak I/O usage. (User-requested and shutdown checkpoints are still immediately written to disk.)

  • Heap-Only Tuples (HOT) accelerate space reuse for UPDATEs (Pavan Deolasee, with ideas from many others)

    To allow high concurrency UPDATE creates a new tuple, rather than replacing the old tuple. Previously only VACUUM could reuse space taken by old tuples. With HOT dead tuple space can be reused at the time of UPDATE or INSERT. This allows for more consistent performance. HOT even allows deleted row space reuse. HOT space reuse is not possible if UPDATE changes indexed columns.

  • Just-in-time background writer strategy improves disk write efficiency (Greg Smith, Itagaki Takahiro)

    This basically makes the background writer self-tuning.

  • Reduce per-field and per-row storage requirements (Greg Stark)

    Variable-length data types with data values less then 128 bytes will see a decrease of 3-6 bytes. For example, two CHAR(1) fields now take 4 bytes instead of 16. Rows are also 4 bytes shorter.

  • Reduce need for vacuum by using pseudo-transaction ids in read-only transactions (Florian Pflug)

    Pseudo-transaction ids do not increment the global transaction counter. Therefore, they do not add to the need for vacuum to read all database rows to prevent problems with transaction id wrap-around. Other transaction performance improvements were also made that should improve concurrency.

  • Create a dedicated WAL writer process to off-load work from backends (Simon)

  • Skip unnecessary WAL writes for CLUSTER and COPY (Simon)

    Unless WAL archiving is enabled, it is possible to just fsync() the table at the end of the command, increasing performance. Additional WAL efficiencies were also made.

  • Prevent large sequential scans from forcing out more frequently used cached pages (Simon, Heikki, Tom)

  • Allow large sequential scans to use cached pages from other concurrent sequential scans (Jeff Davis)

    This is accomplished by starting the new sequential scan in the middle of the table (where the other sequential scan is already in-progress) and wrapping around to the beginning to finish. This may affect the order of returned rows in a non-ORDER BY query.

  • Allow ORDER BY ... LIMIT to be done without sorting (Greg Stark)

    This is done by sequentially scanning the table and using a filter to save the few requested rows, rather than sorting the entire table. This is used if there is no matching index.

  • Reduce overhead of populating the statistics tables (Tom)

  • Improve hash join performance for cases with many NULLs (Tom)

E.1.3.2. Server Changes

  • Support multiple concurrent autovacuum processes (Alvaro, Itagaki Takahiro)

    This allows multiple vacuums to run concurrently, meaning vacuuming of a large table will not prevent smaller tables from being vacuumed at the same time. Autovacuum is now considered mature and thus enabled by default. Several autovacuum default parameter values were also updated.

  • Autovacuum is now enabled by default (Alvaro)

    Also, autovacuum now reports its activity start time in pg_stat_activity (Tom)

  • Automatically invalidate cached function code when table definitions change or statistics are updated (Tom)

    Previously PL/PgSQL functions that referenced temporary tables would fail if the temporary table was dropped and recreated between function invocations, unless EXECUTE was used.

  • Support Security Service Provider Interface (SSPI) for authentication on Windows

    This also adds support for the GSSAPI authentication API.

  • Support GSSAPI authentication (Henry Hotz, Magnus)

  • Support a global SSL configuration file (Victor Wagner)

  • Add ssl_ciphers parameter to control accepted SSL ciphers (Victor Wagner)

  • Add new encodings EUC_JIS_2004 and SHIFT_JIS_2004, along with new conversions between EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8 (Tatsuo)

  • Make JOHAB encoding client-only (Tatsuo)

    JOHAB cannot safely be used as a server-side encoding.

  • Allow logfile creation in CSV format (Arul Shaji, Greg Smith, Andrew Dunstan)

    The CSV file can be loaded into a database table for analysis.

  • Add log_autovacuum_min_duration parameter to support configurable logging of autovacuum actions (Simon, Alvaro)

  • Add log_lock_waits parameter to log long wait times (Simon)

  • Add log_temp_files parameter to log usage of temporary files (Bill Moran)

  • Add log_checkpoints parameter to improve logging of checkpoints (Greg Smith, Heikki)

  • log_line_prefix escapes %s and %c can now be used in all processes (Andrew)

  • Use our own timezone support for formatting timestamps displayed in the server log (Tom)

    This avoids Windows-specific problems with localized time zone names that are in the wrong encoding. There is a new log_timezone parameter that controls the timezone used in log messages, independent of the client-visible timezone parameter.

  • Change the timestamps recorded in transaction WAL records from time_t to TimestampTz representation (Tom)

    This provides sub-second resolution in WAL, which can be useful for point-in-time recovery.

  • New boolean configuration parameter, archive_mode, controls archiving (Simon)

    Previously setting archive_command to an empty string turned off archiving. Now archive_mode turns archiving on and off. This is useful for stopping archiving temporarily.

  • Improve ability to create warm standby servers using archiving (Simon)

    Allow the warm standby server to pass the earliest needed WAL file to the recovery script to allow automatic removal of unneeded WAL files. This is done using restore_command %r in recovery.conf.

  • Add log_restartpoints archive recovery option to emit a log message at each recovery restart point (Simon)

  • Last transaction end time is now logged at end of recovery and at each logged restart point (Simon)

  • Add a temp_tablespaces parameter to control the tablespaces for temporary tables and files (Jaime Casanova, Albert Cervera, Bernd Helmle)

    This parameters allows a list of tablespaces to be specified which enables spreading the I/O load across multiple tablespaces. A random tablespace is chosen each time a temporary object is created. Temporary files are not stored in per-database pgsql_tmp/ directories anymore but in per-tablespace directories.

  • New system view pg_stat_bgwriter displays statistics about the background writer activity (Magnus)

  • Add new columns for database-wide tuple statistics to pg_stat_database (Magnus)

  • Add an xact_start column to pg_stat_activity (Neil)

    This makes it easier to identify long-running transactions.

  • Add n_live_tuples and n_dead_tuples columns to pg_stat_all_tables and related views (Glen Parker)

  • Remove stats_start_collector parameter (Tom)

    We now always start the collector process, unless prevented by a problem with setting up the stats UDP socket.

  • Remove stats_reset_on_server_start parameter (Tom)

    This was removed because pg_stat_reset() can be used for this purpose.

  • Merge stats_block_level and stats_row_level parameters into a single parameter track_counts, which controls all messages sent to the collector process (Tom)

  • Rename stats_command_string parameter to track_activities (Tom)

  • Limit the amount of information reported when a user is dropped (Alvaro)

    Previously, dropping (or attempting to drop) a user who owned many objects could result in extremely large NOTICE or ERROR messages listing all these objects; this caused problems for some client applications. The length of the list is now limited, although a full list is still sent to the server log.

  • Place temporary table TOAST tables in a special schemas named pg_toast_temp_nnn (Tom)

    This allows low-level code to recognize that these tables are temporary, which enables various optimizations such as not WAL-logging changes and using local rather than shared buffers for access. This also fixes a bug where backends unexpectedly held open file references to temporary tables.

  • Fix problem that a constant flow of new connection requests could indefinitely delay the postmaster from completing a shutdown or crash restart (Tom)

  • Allow CREATE INDEX CONCURRENTLY to ignore transactions in other databases (Simon)

E.1.3.3. Query Changes

  • Full text search now fully integrated into the core database system (Teodor, Oleg)

    This features was previously in contrib/tsearch2. It has been improved, moved into the server, and is now installed by default.

  • Add control over whether NULLs sort first or last (Teodor, Tom)

    The syntax is ORDER BY ... NULLS FIRST/LAST.

  • Allow ascending/descending (ASC/DESC) control during index creation (Teodor, Tom)

    Previously a query using ORDER BY with mixed ASC/DESC specifiers could not fully use an index. Now an index can be fully used in such cases if the index was created with matching ASC/DESC specifictions.

  • Support updatable cursors (Arul Shaji, Tom)

    This eliminates the need to reference a primary key to UPDATE or DELETE rows returned by a cursor. The syntax is UPDATE/DELETE WHERE CURRENT OF.

  • Allow FOR UPDATE in cursors (Arul Shaji, Tom)

  • Create a general mechanism that supports casts to and from the standard string types (TEXT, VARCHAR, CHAR) for every datatype, by invoking the datatype's I/O functions (Tom)

    XXX? bjm These new casts are assignment-only in the to-string direction, explicit-only in the other, and therefore should create no surprising behavior. Remove a bunch of thereby-obsoleted datatype-specific casting functions.

  • Allow col IS NULL to use an index (Teodor)

  • Allow limited hashing when using two different data types (Tom)

    This allows hash joins, hash indexes, hashed subplans, and hash aggregation to be used in situations involving cross-data-type comparisons, if the data types have compatible hash functions. Current cross-data-type hashing support exists for SMALLINT/INTEGER/BIGINT, and for FLOAT4/FLOAT8.

  • Improve optimizer logic for detecting when variables are equal in a WHERE clause (Tom)

    This allows mergejoins to work with descending sort orders, and improves recognition of redundant sort columns.

  • Improve performance when planning large inheritance trees when most tables are excluded by constraints (Tom)

  • Remove the undocumented !!= (not in) operator (Tom)

    NOT IN (SELECT ...) is the proper way to perform this operation.

E.1.3.4. Object Manipulation Changes

  • Support arrays of composite types (David Fetter, Andrew, Tom)

    Arrays of rowtypes of regular tables and views are now supported, but not for system catalogs, sequences, or TOAST tables.

  • Server configuration parameters can now be set on a per-function basis (Tom)

    For example, functions can now set their own search_path to prevent unexpected behavior if a different search_path exists at run-time. Security definer functions should set search_path to avoid security loopholes.

  • Add COST and ROWS options to CREATE/ALTER FUNCTION (Tom)

    This allows simple control of the estimated cost of a function call and control over the estimated number of rows returned by a set-returning function.

  • Allow triggers and rules to be deactivated in groups using a session variable, for replication purposes (Jan)

    This allows replication systems to disable triggers and rewrite rules as a group without modifying the system catalogs directly. The behavior is controlled by ALTER TABLE and a new parameter session_replication_role.

    psql's \d command and pg_dump have been enhanced

  • User-defined types can now have type modifiers (Teodor, Tom)

    This allows a user type to take a modifier when being created, e.g. SSNUM(7). Previously only predefined system data types would allow this, e.g. CHAR(4).

  • Foreign keys now must match indexable conditions for cross-data-type references (Tom)

E.1.3.5. Utility Command Changes

  • Non-superuser database owners now have privileges to add trusted procedural languages in their databases by default (Jeremy Drake)

    While this is reasonably safe, some administrators may wish to revoke the privilege. It is controlled by pg_pltemplate.tmpldbacreate.

  • Allow a session's current parameter setting to be used as the default for future sessions (Tom)

    This is done with SET ... FROM CURRENT in CREATE/ALTER FUNCTION, ALTER DATABASE, or ALTER ROLE.

  • Implement new commands DISCARD ALL, DISCARD PLANS, DISCARD TEMPORARY, CLOSE ALL, and DEALLOCATE ALL (Marko Kreen, Neil)

    These commands simplify resetting a database session to its initial state, and are particularly useful for connection-pooling software.

  • Add ALTER VIEW ... RENAME TO and ALTER SEQUENCE ... RENAME TO (David Fetter, Neil)

    Previously this could only be done via ALTER TABLE ... RENAME TO.

  • Implement CREATE TABLE LIKE ... INCLUDING INDEXES (Trevor Hardcastle, Nikhil S, Neil)

  • Make CLUSTER MVCC-safe (Heikki Linnakangas)

    Formerly, CLUSTER would discard all tuples that were committed dead, even if there were still transactions that should be able to see them under the visibility rules.

  • Add new syntax for CLUSTER: CLUSTER table USING index (Holger Schurig)

    The old CLUSTER syntax is still supported, but the new form is considered more logical.

  • Fix EXPLAIN so it can show more complex plans (Tom)

  • Make CREATE/DROP/RENAME DATABASE wait briefly for other backends to exit before failing (Tom)

    This increases the likelihood that these commands will succeed.

  • Prevent NOTIFY/LISTEN/UNLISTEN from accepting schema-qualified names (Bruce)

    Formerly, these commands accepted "schema.relation" but ignored the schema part, which was confusing.

E.1.3.6. Data Type and Function Changes

  • Support the SQL/XML standard, including new operators and an XML data type (Nikolay Samokhvalov, Peter)

  • Support for enumerated data types (ENUM) (Tom Dunstan)

    This is accomplished by creating a new data type with an ENUM clause, e.g. CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy').

  • Add Universally Unique Identifier (UUID) data type (Gevik Babakhani, Neil)

    This closely matches RFC 4122.

  • Widen the MONEY data type to 64 bits (D'Arcy Cain)

    This greatly increases the range of supported MONEY values.

  • Add new regexp functions regexp_matches(), regexp_split_to_array(), and regexp_split_to_table() (Jeremy Drake, Neil)

    These functions provide access to the regex groups, \(.*\) , and allows splitting a string on a POSIX regular expression.

  • Add lo_truncate() function for large object truncation (Kris Jurka)

  • Implement width_bucket() for the float8 data type (Neil)

  • Add pg_stat_clear_snapshot() to discard statistics snapshots collected during the current transaction (Tom)

    The first request for statistics in a transaction takes a statistics snapshot that doesn't change during the transaction. This function allows the snapshot to be discarded and a new snapshot loaded during the next statistics query. This is particularly useful for PL/PgSQL functions which are confined to a single transaction.

  • Add isodow option to EXTRACT() and date_part() (Bruce)

    This is the day of the week, with Sunday as seven. (dow returns Sunday as zero.)

  • Add ID (ISO day of week) and IDDD (ISO day of year) format types for to_char(), to_date() and to_timestamp() (Brendan Jurd)

  • Make to_timestamp() and to_date() assume "TM" (trim) for potentially variable-width fields (Bruce)

    This matches Oracle behavior.

  • Fix off-by-one conversion in to_date()/to_timestamp() 'D' fields (Bruce)

  • Fix float4/float8 to handle Infinity and NAN (not a number) consistently (Bruce)

    The code formerly was not consistent about distinguishing Infinity from overflow conditions.

  • Make setseed() return void, rather than a useless integer value (Neil)

  • Add a hash function for NUMERIC (Neil)

    This allows hash indexes and hash-based plans to be used with NUMERIC.

  • Improve efficiency of LIKE/ILIKE, especially for multi-byte character sets like UTF8 (Andrew, Itagaki Takahiro)

  • Allow leading and trailing whitespace for BOOLEAN values (Neil)

  • Make currtid() functions require SELECT privileges on the target table (Tom)

  • Add several txid_*() functions to query the transaction ids used by the current session (Jan)

    This is useful for various replication solutions.

E.1.3.7. PL/PgSQL Server-Side Language Changes

  • Add scrollable cursor support by adding directional control to PL/PgSQL's FETCH (Pavel Stehule)

  • Add support for IN as an alternative to FROM in PL/PgSQL's FETCH statement, for consistency with the backend's FETCH command (Pavel Stehule)

  • Add MOVE to PL/PgSQL (Magnus, Pavel Stehule, Neil)

  • Implement RETURN QUERY (Pavel Stehule, Neil)

    This adds convenient syntax for PL/PgSQL set-returning functions that want to return the result of a query, rather than using RETURN NEXT. RETURN QUERY is more efficient too.

  • Allow function parameter names to be qualified with the function's name(Tom)

    For example, myfunc.myvar. This is particularly useful for specifying variables in a query where the variable name might match a column name.

  • Tighten requirements for FOR loop STEP values (Tom)

    Prevent non-positive STEP values, and handle loop overflows.

  • Improve accuracy when reporting syntax error locations (Tom)

E.1.3.8. PL/Perl Server-Side Language Changes

  • Allow type-name arguments to spi_prepare() to be data type aliases in addition to names in pg_type (Andrew)

E.1.3.9. PL/Python Server-Side Language Changes

  • Enable PL/PythonU to compile on Python 2.5 (Marko Kreen)

  • Allow type-name arguments to plpy.prepare() to be data type aliases in addition to names in pg_type (Andrew)

  • Support a true boolean type in compatible Python versions (Python 2.3 and later) (Marko Kreen)

E.1.3.10. PL/Tcl Server-Side Language Changes

  • Allow type-name arguments to spi_prepare to be data type aliases in addition to names in pg_type (Andrew)

  • Fix problems with thread-enabled libtcl spawning multiple threads within the backend (Steve Marshall, Paul Bayer, Doug Knight)

    This caused all sorts of unpleasantness.

E.1.3.11. psql Changes

  • List disabled triggers separately in \d output (Brendan Jurd)

  • Show aggregate return types in \da output (Greg Sabino Mullane)

  • Add the function's volatility to the output of \df+ (Neil)

  • In \d patterns, always match $ literally (Tom)

  • Add \prompt capability (Chad Wagner)

  • Allow \pset, \t, and \x to use on/off, rather than just toggling (Chad Wagner)

  • Add \sleep capability (Jan)

  • Enable \timing output for \copy (Andrew)

  • Improve \timing resolution on Windows (Itagaki Takahiro)

  • Flush \o output after each backslash command (Tom)

E.1.3.12. pg_dump Changes

  • Add --tablespaces-only and --roles-only options to pg_dumpall (Dave Page)

  • Add an output file option to pg_dumpall (Dave Page)

    This is primarily useful on Windows, where output redirection of child pg_dump processes does not work.

  • Allow pg_dumpall to accept an initial-connection database name rather than the default template1 (Dave Page)

  • In -n and -t switches, always match $ literally (Tom)

  • Improve performance when a database has many thousands of objects (Tom)

E.1.3.13. Other Client Application Changes

  • In initdb, allow the location of the pg_xlog directory location to be specified (Euler Taveira de Oliveira)

  • Enable core dump generation in pg_regress and pg_ctl, if possible (Andrew)

  • Allow Control-C to cancel clusterdb, reindexdb, and vacuumdb (Itagaki Takahiro, Magnus)

  • Suppress command tag output for createdb, createuser, dropdb, dropuser (Peter)

    The --quiet option is ignored and will be removed in 8.4. Progress messages when acting on all databases now go to stdout instead of stderr because they are not actually errors.

E.1.3.14. libpq Changes

  • Interpret the dbName parameter of PQsetdbLogin() as a conninfo string if it contains an equals sign (Andrew)

    This allows use of conninfo strings in client programs that still use PQsetdbLogin().

  • Support a global SSL configuration file (Victor Wagner)

  • Add environment variable PGSSLKEY to control SSL hardware keys (Victor Wagner)

  • Add lo_truncate() for large object truncation (Kris Jurka)

  • Add PQconnectionUsedPassword() that returns true if the server required a password (Joe Conway)

    If this returns true and the connection failed a client application should prompt the user for a password.

E.1.3.15. ecpg Changes

  • Major rewrite to use V3 frontend/backend protocol (Michael)

    This adds server-side prepared statements.

  • Use native threads, instead of pthreads, on Windows (Magnus)

  • Improve thread-safety of ecpglib (Itagaki Takahiro)

  • Have ecpg libraries exporting only API symbols (Michael) Win32 only? XXX

E.1.3.16. Windows Port

  • Allow the backend database server to be compiled with Microsoft Visual C++ (Magnus and others)

    Windows executables made with Visual C++ might have better stability and performance than those made with other tool sets. Development and debugging tools familiar to Windows developers will also work. The client-only C++ build scripts have been removed.

  • Allow regression tests to be started by an admin user (Magnus)

  • Add native shared memory implementation for Windows (Magnus)

E.1.3.17. Source Code Changes

  • Rename macro DLLIMPORT to PGDLLIMPORT to avoid conflicting with third party includes (like TCL) that define DLLIMPORT (Magnus)

  • Allow execution of cursor commands through SPI_execute (Tom)

    The macro SPI_ERROR_CURSOR still exists but will never be returned.

  • SPI plan pointers are now SPIPlanPtr instead of void * (Tom)

    This does not break application code, but switching is recommended to help catch simple programming mistakes.

  • Add cursor-related functionality in SPI (Pavel Stehule)

    Allow access to the cursor-related planning options, and add FETCH/MOVE routines.

  • Add configure --enable-profiling to enable code profiling (works only with gcc) (Korry Douglas and Nikhil S)

  • Add configure --with-system-tzdata to use the operating system time zone database (Peter)

  • Create "operator families" improve planning of queries involving cross-data-type comparisons (Tom)

  • Support gmake draft when building the SGML documentation (Bruce)

  • Update GIN extractQuery() API to allow signalling that nothing can satisfy the query (Teodor)

  • Move NAMEDATALEN definition from postgres_ext.h to pg_config_manual.h (Peter)

  • Change server startup log message from "database system is ready" to "database system is ready to accept connections"

  • Provide strlcpy() and strlcat() on all platforms, and replace error-prone uses of strncpy(), strncat(), etc (Peter)

  • Fix pgstats counting of live and dead tuples to recognize that committed and aborted transactions have different effects (Tom)

  • Create hooks to let a loadable plugin monitor (or even replace) the planner and create plans for hypothetical situations (Gurjeet Singh, Tom)

  • Create a function variable join_search_hook to let plugins override the join search order portion of the planner (Julius Stroffek)

  • Add tas() support for Renesas' M32R processor (Kazuhiro Inaoka)

  • Have quote_identifier() and pg_dump not quote keywords that are unreserved according to the grammar (Tom)

  • Fix PGXS so extensions can be built against Postgres installations whose pg_config program does not appear first in the PATH (Tom)

  • Change the on-disk representation of the NUMERIC data type so that the sign_dscale word comes before the weight (Tom)

  • Use SYSV semaphores rather than POSIX on Darwin >= 6.0, i.e., OS X 10.2 and up (Chris Marcellino)

E.1.3.18. Contrib Changes

  • Add /contrib/pageinspect module for low-level page inspection (Simon, Heikki)

  • Add /contrib/pg_standby module for warm standby operation (Simon)

  • Add /contrib/uuid-ossp module for generating UUID values using the OSSP UUID library (Peter)

    Use configure --with-ossp-uuid to activate. This takes advantage of the new UUID builtin type.

  • Allow pgbench to set the fillfactor (Pavan Deolasee)

  • Add timestamps to pgbench -l (Greg Smith)

  • Add usage count statistics to contrib/pgbuffercache (Greg Smith)

  • Add GIN support for hstore (Teodor)

  • Add GIN support for pg_trgm (Guillaume Smet, Teodor)

  • Update OS/X startup scripts in /contrib/start-scripts (Mark Cotner, David Fetter)

  • Restrict pgrowlocks() and dblink_get_pkey() to users who have SELECT privilege on the target table (Tom)

  • Restrict contrib/pgstattuple functions to superusers (Tom)

  • contrib/xml2 is deprecated and planned for removal in 8.4 (Peter)

    The new XML support in core Postgres supersedes this module.