PostgreSQL Keeps Secrets: How You Can Access Hidden Row Histories

Itskmyoo
4 min readJan 20, 2025

--

image showing postgres icon, generated using DALL-E

PostgreSQL is a powerful relational database that is widely used for its robustness and rich feature set. While many developers are familiar with its core functionalities, fewer are aware of the hidden columns that PostgreSQL maintains for each row, which can provide insight into the history of changes. These hidden columns include ctid, xmax, and xmin. In this article, we'll explore what these columns are, how they work, and how you can use them to read past values of rows until a VACUUM operation runs.

Understanding the Hidden Columns

PostgreSQL stores additional metadata with each row in a table, which includes the following hidden columns:

1. ctid (Tuple Identifier)

The ctid column uniquely identifies a physical location of a row in a table. It consists of two components:

  • Block Number: The data block where the row is stored.
  • Offset: The position of the row within that block.

Whenever a row is updated, PostgreSQL does not modify the row in place. Instead, it creates a new version of the row in a new location and marks the old version as obsolete. The ctid reflects this physical movement, allowing you to track different versions of the same logical row.

2. xmax (Transaction ID of the Deleting/Updating Transaction)

The xmax column records the Transaction ID (XID) of the transaction that deleted or updated a row. If the row has not been modified, xmax is set to 0. When a row is updated, the xmax of the old row is set to the XID of the updating transaction, and the new row receives the current xmin.

3. xmin (Transaction ID of the Creating Transaction)

The xmin column records the Transaction ID (XID) of the transaction that created the row. This allows you to track when and by which transaction a row was inserted into the table.

These columns are critical components of PostgreSQL’s Multi-Version Concurrency Control (MVCC) mechanism, which allows it to maintain consistency and isolation between transactions.

Using Hidden Columns to Read Past Row Versions

Because PostgreSQL creates new row versions on updates, you can access the previous versions by querying the table’s internal structure before a VACUUM operation removes obsolete rows. Let’s consider an example with a products table:

Example: Debugging Product Price Updates

Suppose you have a table products with the following structure:

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
price NUMERIC
);
  1. Insert a product:
INSERT INTO products (name, price) VALUES ('Laptop', 1000);

2. Lets start a transaction

BEGIN;

3. Update the price:

UPDATE products SET price = 1200 WHERE name = 'Laptop';

4. Query the table’s hidden columns:

SELECT ctid, xmin, xmax, name, price FROM products;

This will show both the old and new rows:

  • The old row with its ctid marked obsolete by xmax.
  • The new row with the current ctid and xmin.
result of reading hidden columns of a table postgres

5. Read the previous version: If you know the ctid of the obsolete row, you can query it directly:

SELECT * FROM products WHERE ctid = '(block, offset)';

Replace (block, offset) with the ctid value of the old row.

Preventing Autovacuum for Debugging

To ensure old row versions are not removed during debugging, you can disable autovacuum temporarily:

ALTER TABLE products SET (autovacuum_enabled = false);

Remember to re-enable it after your analysis:

ALTER TABLE products SET (autovacuum_enabled = true);

Leveraging Hidden Columns for Debugging Production Issues

Hidden columns can be incredibly useful for debugging production issues. Here’s how developers can leverage them:

  1. Trace Updates: Identify which transaction modified a row by inspecting xmax and correlating it with PostgreSQL’s transaction logs.
  2. Analyze Row Changes: Use ctid to track the lifecycle of a row and investigate why or when an update occurred.
  3. Recover Data: Temporarily recover data from obsolete rows before VACUUM removes them.
  4. Audit Rollback: Analyze aborted transactions and their impact by querying rows with specific xmin or xmax values.

Bonus: Pages and Page Headers

Every row in PostgreSQL is stored in a page, which is a fixed-size block of memory (usually 8 KB). Pages contain a header with metadata about the rows they store, including transaction information. While the discussion of pages and page headers is beyond the scope of this article, we’ll leave this topic for a future blog post!

Limitations

  • VACUUM Cleanup: Once VACUUM runs, obsolete rows are permanently removed.
  • Transaction Wraparound: PostgreSQL reuses Transaction IDs over time, which can limit the ability to track changes in long-running systems.

Conclusion

PostgreSQL’s hidden columns, such as ctid, xmax, and xmin, provide a fascinating glimpse into the database’s internals and its MVCC implementation. By understanding and leveraging these columns, you can inspect the history of row changes and troubleshoot complex scenarios. While these techniques are not suitable for long-term auditing, they offer a powerful tool for short-term analysis and debugging.

Have you explored PostgreSQL’s hidden columns in your projects? Share your experiences in the comments below!

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Itskmyoo
Itskmyoo

Written by Itskmyoo

Entrepreneur - Visionary - Survivor

Responses (3)

Write a response