Preamble
I reasoned that since I’ve only recently become aware of the idea of “killed index tuples,” there may be some other people who haven’t yet heard of this intriguing PostgreSQL concept.
This may help you figure out why execution times for the same PostgreSQL query with the same execution plan can be so different.
Let’s review the life cycle of a table row version (also known as a “heap tuple”) before we take a closer look at the index.
Life, death and visibility in the table heap
It is widely known that the visibility of heap tuples is determined by the system columns xmin and xmax. A heap tuple is “dead” if its xmax is less than the xmin of all active transactions.
Now xmin and xmax are only valid if the respective transactions have been marked committed in the “commit log.” Consequently, any transaction that needs to know if it can see a tuple has to consult the commit log. The first person to look at the commit log will put the details in the “hint bits” of the tuple so that other people don’t have to do that extra work.
Dead tuples are eventually reclaimed by VACUUM.
All of this is generally known, but how are index entries doing?
Life, death and visibility in the index
The visibility information is not stored in the index in order to avoid duplication and maintain small index tuples.
The status of an index tuple is determined by the heap tuple it points to, and both are removed by VACUUM at the same time.
So, an index scan needs to look at the heap tuple to see if it can “see” an entry. This is the case even if all the columns needed are in the index tuple itself. Even worse, this “heap access” will lead to random I/O, which on spinning disks is not very effective.
Because of this, PostgreSQL’s index scans are more expensive than those in other database management systems with different architectures. Over time, a number of features have been added to help mitigate that:
- In order to reduce random I/O and avoid visiting the same block multiple times during an index scan, PostgreSQL 8.1 introduced the “bitmp index scan,” a scan method that first creates a list of heap blocks to visit before scanning them sequentially.
- The “index-only scan,” introduced in PostgreSQL 9.2, avoids fetching the heap tuple by requiring that all necessary columns be in the index and that the “visibility map” demonstrate that all tuples in the table block are visible to all users.
However, I want to concentrate on a different feature that was also included in 8.1 but was never mentioned in the release notes.
With this commit, the feature was unveiled:
commit 3f4d48802271126b1343289a9d2267ff1ed3788a Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri May 24 18:57:57 2002 +0000 Mark index entries "killed" when they are no longer visible to any transaction, so as to avoid returning them out of the index AM. Saves repeated heap_fetch operations on frequently-updated rows. Also detect queries on unique keys (equality to all columns of a unique index), and don't bother continuing scan once we have found first match. Killing is implemented in the btree and hash AMs, but not yet in rtree or gist, because there isn't an equally convenient place to do it in those AMs (the outer amgetnext routine can't do it without re-pinning the index page). Did some small cleanup on APIs of HeapTupleSatisfies, heap_fetch, and index_insert to make this a little easier.
When an index scan retrieves a heap tuple and discovers that it is dead—or, to be more precise, that the entire “HOT chain” of tuples is dead—it marks the index tuple as “killed.” Afterward, future index scans can just ignore it.
Future index scans can be significantly sped up by using this “hint bit for indexes.”
An example
To illustrate killed index tuples, let’s build a table. To prevent autovacuum from ruining the effect by removing the dead tuples, we must turn it off.
CREATE UNLOGGED TABLE HOLE WITH (autovacuum_enabled = off) AND (id integer PRIMARY KEY, val text NOT NULL); Put into the gap ANALYZE hole; SELECT i, "text number" || i FROM generate_series(1, 1000000) AS i;
By removing rows from the table, we produce several dead tuples:
Delete from the hole where the ID is between 501 and 799500 and analyze the hole.
Spot the differences!
The same query is now executed twice:
SELECT * FROM hole WHERE id IS BETWEEN 1 AND 800000; QUERY PLAN EXPLAIN (ANALYZE, BUFFERS, COSTS OFF, TIMIMG OFF) --------------------------------------------------------------- Index Scan using hole_pkey on hole (actual rows=1000 loops=1) Index Cond: ((id >= 1) AND (id <= 800000)) Buffers: shared hit=7284 Planning Time: 0.346 ms Execution Time: 222.129 ms (5 rows) EXPLAIN (ANALYZE, BUFFERS, OFF COSTS, OFF TIMIMG) QUERY PLAN SELECT * FROM hole WHERE id IS BETWEEN 1 AND 800000; --------------------------------------------------------------- Index Scan on hole with hole_pkey (actual rows = 1000 loops = 1). Index Cond: ((id >= 1) AND (id <= 800000)) Buffers: shared hit=2196 Planning Time: 0.217 ms Execution Time: 14.540 ms (5 rows)
Discussion of the example
Numerous factors exclude them from being responsible for the difference, including:
- However, you can see from the execution plan that in both cases, all blocks were already in the cache (“shared hit”) and none had to be read from disk. Often, the second execution is faster because the first run has to read more data from secondary storage, which are already cached on the second run.
- No hint bits had to be written during the first execution (no buffers were “dirtied”), because that was already done during the sequential scan from the
DELETE.
The 799000 index tuples that pointed to dead tuples had to be killed during the first execution, which required visiting all of the table blocks.
Because the second execution didn’t have to do that, it was ten times faster.
Because each table block contains many dead tuples, the difference in blocks touched by the two queries is not as great as one might think.
It is also worth noting that even though they are not shown in the EXPLAIN output, some index pages must have been dirtied in the process, which will cause some disk writes from this read-only query.
Conclusion
The next time you experience enigmatic query run-time variations that cannot be attributed to caching effects, think about killed index tuples as the likely culprit. This could be a sign that autovacuum needs to be set to run more often on the table in question. This will get rid of the dead tuples and make the first query run faster.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
How to Enable Intelligent Wealth Growth with Enteros Database Analytics, RevOps Automation, and Gen AI
- 24 June 2026
- Software Engineering
Introduction Wealth management and investment organizations are entering a new era defined by data-driven decision-making, AI-powered advisory systems, and highly automated operational environments. As client expectations grow and financial markets become more dynamic, firms must continuously improve performance, efficiency, and personalization to remain competitive. Modern wealth organizations now operate complex ecosystems that include: Portfolio management … Continue reading “How to Enable Intelligent Wealth Growth with Enteros Database Analytics, RevOps Automation, and Gen AI”
How to Improve Financial Cost Visibility with Enteros Database Management Platform and Cost Attribution Analytics
Introduction The financial services industry is rapidly evolving as banks, insurance companies, fintech platforms, and investment firms modernize their digital infrastructure to support real-time transactions, data-driven decision-making, and highly personalized customer experiences. Modern financial organizations operate complex ecosystems that include: Core banking systems Digital payment platforms Investment and trading systems Risk management applications Fraud detection … Continue reading “How to Improve Financial Cost Visibility with Enteros Database Management Platform and Cost Attribution Analytics”
How AI-Driven Database Monitoring Enhances Business Continuity and Resilience
In today’s always-on digital economy, business continuity and operational resilience have become essential for enterprise success. Organizations depend heavily on digital systems to support customer interactions, financial transactions, supply chain operations, analytics, internal workflows, and real-time decision-making. Any disruption to these systems can lead to significant financial loss, operational inefficiencies, and reputational damage. At the … Continue reading “How AI-Driven Database Monitoring Enhances Business Continuity and Resilience”
Reducing Application Latency with Intelligent Database Performance Management
In today’s digital economy, application speed is directly tied to business success. Whether users are shopping online, using banking applications, streaming content, accessing SaaS platforms, or interacting with enterprise systems, they expect fast and seamless experiences. Even minor delays can impact user satisfaction, engagement, and revenue. Application latency has become one of the most important … Continue reading “Reducing Application Latency with Intelligent Database Performance Management”