Preamble
Producing sample data
Before we begin, we must first create some example data:
test=# CREATE TABLE a (aid int); CREATE TABLE test=# CREATE TABLE b (bid int); CREATE TABLE test=# INSERT INTO a VALUES (1), (2), (3), (4); INSERT 0 4 test=# INSERT INTO b VALUES (2), (3), (4), (4), (5); INSERT 0 5
It has a rather straightforward structure. For our little demonstration, it comprises of two tables with a few rows each.
Explicit vs implicit joins
A common question is what distinguishes an implicit join from an explicit join. Syntactic sugar is the simple solution. An implicit join appears here.
test=# SELECT * FROM a, b WHERE aid = bid; aid | bid -----+----- 2 | 2 3 | 3 4 | 4 4 | 4 (4 rows)
In this case, the WHERE clause is where the join actually occurs. We just list the tables we want to join in the FROM clause. Use an explicit join as an alternative:
test=# SELECT * FROM a JOIN b ON (aid = bid); aid | bid -----+----- 2 | 2 3 | 3 4 | 4 4 | 4 (4 rows)
They are both identical in syntax. All that matters in this case is a matter of taste and fashion. The execution plans for both proposals are exactly the same if you look at them.
test=# explain SELECT * FROM a, b WHERE aid = bid; QUERY PLAN ----------------------------------------------------------------- Merge Join (cost=393.42..893.85 rows=32512 width=8) Merge Cond: (a.aid = b.bid) -> Sort (cost=179.78..186.16 rows=2550 width=4) Sort Key: a.aid -> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4) -> Sort (cost=179.78..186.16 rows=2550 width=4) Sort Key: b.bid -> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4) (8 rows)
Here is the second plan …
test=# explain SELECT * FROM a JOIN b ON (aid = bid); QUERY PLAN ----------------------------------------------------------------- Merge Join (cost=393.42..893.85 rows=32512 width=8) Merge Cond: (a.aid = b.bid) -> Sort (cost=179.78..186.16 rows=2550 width=4) Sort Key: a.aid -> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4) -> Sort (cost=179.78..186.16 rows=2550 width=4) Sort Key: b.bid -> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4) (8 rows)
For most queries, the optimizer basically treats both requests in the same way. There is no difference if you happen to run a normal application. However, there are a few minute variations that merit addressing. Let’s look at the configuration setting listed below, which is located in postgresql.conf:
test=# SHOW join_collapse_limit; join_collapse_limit --------------------- 8 (1 row)
This indicates that eight explicit joins are planned implicitly. PostgreSQL will be able to choose the join order for those joins automatically if you use conventional implicit joins. This was carried out to hasten the inquiry. In practice, this implies that PostgreSQL may reorder the order in which the data is joined if the optimizer determines that it is essential. If you are using explicit joins, this is only done for the first 8 joins; the remaining joins are processed in the order specified by the query. Although there is a distinction between implicit and explicit joins, most applications will not often be affected by this. Joining more than ten tables is typically the exception rather than the rule. Remember that combining numerous tables will already result in genetic query optimization (GEQO).
The choice between explicit and implicit joins is typically one of preference rather than performance. However, keep in mind that there are differences when looking at highly sophisticated queries.
What is an inner join? What is an outer join?
The following common query is: What distinguishes an inner join from an outer join? The correct response is that an inner join will search those tables for shared entries. Every single piece of information on one side will be used in an outer join to locate matches on the other side.
Here’s an illustration:
test=# SELECT * FROM a LEFT JOIN b ON (aid = bid); aid | bid -----+----- 1 | 2 | 2 3 | 3 4 | 4 4 | 4 (5 rows)
Here, we display every record on table “a” (sometimes known as the “left side”) and search for any rows that match on the right side. A RIGHT JOIN is the polar opposite of a LEFT JOIN. The following inquiry is the same as the previous one:
test=# SELECT * FROM b RIGHT JOIN a ON (aid = bid); bid | aid -----+----- | 1 2 | 2 3 | 3 4 | 4 4 | 4 (5 rows)
Full joins
A full join is a third choice, though. It finds matches by combining ALL of the data from both sides. If there is no match on either side of the join, NULL values will be filled in. Here’s an illustration:
test=# SELECT * FROM a FULL JOIN b ON (aid = bid); aid | bid -----+----- 1 | 2 | 2 3 | 3 4 | 4 4 | 4 | 5 (6 rows)
Even though this is easy to do, people often make mistakes because they don’t know what outer joins are or how they work.
Outer joins: Common mistakes and errors
What happens more often than not is that people add an AND clause to the join condition. What does it mean for the result:
test=# SELECT * FROM a LEFT JOIN b ON (aid = bid AND bid = 2); aid | bid -----+----- 1 | 2 | 2 3 | 4 | (4 rows)
Contrary to what most people believe, the result set will not be less than it was. The LEFT JOIN will still produce all the rows on the left; the ON condition only changes which row matches. The additional condition does not reduce the amount of data; it merely causes certain items to become NULL. So, a LEFT JOIN that doesn’t make sense from a semantic point of view is often reported to our PostgreSQL 24-7 help desk as a performance problem.
This can conveniently mask the underlying semantic issue, especially when aggregates are concerned:
test=# SELECT count(*), count(bid) FROM a LEFT JOIN b ON (aid = bid AND bid = 2); count | count -------+------- 4 | 1 (1 row)
People often see wrong results because they think the new criterion will get rid of some data.
IN: What is a semi-join?
But there is more. Often people ask: What is a semi-join? The answer is quite simple: Think of an IN statement:
test=# SELECT * FROM a WHERE aid IN (SELECT bid FROM b); aid ----- 2 3 4 (3 rows)
The implicit DISTINCT filter of the IN-statement eliminates duplicate entries. Therefore, the method used to handle duplication distinguishes between a join and a semi-join.
This duplication is eliminated, as the implementation plan makes quite evident. In this instance, a HashAggregate is used (which you will frequently see when a GROUP BY statement is used):
test=# explain SELECT * FROM a WHERE aid IN (SELECT bid FROM b); QUERY PLAN ----------------------------------------------------------------------- Hash Join (cost=46.38..102.75 rows=1275 width=4) Hash Cond: (a.aid = b.bid) -> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4) -> Hash (cost=43.88..43.88 rows=200 width=4) -> HashAggregate (cost=41.88..43.88 rows=200 width=4) Group Key: b.bid -> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4) (7 rows)
The opposite of an IN statement is NOT IN:
test=# SELECT * FROM a WHERE aid NOT IN (SELECT bid FROM b); aid ----- 1 (1 row)
In this case we remove all other rows from the result.
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
How Predictive Database Analytics Helps Optimize Cloud Resource Utilization
- 23 June 2026
- Database Performance Management
As enterprises continue migrating workloads to the cloud, optimizing resource utilization has become a critical business priority. Cloud infrastructure offers scalability, flexibility, and operational agility, but it also introduces new cost and performance challenges. Without proper visibility into workload behavior, organizations often struggle to balance application performance with infrastructure efficiency. At the center of this … Continue reading “How Predictive Database Analytics Helps Optimize Cloud Resource Utilization”
Why Proactive SQL Performance Monitoring Is Essential for Enterprise Growth
In today’s digital economy, enterprise growth depends heavily on application speed, scalability, and reliability. As businesses expand their digital services, customer interactions, transactions, analytics, and operational workloads grow exponentially. Behind nearly every business-critical application lies SQL-driven databases that process and manage massive amounts of structured data in real time. From financial transactions and e-commerce purchases … Continue reading “Why Proactive SQL Performance Monitoring Is Essential for Enterprise Growth”
How to Enable Data-Driven Media Growth with Enteros Cost Attribution and Software Management
- 22 June 2026
- Software Engineering
Introduction The media industry is experiencing one of the most significant transformations in its history. Streaming services, digital publishing platforms, online advertising ecosystems, video-on-demand applications, and content distribution networks have fundamentally changed how audiences consume content. Modern media organizations now operate highly complex digital ecosystems that support: Streaming platforms Digital publishing systems Video content delivery … Continue reading “How to Enable Data-Driven Media Growth with Enteros Cost Attribution and Software Management”
How to Enable Intelligent Wealth Management Operations with Enteros Database Software, AIOps Platform, and Gen AI
Introduction The wealth management industry is undergoing a major transformation. As investors demand personalized financial services, real-time portfolio visibility, and digital-first experiences, wealth management firms are increasingly relying on technology to drive operational efficiency, improve client engagement, and accelerate business growth. Modern wealth management organizations now support: Portfolio management platforms Wealth advisory applications Digital client … Continue reading “How to Enable Intelligent Wealth Management Operations with Enteros Database Software, AIOps Platform, and Gen AI”