Plunge into the Core of PostgreSQL: An 8 min guide

Published in

Level Up Coding

9 min readMay 14, 2023

This series will be about exploring the depths of PostgreSQL in detail. As a software engineer, I have seen many articles that either explain the theory or the practical aspect of PostgreSQL. In this series, I will cover both the practical and theoretical aspects and explain why things are the way they are. I would also like to hear from you about the different ways in which we can explore PostgreSQL, so any ideas are welcome and would be much appreciated.

What I am using:

Postgres 14
Ubuntu 22
Something to snack on & tea
(Of course, you can use any other version of the items in the list)

Let’s get to it.

Getting into Databases and creating Tables

1. To get into Postgres
      -> sudo -i -u postgres
      -> psql
2. List all databases
      -> \l
3. Create Database
      -> CREATE DATABASE exploration_medium;
4. Create table in it
      First we connect to it
      ->\c exploration_medium
      -> CREATE TABLE A(name varchar(10)); #couldnt come up with a more innovative name :P
5. List of relations
      -> \dt+

What is \dt+ indicating?

                                   List of relations
 Schema | Name | Type  |  Owner   | Persistence | Access method |  Size   | Description 
--------+------+-------+----------+-------------+---------------+---------+-------------
 public | a    | table | postgres | permanent   | heap          | 0 bytes | 
(1 row)

Schema: “This is a logical container that holds a set of related database objects, and it provides a way to organize and manage the database objects within a database.” What does this mean to a layman? Each schema can have its own set of tables, views, functions, indexes, etc., that are different from the objects in other schemas.

But why do I need to make a new schema? “Public” seems perfectly okay. It is okay until security and data integrity become an issue. These are just some scenarios.

Just imagine you are a bank. You have employee data, customer data, financial data etc. You can create schemas to ensure security and access control based on user role.
When developers are testing and developing the program. You would not want them testing in PRD. So you can allow developers to work on a specific set of objects by making separate schemas.

While researching, I went too deep into Schemas and it seems like it warrants a separate article.
2. Persistence: You can also create temporary tables. It can be used if you want to temporarily store data after a very complex query for subsequent use. It only keeps complex calculations within a session or connection. You need to think hard about why using this would be better than views or materialized views.
3. Access method: It is how data is organized and stored efficiently in a table. There are two kinds of access methods: index and table. One is about how you store data and the other is about searching and accessing data efficiently. There are different kinds of access methods like B+ trees, GiST, SP-GiST, GIN etc. Can I change this access method? Yeah, but it is not covered here.

Database Cluster

Have you noticed that whenever you wander in /var/lib/postgresql/14/ there is a main folder? That is a cluster.

It is a collection of databases, which includes data, configuration files, etc. that work together to provide a shared environment for storing and managing data. A cluster is a single directory under which all data will be kept — data directory.
Each database in a cluster shares the same underlying resources, such as CPU, disk space, and memory power. Suppose you have max_connections set to 100. And suppose 1 database in the cluster uses all of them, this will leave other databases stranded for connections, as connections are shared across all of them in a cluster. Of course, it is possible to set limits on max connections per database in a cluster. It also helps different databases in a cluster communicate with one another efficiently.

1. Listing all clusters(If you run the cmd below, in most cases its one)
    -> sudo pg_lsclusters

2. To create a new cluster
    -> sudo pg_createcluster <Postgres Version> <name-of-cluster>

3. This runs the cluster
    -> sudo pg_ctlcluster <Postgres Version> <name-of-cluster> start
(If the earlier cluster ran in port 5432, this might run in 5434 or next available port)

4. To delete the cluster 
    i) First stop the postgresql service
   ii) then sudo rm -r /var/lib/postgresql/<Postgres Version>/<name-of-cluster>

So, when you had one db cluster, it was managed by one PostgreSQL db server. And, now when you have two, it gets handled by two separate instances of the PostgreSQL server.

Why would you need it?

If you want to test a new configuration without affecting an existing cluster. You can create new config files, and try testing them. It won’t at all affect the old cluster but will require additional system resources.
Can create different environments for testing, development and production. Each cluster can have different configurations and permissions and data.
Can help to upgrade the database version. Suppose you want to upgrade from 12 to 14 Postgres version, you can download 14 and then run the above commands and test it out.
Of course, it is used for distributing load for high-traffic apps by creating multiple clusters.

When to use Database Cluster vs Schema?

A schema is a logical container for database objects such as tables, views, and functions.

While a database cluster is a collection of databases managed by a single instance of a running database server.

To summarize, a schema is suitable when you need to separate or organize data within a single database, while a database cluster is suitable when you need to manage multiple databases as a single entity or isolate them from each other for security or performance reasons.

Database Files

Have you ever ventured into this ls /var/lib/postgresql/14/mainand wondered what these subdirectories could possibly be?

base: This directory contains actual data for all user db of a cluster. Directories in these are named after the OID of the db it contains.
global: This directory contains the global data that is shared among all the databases in the cluster, such as system catalogues and configuration files.

There are other directories with pg_* but as a user, we don't use them directly. But if you want to try one, there is one directory called pg_wal. Let’s first try getting the output and then I can explain what we just did and what WAL is.

1. First find if you have `pg_waldump` in your system.
    -> which pg_waldump -> pg_waldump not found
        It is cool, just check if you have postgresql-contrib installed or not
        If yes, then maybe like me you do not have it in path, so give absolute path
    
    -> /usr/lib/postgresql/<PG-Version>/bin/pg_waldump

2. Now first go to this directory /var/lib/postgresql/14/main/pg_wal/, you'll need root access
 
3. You'll see files like this "00000001000000000000000A", just pick any two subsequent files and run it

4. This will run it and give output in wal_dump.txt file:

sudo /usr/lib/postgresql/14/bin/pg_waldump -p /var/lib/postgresql/14/main/pg_wal/ 00000001000000000000000A 00000001000000000000000B > ~/<Your-output-path>/wal_dump.txt

This is what the output will look like

rmgr: Btree       len (rec/tot):     64/    64, tx:       4782, lsn: 0/0A000030, prev 0/09FFFFB0, desc: INSERT_LEAF off 318, blkref #0: rel 1663/34944/34990 blk 1037

rmgr: It stands for “resource manager” and it indicates which component of the PostgreSQL generated the record. It helps identify the type of operation that was performed on db. This can be useful for performance tuning, troubleshooting, and monitoring the system’s behaviour if you know what is most active.
len(rec/tot): 64/64: This indicates the length of the log record in byte, which is 64 bytes, and the entire record is present in this log file segment.
tx: 4782: This is the transaction ID associated with the log record. It indicates that this operation was performed as part of transaction 4782.
lsn: This is the log sequence number (LSN) of the record, which is used to order the records and identify their position in the WAL stream. The first number (0) indicates the log file number, while the second number (0A000030) indicates the position of the record within the log file
prev: It is for the previous record in the stream.
desc: This gives us info about the nature of the operation, whether it was inserted or delete or update, and the location within the relation where the change was made(e.g., off 318 means at byte offset 318)
blkref: It tells us on which page(block) the change was made. blkref #0 refers to the block reference number, which can be used to identify which relation the block belongs to. rel 1663/34944/34990 specifies the relation's database OID: 34944, tablespace OID: 1663, and relation OID: 34990 respectively, andblk 1037 indicates the block number, or page number which is being modified.

Tablespace is a physical location on the disk where db stores data files(actual data) associated with data objects(tables, views and index). The tablespace is used to manage disk space and optimize performance by separating frequently accessed from infrequently accessed. Generally, tablespace has two values pg_default and pg_global. SELECT * FROM pg_tablespace;

Relation is every database object such as a table, index, etc. which gets created and gets an OID number assigned to it.

Database OID (Object Identifier) is a unique identifier assigned to a database object within a PostgreSQL database cluster. Each database in a cluster has a unique OID, which is a 32-bit unsigned integer. SELECT * FROM pg_database WHERE oid=database OID;

I’ll explain WAL in a different article properly, but I’ll just give an introduction. WAL (Write-Ahead Logging) is used in PostgreSQL to provide durability and crash recovery. Whenever an operation modifies data in a PostgreSQL database, the changes are first written to a WAL segment on disk before being applied to the actual database files. This ensures that the changes are recorded before the data is modified on disk. We can use WAL to reconstruct the database to its state at the time of the crash.

PAGE

So when you go inside any of these folders /var/lib/postgresql/14/main/base/ you’ll notices pages. Pages are fixed-size blocks of data/disk space allocated for tables or indexes, usually around 8kB in PostgreSQL. Pages contain tuples, metadata, indexes, log record etc. And, most systems do not mix page types. But, some can have everything contained together.

Headers contain information about the page.
ItemID will point to Item. They grow from left to right.
Items are the actual data or rows in a table or can be index entries. They grow from right to left.
Free Space is the unused space as the name suggests.
Special Space is used by Index file structure it depending on the needs of the index type

These pages are managed by storage engines. When data is written into tables, it’s initially stored in memory or buffer pools. And once buffers fill up, PostgreSQL starts flushing pages into disk. Suppose, you want to read some data from a table, so Postgres will check whether said rows are in buffer pools or not. If not, it first fetches and loads it into buffer pools.

In the next article, I will explain storage organization, what happens if rows are too big to fit in pages, buffer pools etc.

Thank you for taking the time to read this article! I hope you found it informative and helpful. If you have any suggestions or feedback, please feel free to leave a comment or contact me directly.

If you enjoyed this article and would like to see more content like it, you can follow me on LinkedIn, Twitter, and Instagram to stay updated on my latest posts.

And if you found this article particularly helpful, you can also give support here: https://buymeacoffee.com/divadugar.

Once again, thank you for reading and I look forward to sharing more with you in the future!

Level Up Coding

Thanks for being a part of our community! Before you go:

👏 Clap for the story and follow the author 👉
📰 View more content in the Level Up Coding publication
💰 Free coding interview course ⇒ View Course
🔔 Follow us: Twitter | LinkedIn | Newsletter

🚀👉 Join the Level Up talent collective and find an amazing job

Plunge into the Core of PostgreSQL: An 8 min guide

Getting into Databases and creating Tables

What is \dt+ indicating?

Database Cluster

Why would you need it?

When to use Database Cluster vs Schema?

Database Files

PAGE

Level Up Coding

Written by Diva Dugar