This article is half-done without your Comment! *** Please share your thoughts via Comment ***
What is HOT UPDATE (Heap Only Tuple) and How It helps to increase the performance of UPDATE Query?
In this post, I am going to explain the answer of this question.
This is one of the internal of PostgreSQL, but I am sure that most of you guys don’t know about PostgreSQL HOT.
PostgreSQL requires HOT because of MVCC architecture.
Please read below articles, before start to knowing about the HOT.
In the MVCC architecture, When we update any record, internally MVCC architecture inserts a new record with new values and marks old record as deleted or discarded.
When we vacuum, we can clean that deleted records and utilize that space.
What is HOT (Hot Only Tuple)?
Now imagine that, we have total 500 data pages and we are updating record of page number 3, but unfortunately new record insert in page number 500.
It looks like very big internal fragmentation because of MVCC architecture.
But actually this is not happening.
PostgreSQL has a concept of HOT, With a HOT dead tuple space can be automatically reclaimed at the time of INSERT or UPDATE if no changes are made to indexed columns. This allows for more consistent performance.
HOT means, creating a new update tuple if possible on the same page as the old tuple, and maintaining a chain of updated tuples linking a new version to the old tuple.
HOT updates can only reuse dead tuples from previous transactions directly, not from the same or concurrent ones.
If you have big updates, changing large portions of the table at once, you might want to split them up in a couple of chunks.
HOT updates only work when the changed columns are not involved in indexes in any way because the indexes pointing the the old tuples need to point to the new version of it as of transaction id.
You might be blocking HOT updates with indexes on updated columns. If those are expendable, you might get better overall performance without them.
Let’s understand PostgreSQL HOT practically.
Create a table with Sample data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE TABLE tbl_Employees ( EmpID INT ,EmpName CHARACTER VARYING ,EmpDepartment CHARACTER VARYING ,EmpSalary INT ); INSERT INTO tbl_Employees VALUES (1,'Anvesh','Database',90000) ,(2,'Jenny','JAVA',65000) ,(3,'Martin','PHP',85000) ,(4,'Roy','PHP',94000) ,(5,'Eric','PHP',70000) ,(6,'Rajesh','Animation',50000) ,(7,'Mahi','Database',40000) ,(8,'Sofia','JAVA',60000); |
Delete one record:
1 |
DELETE FROM tbl_Employees WHERE EmpID = 7; |
Insert one New Record:
1 2 3 |
INSERT INTO tbl_Employees VALUES (7,'Junik','Database',10000); |
Please check If any HOT Update or not on table:
1 2 3 4 5 6 7 |
SELECT pg_stat_get_tuples_hot_updated('tbl_employees'::regclass::oid); /* Result: pg_stat_get_tuples_hot_updated ------------------------------ 0 */ |
Please update the statestic of that table:
1 |
ANALYZE tbl_Employees; |
Now execute one more update command:
1 |
UPDATE tbl_Employees SET EmpName = 'Robort' WHERE EmpID=7; |
Please check If any HOT Update or not on table:
1 2 3 4 5 6 7 |
SELECT pg_stat_get_tuples_hot_updated('tbl_employees'::regclass::oid); /* Result: pg_stat_get_tuples_hot_updated ------------------------------ 1 */ |
Again, do more update:
1 2 |
UPDATE tbl_Employees SET EmpName = 'Manish' WHERE EmpID=8; UPDATE tbl_Employees SET EmpName = 'Cerlin' WHERE EmpID=4; |
Please check If any change in number of HOT Update of a table:
1 2 3 4 5 6 7 |
SELECT pg_stat_get_tuples_hot_updated('tbl_employees'::regclass::oid); /* Result: pg_stat_get_tuples_hot_updated ------------------------------ 3 */ |
Now Apply, Index on EmpName Column:
1 2 |
CREATE INDEX idx_tbl_employees_empname ON tbl_employees USING BTREE (EmpName); |
After Index, execute few more update:
1 2 |
UPDATE tbl_Employees SET EmpName = 'Kulin' WHERE EmpID=3; UPDATE tbl_Employees SET EmpName = 'Priya' WHERE EmpID=5; |
After Index, Please check If any change in number of HOT Update of a table:
The result is same because If column has an Index, HOT Update does not work.
1 2 3 4 5 6 7 |
SELECT pg_stat_get_tuples_hot_updated('tbl_employees'::regclass::oid); /* Result: pg_stat_get_tuples_hot_updated ------------------------------ 3 */ |
You can also check number of HOT update of current transaction:
1 |
SELECT pg_stat_get_xact_tuples_hot_updated('tbl_employees'::regclass::oid); |