This article is half-done without your Comment! *** Please share your thoughts via Comment ***
In this post, I am sharing a basic note about Multi-version concurrency control and explain how MVCC works.
Guys, you can find lots of theory and research on MVCC, but here I shared this theory by giving an easy example
What is Multi-version Concurrency Control? :
Very simple, ” One record has multiple versions “.
E.g,
User A —-> inserted one record with ID 1. (stored this into one data page.)
User B —-> update record for ID 1.
Guys, what are you thinking now?
One record inserted and updated by another user.
First inserted, an old version of the record changed by the new update statement and become a new version of the record. But the new version of the record added separately in the data page.
Whatever different changes occur on the same record, MVCC stored different version for each change which is not an overwritten process but it stores all different versions of records.
E.g,
You wrote one sentence in your book.
The Sentence is : ” I am good database developer “.
Now change this sentence.
The Sentence is : ” I am good database administrator “.
Now for that change, you can erase that sentence and change “developer” to “administrator”.
Other database engines also doing the same thing as you do in your real life, but MVCC stored those two sentences with different versions.
Old Version : ” I am good database developer “
New Version : ” I am good database administrator “
By Default, User can only select a new version of the data.
That’s the simple definition and example of MVCC.
Why MVCC ?:
Let’s first discuss concurrency, more than one users at the same time accessing the database or records for different purposes.
MVCC provides concurrent access to the database.
Let’s imagine, one user is reading a record and the second user is trying to update the same record. Now reader wants to read that record before completion of the update process.
In this situation, you can allow readers to read uncommitted or inconsistent data which are not updated yet.Else, You can apply lock, in which all readers have to wait until the update complete.
But this lock and dirty read is not a proper solution because lock can degrade performance and dirty read is not showing correct data.
In this situation MVCC is the best solution, where readers can see old committed version of data and once the update finished, the reader can see a newly committed version of the data.
The read process and write process never block to each other.
E.g.
The reader can read this sentence ” I am database developer. “
After update reader can read the new sentence ” I am database administrator. “
Drawbacks of MVCC :
The main drawback of MVCC is, it requires more space to store a different version of the data.
Yes, later you can remove an old version of data by specifying cleaning command. But you require more maintenance on those tables for removing the fragmentation which generated by an old version of data.
In the MVCC, the same data causes an update conflict because two different transactions can update the same version of the row.
Most MVCC database systems:
ORACLE
PostgreSQL
MySQL (only InnoDB engine)
You can also achieve MVCC in Microsoft SQL Server using ” Snapshot Isolation Level “.