What is Multi Version Concurrency Control (MVCC)

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

In this post, I am sharing a basic note about Multi-version concurrency control and explain how MVCC works.

Guys, you can find lots of theory and research on MVCC, but here I shared this theory by giving an easy example

What is Multi-version Concurrency Control? :

Very simple, ” One record has multiple versions “.

E.g,

User A —-> inserted one record with ID 1. (stored this into one data page.)

User B —-> update record for ID 1.

Guys, what are you thinking now?

One record inserted and updated by another user.

First inserted, an old version of the record changed by the new update statement and become a new version of the record. But the new version of the record added separately in the data page.

Whatever different changes occur on the same record, MVCC stored different version for each change which is not an overwritten process but it stores all different versions of records.

E.g,

You wrote one sentence in your book.

The Sentence is : ” I am good database developer “.

Now change this sentence.

The Sentence is : ” I am good database administrator “.

Now for that change, you can erase that sentence and change “developer” to “administrator”.

Other database engines also doing the same thing as you do in your real life, but MVCC stored those two sentences with different versions.

Old Version : ” I am good database developer “

New Version : ” I am good database administrator “

By Default, User can only select a new version of the data.

That’s the simple definition and example of MVCC.

Why MVCC ?:

Let’s first discuss concurrency, more than one users at the same time accessing the database or records for different purposes.

MVCC provides concurrent access to the database.

Let’s imagine, one user is reading a record and the second user is trying to update the same record. Now reader wants to read that record before completion of the update process.
In this situation, you can allow readers to read uncommitted or inconsistent data which are not updated yet.

Else, You can apply lock, in which all readers have to wait until the update complete.

But this lock and dirty read is not a proper solution because lock can degrade performance and dirty read is not showing correct data.

In this situation MVCC is the best solution, where readers can see old committed version of data and once the update finished, the reader can see a newly committed version of the data.
The read process and write process never block to each other.

E.g.

The reader can read this sentence ” I am database developer. “

After update reader can read the new sentence ” I am database administrator. “

Drawbacks of MVCC :

The main drawback of MVCC is, it requires more space to store a different version of the data.

Yes, later you can remove an old version of data by specifying cleaning command. But you require more maintenance on those tables for removing the fragmentation which generated by an old version of data.

In the MVCC, the same data causes an update conflict because two different transactions can update the same version of the row.

Most MVCC database systems:

ORACLE

PostgreSQL

MySQL (only InnoDB engine)

You can also achieve MVCC in Microsoft SQL Server using ” Snapshot Isolation Level “.

May 30, 2015Anvesh Patel

Comments: 8

Mauro

October 28, 2016 at 1:39 pm

Do you know how indexes are maintained when an update occurs? Does the index remain pointing to the old version of the row and this address points to the new version? Or the index is updated as part of the update process?

Anvesh Patel
October 30, 2016 at 6:49 am
Really very nice Question !
Yes, Index will also change pointer from Old version to New version data.

valtih1978

December 1, 2016 at 7:06 pm

Wikipedia says “If transaction Ti wants to Write to object P, and there is also another transaction Tk happening to the same object, the Read Timestamp RTS(Ti) must precede the Read Timestamp RTS(Tk), i.e., RTS(Ti) < RTS(Tk), for the object Write Operation (WTS) to succeed."

That is, only single user can update anything in the database. For if we have two transactions, none has RTS lower than the other, no matter which DB item we look at. None can write anything. This seems strange. Moreover, they violate this rule in their example when they show that long-running T2 reads values written by T3. Can you explain the crazy rule?

Anvesh Patel
December 18, 2016 at 1:19 pm
Really very Interesting point,
Your point is valid where insert with the same RTS will create deadlock situation, but what I have found is this is very rare situation and even you can control your insert using different Isolation level.
Still, I need to research on this and will let you know in detail.

daniel

February 14, 2017 at 2:55 am

thank you for amazing fantastic postingg!!

Venkatesh Natukula

February 25, 2017 at 1:42 am

Thanks to write such a good use full article. I have doubt can we able to see old data by writing any command and if we write any cleaning command is this specific to one table or it will work entire DB. can please share commands regarding these things.

Anvesh Patel
February 25, 2017 at 8:17 am
You cannot see old data, but you can get a count of dead tuples/old data so that it helps you to take decision for cleaning. You can execute your cleaning command on each table level, schema level and also at a database level.

Vlad Mihalcea

March 1, 2017 at 12:40 pm

I also wrote an article about how MVCC works in PostgreSQL. Ir provides diagrams for how INSERT, UPDATE, and DELETE statements work in the context of MVCC:

https://vladmihalcea.com/2017/03/01/how-does-mvcc-multi-version-concurrency-control-work/

What is Multi Version Concurrency Control (MVCC)

Anvesh Patel

About Me!

About DBRND !

Recent Comments !

Follow Me !