Greenplum: Script to find the usage of Work spill file per Running Query

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

In this post, I am sharing a script to find the total usage of work spill files for each running query in Greenplum.

Whenever we are doing hash aggregation operation like GROUP BY or DISTINCT on billions of data, query executor holds this data in temporary files which called work spill files.

We should optimize our query to reduce the maximum use of work spill files because it degrades the overall performance and also requires more disk space.

Using below script, we can monitor the usage of work spill file for each running query and we can also optimize the long running queries.

SELECT

gwupq.datname

,psa.usename

,gwupq.procpid

,gwupq.sess_id

,sum(gwupq.size)/1024::float as total_spill_size_kb

,sum(gwupq.numfiles) as total_spill_file

FROM gp_toolkit.gp_workfile_usage_per_query as gwupq

inner join pg_stat_activity psa

on gwupq.procpid = psa.procpid

and gwupq.sess_id = psa.sess_id

GROUP BY 1,2,3,4;

Jul 1, 2017Anvesh Patel

Greenplum: Script to find the usage of Work spill file per Running Query

Leave a Reply Cancel reply

Anvesh Patel

About Me!

About DBRND !

Recent Comments !

Follow Me !