DAP SQL Causing Poor Performance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-21-2024
07:25 AM
Note: Relying on my DBA since some of this is a bit over my head.
Our scheduled "dap syncdb" process keeps blowing up our CPU. Our DBA used an "explain plan" to see where our performance issue are. Below is the SQL running a loop while sorting through a huge number of rows (18,589,175,389,900). Any assistance with this query or how to best prevent this impacting our performance would be much appreciated. I'm guessing this query is being run by the dap command because I didn't build this.
SQL Causing Performance Issue
SELECT e.acol, j.bcol, c.cnt, d.cnt, c.nulcnt, d.nulcnt, e.ccol
FROM (SELECT SUM(CASE WHEN f.id IS NOT NULL THEN 0 ELSE 1 END) AS nulcnt, COUNT(1) AS cnt FROM canvas.canvas__courses f) c,
(SELECT SUM(CASE WHEN g.course_id IS NOT NULL THEN 0 ELSE 1 END) AS nulcnt, COUNT(1) AS cnt FROM canvas.canvas__enrollments g) d,
(SELECT SUM(CASE WHEN b.course_id IS NOT NULL THEN 0 ELSE 1 END) AS acol, SUM(CASE WHEN b.course_id IS NOT NULL AND b.course_id IS NOT NULL THEN 1 ELSE 0 END) AS ccol
FROM (SELECT * FROM canvas.canvas__courses j WHERE j.id IS NOT NULL) a LEFT JOIN canvas.canvas__enrollments b ON a.id = b.course_id) e,
(SELECT SUM(CASE WHEN h.course_id IS NOT NULL THEN 1 ELSE 0 END) AS bcol
FROM canvas.canvas__enrollments h WHERE h.course_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM canvas.canvas__courses i WHERE i.id = h.course_id)) j
See attached screenshot of the explain plan.
Python Script to syncdb every 4 hrs from crontab
import subprocess
cd2_tables = ['account_users','accounts','assignments'...]
# Function to write table name on log file & run syncdb
def sync_table(table, log_file='/home/canvas/dap_syncdb/dap_syncdb.log'):
with open (log_file, 'a') as f:
separator = '\n========= TABLE: ' + table + ' =========\n'
f.write(separator)
print(separator)
p1 = subprocess.run('dap --loglevel info --logfile '+ log_file +' syncdb --namespace canvas --table ' + table, shell=True, text=True)
# Loop through tables
for x in cd2_tables:
sync_table(x)
Memory/Cache Specs
free -h
total used free shared buff/cache available
Mem: 125Gi 9.3Gi 15Gi 503Mi 102Gi 116Gi
cat /proc/meminfo
MemTotal: 131370264 kB
MemFree: 15768452 kB
MemAvailable: 121687144 kB
...
It seems like our system has enough memory. Thanks in advance for any assistance.