Discrepancy between internal Canvas reports and Canvas Data 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Everyone,
Our organisation recently got access to Canvas Data 2 and I have been comparing data pulled from CD2 to Canvas' internal reports and have noticed some large discrepancies.
For example when looking at the enrollments table from CD2, I get around 10% of the rows expected. The same is true for users, quizzes, basically all the data pulled from CD2 has been incomplete. I do know there is a 4 hour freshness interval, but that would not account for these large discrepancies. When I use the internal reporting tool, all the data is there properly and the numbers make sense.
I pulled data from CD2 using both the CLI tool and Postman and got the same result.
Here is the code in python I have:
import os
from dap.api import DAPClient
from dap.dap_types import Credentials, SnapshotQuery, Format
import asyncio
base_url = "https://api-gateway.instructure.com"
client_id = "clientid"
client_secret = "secret"
credentials = Credentials.create(client_id=client_id, client_secret=client_secret)
output_directory = os.getcwd()
async def download_data():
async with DAPClient(base_url=base_url, credentials=credentials) as session:
query = SnapshotQuery(format=Format.JSONL, mode=None)
await session.download_table_data(
"canvas", "enrollments", query, output_directory, decompress=True
)
if __name__ == "__main__":
asyncio.run(download_data())
I have tried reaching out to Canvas Data Help, and haven't had a response so I was hoping that someone here might have an idea or experienced a similar problem.
Many thanks,
Matt