Celebrate Excellence in Education: Nominate Outstanding Educators by April 15!
Greetings Canvas Data Community,
Our online HS is exploring its options for setting up a data warehouse for the purpose of housing data from the Canvas Data Portal in addition to some other sources (survey results, student performance data from 3rd party tools). We currently have a small community of canvas users (~100 students and staff) but are anticipating significant future growth.
I'm curious to hear thoughts from those who are ahead of us in this process...
What do you see as the pros, cons and considerations when deciding to opt in to use Redshift via Instructure vs. choosing a different path (MS Azure SQL, Cloud SQL, MySQL, on-prem solutions, etc.)?
I am generally more experienced in report building using an existing RDBMS, as opposed to setting up the warehouse itself.
All advice from the data trailblazers before me is appreciated!
Rob Clifford
Hi rtcliffo -
We use Redshift at the University of Colorado Denver, and I am happy to share some of my thoughts on the topic:
Those are a few things that come to mind initially. I'll add more as it comes up, and hope to hear more from others!
Hi Brenden,
We used CanvasDataViewer to import the data into our SQL SERVER DW, as you suggested, and worked wonders so far. Thank you very much for sharing it!
The only problem we have is that it doesn't import wellformed data and takes some garbage characters. We understand that this is because we are not using the correct encoding.
Reviewing the Stored Procedure "CanvasData_TableBuild_02_LoadImportTables", that is created with CanvasDataViewer, we saw that the BULK INSERT command is used to insert the data that comes in the txt files. We tried changing the parameters CODEPAGE and DATAFILETYPE but we still don’t solve it. Our database has this configuration: Collation = SQL_Latin1_General_CP1_CI_AS and Compatibility Level=SQL Server 2008 (100).
Do you have any advice for us so we can remove the garbage characters when inserting the data into our DW? Or how can we get in touch with William Jones and his team at Eastern Michigan University to give us some support?
Thank you. Regards!
Jimena del Campo
Hi Jimena -
Thanks for providing all the details! I haven't actually used the CanvasDataViewer myself and am not 100% positive of all its inner workings, so I can't speak from experience. Here are a few troubleshooting ideas though:
Let me know if you've made any progress on this!
Since writing that, I've since discovered utf8 is not sufficient for MySQL, which was the original context of my comment. We had a couple of tables that were failing import because certain emojis required 3 bytes rather than 2. For MySQL, I had to switch to multi-byte utf8 -- in MySQL parlance the utf8mb4 charset. Even if you're not using MySQL, you might be aware of the issue in case it works most of the time, but there are a couple of tables that have issues. I believe I found it in conversations and submission comments but a teacher could put an emoji in a quiz, so I just went ahead and switched everything for MySQL over to utf8mb4.
Along the same vein, the documentation for collation for MS SQL 2008 says that you want to be sure to use the correct data types. If you're used to latin1 and only having ASCII data, that page might help in eliminating the garbage (by making it recognizable). The section on Unicode says:
If you store character data that reflects multiple languages, always use Unicode data types (nchar, nvarchar, and ntext) instead of the non-Unicode data types (char, varchar, and text).
To participate in the Instructure Community, you need to sign up or log in:
Sign In
This discussion post is outdated and has been archived. Please use the Community question forums and official documentation for the most current and accurate information.