Community Help

burkepk · ‎03-29-2024

Hello all, our team is new to developing solutions to pull data from Canvas. I have been lurking in the forums for a bit, trying to gather as much information I can to set up the correct tools that we would need to be successful. However the one part I am wondering about is where to run the Python functions to handle the initialization and synchronization calls. We want to stay away from EC2 because of the administrative overhead involved, but also we are apprehensive about the 15 minute time limit on Lambda. Has anyone containerized their code to run on Fargate? Or is the 15 minute time limit in Lambda enough for processing the synchronizations and we run a one off process just for the initializations?

ColinMurtaugh · ‎03-30-2024

Hi --

We've had success running our CD2 init/sync code in Lambda and orchestrating the process using Step Functions. Currently we're syncing everything in the canvas schema every three hours, and the process has been running for a couple of months without problems. A few of the tables (less than 5, IIRC) are large enough that the init step took longer than 15 minutes -- for those we just ran a one-off init outside of Lambda, and subsequent syncs have been fine.

Here's a link to a work-in-progress version of our pipeline code. This is essentially a slightly simplified version of the process that we run ourselves; I have a little work to do to apply some our recent updates to the public version of the code, but you can get a sense of how it works:

https://github.com/Harvard-University-iCommons/canvas-data-2-aws/tree/develop

--Colin

View solution in original post

ColinMurtaugh · ‎03-30-2024

Hi --

We've had success running our CD2 init/sync code in Lambda and orchestrating the process using Step Functions. Currently we're syncing everything in the canvas schema every three hours, and the process has been running for a couple of months without problems. A few of the tables (less than 5, IIRC) are large enough that the init step took longer than 15 minutes -- for those we just ran a one-off init outside of Lambda, and subsequent syncs have been fine.

Here's a link to a work-in-progress version of our pipeline code. This is essentially a slightly simplified version of the process that we run ourselves; I have a little work to do to apply some our recent updates to the public version of the code, but you can get a sense of how it works:

https://github.com/Harvard-University-iCommons/canvas-data-2-aws/tree/develop

--Colin

burkepk · ‎04-01-2024

Thank you very much for your insight.

Setting up DAP Synchronization in AWS, Fargate or Labmda?

AWS

DAP client library

python

"Malformed HTTP response"

How to Access Page Views using Canvas Data 2 Table...

AWS Harvard Data 1 extract conversion to Data 2

CD1 to CD2 schema mapping document.

Is there a way to translate bash script to Azure w...

"Malformed HTTP response"

Finding Course Pages and Module Pages separately

Assignment Points Possible Zero Other Explanation?

CD1 to CD2 schema documentation Deleted

CD2 dap issues

You're signed out

Setting up DAP Synchronization in AWS, Fargate or Labmda?

Community Help

View our top guides and resources: