Celebrate Excellence in Education: Nominate Outstanding Educators by April 15!
Found this content helpful? Log in or sign up to leave a like!
This particular error is becoming more and more prevalent. All of our tables failed this morning with nothing more than a JSON response that says "Middleware Error". The Canvas Data 2 status page does not indicate any problems even though it seems the data was completely unreachable.
Previously the "Middleware Error" would show up intermittently, today was one of the first time it was in 100% of our attempts.
The error occurs after a data request has been sent, but while we are needing to check the status end point to know when it has completed the data request.
Has anyone else been encountering this issue?
Yes, our jobs failed 10/21 and 10/22. Just spot-checking, it does look like the failures are when checking the status endpoint (500 Server Error: Internal Server Error for url: https://api-gateway.instructure.com/dap/job/<job_id>). We're not using the DAP client. It would be useful to have confidence that the status page accurately reflects the status of the system, as we have multiple moving parts and the first question is always "Is this something on our side or their side?".
Yes, we are just getting CD2 up and when the job tries to run the DAP CLI command for a table to download import files, the command throws an exception with the error "Middleware Error". We received this error Thursday, Saturday and Sunday. Not sure what was different on Friday.
Today (10/23) is now day 3 of all tables failing.
We are not using the DAP either, and go through a similar series of checks to see if the issues are on our side or theirs. Occasionally we'll spin up a Postgres Docker container and run the DAP to see if we get the errors there too, almost always we do.
It is very unfortunate that the status page does not seem to comprehensively monitor the health of CD2. Perhaps it is just a ping check to see if the API can be reached regardless of if the services are operating normally?
I noticed these errors too. I saw https://pypi.org/project/instructure-dap-client/ has updated to 0.3.15 and found that updating my container image with that new version made it to where most of the tables are syncing properly. Although there are still some that are throwing that Middleware Error.
I'd like to echo some of the others and say I wish there were some increased transparency with this service. 100% uptime isn't a realistic expectation, but some more communication relating to these issues would make me feel less in the dark about all this.
We're getting these errors too. Ticket submitted as 10246425. Fingers crossed.
We use the DAP Python library within an AWS solution. On October 19, we had 78 tables fail with the "Middleware error" and 12 succeed in syncing. October 20 we had 65 tables sync successfully but 25 timed out on Instructure's end (not sure if this is related to the mysterious new "Middleware error" or not). Today we were back to 78 tables with "Middleware error" and 12 that synced. I agree that this is clearly degraded service and should be reflected on the status page!
Hi All,
Thank you for all the feedback. We have started investigating. I will post updates as we dig deeper into the issue. Thank you for your patience.
Any insights as of yet... people are starting to get angry with us....
case:
#10247430
For anybody keeping score at home, our daily run this morning had 89 tables fail with the middleware error and one table successfully sync.
Our runs Monday and Tuesday at 1 and 2 am (prod and dev respectively) failed about 75 of the tables each. This has definitely gotten worse here recently. We are using the DAP client wrapped in a python script to call the DAP sync command with the table names. I have not attempted to update the pip package as of yet, but it seems from the other comments here that its not a cure-all either.
Hi All,
We are actively working on this issue. Instructure will provide an update as soon as available. Please bear with us.
CC: @JozsefKercso
ours is independent of external packages and connecting directly to DAP API .9 query interface still facing those issues this is a rooted issue in CD2 nothing updating a package will fix or hurt at this time at least.... Just going to lay here and get my butt chewed out....
https://github.com/uvadev/PullCanvasData2
https://github.com/uvadev/PullCanvasData2/pkgs/nuget/PullCanvasData2
When will the Instructure Status page be updated to reflect there is an issue with Canvas Data 2?
@All looks l have had success today at 21:00UTC
NVM
```<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
</body>
</html>```
`PostSnapshotJob` worked for us.... this evening
`PostIncrementalJob` did not
We had 64 tables sync successfully today but 26 fail with "Malformed HTTP response." This error is what bubbles up to the Python stack trace after Instructure sends the 504 time-out noted by @jsimon3 (the Python library could probably handle this more gracefully but that's a separate issue).
Hi All! Today we deployed a new version of the API Gateway which aims to solve the "middleware error" issue. We are monitoring closely the new version to make sure that it works as expected. In case you still encounter "middleware error" issue, please let us know immediately and - if available - please include the logs you have in your message. It helps us a lot during investigation.
Thank you for your help and patience!
I was able to get a full sync in Yesterday morning. However this morning I had one run fail a single table and a second run fail about 20 tables and take about 60% longer than normal to complete.
Mine took about 40% longer (full initdb) but completed successfully as well. However, today I've got a whole new issue:
Any DAP process: dap.dap_error.ServerError: inval... - Instructure Community (canvaslms.com)
@reynlds make sure to let @JozsefKercso team know. I would file a ticket w/canvas support and drop that error in along with the table that it happened on and whether it was an incremental or post
@jsimon3 I do have a case open (10256554) and it references the error that I posted in the forum. I'll go back and tag as suggested. thanks!
so far so good for us since the fix. 🤞
We've also been having good luck since the fix. 🤞
my `PostIncrementalJob` failed last night and the process is taking 1000%+ longer than normal to even generate the tarballs, a process that used to take seconds now takes upwards of 10minutes... Not sure what is going on it was fine for a little while after the fix... Maybe @JozsefKercso might have some insight...It appears to be still struggling today as well
Unfortunately I don't have an insight, as from API Gateway point of view the request has been forwarded in just couple of milliseconds to the backend service. My colleagues, who are responsible for the backend service are already checking the issue.
You can help identifying and fixing the problem with this:
Thank you in advance, we will keep you updated.
We've got each table as a separate job in Informatica, and over half seem to have failed this morning with server errors. Yesterday's load was fine.
Our data jobs failed overnight with the following error:
{
"error" : {
"uuid" : "b1ba7fb4-51d9-46f8-84a0-e85e5dce393e",
"message" : "DAP Querying service is overloaded or under maintenance.",
"type" : "Overloaded"
}
}
As of 9AM EST, we're still getting the same 503 response and error when trying to refresh our data.
https://status.instructure.com is currently indicating no issues with Canvas Data 2.
We are also getting the same error this morning.
To participate in the Instructure Community, you need to sign up or log in:
Sign In