Celebrate Excellence in Education: Nominate Outstanding Educators by April 15!
Found this content helpful? Log in or sign up to leave a like!
I have a Python program that gets all quizzes from a course and for each quiz, it gets all submissions for that quiz. After that, gets all attempts for each submission. It is a triply nested for loop.
I would like to have an API call that knowing the course_id, quiz_id, user_id, could get all submissions for that quiz for that user.
Something similar to what we have for the assignments.
GET /api/v1/courses/:course_id/assignments/:assignment_id/submissions/:user_id
Thanks in advance.
Solved! Go to Solution.
The List submissions for multiple assignments endpoint of the submissions API will do everything you say you want (and a little more). It's what Canvas uses to fetch the gradebook information, but it can be used for other things.
The basic call is
GET /api/v1/course/:course_id/students/submissions
but that won't do anything by itself.
What I would do for the quiz_id is to get a list of all of the quizzes first or assignments first.
Then I would have a single loop that iterated through the assignments, perhaps in batches of assignments, depending on how many quizzes you have. I would use student_ids[]=all to get the submissions for all students with an active enrollment. You have to explicitly list student ids for those students who are concluded or inactive as they are not included.
The assignment_ids[] will take multiple values, which is why I mentioned batching. For example, if you had 20 quizzes that you wanted to obtain the information for, you could probably put all 20 at the same time by repeating assignment_ids[], but I would either go 1 at a time (number of API calls = number of quizzes) or do array slices and make perhaps 4 or 5 at a time.
The more information you request, the more likely you are to run into pagination issues. You can get up to 100 submissions at a time with per_page=100, so that could be up to 100 students for a single assignment, 50 students for 2 assignments, or 25 students for 4 assignments. Pagination for the multiple submissions endpoint uses the opaque bookmark, so you cannot predict what the next page will be, you'll have to wait for the first request to finish before starting the next. That blocks the beauty of concurrent calls. As long as your students × assignments doesn't exceed 100, you can make multiple concurrent calls, subject to the rate limiting that Canvas has in place. Browsers allow 5-7 concurrent requests, but if you try 14 at the same instant, you'll hit the threshold and get blocked and they won't succeed.
As for the actual request, what you're going to get is a structure that looks like this (I've removed a lot of the information)
[
{
"id":324686732,
"body":"user: 9335957, quiz: 6068641, score: 5.0, time: 2020-02-13 01:43:09 +0000",
"grade":"5.67",
"score":5.67,
"submitted_at":"2020-02-13T01:43:09Z",
"assignment_id":24792036,
"user_id":9335957,
"grader_id":-6068641,
"attempt":3,
"submission_history":[
{
"id":39442823,
"score":5.0,
"submitted_at":"2020-02-13T01:43:09Z",
"attempt":3,
},
{
"id":39442823,
"score":6.0,
"submitted_at":"2020-02-13T01:42:22Z",
"attempt":2,
},
{
"id":39442823,
"score":6.0,
"submitted_at":"2020-02-13T01:39:36Z",
"attempt":1,
}
],
}
]
The array here only contains one object, but you would have one for each student.
What part of the quiz submissions do you want that you cannot get through the regular submissions API that you listed?
The get multiple submissions version may be able to cut out some of your loops, but it may not be usable at all depending on what information you need.
If you include[]=submission_history in the query for the get submissions (single or multiple) endpoint, you can get a list of all of the submissions for quizzes including the student responses.
Hi James.
I basically need for each user the following information: quiz id, date/time of the attempt, score of the attempt.
Now I have much more information than I need actually.
Do you think the get multiple submissions versions could work? If so, where do I find documentation regarding the get multiple submissions version?
Thanks in advance!
The List submissions for multiple assignments endpoint of the submissions API will do everything you say you want (and a little more). It's what Canvas uses to fetch the gradebook information, but it can be used for other things.
The basic call is
GET /api/v1/course/:course_id/students/submissions
but that won't do anything by itself.
What I would do for the quiz_id is to get a list of all of the quizzes first or assignments first.
Then I would have a single loop that iterated through the assignments, perhaps in batches of assignments, depending on how many quizzes you have. I would use student_ids[]=all to get the submissions for all students with an active enrollment. You have to explicitly list student ids for those students who are concluded or inactive as they are not included.
The assignment_ids[] will take multiple values, which is why I mentioned batching. For example, if you had 20 quizzes that you wanted to obtain the information for, you could probably put all 20 at the same time by repeating assignment_ids[], but I would either go 1 at a time (number of API calls = number of quizzes) or do array slices and make perhaps 4 or 5 at a time.
The more information you request, the more likely you are to run into pagination issues. You can get up to 100 submissions at a time with per_page=100, so that could be up to 100 students for a single assignment, 50 students for 2 assignments, or 25 students for 4 assignments. Pagination for the multiple submissions endpoint uses the opaque bookmark, so you cannot predict what the next page will be, you'll have to wait for the first request to finish before starting the next. That blocks the beauty of concurrent calls. As long as your students × assignments doesn't exceed 100, you can make multiple concurrent calls, subject to the rate limiting that Canvas has in place. Browsers allow 5-7 concurrent requests, but if you try 14 at the same instant, you'll hit the threshold and get blocked and they won't succeed.
As for the actual request, what you're going to get is a structure that looks like this (I've removed a lot of the information)
[
{
"id":324686732,
"body":"user: 9335957, quiz: 6068641, score: 5.0, time: 2020-02-13 01:43:09 +0000",
"grade":"5.67",
"score":5.67,
"submitted_at":"2020-02-13T01:43:09Z",
"assignment_id":24792036,
"user_id":9335957,
"grader_id":-6068641,
"attempt":3,
"submission_history":[
{
"id":39442823,
"score":5.0,
"submitted_at":"2020-02-13T01:43:09Z",
"attempt":3,
},
{
"id":39442823,
"score":6.0,
"submitted_at":"2020-02-13T01:42:22Z",
"attempt":2,
},
{
"id":39442823,
"score":6.0,
"submitted_at":"2020-02-13T01:39:36Z",
"attempt":1,
}
],
}
]
The array here only contains one object, but you would have one for each student.
James,
Thank you so much for your explanation!
It will help me a lot. I will try it and may be have more questions...
So I think this almost gets me where I'm trying to go but wondering if there is a much easier way. I have about 20 quizzes (Jedi Trials) in my Physics Course. I have about 70 students in Physics. I want to find the average scores for each Jedi Trial.
I've queried an assignment group ID I need using: /api/v1/courses/'.$canvasID.'/assignment_groups
Then got a list of all assignment IDS for the 20 Jedi Trials in that group using: /api/v1/courses/'.$canvasID.'/assignment_groups/'.$JAGID.'?include[]=assignments
Now I'm wondering if I have to iterate through those 20 ID numbers with the GET /api/v1/course/:course_id/students/submissions?student_ids[] =all&assignment_ids[]=$AID call and then average them myself?
I'm personally hoping there is just a simple call to get the average of an assignment? (fingers crossed)
First, you do not need to iterate through the 20 Assignment ID numbers, you can repeat the assignment_ids[] query parameter multiple times within a single request rather than making 20 requests. If the assignment IDs are 1, 2, 3, ..., you could use assignment_ids[]=1&assignment_ids[]=2&assignment_ids[]=3
I don't believe there is a call to get just the average. Even in New Analytics, they download all of the grades for each student individually and compute the average from that.
There is a call to get the median for all assignments using the get course-level assignment data endpoint of the Analytics API. You would still need to filter through to find the right assignment.
There are faster ways to just get the scores. I believe the thread you tacked onto was about getting all of the responses to each attempt of the quiz, and if you're just after the average, you don't need as much overhead.
Here's a quick way to get list of assignments with their names so that you can filter to find the Jedi assignments. I'm including the Quiz IDs, which allows me to verify that something is a quiz, but it may not be necessary depending on your naming structure. It uses the GraphQL language, which allows you to get targeted information quickly and usually without pagination (definitely not for a class of 70 students).
You will need to replace the course ID (3119582) with your actual Canvas course ID.
query assignmentList {
course(id: "3119582") {
assignmentsConnection {
nodes {
_id
name
quiz {
_id
}
}
}
}
}
Once you have the ID of the assignment you need, then you can get the submissions for it. This would need repeated for each of the 20 assignments, changing the assignment ID each time. The reason I included the submissionStatus is because some of these will be late, missing, submitted. You may not want to include the missing scores (0) in the average. There's also a filter to include those who score above a particular score, so if you want to exclude any 0 for any reason, you could use that capability.
query getScores {
assignment(id: "30428636") {
submissionsConnection(filter: {gradingStatus: graded}) {
nodes {
score
submissionStatus
}
}
}
}
You can play around with the GraphQL interface by going to your Canvas instance and adding /graphiql at the end. Once you have a working query, you can automate it using the GraphQL API.
Now I really want to show you the power of the GraphQL approach. Since you want all scores for an entire assignment group, you could do something like this. I need to find the assignment group that I want first.
query getAssignmentGroups {
course(id: "3119582") {
assignmentGroupsConnection {
nodes {
_id
name
}
}
}
}
From that, I see that the assignment group ID I need is 5683767. Then I can get all of my grades for that assignment group.
query getScores {
assignmentGroup(id: "5683767") {
assignmentsConnection {
nodes {
_id
name
submissionsConnection(filter: {gradingStatus: graded}) {
nodes {
score
submissionStatus
}
}
}
}
}
}
This returns a nested structure. There is an array that contains an entry for each assignment and then within that is an array for the scores and states. Here's a snippet of what the response looks like so that you can see what to expect.
{
"data": {
"assignmentGroup": {
"assignmentsConnection": {
"nodes": [
{
"_id": "30428636",
"name": "Quiz 1.1 Classifying Data",
"submissionsConnection": {
"nodes": [
{
"score": 9,
"submissionStatus": "submitted"
},
{
"score": 10.4,
"submissionStatus": "submitted"
},
{
"score": 11,
"submissionStatus": "late"
},
{
"score": 9.4,
"submissionStatus": "submitted"
},
{
"score": 0,
"submissionStatus": "missing"
}
]
}
},
With this much of a query, you may run into pagination issues, but 20 assignments with 70 students might fit into a single request. It just over 1 second for me to retrieve 31 assignments without about 25 students each. You're definitely not limited to the 100 that you would get from the regular API.
I really do appreciate your efforts to help me and I have over the last month or two kept going back to looking into GraphQL and using our built in interface but I'm afraid at this point it's just too far over my head. We just switched from Moodle to Canvas. When using Moodle, I could create little PHP websites and use MySQL to very quickly query our Moodle Database. It was one of the reasons I was hesitant to switch to Canvas. I have been able to use =json_decode(@file_get_contents( in PHP to query our canvas and rebuild many of my original pages. But getting the average score of all 20 tests makes it hang for a long time. I have read many things and watched videos but using GraphQL somehow with PHP is too far above me currently.
Thanks again to @James . I still haven't figured out how to program in GraphQL but I continue to play with it. As a follow up I did make a long string for my query using what you said about you (could use assignment_ids[]=1&assignment_ids[]=2&assignment_ids[]=3) rather than making 20 different calls and then I used some pagination and it worked ok. I won't have all 20 Jedi Trials published until the end of the year so it goes pretty fast currently. By the end of the year it will be up to no more than 10 seconds to load which will be acceptable.
I'm glad you got it working.
There are other options available to the multiple submissions API that can speed it up more, but at the cost of having to cache the results locally. There are options to only fetch information that has been submitted or graded since a particular timestamp. With just 10 assignments, it probably isn't worth looking into. I use it to fetch all new submission data for the entire institution each night for an early alert system and using those filters speeds things up immensely. All of the information is stored in a database so I can prepare it in the format needed by the early alert system.
Thanks @James, your replies here (and also here) are helpful.
I'm using the Submissions API to retrieve all my school's assignment submissions to find students with late/missing submissions in the current year and term. I'm supplying the section_id and one student per API call, looping over all the enrollments until complete.
In the Canvas UI assignments can be assigned to Everyone, or section(s) or student(s).
Is there is a difference in data returned when querying API by course ID or section ID?
List submissions for multiple assignments
GET /api/v1/courses/:course_id/students/submissions
vs.
GET /api/v1/sections/:section_id/students/submissions
First, this sounds really inefficient. There's no need to make one API call per student per section, you can get an entire course worth of students at one time (pagination will be involved).
I never use the section endpoint for submissions since I want data for the whole course and a student may be enrolled in multiple sections (perhaps not at your institution, but Canvas allows for it).
I looked at the source code and it looks like course or section is used mostly to determine the list of students and assignments. The section endpoint would be a subset of the course. If you have assignments that are only assigned to specific sections then the list of assignments for a different section would not include that assignment, whereas the course level endpoint would return all assignments, even if the student wasn't assigned it.
If you specify a student id, then you're just going to get the assignments for that student, regardless of whether it was assigned to the section, the student, or the whole course.
Now, back to the really inefficient comment.
If you specify student_ids[]=all, then you can get all submissions for all students. If you leave off the assignment_ids[] parameter, you get it for all assignments. That's one API call, but you will have to use pagination -- even with a per_page=100 parameter. If the student is not assigned the assignment, there is no submission record for it.
Now, if you have lots of students and/or lots of assignments, then you will likely run into bookmark pagination. That can slow things down since you cannot make multiple requests. In that case, making an individual call for each student could be parallelized and end up being faster (maybe - untested).
However, realize that with that endpoint, you're getting extra information that you don't need. I did some testing while writing this response and the discussion replies get delivered as part of the response. If you have a lot of those with some prolific students, then that could add to the time to download. But how do you get rid of that?
GraphQL to the rescue.
You can get all of the submission information for a course with just the information you need and it's a lot faster. You likely won't even have to mess with pagination unless your course is so big that it takes more than 30 seconds.
query courseSubmissions($courseId: ID) {
course(id: $courseId) {
submissionsConnection {
nodes {
userId
assignmentId
missing
late
}
}
}
}
You would then specify a variables property to the request that has the courseId. For example, here is the variables object for a Canvas course ID of 123456.
{ "courseId": 123456 }
That took 200 ms and delivered 1.8 kb of payload for one class with 186 submissions. For a larger class with 3409 submissions, it took 2.3 s with a 10.7 kb payload.
For comparison, in that larger class, grabbing all submissions for just one student (the call you're making) took me 3.1 s with a payload of 220.5 kb. And that's just the first 100 assignments. I had more than 100 assignments in that class. The second request, which had to wait until the first one was returned because it uses bookmarks, took 1.2 s and added another 89.8 kb.
We're looking at 4.3 s and 310.3 kb for one student. And sure, you can use a throttling library to make multiple requests, but you have to allow space between them or you exceed the allowed limit and your requests die. The heavier the request -- this seemed pretty heavy since it took 3.1 s -- the fewer requests you can make simultaneously. You're probably looking at no more than 4 students per second and that might be pushing it.
I had about 16 students in that class with 3409 submissions, so at 4 per second, that's a 4 seconds to make the requests. 3 for each of them to come back with the initial 100 submissions for the student. That takes me out to 7 seconds before the final request is made. Then another second for the second page.
8 seconds with a lot of pagination to get the entire class when I fetch them one at a time vs 2.3 s for the entire class with one request when I use GraphQL.
You will probably want to make a couple of other GraphQL requests -- one to get the list of assignments in a course and one to get the list of students in a course. Then you can cross reference those to the userId and assignmentId delivered in the submissions payload.
I could have added it to the request itself. That's part of the beauty of GraphQL. But it's redundant since it includes the user information and assignment information that you request for each assignment. Depending on the size of the class, it's not that much extra. My large class took 2.6 s and delivered 20.4 kb (vs 2.3 s and 10.7 kb) when I requested some extra information about the student and assignment.
query courseSubmissions($courseId: ID) {
course(id: $courseId) {
submissionsConnection {
nodes {
userId
assignmentId
missing
late
assignment {
name
}
user {
sortableName
}
}
}
}
}
Why did I show it both ways? I'm asking for the bare minimum. You probably want due dates and other assignment information and that could make separate requests more efficient, especially if it's expensive (resource-wise) to fetch. I could send all three requests (submissions, assignments, users) in parallel to speed up the process.
If you're not familiar with GraphQL, you can take /graphiql onto the end of your dashboard URL to play around. If you can get it to work (not all information is available), it can really speed up the time to get the results and reduce the amount of information retrieved.
@James , Thank you, outstanding, this is most helpful. My script went from 4+ hours of runtime down to 30 minutes.
GraphQL is on my to do list. I did some training on it last year but haven't used it yet.
To participate in the Instructure Community, you need to sign up or log in:
Sign In