Brian ( @bbennett2 ),
I was trying to go off of what I remembered from a couple of years ago. I may have misremembered some things, might have been confusing in some cases, or Canvas may have updated the way they do things. I'm sorry if I caused any confusion. I looked at things again this morning and here is an update.
First, let me clarify: the students can only receive their own scores, so the mean has to be calculated on the server side for students. It would be very bad for students to be able to see individual scores for other students.
I'm having a slow brain day and cannot think of any place within Canvas where the mean for an assignment is reliably (consistently) given to the instructor. If someone does, let me know and I'll do some digging and see if I can find out how Canvas is returning that number.
In New Analytics, two means are provided.
- The class mean that the teacher sees at the top of New Analytics where is says "Average Course Grade".
- The mean for each assignment when you click on an assignment and the fly-out trade expands with the average grade.
It turns out that I was wrong, Canvas does supply these as part of an API request (just not one that you can easily make) for New Analytics. They are not calculated in the browser. I think I may have been thinking about the gradebook where all calculations are made within the browser, including the averages for the assignment groups. I still don't remember seeing any place in the public documentation that returns the mean (except perhaps outcomes).
When you load New Analytics, it doesn't use the old REST API, it uses the GraphQL interface on a different host. It makes six queries before displaying the first page.
- The first query is named CheckCourse and gets the name of the course.
- The second query is named StudentMean and fetches the mean.
- The third query is named FilterQuery, despite there not being a filter in the query at all. Maybe they use it to decide what to filter? It contains a lot of stuff: assignments, sections, and students (including the current overall score, last participation time, and last page view time; sections the student is enrolled in, and information about the student that includes their last logout time and on time percentage).
- The fourth query is named CourseDetail and gets information about the course, assignments for the course (this includes a stats section which has max, mean, and min values), assignments for the students (akin to submissions), sections, and students (including the current overall score, last participation time, and last page view time).
- The fifth query is called LastUpdated and tells when the information in the New Analytics was last updated.
- The sixth query is called MessagingButtonQuery and contains information about the assignment for each student (akin to submissions) with the information needed to do the Message Students Who functionality. For each submission, it has their score (as a percentage) and whether it was late or missing. It then has the current score, section enrollment, and student name for each student.
That second query sounds like a quick way to get the mean for the whole class and the fourth one will get the average, minimum, and maximum for each assignment.
That sounds great except for two things and one of these is kind of a big thing.
These queries are not part of the REST API. It's not part of the Canvas API that's available through GraphQL, which can be automated using the REST API. It's part of the Canvas Analytics API. It doesn't call <instance>/api/v1/, it calls canvas-analytics-iad-prod.inscloudgate.net/v2/graphql. The authorization token you have for Canvas doesn't work with it. It sets up a session and passes that session-id as part of the request headers.
If I try to take one of the queries and copy it as fetch and then replay it, it comes back with a CORS error. In the browser's developer tools, I can adjust the Execution Contet Selector to be "tool_content (lti_check_teacher_performance) canvas-analytics-iad.prod.inscloudcate.net" and then it will work. However, if I try fetching the min and max, then I get a 400 (bad request) status. I guess that request is just for the mean and not the other stats that it gives.
How do you get access to the information from New Analytics outside of the New Analytics interface in Canvas?
The API documentation mentions a Canvas API Gateway and says that "All Canvas services with public-facing APIs are moving (slowing) towards federating their APIs through a single GraphQL gateway." When I use that, I see that New Analytics is not one of the resources available.
If I try to send the token from the Canvas API Gateway to the analytics API, I get a 500 (internal server error) response status. I also tried sending a session-id from the browser and got the 500 status as well. It may be that I just don't have something set up correctly in the request or it's the security issues in place to keep people from doing that.
A few years ago (InstructureCon in 2017), I spent a bunch of time trying to figure out how to get into Canvas Studio from within Canvas without having an OAuth server. When they start using JWT, I get lost. I used that information in November 2018 to get into Roll Call attendance so that I could make the requests to fetch the information about attendance from within the Admin > Attendance screen. I wrote some JavaScript that I could run under Node that would make a sessionless launch that would get me into the Roll Call and then I could complete the form and make the calls. The process wasn't trivial.
You could use an automated browser like Puppeteer to launch new analytics from within the browser and then capture the information.
All of that is a lot of overhead just to get a mean and you may not want it anyway.
You can also get the average on an assignment by loading SpeedGrader. This can be filtered rather than being for the whole course, but it includes a lot of information and it isn't available through the API, making it harder to automate. If you can get into Canvas, it saves the hassle of the sessionless LTI launch, but the amount of information that it sends may be prohibitive.
There is another non-API location that you can use to get the mean grade for the courses you're teaching. That's from the Dashboard when you click View Grades. The grades get delivered as part of the page, but then you could parse it and extract the means.
Of course, then we have the problem that the class average reported through Canvas Analytics doesn't match the class average reported from the View Grades page. The Canvas Analytics is not current. When I look at my New Analytics, I see "As of March 15, 3:07 PM CDT" when it is currently March 16, 9:34 AM CDT. The view grades page in Canvas is up-to-date and matches what I get if I export the gradebook and find the average.
When I say you may not want the mean from New Analytics, I'm not just talking about the delay of up to a day. I'm talking about the inaccuracy. We are on spring break this week and the class average is off a little. 74.81% (current) vs 74.37% (New Analytics). Okay, so maybe some students took a quiz in the middle of their spring break and the class average rose by 0.44%. I could buy that.
But then I go to a manually graded project that can no longer be turned in because it was due back in February. New Analytics shows that the average for that assignment is 75.8% with 3 missing submissions out of 20 students. That sounds reasonable, until I go to the actual gradebook. The average in the gradebook is 79.74% for 19 students if I don't include the Test Student. If I include the Test Student, I have 20 students and the average is 81%. When I look at the grades, only 2 are missing.
What's going on? I did have one student drop right before midterm and that student had a 0 on the assignment. That explains the third 0 for the assignment and it should fix the average, right? I summed up the scores for the 19 students still in the course, added the 0 for the student who dropped (a little math humor there), and then divided by 20. Now I get 75.75%, which Canvas could be displaying as 75.8%.
If I look at an assignment that was due after my student dropped, New Analytics only shows 19 submissions. There are 12 missing submissions and 1 late submission (it was due the Saturday before spring break but doesn't close until after spring break and despite telling the students that it was just like the content we had spent two days in class covering and it would be better if they did it now rather than forgetting it over spring break,. it seems few listened). Ok, 12 missing submissions should mean that 12 students have 0%, right? Nope, only 11 do in New Analytics. Back in Canvas, only 10 have the missing grade and a 0. In New Analytics, the average grade is 37.1%, but in Canvas, it's 40.44%. When I go into Quiz Analytics and pull up the Student Analysis, only 1 student has attempted the quiz since March 13 (2 days before New Analytics was updated) and that was on March 16, so it would make sense if that submission was missing from the New Analytics. That means that the 12 missing makes sense, but why does the histogram only show 11 missing? The one student who took the quiz after New Analytics was updated is included in the list of students missing submissions. The average of 37.1% is correct for the 12 missing grades, but the histogram is up to date except that there are 11 missing instead of 10. It turns out that the dropped student is listed as a missing submission for New Analytics, but the mean grade is computed without that student's score.
The point being that even if you could get to the New Analytics data to get the mean, you might not want it. It is more reliable to download all of the scores and compute the mean there. For small classes, it won't take much additional time to download the scores and compute the mean than it would to make an API call and have Canvas return the mean. For large classes, it will, but the GraphQL query I gave helps a lot there since you don't have to download all of the information.
I still think the mean is the wrong number to look at in most cases. I don't want to know that the average is 37.1% or 40.44% on the quiz that only 9 students have taken. I either want to know that the mean of those 9 students who actually took the quiz is 85.37% or that the median of those 9 students is 80%. Although the median is more appropriate in this case, it is not helpful to me to know that the median of the 19 students is 0.
I say not helpful, but that isn't entirely accurate. It is helpful to know that most of the students haven't taken the quiz yet. I mean that it is not helpful to me as the instructor in assessing how much the students know. I don't directly grade on procrastination.
Switching gears ...
In one way of reading the documentation, Canvas already mentions that the score_statistics is only valid if you have a submission. The API documentation reads 'For “score_statistics” to be included, the “submission” option must also be set.' In the terse way they right, a person should read that as the score_statistics being tied to the submission and you must have a submission to get the score_statistics. That would keep someone from seeing how poorly (or well) the class did before they make their submission.
Another way to help clarify is that the Get a single assignment endpoint is a special case of the List assignments endpoint. This pattern is repeated in many of the APIs. The documentation for the more general case explains that submission is the current user's current submission and the score_submission is "An object containing min, max, and mean score on this assignment. This will not be included for students if there are less than 5 graded assignments or if disabled by the instructor. Only valid if 'submission' is also included."
In other words, that particular call is not meant for the instructor, so it is unlikely to give what the instructor wants out of it.
I remember feature requests from years ago to add the mean and or median. It seems my long term memory is much better than my short term memory. One request that was created in 2016 is More descriptive statistics (at least, median and mean) for every column in "Grades" (gradebook). Someone mentioned being able to see the mean, but I don't even see that (it pre-dated the new gradebook). Even the old analytics page (add /analytics to the end of your course URL) doesn't have the mean, just the min, max, and median. Another request mentioned the ability to find the mean for an assignment by going to assignment details, but I'm not sure what that means, either (I don't see anything grade related on the assignment details page).
As far as the mean goes, which is what I've written about, you may not want what Canvas gives anyway. If you can live with the minimum, maximum, and median, then the Analytics API that Brian referred to is probably the quickest way. Unfortunately, you will have to sort through the data for every assignment to extract the one that you want.
If you want the mean, or any kind of specialized calculation like ignoring missing grades, then you should download the scores and calculate the statistics yourself.