Obtaining and using Access Report data for an entire course

James
Community Champion
203
149390

May 27, 2020, update (version 15):

Another week, another update. This time I installed a new linter to check for crud and subtle issues with the code. It turned up a lot of little things and at least one place where I was using an array when I thought I was using an object. Here are some of the changes for version 15

  • The file is now sorted by the sortable name, when present. Version 14 sorted it by the Canvas User ID.
  • Promises now work in the code. Version 14 relied on Bottleneck's idle function to kick in, but it might have triggered before all of the fetches were done.
  • The script can now handle students enrolled in multiple sections of the course. You can generate a delimited list of sections, duplicate the information for each section, or just return the first section found.
  • The script now only tries to fetch usage information when there is a last_activity_at date. That is no guarantee that there is an access report, but I never found an access report without having a last activity date.
  • The script can now hide empty columns with no data. Previously, this was only for SIS data unless you edited the code.
  • The script can be run correctly without reloading the page. Previously it appended the data, so if you ran it twice without refreshing the page, the data would be there twice.

May 19, 2020, update (version 14):

I've done a major rewrite of the script.

  • It now uses a throttling library to play nicer with Canvas. Large courses were getting inconsistent results because the requests were failing. Requests shouldn't fail, but if they do, I throttle it even more and then try to repeat the failed requests.
  • I added the Last Activity and Total Activity Time from the People page to the report.
  • I added the current course score and current course grade to the report (the grade is only available if you're using a grading scheme).
  • I modified the quiz views to match what Canvas gives. In the data supplied, they count taking a quiz as a view, but they subtract that off in the online access report. My data now matches theirs.
  • I check the permissions of the user to make sure they can generate a report and don't add the button if they cannot. Previously, students could add the access report, it just didn't give them any information. Now it doesn't even show the button.

This is a major rewrite and updating will show a lot of changes. There is a good chance that any local customization you have done to the script will be lost. Those using a non-instructure.com domain for Canvas may need to reset their // @include line on line 5 (see the custom URL section below).

Synopsis

Canvas provides an Access Report for each student, but you have to click on each student's name from the People page to get it. This document will show you how to install a button on the People page that will create a .CSV file of the Access Report data for all students enrolled in the course. It will then show some ways you can obtain useful information from that data using Microsoft Excel. Although much of this data could be obtained through Canvas Data, this script makes it available to the instructor in real-time.

Quick Install

For those power users who are impatient, here are the quick install steps.

  1. Install a browser userscript manager Tampermonkey for Chrome/Firefox/Safari
  2. Install the Access Report Data user script.
  3. Navigate to the People page and click on the "Access Report Data" button

If you run into problems, be sure to go back and read the instructions.

Note that the Tampermonkey extension for Safari requires payment. This is for the author of the extension that allows the script to run. I am not asking anything for the script itself.

Introduction

Canvas will give you an Access Report for each student that tells you how many times a student has viewed or participated in a particular content item. To obtain the list, you go to the Course Roster by clicking the People navigation link, then click on a student's name. Once you do that, you will get an item on the right-side navigation bar that says "Access Report for student's name". When you click on that, you get something like this:

Student Access Report

It gives you the type of content (as an icon), the name of the content, the number of times a student viewed the content, the number of times a student participated in the content, and the last time the student viewed (or participated) in the content. It is sorted by the time the content was last accessed so you can see what they have been working on most recently.

Unfortunately, this information is available for just one student at a time, so answering questions like "Who viewed the PowerPoint presentation I told them to read?" becomes difficult -- you need to go into each student individually to see that information.

I've written a User Script that solves that problem. It fetches a list of all of the students in the class and then obtains the Access Report for each one, compiling all of that data into a single Comma Separated Values (.CSV) file that can be opened with a spreadsheet like Microsoft Excel.

This document shows you how to install that script and then analyze the data that you get from it.

Installing the User Script

A user script is a JavaScript that is ran by the browser on the user's machine. Rather than a Canvas-supplied script, it's one that the user installs and runs on their own. The installation is per user and per machine, so you will need to install it on each machine that you want to use the script with.

Custom URLs

The script automatically runs on any page that matches https://*.instructure.com/courses/*/users. This is the main People page if your site is hosted by Instructure without a custom instance. If you have a custom URL, like canvas.university.edu, then you will need to modify the script to get it to work.

The specific steps to do this vary depending on your browser add-on. In Greasemonkey, click on the Greasemonkey pull-down, choose Manage User Scripts, find the Access Report Data, right click and choose Edit. In Tampermonkey, click on the Tampermonkey Icon, choose Dashboard, and then click on Access Report Data.

In either case, you need to change // @include statement on line 5 to match your instance. In the case of canvas.university.edu, you should change it to https://canvas.university.edu/courses/*/users. The * is a wildcard that will match any course.

Customization

There are two configuration variables that can be set within the source code to alter the functionality of the script.

  • showViewStudent will include the participations where a student clicked on a student's name from the People (roster) page and viewed their profile when it is set to true. The default is false, which will remove this information from report before the CSV file is downloaded. This information was included prior to February 3, 2017, but faculty were getting confused when they did a pivot table and the names of their students showed up as titles along with assignments and content pages. Now users will have to explicitly set it to true to get that information.
  • quizParticipation is a flag to make the data in this report match the data displayed in the Canvas Access Report. Canvas counts quiz participations as views, but then subtracts them off when displayed. When this is true, it makes this report match Canvas. If false, then it displays what was in the JSON data, which was the previous behavior.
  • enrollmentStates is an array that says what kind of enrollments should be included. The default is 'active' and 'completed', but you may also use 'invited', 'rejected', and 'inactive' (inactive probably won't help much). See the List users in a course endpoint for additional information.
  • analytics is an array that provides an easy way to include the student analytics (last activity and total activity) or grade (current grade and current score).
  • disableMissing is a Boolean value that will remove empty columns from the output when true.
  • headingSpaces will allow you to remove the space in the headings and replace it with something else (empty string or underscore are common). This can cause "User ID" to become "UserID" or "User_ID". Some users reported problems with programs, like R, when there were spaces in the variable names (headings). The default is ' ' (a space), which is to leave the spaces.
  • multipleSections determines how students enrolled in multiple sections are handled. Setting the value to 1 will duplicate the information for each section, so a student enrolled in three sections will have three times the data. This is important if you are using the section in your pivot table, but will throw off the counts if you are not. Setting the values to 2 will take the first section that it finds and use it and ignore all of the other sections so that each item is included only once with no duplication. Setting it to a non-numeric string will use that string as a delimiter but not duplicate any items. The default is ', ' (comma followed by space) so that multiple sections will show up as a comma delimited list.
  • maxConcurrent is the maximum number of concurrent calls to make to Canvas. This is part of the throttling system. The higher this is, the faster the calls can be made, but the more likely you will abuse Canvas in a way that causes them to block your requests.
  • minTime is a delay in milliseconds injected between each call. If you make a lot of calls that the same time, Canvas imposes a penalty that can quickly exhaust the available limit before the maxConcurrent is reached. It is likely that this value will have more impact that maxConcurrent will.
  • debug is a Boolean flag to output debugging information. Included in the debugging output is the minimum x-rate-limit-remaining value and the maximum x-request-cost value. Those can be used to help tweak the maxConcurrent and minTime settings.
  • csvFields contains the headings, source field, any special formatting required, and allows you to disable columns that you do not want. The three sources are u (user data), s (section data), and a (access report data). This is under the advanced configuration section, which typically means don't mess with it if you don't know what you're doing, but any user should be able to change the name property.

Export the Data

In the June 4, 2016, production release, Canvas consolidated the buttons that previously appeared on the People page under the administrative cog.

The script was updated on June 6, 2016, to reflect that change and move the Access Report Data into that same cog. Note that the demonstration videos have not been updated to reflect the change.

Open Settings

When you expand the menu by clicking on the cog, then you can choose the Access Report Data item.

Click Access Report Data

Click on this and wait.

There is now a progress bar that appears once the list of students has been downloaded and it is fetching the access reports.

It has to make an API call for the list of students and then generate the Access Report for each student in the course. This only took 8 seconds for my class with 46 students and up to 112 content items, but we're only one month into the semester and it will probably slow down later. It took 12 seconds on a course with 81 active students and 126 content items. It took 15 seconds for a class with 35 students but 470 content items.

Limitations

Early on, really large classes were having timeout issues in Chrome. Switching to Firefox helped. Now I wonder if the problems were related to the requests failing and so I hope the changes I made with version 14 have improved this so that it is no longer an issue. I've made some improvements to the code with version 14 and let Chrome run for 4.5 minutes in a class with 29,615 items without issues

If you have a really large class, you may need to tweak the settings. The call to get the enrollments is relatively expensive compared to the call to get the usage (access report data). In a class of 300 students, it was fine, but in a class of over 10,000 students, it was timing out. It is supposed to retry failed attempts, but there was some data loss that I haven't tracked down yet. It's best to avoid exhausting the x-rate-limit-remaining header. Increase the minTime or decrease the maxConcurrent values. Normally, the minTime is the more important one, but if your requests are taking a long time to process (like the enrollments were), then you may decrease the number of concurrent requests. Reducing maxConcurrent from 40 to 20 avoided the lost data issue in the 10k+ course and only added 4 seconds to the download time for the class with 300 students.

The course with 81 students actually had 233 students in it, but it was a resource course for faculty and only 81 had bothered to go into the course to do anything. That brings up an important note about the Access Report.

The Access Report only provides information on students who are doing something in the course.

Access Report data does not return information about who has not done something, but Excel has an option that will allow you to see this.

Another thing to note is that additional information is available through the Page View data, like the exact times when students did something and which browser they were using at the time. However, faculty don't generally have access to this information and you have to load the information for each student and sift through it to see which applies to the particular course. This takes way more than 8 seconds for little gain.

The Page View information is available through Canvas Data. However the data there is not current, running about a day or two behind real-time. It also has limited availability and requires additional resources to analyze. Page view data still occurs when a student does something. If they're not doing anything, there is nothing to record.

Finally, links to URLs that are contained within assignments or pages are not included here. The External URLs are just those that are linked to items in modules, not a link in a page full of instructions.

Raw Data

Okay, you've installed the script, clicked the button, and opened the file in Excel. Now what?

Well, you get a bunch of raw data that will need manipulated before you can tell anything useful. It looks pretty intimidating, you have a report that looks like this (and this is just the first 10 rows).

Excel Raw Data

Note that the SIS Login and SIS User ID may not be there, depending on the permissions you have within Canvas.

Where did all this come from?

It turns out that Canvas actually provides more information than it shows on the page you get when you ask for the Access Report. Canvas displays the HTML version with just the highlights and in a form that is easy for humans to read. It also provides the data in JavaScript Object Notation (JSON) format that is much easier for a computer to deal with. Canvas internally calls these items assets, and that gets reflected in the naming of the item.

Here's what that first entry looks like in JSON.

{
"id":123456789,
"asset_code":"discussion_topic_11319205",
"asset_group_code":"topics",
"user_id":1278402,
"context_id":1785810,
"context_type":"Course",
"last_access":"2016-02-12T21:31:11Z",
"created_at":"2016-02-10T20:12:45Z",
"updated_at":"2016-02-12T21:31:11Z",
"asset_category":"topics",
"view_score":2,
"participate_score":null,
"action_level":"view",
"summarized_at":null,
"display_name":"Discussion 4: Hypotheses and Errors",
"membership_type":"StudentEnrollment",
"readable_name":"Discussion 4: Hypotheses and Errors",
"asset_class_name":"discussion_topic"
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

What's being displayed in the HTML version of the report is contained are the asset_class_name, readable_name, view_score, participate_score, and last_access. The User's name is displayed at the top of the report, but the user_id (1278402) can be used to identify the student (in this case Cubic Dream) and the user script merges the user information and the access data together into a usable format for the user. By the way, all of the user data have been anonymized, but the assignments and access data are from my actual class. Cubic Dream isn't my student's real name and 1278402 isn't Cubic's real Canvas User ID.

Data Dictionary

In statistics, I emphasize the importance of having a good data dictionary. You need to know what the values represent.

  • User ID is the Canvas User ID. Everyone has one. Most people won't find it extremely interesting, but in the case of multiple students with the same name, it can help tell them apart.
  • Display Name is name of the student that is displayed in Canvas. It's normally in First Last format, like Cubic Dream or Skinny Record.
  • Sortable Name may not be available to you. There is a feature that your Customer Success Manager can enable that will allow you to list names in Last, First format like Dream, Cubic or Record, Skinny. If it's enabled, this provides you with a quick way to sort alphabetically by last name.
  • Category corresponds to the asset_category_type of the Access Report data. It includes things like announcements, assignments, collaborations, conferences, external_urls, files, grades, home, modules, quizzes, roster, topics, and wiki.
  • Class corresponds to the asset_class_name of the Access Report data. It includes things like announcement, assignment, attachment, content_tag, discussion_topic, google_docs_collaboration, quizzes/quiz, student_enrollment, teacher_enrollment, and wiki_page. Generally, there is a one-to-one correspondence between category and class, but some things like the roster category are broken into student_enrollment and teacher_enrollments for the class name.
  • Title is the name of the content. In the Access Report data, it's called the readable_name There was a display name as well, but it was sometimes blank and when it was there, it mostly matched the readable name.
  • Views is the number of times the student viewed the content. For discussions, this would be the number of times they went in and viewed the discussion. This is the closest most people will get to the number of messages they read, but it's the number of times they went in, there is no guarantee they actually did anything past opening the page.
  • Participations is the number of times the student participated. For discussions, this would be the number of posts they made.
  • Last Access is the last time the student viewed or participated. It is converted to the local time according to the browser.
  • First Access is the first time the student viewed or participated. It is converted to the local time according to the browser.
  • Action is either view or participate. I'm not exactly sure on this, but I think it corresponds to the last access.
  • Code is a unique identifier for each piece of content. It contains the type of content plus the Canvas ID. For example, assignment_8556874 identifies this as assignment with a Canvas ID of 8556874. That is not very useful for faculty, but if someone was trying to link all the data together for advanced reports, it may be.
  • Group Code is another way of categorizing the data that most people will ignore. The group doesn't refer to Groups in the normal sense, but is some way to organize the data. For example, there is an assignment_group_### group code field that appears to represent which assignment group the item belongs to. That means you could, with additional information, break it down by whether the assignment was homework, exams, projects, etc., depending on your assignment groups. On the other hand, every single code beginning with wiki_page_ belonged to the same wiki_ group code (at least for my course).
  • Context Type should be Course or Group and relates to the Context ID to determine exactly which course or group.
  • Context ID is the Canvas Course or Group ID for the course. It is used in conjunction with the Context Type.
  • Login ID is what the user uses to log into Canvas with. For us, it's their NetID, but it could be an email address.
  • Section Name is the name of the section the student is enrolled in. This may be a delimited list if the student is in more than one section. See the multipleSections configuration variable for more information.
  • Section ID is the Canvas ID for the section the student is enrolled in. This may be a delimited list if the student is in more than one section. See the multipleSections configuration variable for more information.
  • SIS Course ID is the Course ID supplied by your Student Information System (SIS). This column may not be there if the person requesting the Access Report doesn't have access to the SIS information from Canvas. It's a permissions issue and the script runs as the person calling it.
  • SIS Section ID is the Section ID supplied by your Student Information System (SIS). This column may not be there if the person requesting the Access Report doesn't have access to the SIS information from Canvas. It's a permissions issue and the script runs as the person calling it.
  • SIS User ID is what your SIS knows the person by. For us, it's an integer that uniquely identifies the user. This column may not be there if the instructor doesn't have access to the SIS information inside Canvas.
  • Last Activity is the last time the student accessed the course as explained in How do I use the People page in a course as an instructor?  It is converted to the local time according to the browser.
  • Total Activity is the total time in decimal hours as explained in the Canvas Instructor Guide. Instead of giving hours:minutes:seconds, 3:12:15 becomes 3.21 hours.
  • Current Score is the current numeric score for the student in the course.
  • Current Grade is the current letter grade for the student in the course. This is only available if a grading scheme has been set.
  • Final Score is the score the student would get if all ungraded assignments were given a 0. This is disabled by default.
  • Final Grade is the letter grade the student would get if all ungraded assignments were given a 0. This is disabled by default.

Note that section information was added after the original script was released and are not included in the videos.


Drilling Down to Specifics

Some questions can be answered using just the raw data.

Format as a Table

If you're going to examine the raw data, you will want to turn it into a table first. To do this, go to Home > Format as Table or Insert > Table. This allows you to filter or sort on a column, which will greatly increase your productivity later.

Who participated in a discussion?

Let's say we wanted to know who participated (and how many times) in Discussion 3.

I purposely started this one off in a novice way, to show what people who may not be familiar with Excel can do. There are more efficient ways of doing this.

  • You can sort by the title by clicking in the Title column and going to Data > Sort > A-Z. An easier way is to click on the down arrow on the Title heading at the top of the table and choose Sort A to Z.
  • Doing that may overwhelm you with information so you can use Ctrl-F to find the one you want. But once I did that, it turned out that Discussions are in there twice, once as a discussion and once as an assignment. This is a problem when you have multiple contents with the same name, so sorting by title may not be the best solution.
  • You can filter the Category (choose topics) or Class (choose discussion_topics). To do this, click on the appropriate heading at the top and un-check the ones you don't want.


Who viewed an external URL?

You'll want to sort by title to group the content together. You could use the Code to group, but it wouldn't be alphabetical.

I then used filters to select the external URLs. Those can be found under Category (external_urls) or Class (content_tag) and found the one I was looking for.

Show Me!

Here is a video walk-through of the three items mentioned here.

Summary Reports

Sometimes you want a broader picture than just what a student did on a single assignment. To accomplish these, you'll need to create a Pivot Table.

Create a Pivot Table

To create a pivot table, you need to go to Insert > Pivot Table and click OK.

Once you're there, you will probably want to drag the Display Name or Sortable Name down to the Rows so you can break things down by students. Every one of the items in this section starts off the same way.

Pivot tables allow you to insert slicers to quickly filter the data or time slicers to view data over a particular time period (you must have a date/time field to do this, but we have first access and last access to pick from)

Analyzing Discussions

This whole Access Report project grew out of a desire to know how many times students had gone into the discussions and at least viewed them.

What I'd like to know is how many times did a student view or participate in a discussion this semester.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows
  3. Drag the Participations to the Values. Excel wants to do "Count of Participations" instead of "Sum of Participations". You can change this in several places, but double clicking on the heading of the table is probably the quickest. Change it from Count to Sum and then change "Sum of Participations" to something else. You might want to use "Participations", but that's already used, so you have to pick something else.
  4. That gives the participations for the entire course. You could add a filter on the right side, but a faster way is to choose Insert Slicer from the top. This gives a nicer interactive menu where you can immediately click and limit your data.
  5. After looking at Participations, you can do the same thing with Views. Drag it to the Values box, where, thankfully, it comes through as a sum. I would change the title from "Sum of Views" to something else, like "View"
  6. You can sort the data by clicking on the Row Labels pull down. The default A to Z and Z to A are for the data in that column, but you can choose More Sort Options and tell it to sort by another column
  7. To break the report down by the discussion, you can add another slicer for the Title and then look at participations and views one discussion at a time.
  8. If you decide you want to look at all of the discussions, then drag Title to the Columns selector on the right side. You'll need to turn off the filter on the Title if you do this.

Here is a video that shows all of that in action.

Filtering by Time Period

The Course Roster (People) page shows you when the last activity of a student was and the total amount of time spent in the course. It does this in alphabetical order. Unfortunately, last activity could be just logging into the course, it doesn't mean they did anything else once they got there.

As a side note, if you have installed my Sort a Roster Canvancement, then you can click at the top of any column to sort by that column. You can find sort by section, by the time they last accessed the course, or even by the total amount of time spent in the course.

You can filter the information in the Access Report Data spreadsheet by time.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows, and both Views and Participations to the Values. As before, change the heading "Sum of Views" to just "View" and the "Count of Participations" to Sum instead of Count and then change the title to "Participation".
  3. Click on Insert Timeline and choose Last Access.
  4. Change the slider from Months to Days and then highlight the date range to restrict the report to.
  5. You could add filters (through the Slicers) to limit what kind of activity you want to look at.

Remember that only students who have data for that time period will show up. There is no easy way to get a list of students who aren't doing a particular thing.

Here is a video that walks you through the report.

Quick Tables

A quick table is a table that shows you information about a particular area. What we're going to do here is break down the information by the Class. You could just as easily choose the Category and some might find it more useful.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows, the Title to the Columns, and Class (or Category) to the Filters.
  3. Draw Views and Participations to the Values. Rename Views to be V (yes, just a single letter). Change "Count of Participations" to Sum and then rename it to just be the letter P.
  4. Rotate the titles in Row 4 so that they are vertical. Do not just click on Row 4, it doesn't work; you need to select the cells and then do the rotation. To rotate the text, click on the Orientation icon from the Home screen and choose Rotate Text Up. You may also want to Right align all the text from columns B on, although it may not really matter if you use a single letter for Views and Participations.
  5. Be prepared to be wowed!
  6. Click inside the pivot table, then at the top click on PivotTable Tools > Analyze. On the left, choose the pulldown menu next to Pivot Table Options (don't click on the word Options). Then click Show Report Filter Pages (this won't be available if you forgot to put something in the Filters box on the right). Then choose the filter(s) to use and click OK.
  7. What you get is a page for each type of content. The name of the student is on the side and the name of the content is across the top. This allows you to quickly (hence the name Quick Table) look at discussions or quizzes or external URLs or anything else in the class that is available.

Here is a video that shows all of this in action.

Viewing Who Has Not Participated

As mentioned above, what we are looking at is the Access Report data and it doesn't include information about students who are not engaging in your course. Luckily, there is one checkbox in Excel that we can check to get that information. The student will have needed to do something, anything, so that their name is in the Access Report data, but then we can see what they have not done.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows
  3. Drag the Participations to the Values. Excel wants to do "Count of Participations" instead of "Sum of Participations". You can change this in several places, but double clicking on the heading of the table is probably the quickest. Change it from Count to Sum and then change "Sum of Participations" to something else. You might want to use "Participations", but that's already used, so you have to pick something else.
  4. Choose Insert Slicer from the top and add slicers for Category (or class) and Title. Choose the content you want to view. What you currently have is a list of those who have participated.
  5. To get the list of those who have not participated, click on the student's name from the Rows field and choose Field Settings. Then click on Layout & Print. Check the "Show items with no data" and click OK.
  6. If you like, you can go to Data > Sort > A to Z to bring those who have not participated to the top.

Here's a video showing how it works.

Updating the Script

Tampermonkey should attempt to update this script automatically for you. In case you have turned off that functionality or would like to update sooner, you can click on the Tampermonkey icon and choose Check for userscritp updates.

The update process keeps your local settings and shows the changes to the code for you to decide whether to upgrade.

Canvancements

This script is a Canvancement -- a Canvas Enhancement. The links in the document point to an installable version of the code, but there is an Access Report Data project page as well that contains the source code as well as a version that you can use to anonymize the names like I did for the videos. Other projects, like the Roster Sorter that was mentioned here can be found on the Canvancement website as well.

203 Comments
a_eberhard
Community Novice

I've been in data heaven all day.  Thank you so much for this!

shauna_vorkink
Community Contributor

 @James ​ Thank You so much for sharing!  I have many schools asking about how they can get this type of data faster than one student at a time.  For those users who are not DataBase Analyzers and have not delved into the intricacies of Canvas Data this will be fantastic.  Your instructions and videos are so helpful -  Thank You Thank You!

cward
Instructure Alumni
Instructure Alumni

 @James ​ this is awesome! We are aware of the need to have more access to course analytics; Know that it's something we're actively working on!

James
Community Champion
Author

Deactivated user​,

That's good news. I'm sure that what Canvas eventually comes up with will be much cooler than this is because you do all kinds of accessibility testing, don't require browser add-ons, make it look nifty, and all that other jazz that makes Canvas great.

Plus, you have access to the underlying database and I only have access to what you make available. This was my first time using the .json data from within Canvas (rather than through the API), so I didn't even have the documentation to go off of.

I consider anything I write to be temporary until something better comes along. If anything I do sparks an idea in someone, all the better. People are free to take what I've done and run with it -- I released the code under the ISC license​ -- and I don't even really care about the attribution, I just don't want people to sue me.

sgriffith
Community Contributor

Mind blown. I wish there was an emoticon for that. Thank you so much for sharing!

James
Community Champion
Author

 @sgriffith ​, I think there is actually an emoticon for that. Maybe it was a meme. Regardless, I'm glad it was helpful.

kona
Community Coach
Community Coach

I think it looks a little something like this... :smileylaugh:

JamesBeingAwesome.jpg

haysb
Community Novice

Thank you for this!  Super super helpful.  Any thoughts on what I would add to call the course section as well?  I've played with adding 'section', 'u.section' but no dice.  I would love to be able to filter all of this awesome information by section as well.

James
Community Champion
Author

The section isn't included by default in the course users list. To get it, you would need to add include[]=enrollments to the url on line 26 of the accessReport() function

    var url = '/api/v1/courses/' + courseId + '/users?enrollment_type[]=student&per_page=100';

becomes

    var url = '/api/v1/courses/' + courseId + '/users?enrollment_type[]=student&include[]=enrollments&per_page=100';

The problem is the way the enrollments object is returned. It's an array and so you need to loop through each item, looking for the one that has the role=StudentEnrollment and then grab the course_section_id. That's just a number, though. If you have permission to read the SIS information, there will be a sis_section_id and a sis_source_id. The sis_source_id has the short name of the section as part of it, but no where else in the report is that given. But if that's the information you want, you could append it to the userData object, which is keyed off their Canvas User ID.

To reliably add the section, you would need to make another API call, List Course Sections. The good news is that it's unlikely to need pagination (unlikely to have more than 100 sections in a course), but the AJAX calls are asynchronous, so you'll need to make sure it's processed before you start generating the CSV file. You would still need to get the enrollments, although you can include[]=students on the sections API call and possibly replace the get course API with the get sections API and get the same information plus the enrollments.

I wasn't concerned with sections when I generated the data, so I didn't even think to put it in there. I need to get caught up in grading, but if I find some time, I might revise it. But in the meantime, that's how you would need to go about getting it.

James
Community Champion
Author

 @haysb ​,

I went through and modified the report so that it now includes the Section information by grabbing the section information rather than the course information.

There are possibly four extra fields now, depending on your permissions:

  • Section Name
  • Section ID (Canvas ID)
  • SIS Section ID
  • SIS Course ID

To update a script, you can uninstall / reinstall, or follow these instructions:

Firefox / Greasemonkey

  1. Click on the Greasemonkey icon pull-down and choose Manage User Scripts
  2. Right click on Access Report Data and choose Find Updates. You may need to Force Find Updates if you've altered your local copy

Chrome / Tampermonkey

  1. Click on the Tampermonkey icon and choose Check for userscript updates

You may need to confirm the updates

d_ellis
Community Contributor

Fantastic, I love it! It would also be nice to create an activity/page-specific version - e.g. generate a list of which students have or have not viewed a specific page. You can derive that from this report (e.g. sort by Title) but a page-centric report would also be useful.

James
Community Champion
Author

 @d_ellis ​,

You can tell Excel to show who didn't participate by checking a box. I wasn't aware that was an option until your post, so thank you. I've gone back and updated the document to show that and added a video demonstrating it as well.

If I understand what you mean by page-centric (have a button on each page rather than just on the people page), then you are free to take what I've written and make a separate page-centric version, but it's not as easy as you might think​ and I won't be working on it for the reasons I gave in that post.

haysb
Community Novice

Hi James, thank you so much!  I really appreciate this Smiley Happy

alyssa_beckwith
Community Novice

Hi,

I just followed the download and install instructions.  When I clicked on the Access Report Data link, the following error message was displayed:

171445_pastedImage_0.png

Please assist.

Thank you!

James
Community Champion
Author

 @alyssa_beckwith ​,

  • What operating system and browser are you using?
  • Do you have names of assignments or students using international or special characters? If so, can you let me know which characters so I can test?

The examples I found on the Internet for creating a .CSV file wanted you to URI encode the data, but it didn't work when I tried that, all it did was vastly increase the size of the file and make it come through on a single line and not readable. I want it to use UTF-8 encoding, but none of my assignments have special characters outside the Latin1 range. The Latin1 range is mostly what we use in English with a few accented characters thrown in.

Anything copy/pasted from Word is likely to cause problems as it likes to use smart (curly) quotes and unicode. You might have a special character in a file name. It might be a name of a student.

alyssa_beckwith
Community Novice

Thanks for the quick response James!  I am using Windows 7 Enterprise, and Chrome as my browser.  None of the students' names and assignment names use any international or special characters.

James
Community Champion
Author

I'm able to reproduce your error (Windows 10 with Chrome) and it happened when I inserted a special character into an assignment name. So, I'm going to guess that you have some in there somewhere even though you may not be aware of it.

I'll work on it with my bogus UTF8 character and see if I can get the problem fixed.

James
Community Champion
Author

 @alyssa_beckwith ​,

I've updated the code to work with the special characters in my test case. I stopped using the btoa() function all together and went with URIencoding. It makes the file bigger and less readable (to a human), but it handled the non-Latin1 characters. I feel there's probably a better way overall to do it, but I don't have time to look into it right now.

I'm sorry it took so long to get back to you, but UTF-8 support is always troublesome when you try to move information from one program to another. I could get the script to encode the special character, but then Excel was converting into two characters instead of the one it was supposed to be. I had to include the byte marker in the download to tell it that it was a utf8 file (Excel seems to be ignoring the directive charset=utf-8). There is a library for creating and downloading files on the fly. I had avoided using it, but it probably takes care of all of the issues, so I may need to look into it in the future.

With Chrome, you can click on the Tampermonkey icon and say Check for Userscript Updates to get the latest version.

d_ellis
Community Contributor

Ah, thank you, that's a clear summary of the obstacles. An addition to the API that allows a query for access stats at the page (rather than student) level would be helpful to overcoming some of them - although not the question of where in the UI it would fit.

alyssa_beckwith
Community Novice

Thanks James.  I will try that and let you know how it goes.

alyssa_beckwith
Community Novice

IT WORKED!!!  Thanks James for taking the time to find a solution for me.

James
Community Champion
Author

Glad to hear it! It also probably fixes issues for non-English installations.

I did play around with the FileSaver.js library last night. It converts things into blobs and allowed up to like 500 MB in a download, so I may switch over to that at some point to avoid the encoding issues and running out of space with a Data URI. That should be a behind the scenes thing. I also discovered that by default, Tampermonkey has automatic updates enabled.

nresearcher
Community Novice

This seems really helpful! I would love it if I could also access information about course instructors in a similar fashion, to see how they are using the Canvas site - is this possible?

shauna_vorkink
Community Contributor

 @nresearcher ​

This isn't exactly the same process as described above, but it does allow you to check up on the health of your Instructors are utilizing Canvas. There was a presentation done this last year at InstrutureCon about a tool that can be integrated with Canvas.

James
Community Champion
Author

 @nresearcher ​, it is possible as there is an access report for the instructor, but it's not really what I intended for this particular script. I was writing it for the instructor who wanted to see what their class was doing. It returns the entire class at a time, but I think what you're describing is returning all the instructors at a time or perhaps returning all courses for a single instructor at a time.

The access report is specific to what a user has done in a course, so the process would essentially be 1) gather a list of active courses, 2) get the instructors for those courses, 3) get the access page for each of those course/user combinations, 4) generate an Excel (.CSV) file. It's a different set of API calls, but essentially the same framework.

The link would need to go in the Admin page for the account maintenance page rather than on the people page within the course.

Most of the information in the access report is also found in the requests table from Canvas Data, so if you have access to that, it would probably be faster and more insightful.There is an issue of the data being a minimum of 18 hours old, but I don't know how much real-time analytics you need.

I know that AspirEDU makes a product called Instructor Insight that provides some of this for analysis for the instructor and  @Chris_Munzo ​ could provide more information about that.

Chris_Munzo
Partner
Partner

 @nresearcher ​ As  @James ​ mentioned, Alliance Partner - AspirEDU​ has an analytics solution called Instructor Insight that uses a nightly feed of data from Canvas to provide metrics on the performance and engagement of instructors in their courses.  This includes an assessment of class risk. There is a brief video demo on our web site. Let me know if you'd like to learn more.

James
Community Champion
Author

 @nresearcher ​,

Consider this proof of concept and not ready for prime time, but ...

I modified the report to generate mostly the same thing for instructors. It's not as well debugged as the one for students, but it works for me. It adds a button to the Managed Accounts > account name > Sub Accounts page called "Instructor Access Report Data". I put it there so that you could limit it by subaccount if you wanted to.

It will run on your main account, but it includes all sub-accounts as well. And if you have a large Canvas installation, it will most likely bomb. I ran it on our main account over a 12 Mbps DSL connection and it took 3.5 minutes to download the list of accounts. In total, it took 14 minutes to run and I had to tell Firefox to keep running the script. RAM usage swelled to over 4 GB and CPU usage to 14%. It had 3,152 courses and 199,207 rows of information. The file was 48 MB in size.

Running it on a sub-account with 85 courses was much more manageable.

I did, however, discover some CSV encoding bugs that I will need to fix in the main version as well. I thought I had checked for quotes, but there were some instructors who put "points" (with the quotes) in an assignment name and it caused problems with the CSV import.

What I really need to do is limit it to just the current term, but the "current term" isn't always singular for institutions and others don't put courses into terms (but they should). I also need to optimize some of the queries (especially on fetching the courses).

So, it works, sort-of, but if you would like to test it and provide feedback, you can.

The main page is on my Canvancement / Roster page, click on access-report-instructor.user.js to see the code and then the Raw button to install it. I don't want to provide a direct link here because I want people to consciously decide to install it and understand the ramifications of just what that means. I've only tested it with Firefox right now.

After writing the user script, I'm even more convinced that Canvas Data is the way to get at this information.

nresearcher
Community Novice

Thank you for these tips James (and others)! Unfortunately, I was unable to get the draft instructor report to run. but if you think that Canvas Data is the best way to get at this information (which we have access to), then I think we'll focus our efforts more on that option. However, if a "prime time" version ever becomes available, I think programs like NYCLA would find it really helpful in helping us evaluate and improve our programs.

James
Community Champion
Author

I definitely think Canvas Data is better for this kind of report. There is just too much information to get real-time unless you narrow it down by current courses (and that may still be too much depending on the institution) or by the instructor. That is a possibility, but not sure where you would add the button for it.

james_acevedo
Community Explorer

Thank you for sharing. Works nicely and collects a ton of info. Very useful. Will show this to others here to generate ideas how we can use.

shane_ohara
Community Champion

This is EXCELLENT, James. Thank you for sharing it with all of us.

Shane

caitlin_stiles
Community Contributor

Hi James- thank you so much for this. This is super interesting and exactly the kind of data we're looking for.

Any idea from a Canadian privacy stand-point where Greasemonkey stores their data? Canada has different privacy laws and I need to be sure that we're abiding by those. Any information would be helpful!

cward
Instructure Alumni
Instructure Alumni

Caitlin,

Greasemonkey doesn't store any data remotely, it's a way for end users to add custom functionality to existing web pages using Javascript (you can read the Wikipedia article here explaining in more detail: Greasemonkey - Wikipedia, the free encyclopedia ). Any data you pull using James's script will be stored locally on your machine, pulled directly from Canvas.

Hope this helps,

Chris

James
Community Champion
Author

Thanks for jumping in with the explanation, Deactivated user​.

 @caitlin_stiles ​​  - everything Chris said is correct. The user script (the program code) is stored on your local machine. Greasemonkey and Tampermonkey fetch the information using by issuing GET commands to a web server (hosted by Canvas) like what would happen if you clicked on a link. It then downloads the information to your machine to do processing. This is what Canvas does when you visit the People (Roster) page (as well as some other pages). It sends the information about the users in a computer-readable form and then runs a script on your local machine that formats and presents the information to you. The difference is that you're adding functionality via the user script ran by Greasemonkey, while the code to display the Roster page comes automatically from Canvas when you download the page. The other difference is their code is written by professional programmers and goes through review and is of a higher caliber than what I write, which is intended to meet a need, but may not work for everyone (for example, I do not even pretend that it will work with MSIE or Edge browsers). They also make sure their code doesn't break something else - whereas mine may stop working upon any release that changes functionality. But if it does, let me know and I'll try to get it working again.

The user script doesn't save anything to your hard drive, unless you save the Excel (.CSV) file that it generates. The browser may cache the information that it downloads like it caches any page that you visit on the web, but nothing is sent to or stored on a remote server (Canvas probably logs the request for information, but the script doesn't store anything).

stelpstra
Community Champion

Awesome contribution!

f000f2p
Community Novice

Great contribution! The script gets all students access report like a gem. I have a question on obtaining a list of all instructors/TAs instead of students in a course site and fetch their access report data. I tried, but I cannot seem to make it work. Is it possible? Thank you!

James
Community Champion
Author

Yes, it is possible.

Are you talking about the script mentioned in this post (inside the current thread) or are you talking about for just a specific course?

The first one was rough and worked for me (small school) but took like 15 minutes to run. Dartmouth would probably crash it unless you use subaccounts and it still may crash then.

If you're talking about for just a single course and want the instructors, you probably need to use a different API call in line 26 of the main script to get the list. I had originally used the course enrollments but then found that /sections was faster for the purpose here.

f000f2p
Community Novice

Thank you for the prompt response! I am talking about for just a specific course. I have tried to use a different API call in line 26 of the main script<https://github.com/jamesjonesmath/canvancement/blob/master/roster/access-report/access-report.user.js> var url ='/api/v1/courses/'+ courseId +'/enrollments?per_page=100' to get the list but still need to edit other parts of the script. what does the 'include[]=students' add to the section object? does it add user_id? Thanks!

James
Community Champion
Author

The sections object is explained in the API. The include[]=students includes a list of the students with the list of the sections. It does provide the user ID, but only for students, not for instructors.

You'll need to switch to a completely different API call. You could use List users in a course​ and specify the enrollment_type

     var url ='/api/v1/courses/'+ courseId + 'users?enrollment_type[]=teacher&enrollment_type[]=ta&per_page=100';

Within the sections API call I was using, the student IDs were contained in a field called students. It was an array within an array and the information I needed was in the inner array.

When you use the List users API, there is just a single array, you don't need to look inside of each item to get the list, the object returned is the list. There isn't a need for the nested loops or anything involving section or section.students -- with that API call, the udata is essentially the same thing as section.students. The problem with this call is that you won't know what section the TA or teacher is enrolled in. For us, that's not an issue as we put the teachers into the course via sis import rather than into a section and we don't generally have TAs, but it might be for some people.

To get the section from the List Users API, you'll need to add include[]=enrollments to the query string and then go through and iterate over the enrollments (which may be more than one per user) to determine which section(s) they belong in. It would be easier to remove the information about section, but some schools might want to know that. My initial release didn't have the section, it was added in response to a request, so you might be able to view the history to see what changed on February 17 when I added the section. You don't want to just take the previous version as other fixes have been applied.

f000f2p
Community Novice

That is great, thank you! On a separate note, I wonder whether it is possible to use a similar method (a user script) to pull Student Analysis data in Quiz Statistics for all quizzes at once? Currently, the tool allows faculty to obtain the Student Analysis data for one quiz at a time.

James
Community Champion
Author

I don't think a user script would be the place for that one. User scripts interact through the browser, so you would need to visit the page, simulate clicking on the button to generate the report, wait for the report to complete, and then download. However, user scripts run on a single page, so it's going to be challenging to get it to navigate to the quiz statistics page and do its thing while staying on the current page. The Access Report data doesn't do anything dynamic, so I can load the page through AJAX calls but I don't have to process anything. That's not the case with the Quiz Statistics.

It would be much easier to use the Quiz Reports API. Write a program that requests each of the reports be generated. Then keep checking back until all of the reports are finished. Finally download them all and combine them into a single report.

But all that is overlooking one major problem. Quizzes have different questions and so the Student Analysis quiz reports are basically incompatible with each other. To combine them, you would need to stack the question information while retaining the information at the beginning and end. Columns would contain: name, id, sis_id, section, section_id, section_sis_id, submitted, attempt, question_id, question_response, question_score, question_possible, question_correct. A 50 item quiz would have 50 entries for each student. It's beginning to look a lot like what Canvas Data does.

A normalized database structure would be more useful with one entry for name, id, sis_id, section, section_id, section_sis_id, submitted, attempt, score, number_correct, number_incorrect and then one to many key to another table that contains all of the question information. But you can only have one sheet at a time in CSV format.

For all of those reasons, I don't think a user script is the way to go.

f000f2p
Community Novice

Really appreciate your prompt response!

Canvas admin can write scripts to query against APIs to harness Canvas data with a token, but not sure what kind of API requests that faculty can make without a token. It does not appear that faculty can generate a token?

James
Community Champion
Author

From outside a browser that has already logged into Canvas, you can't do much with the API without a token.

However, unless the school has blocked access to the token generation page, then faculty can generate a token, students can generate a token, any user can generate a token.

It's not as easy as adding a button within Canvas, but several people have used the Google Sheets I've written to adjust due dates, count discussion posts, etc, and they all use a user-generated token.

d_ellis
Community Contributor

Is there an upper limit on course size with this report? I ask because I have tried it out in a course with 2500 students and it is returning a network error.

James
Community Champion
Author

There's not a hard limit coded into it, but there probably is a practical limit. For instance, it could be limited by the number of requests that it has to make. It has to generate a request for each student, so you're going to be sitting there hammering the Canvas servers with 2500 requests and that's after you obtain the list of all of the students, which was at least 25 requests. If you're hitting a server with 2500 requests within a couple of minutes, there's a good chance it will take efforts to slow you down.

The Canvas API Policy says that there is a rate limit.


"Applications that access the Canvas API must not place undue load on Canvas servers. Canvas has an automatic rate limiting provision that dynamically adjusts as more concurrent and/or expensive requests happen. When the rate limit is exceeded, API requests will fail. Rate limiting is enforced per access token, so that partners that perform requests on behalf of multiple end-users will not be throttled per access token that they hold."

Because of the overhead involved, I imagine that courses with lots of students will run into many issues with interactive data gathering done through the browser. A better way would be to gather it outside the browser through some other means. Unfortunately, some of the information isn't available outside the web interface.

d_ellis
Community Contributor

That's really good to know, thank you. When you say "other means" are you thinking of the Google Data Service?

I wonder if altering the code to create an access report for each section individually would get under the rate limiting provision. I may try to code that myself, if you think it would have a possibility of success.

James
Community Champion
Author

I was not thinking of Google Data service -- I have no experience there.

Google Sheets might seem reasonable, but they have a limit of 5 minutes execution time and the API calls are not asynchronous there unless I missed something. That would slow it down, but the call is not to the API, it's to a page within Canvas, which means that you need to be logged into Canvas and that's going to be difficult to do from a Google Sheet.

The information might be obtainable through Canvas Data, but I wrote the script primarily for the teachers who wouldn't have access to it or didn't want to wait a couple of days for the latest information to appear.

I was thinking of a headless browser like PhantomJS or CasperJS that you could use to log into Canvas and then make the calls. You could program delays into them. Speaking of which, you could probably add delays to the code I wrote between each new API call, but that involves setTimer and would complicate things, so I didn't bother. For smaller courses, it's not a problem to get it all at once and they wouldn't want an extra delay.

Depending on the size of your sections, obtaining the report by section would be a great way to go as long as you took a break between each section. You would manually have to combine the reports together. You would also want to use a different API call to get the list of the students since you wouldn't want to download all of the students every time you processed a section.

I've got a revision that I've been meaning to test for probably a month now that adds a progress bar to the download so it looks like it's actually doing something. I also wrote some other code that parallelizes the calls when pagination is involved that might speed up obtaining the list of students from the API. I've got too many projects going on and not enough time to finish them.

My best guess though, is that you're just making so many requests that Canvas is shutting you down, and that you need to break it up somehow.

d_ellis
Community Contributor

I don't expect you to do any recoding on the basis of my edge case! I know enough Javascript to be dangerous so I may just try your suggestion to add a delay between calls - I'll report back if it works.

cjford
Community Member

Thanks for sharing, James. Cool stuff!

I'm curious about that request you're making for usage data (courses/[course_id]/users/[id]/usage.json). It's doesn't go through the official API and I haven't been able to find any documentation about arbitrary JSON responses like that. Any chance you (or anyone else) could shed some light on that? Are there any other undocumented endpoints? How do you go about finding them?

James
Community Champion
Author

It depends on your definition of undocumented. They are technically documented in the source code, but what I did was to go to the page and look at what the link was doing. The browser's inspector (usually F12) is great for doing this, looking at what network calls were made and what parameters are passed. Many of the actual calls are made from within the JavaScript source files.

If you're going to ask if there is a list of undocumented calls somewhere, then I'm pretty sure the answer is going to be no, otherwise they would be documented.