Course total file size

Jump to solution
RichC
Community Participant

Is there a way to work out the total file size of a course including any videos that have been added through Studio? 

This doesn't need to include any files uploaded by students.

I can get the 'files uploaded to course' size but this doesn't include any videos. When I'm in Studio for the course, I can't see any options to do with file size.

All I can think of is counting videos in the Course Studio and working out a rough size per video and multiplying them. This would take ages and not be very accurate.

Any ideas are greatly appreciated. Thanks

0 Likes
1 Solution
James
Community Champion

@RichC 

I've got a little more time now to dig into things.

What you want can be obtained through the Public API. There's a link at the top about how to integrate it with their OAuth.

I gave the link to the Public API documentation on Canvas' site, but if you take your Canvas Studio URL (something like <instance>.instructuremedia.com) and add /api/public/docs/ to the end of it, you will get one for your specific instance. In theory, there's an authorize button that allows you to try things out.

I was originally concerned whether to use the Public or private (Arconaut's API). Their public API documentation doesn't show the size being returned, but when I tested it, the size was there.

Looking at the size of videos is misleading. There may be multiple transcodings of the videos (high, medium, low quality). The size, mixed with sharing between different courses (but only stored once), throws off the notion of traditional storage.

Here's a general workflow to get the original upload size for all videos within a course. When I say <instance>, I'm talking about xxx.instructuremedia.com, where xxx is your Canvas instance. For us, our Canvas instance is richland.instructure.com so our Canvas Studio instance is richland.instructuremedia.com.

Start by getting the collection ID using the get courses endpoint. The {course_id} is your Canvas course ID. In my example, the Canvas course ID is 3903130. It tells me my collection ID is 24709.

GET <instance>/api/public/v1/courses/3903130

{
  "course": {
    "id": 24709,
    "name": "MATH 113 - Stats (SP24)",
    "type": "course_wide",
    "created_at": "2023-12-08T00:45:26Z",
    "course_id": 3903130,
    "owner": null
  }
}

Now that I know the collection ID, I can list the media in a collection. The {collection_id} is the the ID that you just obtained from the course request. This will return an array of media in the course. It is most likely going to be paginated and have a maximum per_page=50. If you try more than 50, you will get erroneous results in the metadata.

GET <instance>/api/public/v1/collections/24709/media

{
  "media": [
    {
      "id": 42563,
      "title": "Intro to Data Prezi",
      "description": "this Prezi introduces the topic of data.",
      "duration": 333.927,
      "created_at": "2020-08-10T20:19:10Z",
      "thumbnail_url": "<truncated>/thumbnail?width=540&height=320",
      "transcoding_status": "transcoding_finished",
      "size": 36648249,
      "source": "upload",
      "owner": {
        "id": 168,
        "full_name": "James Jones",
        "display_name": "James Jones",
        "email": "james@richland.edu"
      }
    },
 ],
  "meta": {
    "current_page": 1,
    "last_page": 13,
    "total_count": 241
  }
}

There were 19 other media on that page that I truncated for space purposes. The metadata at the bottom tells me there are 241 videos total with 12 more pages to go through. They pagination uses page and per_page parameters.

What you're looking for is the size property. Here it's 36648249 bytes (36MB). Also note that the source is "upload" and that lets me know it's an actual upload as opposed to a linked YouTube video.

Now you just need to iterate through all of the pages and get the information.

You really need to watch throttling with the Canvas Studio API. If you make the requests too quickly, you can get blocked. Unlike the Canvas REST API where you start with x-rate-limit-remaining at 700 and can basically make requests as fast as you want as long as you let one finish before starting the next, that is not true with Canvas Studio API. There is no x-rate-limit-remaining in the response headers, just an x-runtime. I've done more exploring with the old (now unpublished) Arconauts API and it started at 70 (I think) and would drop quickly.

In practice, I've found that sustained requests to the API should not exceed 1 request per second. Anything faster than that and it eventually runs out. For small requests, you might be able to exceed that. It is far more sensitive to throttling than the Canvas REST API.

View solution in original post