Celebrate Excellence in Education: Nominate Outstanding Educators by April 15!
Found this content helpful? Log in or sign up to leave a like!
Hey. I am trying to build JSON objects to work with down the line. However, I am running into an issue and am hoping I can get some help.
I am wanting to store all pages of a Canvas course in the JSON 'page_url', but I am only getting the last page that is in the iteration of a course. Any ideas on how to get ALL pages in a course to be store in 'page_url' in my JSON file?
I've been wracking my brain for a solution but I'm lost, and every time I take 1 step forward, I take 3 steps back.
import requests
import json
import time
import re
# Start timer for code to run :)
# Opens file with all courseIDs
# And creates the report file (myfile.json)
start = time.time()
DataFile = open("PATH_TO_FILE","r")
#secret token and course ID
secret_token = "MY_SECRET_KEY"
course_id = DataFile.read().split()
htmlUrlVariable = None
titleVariable = None
bodyVariable = None
for c in course_id:
print(c)
###### Code to GET course information
Get_url = "https://INSTITUTION_URLapi/v1/courses/"+c+"/pages?per_pages=9999"
headers = {'Authorization' : 'Bearer ' + secret_token}
r = requests.get(Get_url,headers = headers)
User_dict = r.json()
for i in User_dict:
pageIdToStr = str(i['page_id'])
# Make second API call to page body to get <HTML> <body> code
Get_url = "https://INSTITUTION_URLapi/v1/courses/"+c+"/pages/"+pageIdToStr
headers = {'Authorization' : 'Bearer ' + secret_token}
x = requests.get(Get_url,headers = headers)
pages_dict = x.json()
htmlUrlVariable = pages_dict['html_url']
titleVariable = pages_dict['title']
bodyVariable = pages_dict['body']
finalURL = print(htmlUrlVariable)
###### Code to find sub-account ID
Get_url = 'https://INSTITUTION_URLapi/v1/courses/'+c
headers = {'Authorization' : 'Bearer ' + secret_token}
e = requests.get(Get_url,headers = headers)
subAccount_dict = e.json()
courseNameVariable = subAccount_dict['name']
subID = subAccount_dict["root_account_id"]
subVariable = (str(subID)) # Variable needed to print to final report
###### Code to find course instructor with id role of 5020 (teacher role)
Get_url = 'https://INSTITUTION_URLapi/v1/courses/'+c+'/enrollments?per_page=9999'
headers = {'Authorization' : 'Bearer ' + secret_token}
e = requests.get(Get_url,headers = headers)
enroll_dict = e.json()
# Iterate through the JSON array
for item in enroll_dict:
teacher = item["role_id"]
if (teacher == 5020):
facultyIdVariable = item["role_id"]
instructorVariable = (item['user']['name'])
instructorUserNameVariable = item['user']['login_id']
#### Writes one course detail to JSON object. Used for testing
canvasData = {
'instructor': instructorVariable,
'course_Name': courseNameVariable,
'page_url': htmlUrlVariable,
# 'body' : bodyVariable
}
json_object = json.dumps(canvasData, indent=4)
with open("PATH_TO_FILE", "a") as outfile:
outfile.write(json_object)
print(">>>>Process complete! It took", time.time()-start, "seconds to complete.\n")
Solved! Go to Solution.
Once you get everything working, 9999 isn't going to get all of your pages. There is hard-limit in most places of 100. You really need to pay attention to the pagination documentation if you want to make sure that you're getting all of the pages.
The reason you're only getting the last URL is because you are storing the page url to a string, which can only hold one value at a time. It only remembers that last value you give it.
Once you get everything working, 9999 isn't going to get all of your pages. There is hard-limit in most places of 100. You really need to pay attention to the pagination documentation if you want to make sure that you're getting all of the pages.
The reason you're only getting the last URL is because you are storing the page url to a string, which can only hold one value at a time. It only remembers that last value you give it.
Thanks for the insight on pagination, James. I only did 9999 to capture as much as I could, but realistically 100 would suffice. It was more of safety net than anything else. I'll change this.
I'm not a beginner to python or json, but I'm not an expert either, so could you help point me in the write direction of how to store each page individually to dump into a JSON document.
Python is a language I rarely use. In other languages, I would use some kind of array. In JavaScript, I would use something like const pageUrls = []; for initialization and then pageUrls.push(currentUrl); to add.
For Python, I would have to do the same thing you would -- Google how to use arrays in Python. I say array because I am familiar with JavaScript. It might be lists or something else in Python. That's how your previous Python experience can help.
@James , I think I figured it out. I created empty arrays, then appended my GET request data to go into the array. Problem solved. Thank you!!
Rather than starting from scratch, I suggest that you review an existing python library:
https://canvasapi.readthedocs.io/en/stable/getting-started.html
I use this API library regularly to both extract data from Canvas and to push new data up to canvas. I use it to create modules, create pages, extract quiz results, and lots more.
The library automatically manages pagination, so you don't need to worry about the lower level mechanics of the canvas api.
To participate in the Instructure Community, you need to sign up or log in:
Sign In