Celebrate Excellence in Education: Nominate Outstanding Educators by April 15!
Found this content helpful? Log in or sign up to leave a like!
I am developing a custom API application in .NET/MVC that performs the JSON request for the various API's. So far all is working and I've pulled in many records, however, one of the largest data pulls is for the PageViews API.
My question is regarding how to get the delta data or new data since the last request. The complete processed pulled in over a million records and this took a couple days to process. I have enhanced the code since then but the full call still takes awhile.
I'd like to get just the new data. What I have done so far is to create a new table and save the total pages and index it that way so I can begin the next request from the last page called. However, i've seen that new data is not at the beginning or end of a request.
Has anyone run into this issue and found a solution? If not, is there something similar that was done to what I am attempting?
Thanks, George
Nevada State College
@geomark is this the API call you are working with: Users - List user page views
This call allows for a date range.
Can you post a link to the exact API call you are using?
Also, if you are working with large amounts of data, have you considered using Canvas Data Portal?
The wiki_page_fact has a view_count field, but not sure if this is what your looking for.
@geomark again, not sure exactly what you are after, but have you looked at the analytics API, specifically this call: Analytics - User in a course level participation data
Response data includes number of page views by date:
{
"page_views": {
"2012-01-24T13:00:00-00:00": 19,
"2012-01-24T14:00:00-00:00": 13,
"2012-01-27T09:00:00-00:00": 23
},
"participations": [
{
"created_at": "2012-01-21T22:00:00-06:00",
"url": "https://canvas.example.com/path/to/canvas",
},
{
"created_at": "2012-01-27T22:00:00-06:00",
"url": "https://canvas.example.com/path/to/canvas",
}
]
}
There is also course level participation data: Analytics - Course level participation data
This also includes page views:
{
"page_views": {
"2012-01-24": {
"general": 200,
"grades": 25,
"files": 5,
"other": 10
},
"2012-01-27": {
"general": 251,
"assignments": 55,
"pages": 6
}
},
"participations": [
"2012-01-21",
"2012-01-27"
]
}
Not sure if this helps you or not.
Hi Garth, Thanks for your reply.
It appears that we need to target the Canvas Data specifically, so I am attempting to re-write the application to support HMAC authentication.
I'm also using the CLI tool to download flat files but I'm running into an issue where it looks like it is not pulling all of our data. I'm attempting to understand the way it works so I can determine if we are receiving all of our historical data and how to process incremental data.
I have spent time studying the Canvas Data schemas, but have not dug in too deep yet.
One thing that I do know is that there can be delay of several days in the data, I believe I was told the data could be up to three days behind production. Notice this statement on the data portal page:
"The most recent data in a given export is generally 24-36 hours older than the date given"
If you are trying to detect trends in your data, Canvas Data is a good tool
If you want to analyze real time data, it is not the tool to use.
I guess I assumed that page views would be a trending study, but I shouldn't assume : )
Regarding differentials, you might ask your CSM for more details.
Hopefully someone can jump in who has more experience with Canvas Data.
Hello again! I really appreciate your reply's. It's been hard to come across answers so I've been trying different things to get this going on our end. Hopefully in the end, others can find my findings useful...and hopefully it works for us!
What I am testing now is a staging table and merge process where I load the large flat file (downloaded from canvas admin directly) which seems to contain all of our data. Then, I have two nightly processes. One is the CLI application which gets the new files and stages them. The other is a 2 step process. The first step looks at the newly staged files, and writes them to a temporary table. The second step merges the temp table to a live table and only inserts the delta data.
Hopefully this works. I will be testing it in the days to come and keeping an eye on it. Like I mentioned before, I'd really like to get the CanvasData Api up and running but it has been a struggle with no direct demos or examples in the .Net environment I am using.
Thanks, George
@george_markaria I just ran across this post, and thought you might be interested:
Maybe you and @wre0001 can exchange some ideas.
Your approach sounds reasonable, given the tools available.
Would it be easier to simply refresh your tables, rather than go through the exercise of calculating the differential?
If you are not modifying the data on your end, refreshing the tables would simplify your logic.
Regarding the API, I have written a workflow server using .NET, which uses the Canvas API to automate many different tasks we have. To make the API calls I'm using System.Net.Http.HttpClientHandler:
Using HttpClientHandler I have created a base class with three methods to handle GET, POST and PUT calls.
From those three methods I am able to make any Canvas API call I need.
I hope that helps.
To participate in the Instructure Community, you need to sign up or log in:
Sign In