Celebrate Excellence in Education: Nominate Outstanding Educators by April 15!
Found this content helpful? Log in or sign up to leave a like!
The documentation contains details about
Identifies a subset of data to fetch from a table.
(This feature is not currently implemented.)
When is this going to be made available? This feature would be extremely beneficial to reduce the amount of data that needs to be moved and make our implementation much more resilient.
For efficiency, we use CSV format, and the ability to specify only the columns we actually need would both reduce the volume, but also mean that schema updates that introduce new columns would not change the format of what is returned from our query unless we made explicit changes to request the new columns.
Now that web_logs are available the where row-level filter is of extreme importance. We are not really interested in trying to accumulate all the web_logs (I estimate that the 30 days web logs for our current activity rate is of the order 7 TB in the zipped format - blowing out to over 26TB when decompressed. Allowing for 30 days, and the fact that weekdays are when our activity occurs, we are talking about 1TB of data per day - which is way too much to realistically move around and process, let alone store for extended periods.
There are some sub-sets (activity where users are acting-as) that we would want to store for audit purposes, but that is a tiny fraction of the total. We would really like to be able to filter for this activity (which could easily be done). That would be a regular fetch and update.
It would also be useful to be able to filter for ad-hoc queries. We don't want to actually replicate the data, but being able to query by user_id over the last 30 days would enable us to make use of the logs in a meaningful way.
Unfortunately, I am afraid we don't plan to implement filter options in the near future due to priorities and limitations in team capacity. We don't have a target date set when this feature would be available.
I haven't dug deep into CD2 yet, but your installation must be stupendously huge if you're generating that much data. We are very large (87,000+ FTE) and our weekly logs compressed were, very approximately, a total of 10GB in the second week of January.
With that said, filter would be amazing to have and solve a lot of problems for us.
We are a bit larger - our staff number around 125,000 users, and there are over 550,000 students. We have over 4 million enrolments in over 40,000 courses.
To participate in the Instructure Community, you need to sign up or log in:
Sign In