Filter

KeithSmith_au · ‎07-12-2023

The documentation contains details about

Filter

Identifies a subset of data to fetch from a table.

(This feature is not currently implemented.)

When is this going to be made available? This feature would be extremely beneficial to reduce the amount of data that needs to be moved and make our implementation much more resilient.

For efficiency, we use CSV format, and the ability to specify only the columns we actually need would both reduce the volume, but also mean that schema updates that introduce new columns would not change the format of what is returned from our query unless we made explicit changes to request the new columns.

Now that web_logs are available the where row-level filter is of extreme importance. We are not really interested in trying to accumulate all the web_logs (I estimate that the 30 days web logs for our current activity rate is of the order 7 TB in the zipped format - blowing out to over 26TB when decompressed. Allowing for 30 days, and the fact that weekdays are when our activity occurs, we are talking about 1TB of data per day - which is way too much to realistically move around and process, let alone store for extended periods.

There are some sub-sets (activity where users are acting-as) that we would want to store for audit purposes, but that is a tiny fraction of the total. We would really like to be able to filter for this activity (which could easily be done). That would be a regular fetch and update.

It would also be useful to be able to filter for ad-hoc queries. We don't want to actually replicate the data, but being able to query by user_id over the last 30 days would enable us to make use of the logs in a meaningful way.

LeventeHunyadi · ‎07-14-2023

Unfortunately, I am afraid we don't plan to implement filter options in the near future due to priorities and limitations in team capacity. We don't have a target date set when this feature would be available.

dtod · ‎07-19-2023

I haven't dug deep into CD2 yet, but your installation must be stupendously huge if you're generating that much data. We are very large (87,000+ FTE) and our weekly logs compressed were, very approximately, a total of 10GB in the second week of January.

With that said, filter would be amazing to have and solve a lot of problems for us.

KeithSmith_au · ‎07-19-2023

We are a bit larger - our staff number around 125,000 users, and there are over 550,000 students. We have over 4 million enrolments in over 40,000 courses.

CD2 When is filter going to be made available?

Filter

CD2 Filter

Data Services sends incorrect Content-Type when re...

duplicates in Catalog enrollment table snapshot

Historical data of assessment decision when using ...

Roll Call Attendance / Grades Alert

CD2 - how to identify which users / sections have ...

CD2: Microsoft Sync State (for MS Teams)

Integrating Canvas with PowerBI?

New Quizzes

Data Services sends incorrect Content-Type when re...

CD2 Python Client Library Typing Error

You're signed out

CD2 When is filter going to be made available?

Filter

Community Help

View our top guides and resources: