(Canvas Data 2) Is it possible to scope on SubAccounts?

Jump to solution
Givan
Community Member

Hey Community,

I’m currently trying to migrate my application to use Canvas Data 2 and I’ve been running against a problem and was wondering if you could help.

Example:

RootAccount

  • SubAccount1
    • Account1
    • Account2
  • SubAccount2
    • Account1
    • Account2

I want to retrieve all data from SubAccount1, now I thought this could be done within the Query Parameters with the use of the scope parameter. In the given example this means that I want to retrieve data per SubAccount and not from the RootAccount. Now in the description of the scope it says: ‘Identifies the scope to access, e.g. a root account UUID for Canvas, or a district ID for Mastery’. Because it just says ‘e.g.’, I was wondering if it is possible to scope my data based on my SubAccount?

For context I want to make a possibility to retrieve the data form separate SubAccount, because this means I can filter on my active accounts and whenever an inactive account is activated in can run an incremental query on the SubAccount itself instead of the whole environment? 

If it is not possible to scope on SubAccounts, is there another solution to solve this? Or do I just have to accept I need to retrieve all data?

1 Solution
KeithSmith_au
Community Contributor

I don't think this is ever going to be possible or feasible.  Canvas Data 2 operates as a Data Lake, where data is ingested (with some transformation), then available for efficient extract.  The filtering is possible by institution (necessary) and timestamps.  Some other capabilities have been mooted for the future, but I do not expect they will ever get anywhere near the capability you are hoping for.

Scoping by sub-account is not a simple concept because:

  • Sub accounts exist in a hierarchy - and you would expect to fetch everything from that sub-account down.  If something you want is two or three layers down, the position in the hierarchy needs to be dynamically calculated to see if the record should be included.  The alternative is labelling each record with all its ancestors.  This is not practical both because the data structure would not be efficient, but more importantly, sub-accounts can be moved around in the hierarchy, which would affect any records previously ingested.  The only way to achieve this would be to maintain a separate hierarchy matrix all the time, which would have to be joined in all extracts
  • There are many elements that are not available directly at sub-account level, but will be needed for meaningful use.  Users is the obvious one (and associated pseudonyms, communication_channels etc.).  Would a scoped query be expected to fetch all records for these tables (which may be too much data), or have some complicated way (via enrollments, account_users) of getting those that are relevant.  This would be both complicated and unreliable, as those eligible would change all the time - adding new and removing some already fetched.
  • Even more complicated are those items that are inherited, and then modified.  Roles and permissions depend on both what is present at the sub-account level as either an override or modifier, and what is in parent sub-accounts.  If extracting from lower down the tree - what data do you dynamically expect.

The only real solution to providing access to data warehouse information to sub-sets of users (whether defined by sub-account or other) is something custom by institution - either views tailored for specific tables (and potentially using derived data that is continuously refreshed and custom to the organisation - we maintain a full hierarchy table that assigns every sub-account to its owning school), or by creating separate tables and selecting (and transforming) data to give an apparent view.

The number and combination of different approaches, and the performance impacts make it most unlikely that anything will be able to be driven directly out of CD2

View solution in original post