Hi Pete,
Understanding Data Factory is a big holdup for me at the moment. We've talked to Microsoft for some advice, but are still waiting on a response.
To the best of my knowledge, Azure Data Factory doesn't currently have any built-in or readily available support for Instructure's DAP/CD2. It may be possible to use Data Factory's existing HTTP Connector if you only intend to use snapshots anyways, so wouldn't need more complicated processing to track previous jobs and modify the request for incremental queries. Writing a custom Data Factory Connector may also be an option, but I'm not familiar enough with Data Factory even know where to start with that.
Instead, I've written an Azure Function using the durable monitoring pattern (https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview?tabs=in-p...) to start a table query, monitor for completion, and download the resulting data files to Azure Blob Storage.
Presumably, a Function can be called from a Data Factory pipeline and then read the data from Blob Storage. Using a Function gives gives extra flexibility to save the job history and pass the last successful timestamp, for a given table, into an incremental query. But this is where I've stalled out for lack of familiarity with Data Factory.
I have a Python script that can run the whole process and dump the data into a database, but struggling to map this to Data Factory components. My plan for now is just to run my script on our own hardware into a Azure hosted database until I can figure out the Azure pieces.
For context, we're using an Azure SQL Managed Instance database (Microsoft SQL Server) and many of our non-Azure systems are still running Python 3.6, so haven't been able to use Instructure's DAP Client and have been instead been building my own. This has helped get a bit more flexibility working in Azure, but lacks some of conveniences offered by their implementation.