CD2: requests object skipped a number?

Jump to solution
IanGoh
Community Contributor

Just happened to be running a request, got back

{
    "id": "f6173a8b-e377-4f44-9919-3284cac40a88",
    "status": "complete",
    "objects": [
        {
            "id": "f6173a8b-e377-4f44-9919-3284cac40a88/part-00000-63ae619d-52d5-473d-b7b9-3e03edebe7e1-c000.json.gz"
        },
        {
            "id": "f6173a8b-e377-4f44-9919-3284cac40a88/part-00001-63ae619d-52d5-473d-b7b9-3e03edebe7e1-c000.json.gz"
        },
        {
            "id": "f6173a8b-e377-4f44-9919-3284cac40a88/part-00003-63ae619d-52d5-473d-b7b9-3e03edebe7e1-c000.json.gz"
        }
    ],
    "expires_at": "2023-07-12T17:53:50Z",
    "schema_version": 1,
    "at": "2023-07-11T17:01:02Z"
}

didn't think anything was unusual until I was getting request object URLs and noticed I had part-00000, part-00001, and part-00003.  So what happened to part-00002 ?

Labels (2)
0 Likes
1 Solution
LeventeHunyadi
Instructure
Instructure

The name of the files returned by the API don't bear any special significance, you should not be relying on any pattern. A query operation returns a list of object identifiers, which capture the entire result-set. If you process all the objects the API call returns, you don't miss out on any output data. In particular, our own DAP client library completely ignores file names.

Behind the scenes, these files are generated by independent parallel processes that don't communicate with one other. Occasionally, one of these processes may be terminated, and must be restarted. If this happens, the new process is assigned the next value in the sequence, and there will be a left-out value for the terminated process. The API call returns when all processes have completed successfully, and all data is ready to be returned.

View solution in original post