[ML] Protect against multiple concurrent downloads of the same model #116869
+301
−142
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The tasks API is used to check for an existing model download when starting a model deployment but started tasks (or tasks about to start) are not immediately visible in the Tasks API leading to a race condition where successive calls to start model deployment may trigger multiple downloads. This is what happens in the Inference API where the default end points are used and multiple calls to inference will trigger the model download.
Download is a master node action so there is only one node in the cluster where the download can occur. Once we get to the download action and have access to the local
taskManager
checks for an existing download can be made more atomically. The change here is to look for a matching download task in thetaskManager
and if found register a listener on that tasks completion.