- Loaders are created by Sources for Servable Versions, then Loaders are sent as Aspired Versions to the Dynamic Manager, which loads and serves them to client requests.
- The Loader contains the metadata that needs to load the Servable.
- The Source uses a callback function to notify the manager of the Aspired Version.
- The manager applies the configured Version Policy to find the next action.
- If the manager specifies that it’s safe, it provides the required resources to the Loader and tells the Loader to load the new version.
- Clients request the manager for the Servable, either determining a version explicitly or just requesting the latest version. The manager returns a handle for the Servable. The Dynamic Manager applies the Version Policy and decides to load the new version.
- If there is enough memory the Dynamic Manager tells the Loader. The Loader instantiates the TensorFlow graph with the new weights.
- A client requests a handle to the latest version of the model, and the Dynamic Manager returns a handle to the new version of the Servable.