Streaming Literals #4101
cosmicBboy
started this conversation in
RFC Incubator
Replies: 3 comments 3 replies
-
Very interesting feature imho, was asked by users whether something like this is possible in Flyte before. Let's turn this into an RFC 👍 |
Beta Was this translation helpful? Give feedback.
1 reply
-
2023-11-09 Contributor's meetup notes: this entry meets the requirements for an RFC; @cosmicBboy feel free to write a new RFC. |
Beta Was this translation helpful? Give feedback.
2 replies
-
@wild-endeavor is this connected to the Raw Literals Offload work? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Streaming Literals: Sharing partial state between tasks via a publisher/subscriber model.
Author: @kumare3
Intro
Today in Flyte we have Literals. A Literal has an implicit contract - it is materialized when the producer task produces it and the consumer task cannot run, unless it is fully materialized. But there are cases in which, we may want to to partially materialize or not at all materialize a Literal, but start consuming it in the partial state. This can be generalized to a stream and a publisher/subscriber model.
Why this might be useful?
Code sample
Example
Let us use an example to motivate the usecase. Consider a very large (100 billion ints) array of elements, for which the subsequent task wants to compute the sum of all the numbers.
This can be done in 3 ways:
Approach 1: Current Predominant Flyte Pattern
Approach 2: Supported by Flyte
Approach 3: Not possible with current Flyte compiler and Propeller model
Analysis
RFC proposal
Approach 3 could make it possible to build partial streaming applications in Flyte. To achieve this one option could be to build a new type called StreamLiteral
Every task that produces a streamLiteral and subsequent task that consumes the streamLiteral have a soft dependency, where they can be launched together. The compiler adds a streamID to every streamLiteral. The streamLiteral itself can hold a Literal as the datum carried by the Stream.
At runtime, the stream ID could be materialized using multiple options,
A few constraints on a such a system could be,
In a static graph, all consumers are known so the stream will not be closed until all consumers exit.
Beta Was this translation helpful? Give feedback.
All reactions