You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This feature request proposes adding asynchronous computation to Apache Beam's Interactive Beam API. This means allowing long-running tasks to execute in the background without blocking the user interface.
Motivation
Interactive Beam is a powerful tool for iterative pipeline development and debugging. However, long-running collect operations can block the interactive environment, hindering productivity and exploration. Introducing asynchronous computation would significantly improve the user experience by allowing developers to continue building the pipeline while computations are executed in the background.
Proposed Solution
Interactive Beam offers a compute API which runs asynchronously in the background and does not produce any result to be displayed on the interactive interface eg. Colab.
wait_for_inputs: Whether to wait until the asynchronous dependencies are
computed. Setting this to False allows to immediately schedule the
computation, but also potentially results in running the same pipeline
stages multiple times.
blocking: If False, the computation will run in non-blocking fashion. In
Colab/IPython environment this mode will also provide the controls for the
running pipeline. If True, the computation will block until the pipeline
is done.
compute operations can subsequently be followed collect operations on the same PCollection for users to view the result.
Benefits
The ability to compute time consuming PCollections asynchronously
Sink operations which do not produce any meaningful output can use compute instead of collect
Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam YAML
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Infrastructure
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
What would you like to happen?
Summary
This feature request proposes adding asynchronous computation to Apache Beam's Interactive Beam API. This means allowing long-running tasks to execute in the background without blocking the user interface.
Motivation
Interactive Beam is a powerful tool for iterative pipeline development and debugging. However, long-running
collect
operations can block the interactive environment, hindering productivity and exploration. Introducing asynchronous computation would significantly improve the user experience by allowing developers to continue building the pipeline while computations are executed in the background.Proposed Solution
Interactive Beam offers a
compute
API which runs asynchronously in the background and does not produce any result to be displayed on the interactive interface eg. Colab.This API introduces two new options:
wait_for_inputs
: Whether to wait until the asynchronous dependencies arecomputed. Setting this to False allows to immediately schedule the
computation, but also potentially results in running the same pipeline
stages multiple times.
blocking
: If False, the computation will run in non-blocking fashion. InColab/IPython environment this mode will also provide the controls for the
running pipeline. If True, the computation will block until the pipeline
is done.
compute
operations can subsequently be followedcollect
operations on the samePCollection
for users to view the result.Benefits
PCollections
asynchronouslycompute
instead ofcollect
Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: