dplutils.pipeline.stream.StreamingGraphExecutor.task_submit#
- abstractmethod StreamingGraphExecutor.task_submit(task: PipelineTask, df_list: list[DataFrame]) Any[source]#
Run or arrange for the running of task
Implementations must override this method and arrange for the function of
taskto be called on a dataframe made from the concatenation ofdf_list. The return value will be maintained in a pending queue, and bothtask_resolve_outputandis_task_readywill take these as input, but will otherwise not be inspected. Typically the return value would be a handle to the remote result or a future, or equivalent.Note
PipelineTaskexpects a single DataFrame as input, while this function receives a batch of such. It MUST concatenate these into a single DataFrame prior to execution (e.g. withpd.concat(df_list)). This is not done in the driver code as the dataframes indf_listmay not be local.