dplutils.pipeline.stream.StreamingGraphExecutor.split_batch_submit

dplutils.pipeline.stream.StreamingGraphExecutor.split_batch_submit#

abstractmethod StreamingGraphExecutor.split_batch_submit(batch: StreamBatch, max_rows: int) Any[source]#

Submit a task to split batch into at most max_rows

Similart to task_submit, implementations should arrange by whatever means make sense to take the dataframe reference in batch.data of StreamBatch, given its length in batch.length and split into a number of parts that result in no more than max_rows per part. The return value should be a list of objects that can be processed by is_task_ready() and task_resolve_output().