mlx.data.Stream.prefetch

mlx.data.Stream.prefetch#

Stream.prefetch(self: mlx.data._c.Stream, prefetch_size: int, num_threads: int) mlx.data._c.Stream#

Fetch samples in background threads.

This operation is the workhorse of data loading. It uses num_threads background threads and fetches prefetch_size samples so that they are ready to be used when needed.

Prefetch can be used both to parallelize operations but also to overlap computation with data loading in a background thread.

# The final prefetch is parallelizing the whole pipeline and
# ensures that images are going to be available for training.
dset = (
  dset
  .load_image("image")
  .image_resize_smallest_side("image", 256)
  .image_center_crop("image", 256, 256)
  .batch(32)
  .prefetch(8, 8)
)
Parameters:
  • prefetch_size (int) – How many samples to prefetch.

  • num_threads (int) – How many background threads to launch.