Saturday, October 14, 2023

Azure Batch for High Performance Batch Processes?

Batch processing often demands two constraints. Time and Power. You need high-demand processing for a limited duration and if the processing cannot be powerful it can consume a significant time or incur timeouts. Usually the high-end processing can be costly for many small or medium scale organisations to have on-premise.

What if there is a PaaS/ IaaS service that allows you to perform powerful batch processes while paying for only the consumption? What if you can scale-up and scale-down the processing power relative to the demand. 


Azure Batch allows you to execute parallel tasks across multiple compute nodes and scale-out as needed. There are a few important concepts around Azure Batch that is worth understanding.

  • Pool - A pool is a collection of compute nodes that are made of same OS, Scale and common lifecycle events. (eg: Start Tasks) A pool can be scaled out or in dynamically.
  • Node - A node is a VM, but we do not get access to the full VM. A group of directories will be available to put applications and execute them. You can have dedicated nodes as well as low cost spot VM nodes.
  • Application - An application is an executable package that is stored in the associated storage account and can be versioned and added to jobs and tasks. 
  • Job - A job is a collection of tasks that can execute on a pool incrementally or in parallel. 
  • Task - A task is an execution of a command. An application can be executed with a command. 


Common Use cases 

  • Video Processing and indexing
  • ETL processes
  • Congested batch workflows


Batch Service usually comes with certain quotas associated by default. But you can request to increase them based on the necessity. 

No comments:

Post a Comment