Batch processing often demands two constraints. Time and Power. You need high-demand processing for a limited duration and if the processing cannot be powerful it can consume a significant time or incur timeouts. Usually the high-end processing can be costly for many small or medium scale organisations to have on-premise.
What if there is a PaaS/ IaaS service that allows you to perform powerful batch processes while paying for only the consumption? What if you can scale-up and scale-down the processing power relative to the demand.
Azure Batch allows you to execute parallel tasks across multiple compute nodes and scale-out as needed. There are a few important concepts around Azure Batch that is worth understanding.
Batch Service usually comes with certain quotas associated by default. But you can request to increase them based on the necessity.
- Pool - A pool is a collection of compute nodes that are made of same OS, Scale and common lifecycle events. (eg: Start Tasks) A pool can be scaled out or in dynamically.
- Node - A node is a VM, but we do not get access to the full VM. A group of directories will be available to put applications and execute them. You can have dedicated nodes as well as low cost spot VM nodes.
- Application - An application is an executable package that is stored in the associated storage account and can be versioned and added to jobs and tasks.
- Job - A job is a collection of tasks that can execute on a pool incrementally or in parallel.
- Task - A task is an execution of a command. An application can be executed with a command.
Common Use cases
- Video Processing and indexing
- ETL processes
- Congested batch workflows
No comments:
Post a Comment