Batch processing often demands two constraints. Time and Power. You need high-demand processing for a limited duration and if the processing cannot be powerful it can consume a significant time or incur timeouts. Usually the high-end processing can be costly for many small or medium scale organisations to have on-premise.
What if there is a PaaS/ IaaS service that allows you to perform powerful batch processes while paying for only the consumption? What if you can scale-up and scale-down the processing power relative to the demand.
- Pool - A pool is a collection of compute nodes that are made of same OS, Scale and common lifecycle events. (eg: Start Tasks) A pool can be scaled out or in dynamically.
- Node - A node is a VM, but we do not get access to the full VM. A group of directories will be available to put applications and execute them. You can have dedicated nodes as well as low cost spot VM nodes.
- Application - An application is an executable package that is stored in the associated storage account and can be versioned and added to jobs and tasks.
- Job - A job is a collection of tasks that can execute on a pool incrementally or in parallel.
- Task - A task is an execution of a command. An application can be executed with a command.
Common Use cases
- Video Processing and indexing
- ETL processes
- Congested batch workflows