Mule ESB has the ability to process messages in batches.
Within an application, we can initiate a batch job which is basically a block of code
that splits a large messages into individual records, then performs actions upon
each record, then reports on the results and potentially pushes the processed
output to other systems or queues.
This functionality is particularly useful when working with a large set of data for
example Database where a large set of Data is required to retrieve or insert in a
A batch job is a top-level element in Mule which exists outside all Mule flows.
Generally a batch jobs split a large messages into small parts which is called
records and which Mule processes asynchronously in a batch job; similarly just as
flows process messages, batch jobs process records.
A batch job contains one or more batch steps which, in turn, can contain any
number of message processors or Mule component that act upon records as they
move through the batch job.
A batch job executes when triggered by either a batch executor in a Mule flow or
a message source in a batch-accepting input which is when triggered, will create a
new batch job instance.
After all record are processed and passed though all the batch steps, the batch job
instances is ended and result of all the batch jobs are accumulated and summarised
in a report that reflect which batch job as succeed and which job has failed at the
time of processing. Source:- MuleSoft
The batch processing is useful in particular scenario :-
• When a large message fails, for example a database insertion in bulk, we can
continue with the rest.
• When integrating a data sets, that may be small or large, streaming or not, to
parallel process records.
• When it require to handle a large quantities of incoming data from an API into
a legacy system.
• Synchronising data sets between various business applications. Example
Netsuite and Salesforce
There are basically 4 parts in Mule batch Job:-
Input phase: In Input phase, we can place a one-way message source and/or
message processors to prepare the data that is actually going to be fed into the
job. Here we are processing synchronously at a message level and this is an
Loading phase: This phase is automatic and implicit and we don’t have to do
anything here. In this phase payload from Input phase are split into records and
stored in persistent queues.
Process phase: In this phase, each records is processed separately and
independently and after that it is moved across the steps in an asynchronous and
On Complete phase: In this phase you get a result object that tells you how many
records were processed, how many succeeded, which ones failed (and in which
step), etc. This is useful for generating reports and sending out notifications
Let’s consider we have a simple Mule flow as following:-
Here , we can see there are 3 phases :- In Input phase it is retrieving the data from
databases at a fixed interval of time
In Process phase it is executing 3 steps .. In step 1, it is inserting data into database.. In
step 2 it is logging messages if any and in step 3, it is logging failed messages if any.
In complete phase it log the Number of failed Records , Number of successful Records
and Elapsed Time