Tuesday, May 18, 2021

Spring Batch Interview Questions

 What is Spring Batch?

Spring Batch is an open-source framework for batch processing – execution of a series of jobs. Spring Batch provides classes and APIs to read/write resources, transaction management, job processing statistics, job restart, and partitioning techniques to process high volume data.

 

What are features of Spring Batch?

The features of Spring Batch are as follows -

·       Transaction management

·       Chunk based processing

·       Declarative I/O

·       Start/Stop/Restart

·       Retry/Skip

·       Web based administration interface

What are disadvantages of Spring Batch?

The disadvantages of Spring Batch are as follows-

·       Spring Batch code is complex. If one does not understand the framework then it can be difficult to understand the flow.

·       The performance will not be good if not preperly implemented

·       Exception handling can be complex.

·       Logs for Spring Batch may not reflect/return the exception or issue we are looking for.

What are use cases of Spring Batch?

·       We can automate complex logic and efficiently process data without user interaction. We can run these jobs daily

·       Bulk data can be processed efficiently

·       Data transformation and other operations can be performed in a transactional operation.

What is Spring Batch Admin ?

Spring Batch Admin provides a web-based user interface (UI) that allows you to manage Spring Batch jobs. Spring Cloud Data Flow is now the recommended replacement for managing and monitoring Spring Batch jobs.

 What is Job?

Job is the batch process to be executed in a Spring Batch application without interruption from start to finish. This Job is further broken down into steps.
A Job is made up of many steps and each step is a READ-PROCESS-WRITE task or a single operation task (tasklet).

What is Step in job?
Spring Batch Step is an independent part of a job. As per above Spring Batch structure diagram, each Step consist of an ItemReader, ItemProcessor (optional) and an ItemWriter. Note: A Job can have one or more steps.

What is ItemReader?

An ItemReader reads data into a Spring Batch application from a particular source.

What is ItemWriter?

ItemWriter writes data from the Spring Batch application to a particular destination.

What is ItemProcessor?
After reading the input data using itemReader, ItemProcessor applies business logic on that input data and then writes to the file / database by using itemWriter.

What is Job Repository in Spring Batch?
Job Repository is used to persist all meta-data related to the execution of the Job.

How to configure the job in Spring Batch?
There are many ways we can configure the Sprin Batch Job. We are using here builder abstract way to call the jobs. Job needs JobRepository to configure the job. If you see below Job has three steps to load the notes, to load the task and to process those tasks.

@Bean

public Job employeeJob() {

    return this.jobBuilderFactory.get("notesJob")

                     .start(LoadNotes())

                     .next(LoadTasks())

                     .next(processTasks())

                     .end()

                     .build();

}

What is batch processing architecture?

Job has complete batch process and one or more steps are included in the job. A job set up to run in JSL (Job Specification Language) sequence.

 

What is execution context in Spring Batch?
In case you want to restart a batchrun for error like fatal exception, the Spring Batch continues with the stored ExecutionContext.

What is StepScope in Spring Batch?
The objects whose scope is StepScope, for those objects Spring Batch will use the spring container to create a new instance of that object for each step execution.

What is Step Partition in Spring Batch?
Spring batch can be handled in a single-process job, however to have a multi-process job, we can use Partitioning a Step. In Spring Batch Step Partitioning, Step is divided into a number of child steps, which may be used either as remote instances or as local execution threads.

What is Spring batch job launcher?
Spring batch job launcher is an interface for running the jobs, which uses run method with two parameters.

Example:

JobLauncher jobLauncher = context.getBean(JobLauncher.class);

Job testJob = context.getBean(TestJob.class);

jobLauncher.run(

testJob,

new JobParametersBuilder()

.addString("inputFile", "file:./notes.txt")

.addDate("date", new Date())

.toJobParameters()

);

What is remote chunking in Spring Batch?
In spring batch remote chunking, Master Step reads the date and pass over to slaves for processing.

What is the difference between Step vs Tasklet vs Chunk?

While Step is an independent phase of execution, Tasklet and Chunk are different processing structures used within a Step.

What are different types of process flow for Step execution?

  1. Tasklet Model
  2. Chunk Model

How to choose between Tasklet model and Chunk model?

Typically, when the Step execution task is simple, we choose Tasklet model and if the task processing is complex, we go for Chunk Model.

How do I schedule a Spring Batch job?

  1. Enable Scheduling with @EnableScheduling annotation.
  2. Annotate method with @Scheduled annotation.

With this, the method execution will happen at a schedule mentioned in @Scheduled annotation.

For example: @Schedules(cron=”0 */1 * * * ?”) will run after every 1 minute.

 

How to implement security for Spring Batch ?

We will need to implement a method to authenicate the user -

 
public Authentication authenticateUser(String username, String password) {
    ProviderManager providerManager = (ProviderManager)applicationContext.getBean("authenticationManager");
    Authentication authentication = providerManager.authenticate(new UsernamePasswordAuthenticationToken(username, password));
    setAuthentication(authentication);
    return authentication;
}
 



What is Spring Batch Partitioning ?

In some scenarios single threaded application will may not give proper performance. In such a scenario spring batch partitioning is one way for scaling batch jobs that can improve performance In Spring Batch, "partitioning" is multiple threads to process range of data each. Lets take example, You have 100 records in table, which has primary id assigned from 1 to 100 and you want to access all 100 records. We can make use of Spring Batch Partitioning in such a scenario.

Explain the conditions processing in Spring Batch?

Spring batch follows the traditional batch architecture where a job repository does the work of scheduling and interacting with the job. A job can have more than one steps and every step typically follows the sequence of reading data, processing it and writing it.Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging transaction management, job processing statistics, job restart, skip, and resource management.
For example, a step may read data from a CSV file process it and write it into the database. Spring Batch provides many made Classes to read/write CSV, XML and database.

Explain the Spring Batch framework architecture?

Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. High-volume batch jobs can leverage the framework in a highly scalable manner to process significant volumes of information.
Application-This component contains all the jobs and the code we write using the Spring Batch framework.
Batch Core-This component contains all the API classes that are needed to control and launch a Batch Job.


What is ExecutionContext in Spring Batch?

An ExecutionContext is a set of key-value pairs containing information that is scoped to either StepExecution or JobExecution.Spring Batch persists the ExecutionContext , which helps in cases where you want to restart a batch run.
For example: When a fatal error has occurred, etc.



What are the typical processing strategies in Spring Batch?

·       cessing during offline

·       Concurrent batch or online processing

·       Parallel processing of many different batch or jobs at the same time

·       Partitioning



What is spring batch listener?

Spring Batch listeners are a way of intercepting the execution of a Job or a Step to perform some meaningful operations or logging the progress.
We will see some samples and eventually see the execution of these various listeners.


How can a person configure a job in Spring Batch framework?

A Job in Spring Batch contains a sequence of one or more Steps. Each Step can be configured with the list of parameters/attribute required to execute each step.
next : next step to execute
tasklet : task or chunk to execute. A chunk can be configured with a Item Reader, Item Processor and Item Writer.
decision : Decide which steps need to executed.

What is difference between Remote Partitioning and Remote Chunking in Spring Batch?

Remote Partitioning allows data to be partitioned and executed parallely. For example, we can say partition is divided into set of data, like if have 30 rows, so first data set would 1-10 rows, second data set will have 11-20 and so on.. Master Step have all meta data like all partition data sets and slave executes those meta data and send result back to master for aggregation.
Remote Chunking read the data and has control to pass the data to its Slaves for processing. Once slaves process data,the result of the ItemProcessor is returned to the master for writing.


What is Commandlinejobrunner in Spring Batch?

CommandLineJobRunner is one of the ways to bootstrap your Spring batch Job. The xml script launching the job needs a Java class main method as as entry point and CommandLineJobRunner helps you to start your job directly using the XML script.
The CommandLineJobRunner performs 4 tasks:
Load the appropriate ApplicationContext.
Parse command line arguments into JobParameters.
Locate the appropriate job based on arguments.
Use the JobLauncher provided in the application context to launch the job.


What is Tasklet, and what is a Chunk?

The Tasklet is a simple interface with one method to execute. A tasklet can be used to perform single tasks like running queries, deleting files, etc. In Spring Batch, the tasklet is an interface that can be used to perform unique tasks like clean or set up resources before or after any step execution.

Spring Batch uses a ‘Chunk Oriented’ processing style within its most common implementation. Chunk Oriented Processing refers to reading the data one at a time and creating chunks that will be written out, within a transaction boundary.

 

No comments:

Post a Comment