Tuesday, May 18, 2021

Spring Batch Interview Questions

 What is Spring Batch?

Spring Batch is an open-source framework for batch processing – execution of a series of jobs. Spring Batch provides classes and APIs to read/write resources, transaction management, job processing statistics, job restart, and partitioning techniques to process high volume data.

 

What are features of Spring Batch?

The features of Spring Batch are as follows -

·       Transaction management

·       Chunk based processing

·       Declarative I/O

·       Start/Stop/Restart

·       Retry/Skip

·       Web based administration interface

What are disadvantages of Spring Batch?

The disadvantages of Spring Batch are as follows-

·       Spring Batch code is complex. If one does not understand the framework then it can be difficult to understand the flow.

·       The performance will not be good if not preperly implemented

·       Exception handling can be complex.

·       Logs for Spring Batch may not reflect/return the exception or issue we are looking for.

What are use cases of Spring Batch?

·       We can automate complex logic and efficiently process data without user interaction. We can run these jobs daily

·       Bulk data can be processed efficiently

·       Data transformation and other operations can be performed in a transactional operation.

What is Spring Batch Admin ?

Spring Batch Admin provides a web-based user interface (UI) that allows you to manage Spring Batch jobs. Spring Cloud Data Flow is now the recommended replacement for managing and monitoring Spring Batch jobs.

 What is Job?

Job is the batch process to be executed in a Spring Batch application without interruption from start to finish. This Job is further broken down into steps.
A Job is made up of many steps and each step is a READ-PROCESS-WRITE task or a single operation task (tasklet).

What is Step in job?
Spring Batch Step is an independent part of a job. As per above Spring Batch structure diagram, each Step consist of an ItemReader, ItemProcessor (optional) and an ItemWriter. Note: A Job can have one or more steps.

What is ItemReader?

An ItemReader reads data into a Spring Batch application from a particular source.

What is ItemWriter?

ItemWriter writes data from the Spring Batch application to a particular destination.

What is ItemProcessor?
After reading the input data using itemReader, ItemProcessor applies business logic on that input data and then writes to the file / database by using itemWriter.

What is Job Repository in Spring Batch?
Job Repository is used to persist all meta-data related to the execution of the Job.

How to configure the job in Spring Batch?
There are many ways we can configure the Sprin Batch Job. We are using here builder abstract way to call the jobs. Job needs JobRepository to configure the job. If you see below Job has three steps to load the notes, to load the task and to process those tasks.

@Bean

public Job employeeJob() {

    return this.jobBuilderFactory.get("notesJob")

                     .start(LoadNotes())

                     .next(LoadTasks())

                     .next(processTasks())

                     .end()

                     .build();

}

What is batch processing architecture?

Job has complete batch process and one or more steps are included in the job. A job set up to run in JSL (Job Specification Language) sequence.

 

What is execution context in Spring Batch?
In case you want to restart a batchrun for error like fatal exception, the Spring Batch continues with the stored ExecutionContext.

What is StepScope in Spring Batch?
The objects whose scope is StepScope, for those objects Spring Batch will use the spring container to create a new instance of that object for each step execution.

What is Step Partition in Spring Batch?
Spring batch can be handled in a single-process job, however to have a multi-process job, we can use Partitioning a Step. In Spring Batch Step Partitioning, Step is divided into a number of child steps, which may be used either as remote instances or as local execution threads.

What is Spring batch job launcher?
Spring batch job launcher is an interface for running the jobs, which uses run method with two parameters.

Example:

JobLauncher jobLauncher = context.getBean(JobLauncher.class);

Job testJob = context.getBean(TestJob.class);

jobLauncher.run(

testJob,

new JobParametersBuilder()

.addString("inputFile", "file:./notes.txt")

.addDate("date", new Date())

.toJobParameters()

);

What is remote chunking in Spring Batch?
In spring batch remote chunking, Master Step reads the date and pass over to slaves for processing.

What is the difference between Step vs Tasklet vs Chunk?

While Step is an independent phase of execution, Tasklet and Chunk are different processing structures used within a Step.

What are different types of process flow for Step execution?

  1. Tasklet Model
  2. Chunk Model

How to choose between Tasklet model and Chunk model?

Typically, when the Step execution task is simple, we choose Tasklet model and if the task processing is complex, we go for Chunk Model.

How do I schedule a Spring Batch job?

  1. Enable Scheduling with @EnableScheduling annotation.
  2. Annotate method with @Scheduled annotation.

With this, the method execution will happen at a schedule mentioned in @Scheduled annotation.

For example: @Schedules(cron=”0 */1 * * * ?”) will run after every 1 minute.

 

How to implement security for Spring Batch ?

We will need to implement a method to authenicate the user -

 
public Authentication authenticateUser(String username, String password) {
    ProviderManager providerManager = (ProviderManager)applicationContext.getBean("authenticationManager");
    Authentication authentication = providerManager.authenticate(new UsernamePasswordAuthenticationToken(username, password));
    setAuthentication(authentication);
    return authentication;
}
 



What is Spring Batch Partitioning ?

In some scenarios single threaded application will may not give proper performance. In such a scenario spring batch partitioning is one way for scaling batch jobs that can improve performance In Spring Batch, "partitioning" is multiple threads to process range of data each. Lets take example, You have 100 records in table, which has primary id assigned from 1 to 100 and you want to access all 100 records. We can make use of Spring Batch Partitioning in such a scenario.

Explain the conditions processing in Spring Batch?

Spring batch follows the traditional batch architecture where a job repository does the work of scheduling and interacting with the job. A job can have more than one steps and every step typically follows the sequence of reading data, processing it and writing it.Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging transaction management, job processing statistics, job restart, skip, and resource management.
For example, a step may read data from a CSV file process it and write it into the database. Spring Batch provides many made Classes to read/write CSV, XML and database.

Explain the Spring Batch framework architecture?

Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems. High-volume batch jobs can leverage the framework in a highly scalable manner to process significant volumes of information.
Application-This component contains all the jobs and the code we write using the Spring Batch framework.
Batch Core-This component contains all the API classes that are needed to control and launch a Batch Job.


What is ExecutionContext in Spring Batch?

An ExecutionContext is a set of key-value pairs containing information that is scoped to either StepExecution or JobExecution.Spring Batch persists the ExecutionContext , which helps in cases where you want to restart a batch run.
For example: When a fatal error has occurred, etc.



What are the typical processing strategies in Spring Batch?

·       cessing during offline

·       Concurrent batch or online processing

·       Parallel processing of many different batch or jobs at the same time

·       Partitioning



What is spring batch listener?

Spring Batch listeners are a way of intercepting the execution of a Job or a Step to perform some meaningful operations or logging the progress.
We will see some samples and eventually see the execution of these various listeners.


How can a person configure a job in Spring Batch framework?

A Job in Spring Batch contains a sequence of one or more Steps. Each Step can be configured with the list of parameters/attribute required to execute each step.
next : next step to execute
tasklet : task or chunk to execute. A chunk can be configured with a Item Reader, Item Processor and Item Writer.
decision : Decide which steps need to executed.

What is difference between Remote Partitioning and Remote Chunking in Spring Batch?

Remote Partitioning allows data to be partitioned and executed parallely. For example, we can say partition is divided into set of data, like if have 30 rows, so first data set would 1-10 rows, second data set will have 11-20 and so on.. Master Step have all meta data like all partition data sets and slave executes those meta data and send result back to master for aggregation.
Remote Chunking read the data and has control to pass the data to its Slaves for processing. Once slaves process data,the result of the ItemProcessor is returned to the master for writing.


What is Commandlinejobrunner in Spring Batch?

CommandLineJobRunner is one of the ways to bootstrap your Spring batch Job. The xml script launching the job needs a Java class main method as as entry point and CommandLineJobRunner helps you to start your job directly using the XML script.
The CommandLineJobRunner performs 4 tasks:
Load the appropriate ApplicationContext.
Parse command line arguments into JobParameters.
Locate the appropriate job based on arguments.
Use the JobLauncher provided in the application context to launch the job.


What is Tasklet, and what is a Chunk?

The Tasklet is a simple interface with one method to execute. A tasklet can be used to perform single tasks like running queries, deleting files, etc. In Spring Batch, the tasklet is an interface that can be used to perform unique tasks like clean or set up resources before or after any step execution.

Spring Batch uses a ‘Chunk Oriented’ processing style within its most common implementation. Chunk Oriented Processing refers to reading the data one at a time and creating chunks that will be written out, within a transaction boundary.

 

Spring Cloud Interview Questions

What are the benefits of using cloud ?

·       It is cost effective. Economies of scale in acquisition of equipment, utilities and maintaining.

·       Abstracts infrastructure complexity. We can use or resources on our application functional requirements instead of maintaining infrastructure or platforms.

·       It provides Scalability so that no matter how efficient your provider and your IT team is, using cloud providers we can scale dynamically.

·       Cloud providers offer a better service level than the best private infrastructure we could afford.

What is Spring Cloud ?

·       Cloud is used in every business organization for hosting. It was important to introduce cloud in the path of Spring too.

·       Spring Cloud Stream App Starters are Spring Boot based Spring Integration applications that provide integration with external systems.

·       A short-lived microservices framework to quickly build applications that perform finite amounts of data processing.

What is Spring Cloud Netflix?

Spring Cloud Netflix, part of Spring cloud platform, provides Netflix OSS integrations for Spring Boot apps.

Using annotations, you can quickly enable and configure the common cloud patterns inside your application and build large distributed systems using Netflix OSS components.

The patterns supported by Spring Cloud Netflix include Service Discovery (Eureka), Circuit Breaker (Hystrix), Intelligent Routing (Zuul) and Client-Side Load Balancing (Ribbon)

What are the key capabilities provided in Spring Cloud Netflix?

Spring Cloud Netflix has the following key capabilities.

1. Service Discovery - Spring Cloud Netflix framework provides the capability to register Eureka instances, which clients can discover using spring managed beans.

 2. Circuit Breaker - Spring Cloud Netflix framework provides a simple annotation-driven method decorator to build Hystrix clients.

3. Declarative REST Client - Spring Cloud Netflix framework provides support for Feign which creates a dynamic implementation of an interface decorated with JAX-RS or Spring MVC annotations.

4. Client-Side Load Balancer - Spring Cloud Netflix framework provides support for Ribbon, a client-side load balancer provided in the Netflix OSS platform.

5. Router and Filter - Spring Cloud Netflix framework provides support for automatic registration of Zuul filters, and a simple convention over configuration approach to reverse proxy creation.

What does one mean by Service Registration and Discovery? How is it implemented in Spring Cloud ?

·       When we start a project, we usually have all the configurations in the properties file.

·       When more and more services are developed and deployed then adding and modifying these properties become more complex.

·       Some services might go down, while some the location might change. This manual changing of properties may create problems.

·       Eureka Service Registration and Discovery helps in such cases.

·       As all services are registered to the Eureka server and lookup done by calling the Eureka Server, any change in service locations need not be handled and is taken care of Microservice Registration and Discovery with Spring cloud using Netflix Eureka.

Mention few benefits for service discovery mechanism?

  • Availability : Service lookups is shared among all nodes of service discovery cluster. So even a node becomes unavailable then others node take over.
  • Sharing Instances : each node in cluster shares instances of services.
  • Fault tolerant : If any service instance is not healthy then service discovery removes it from its table.
  • Load balanced : Service discovery ensures that when when service invocation happens then invocation is spread across all instances.

What is Netflix Feign?
Feign is a declarative web service client. It makes writing web service clients easier. To use Feign create an interface and annotate it.

What is netflix feign advantages?
Netflix provides Feign as an abstraction over REST-based calls, by which microservices can communicate with each other, however developers don't have to bother about REST internal details.

What is Eureka?

Eureka is a Service Discovery Server and Client provided in the Netflix OSS platform. Service Discovery is one of the key tenets of a microservice-based cloud architecture.

How do you include Eureka in your project?

To include the Eureka Client in your project, use spring-cloud-starter-netflix-eureka-client dependency and @EnableEurekaClient annotation on the Spring Boot Application class.

To include the Eureka server in your project, use spring-cloud-starter-eureka-server dependency and @EnableEurekaServer annotation on the Spring Boot Application class.

What is Hystrix?

Hystrix is a library developed by Netflix that implements the Circuit Breaker pattern.

In a microservice architecture, it is common to have multiple layers of service calls, i.e one microservice can call multiple downstream microservices. A service failure in any one of the lower level services can cause cascading failure all the way up to the user.

Circuit Breaker pattern provides a fallback mechanism, which avoids cascading of failures up to the user.

 

How do you include Hystrix in your project?

To include Hystrix in your project, use spring-cloud-starter-netflix-hystrix dependency and @EnableCircuitBreaker annotation on the Spring Boot Application class.

Use the @HystrixCommand annotation on the method for which fallback method has to be applied.

What is Zuul?

Zuul is a JVM-based router and server-side load balancer developed by Netflix and included in the Netflix OSS package.

How do you include Zuul in your project?

To include Hystrix in your project, use the spring-cloud-starter-netflix-zuul dependency.


What are the different kinds of filters provided by Zuul?

Zuul provides the following filter types that correspond to the lifecycle of a request.

1. PRE-Filters - Filters that execute before routing to the origin server.

2.ROUTING Filters - Filters that handle routing the request to an origin. Builds HTTP Request and calls the Origin server using Apache HttpClient or Netflix Ribbon.

3. POST Filters - Filters that execute after the request has been routed to the origin.

4. ERROR Filters - Filters that execute when an error occurs during any one of the phases.