What are some kind of challenges that
distributed systems introduces?
When you are
implementing microservices architecture, there are some challenges that you
need to deal with every single microservices. Moreover, when you think about
the interaction with each other, it can create a lot of challenges. As well as
if you pre-plan to overcome some of them and standardize them across all
microservices, then it happens that it also becomes easy for developers to
maintain services.
Some of the most
challenging things are testing, debugging, security, version management,
communication ( sync or async ), state maintenance etc. Some of the
cross-cutting concerns which should be standardized are monitoring, logging,
performance improvement, deployment, security etc.
On what basis should microservices be defined?
It is a very
subjective question, but with the best of my knowledge I can say that it should
be based on the following criteria.
i) Business
functionalities that change together in bounded context
ii) Service should
be testable independently.
iii) Changes can be
done without affecting clients as well as dependent services.
iv) It should be
small enough that can be maintained by 2-5 developers.
v) Reusability of a
service
How to tackle service failures when there are
dependent services?
In real time, it
happens that a particular service is causing a downtime, but the other services
are functioning as per mandate. So, under such conditions, the particular
service and its dependent services get affected due to the downtime.
In order to solve
this issue, there is a concept in the microservices architecture pattern,
called the circuit breaker. Any service calling remote service can call a proxy
layer which acts as an electric circuit breaker. If the remote service is slow
or down for ‘n’ attempts then proxy layer should fail fast and keep checking
the remote service for its availability again. As well as the calling services
should handle the errors and provide retry logic. Once the remote service
resumes then the services starts working again and the circuit becomes
complete.
This way, all other
functionalities work as expected. Only one or the dependent services get
affected.
How can one achieve automation in microservice
based architecture?
This is related to
the automation for cross-cutting concerns. We can standardize some of the
concerns like monitoring strategy, deployment strategy, review and commit
strategy, branching and merging strategy, testing strategy, code structure
strategies etc.
For standards, we
can follow the 12-factor application guidelines. If we follow them, we can
definitely achieve great productivity from day one. We can also containerize
our application to utilize the latest DevOps themes like dockerization. We can
use mesos, marathon or kubernetes for orchestrating docker images. Once we have
dockerized source code, we can use CI/CD pipeline to deploy our newly created
codebase. Within that, we can add mechanisms to test the applications and make
sure we measure the required metrics in order to deploy the code. We can use
strategies like blue-green deployment or canary deployment to deploy our code
so that we know the impact of code which might go live on all of the servers at
the same time.
What should one do so that troubleshooting
becomes easier in microservice based architecture?
In monolith where
HTTP Request waits for a response, the processing happens in memory and it
makes sure that the transaction from all such modules work at its best and ensures
that everything is done according to expectation. But it becomes challenging in
the case of microservices because all services are running independently, their
datastores can be independent, REST Apis can be deployed on different
endpoints. Each service is doing a bit without knowing the context of other
microservices.
In this case, we can
use the following measures to make sure we are able to trace the errors easily.
- Services should log and aggregators push logs
to centralized logging servers. For example, use ELK Stack to analyze.
- Unique value per client request(correlation-id)
which should be logged in all the microservices so that errors can be
traced on a central logging server.
- One should have good monitoring in place for
each and every microservice in the ecosystem, which can record application
metrics and health checks of the services, traffic pattern and service
failures.
How should microservices communicate with each
other?
It is an important design
decision. The communication between services might or might not be necessary.
It can happen synchronously or asynchronously. It can happen sequentially, or
it can happen in parallel. So, once we have decided what should be our
communication mechanism, we can decide the technology which suits the best.
Here are some of the
examples which you can consider.
A. Communication can
be done by using some queuing service like rabbitmq, activemq and kafka. This
is called asynchronous communication.
B. Direct API calls
can also be made to microservice. With this approach, interservice dependency
increases. This is called synchronous communication.
C. Webhooks to push
data to connected clients/services.
How would you implement authentication in
microservice architecture?
There are mainly two
ways to achieve authentication in microservices architecture.
A. Centralized
sessions
All the
microservices can use a central session store and user authentication can be achieved.
This approach works but has many drawbacks as well. Also, the centralized
session store should be protected, and services should connect securely. The
application needs to manage the state of the user, so it is called stateful
session.
B. Token-based
authentication/authorization
In this approach,
unlike the traditional way, information in the form of token is held by the
clients and the token is passed along with each request. A server can check the
token and verify the validity of the token like expiry, etc. Once the token is
validated, the identity of the user can be obtained from the token. However,
encryption is required for security reasons. JWT(JSON web token) is the new
open standard for this, which is widely used. Mainly used in stateless
applications. Or, you can use OAuth based authentication mechanisms as well.
What would be your logging strategy in a
microservice architecture?
Logging is a very
important aspect of any application. If we have done proper logging in an
application, it becomes easy to support other aspects of the application as
well. Like in order to debug the issues / in order to understand what business
logic might have been executed, it becomes very critical to log important
details.
Ideally, you should
follow the following practices for logging.
A. In a microservice
architecture, each request should have a unique value (correlationid) and this
value should be passed to each and every microservice so the correlationid can
be logged across the services. Thus, the requests can be traced.
B. Logs generated by
all the services should be aggregated in a single location so that while
searching becomes easier. Generally, people use ELK stack for the same. So that
it becomes easy for support persons to debug the issue.
How would you manage application configuration
in microservice running in a container?
As container-based
deployment involves a single image per microservice, it is a bad idea to bundle
the configuration along with the image.
This approach is not
at all scalable because we might have multiple environments and also, we might
have to take care of geographically distributed deployments where we might have
different configurations as well.
Also, when there are
application and cron application as part of the same codebase, it might need to
take additional care on production as it might have repercussions how the crons
are architected.
To solve this, we
can put all our configuration in a centralized config service which can be
queried by the application for all its configurations at the runtime. Spring
cloud is one of the example services which provides this facility.
It also helps to
secure the information, as the configuration might have passwords or access to
reports or database access controls. Only trusted parties should be allowed to
access these details for security reasons.
What is container orchestration and how does
it help in a microservice architecture?
In a production
environment, you don’t just deal with the application code/application server.
You need to deal with API Gateway, Proxy Servers, SSL terminators, Application Servers,
Database Servers, Caching Services, and other dependent services.
As in modern
microservice architecture where each microservice runs in a separate container,
deploying and managing these containers is very challenging and might be
error-prone.
Container
orchestration solves this problem by managing the life cycle of a container and
allows us to automate the container deployments.
It also helps in
scaling the application where it can easily bring up a few containers. Whenever
there is a high load on the application and once the load goes down. it can
scale down as well by bringing down the containers. It is helpful to adjust
cost based on requirements.
Also, in some cases,
it takes care of internal networking between services so that you need not make
any extra effort to do so. It also helps us to replicate or deploy the docker
images at runtime without worrying about the resources. If you need more
resources, you can configure that in orchestration services and it will be
available/deployed on production servers within minutes.
Explain the API gateway and why one should use
it?
A. A gateway can
also authenticate requests by verifying the identity of a user by routing each
and every request to authentication service before routing it to the
microservice with authorization details in the token.
B. Gateways are also
responsible to load balance the requests.
C. API Gateways are
responsible to rate limit a certain type of request to save itself from
blocking several kinds of attacks etc.
D. API Gateways can
whitelist or blacklist the source IP Addresses or given domains which can
initiate the call.
E. API Gateways can
also provide plugins to cache certain type of API responses to boost the
performance of the application.
How will you ensure data consistency in
microservice based architecture?
One should avoid
sharing database between microservices, instead APIs should be exposed to
perform the change.
If there is any
dependency between microservices then the service holding the data should
publish messages for any change in the data for which other services can
consume and update the local state.
If consistency is required,
then microservices should not maintain local state and instead can pull the
data whenever required from the source of truth by making an API call.
What is event sourcing in microservices
architecture?
In the microservices
architecture, it is possible that due to service boundaries, a lot of times you
need to update one or more entities on the state change of one of the entities.
In that case, one needs to publish a message and new event gets created and
appended to already executed events. In case of failure, one can replay all
events in the same sequence, and you will get the desired state as required.
You can think of event sourcing as your bank account statement.
You will start your
account with initial money. Then all the credit and debit events happen, and
the latest state is generated by calculating all of the events one by one. In a
case where events are too many, the application can create a periodic snapshot
of events so that there isn’t any need to replay all of the events again and
again.
How will you implement service discovery in microservices architecture?
Servers come and go
in a cloud environment, and new instances of same services can be deployed to
cater increasing load of requests. So, it becomes absolutely essential to have
service registry & discovery that can be queried for finding address (host,
port & protocol) of a given server. We may also need to locate servers for
the purpose of client-side load balancing (Ribbon) and handling failover
gracefully (Hystrix).
Spring Cloud solves
this problem by providing a few ready-made solutions for this challenge. There
are mainly two options available for the service discovery - Netflix Eureka
Server and Consul. Let's discuss both of these briefly:
Netflix Eureka Server
Eureka is a REST (Representational State Transfer) based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers. The main features of Netflix Eureka are:
Eureka is a REST (Representational State Transfer) based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers. The main features of Netflix Eureka are:
- It provides service-registry.
- zone aware service lookup is possible.
- eureka-client (used by microservices) can cache
the registry locally for faster lookup. The client also has a built-in
load balancer that does basic round-robin load balancing.
Spring Cloud
provides two dependencies - eureka-server and eureka-client. Eureka server
dependency is only required in eureka server’s build.gradle
build.gradle -
Eureka Server
compile('org.springframework.cloud:spring-cloud-starter-netflix-eureka-server')
On the other hand, each microservice need to include the eureka-client dependencies to enables
On the other hand, each microservice need to include the eureka-client dependencies to enables
eureka
discovery.
build.gradle -
Eureka Client (to be included in all microservices)
compile('org.springframework.cloud:spring-cloud-starter-netflix-eureka-client')
Eureka server provides a basic dashboard for monitoring various instances and their health in the service registry. The ui is written in freemarker and provided out of the box without any extra configuration.
Consul Server
It is a REST-based
tool for dynamic service registry. It can be used for registering a new
service, locating a service and health checkup of a service.
You have the option
to choose any one of the above in your spring cloud-based distributed
application. In this book, we will focus more on the Netflix Eureka Server option.
How will you use config-server for your
development, stage and production environment?
If you have 3
different environments (develop/stage/production) in your project setup, then
you need to create three different config storage projects. So, in total, you
will have four projects:
config-server
It is the config-server that can be deployed in each environment. It is the Java Code without configuration storage.
It is the config-server that can be deployed in each environment. It is the Java Code without configuration storage.
config-dev
It is the git storage for your development configuration. All configuration related to each microservices in the development environment will fetch its config from this storage. This project has no Java code, and t is meant to be used with config-server.
It is the git storage for your development configuration. All configuration related to each microservices in the development environment will fetch its config from this storage. This project has no Java code, and t is meant to be used with config-server.
config-qa
Same as config-dev but it’s meant to be used only in qa environment.
Same as config-dev but it’s meant to be used only in qa environment.
Config-prod
Same as config-dev but meant for production environment.
So depending upon the environment, we will use config-server with either config-dev, config-qa or config-prod.
Same as config-dev but meant for production environment.
So depending upon the environment, we will use config-server with either config-dev, config-qa or config-prod.
How does Eureka Server work?
There are two main
components in Eureka project: eureka-server and eureka-client.
Eureka Server
The central server (one per zone) that acts as a service registry. All microservices register with this eureka server during app bootstrap.
The central server (one per zone) that acts as a service registry. All microservices register with this eureka server during app bootstrap.
Eureka Client
Eureka also comes with a Java-based client component, the eureka-client, which makes interactions with the service much easier. The client also has a built-in load balancer that does basic round-robin load balancing. Each microservice in the distributed ecosystem much include this client to communicate and register with eureka-server.
Eureka also comes with a Java-based client component, the eureka-client, which makes interactions with the service much easier. The client also has a built-in load balancer that does basic round-robin load balancing. Each microservice in the distributed ecosystem much include this client to communicate and register with eureka-server.
Typical use case for Eureka
There is usually one eureka server cluster per region (US, Asia, Europe, Australia) which knows only about instances in its region. Services register with Eureka and then send heartbeats to renew their leases every 30 seconds. If the service can not renew their lease for a few times, it is taken out of server registry in about 90 seconds. The registration information and the renewals are replicated to all the eureka nodes in the cluster. The clients from any zone can look up the registry information (happens every 30 seconds) to locate their services (which could be in any zone) and make remote calls.
There is usually one eureka server cluster per region (US, Asia, Europe, Australia) which knows only about instances in its region. Services register with Eureka and then send heartbeats to renew their leases every 30 seconds. If the service can not renew their lease for a few times, it is taken out of server registry in about 90 seconds. The registration information and the renewals are replicated to all the eureka nodes in the cluster. The clients from any zone can look up the registry information (happens every 30 seconds) to locate their services (which could be in any zone) and make remote calls.
Eureka clients are
built to handle the failure of one or more Eureka servers. Since Eureka clients
have the registry cache information in them, they can operate reasonably well,
even when all the eureka servers go down.
What is Circuit Breaker Pattern?
Microservices often
need to make remote network calls to another microservices running in a
different process. Network calls can fail due to many reasons, including-
- Brittle nature of the network itself
- Remote process is hung or
- Too much traffic on the target microservices
than it can handle
This can lead to
cascading failures in the calling service due to threads being blocked in the
hung remote calls. A circuit breaker is a piece of software that is used to
solve this problem. The basic idea is very simple - wrap a potentially failing
remote call in a circuit breaker object that will monitor for
failures/timeouts. Once the failures reach a certain threshold, the circuit
breaker trips, and all further calls to the circuit breaker return with an
error, without the protected call being made at all. This mechanism can protect
the cascading effects of a single component failure in the system and provide
the option to gracefully downgrade the functionality.
Typical Circuit Breaker Implementation
Here a REST client
calls the Recommendation Service which further communicates with Books Service
using a circuit breaker call wrapper. As soon as the books-service API calls
starts to fail, circuit breaker will trip (open) the circuit and will not make
any further call to book-service until the circuit is closed again.
What are Open, Closed and Half-Open states of
Circuit Breaker?
Circuit Breaker
wraps the original remote calls inside it and if any of these calls fails, the
failure is counted. When the service dependency is healthy and no issues are
detected, the circuit breaker is in Closed state. All invocations are passed
through to the remote service.
If the failure count
exceeds a specified threshold within a specified time period, the circuit trips
into the Open State. In the Open State, calls always fail immediately without
even invoking the actual remote call. The following factors are considered for
tripping the circuit to Open State –
- An Exception thrown (HTTP 500 error, can not
connect)
- Call takes longer than the configured timeout
(default 1 second)
- The internal thread pool (or semaphore depending on configuration) used by hystrix for the command execution rejects the execution due to exhausted resource pool.
After a
predetermined period of time (by default 5 seconds), the circuit transitions
into a half-open state. In this state, calls are again attempted to the remote
dependency. Thereafter the successful calls transition the circuit breaker back
into the closed state, while the failed calls return the circuit breaker into
the open state.
What are use-cases for Circuit Breaker Pattern
and benefits of using Circuit Breaker Pattern?
- Synchronous communication over the network that
is likely to fail is a potential candidate for circuit breaker.
- A circuit breaker is a valuable place for
monitoring, any change in the breaker state should be logged so as to
enable deep monitoring of microservices. It can easily troubleshoot the
root cause of failure.
- All places where a degraded functionality can
be acceptable to the caller if the actual server is struggling/down.
Benefits: -
- The circuit breaker can prevent a single
service from failing the entire system by tripping off the circuit to the
faulty microservice.
- The circuit breaker can help to offload
requests from a struggling server by tripping the circuit, thereby giving
it a time to recover.
- In providing a fallback mechanism where a stale data can be provided if real service is down.
What is Hystrix?
Hystrix is Netflix
implementation for circuit breaker pattern, that also employs bulkhead design
pattern by operating each circuit breaker within its own thread
pool. It also collects many useful metrics about the circuit breaker’s internal
state, including -
- Traffic volume.
- Request volume.
- Error percentage.
- Hosts reporting
- Latency percentiles.
- Successes, failures, and rejections.
All these metrics
can be aggregated using another Netflix OSS project called Turbine. Hystrix
dashboard can be used to visualize these aggregated metrics, providing
excellent visibility into the overall health of the distributed system.
Hystrix can be used to specify the fallback method for execution in case the actual method call fails. This can be useful for graceful degradation of functionality in case of failure in remote invocation.
Hystrix can be used to specify the fallback method for execution in case the actual method call fails. This can be useful for graceful degradation of functionality in case of failure in remote invocation.
Add hystrix library
to build.gradle
dependencies {
compile('org.springframework.cloud:spring-cloud-starter-hystrix')
1) Enable Circuit
Breaker in main application
@EnableCircuitBreaker
@RestController @SpringBootApplication
public class ReadingApplication {
public class ReadingApplication {
... }
2) Using
HystrixCommand fallback method execution
@HystrixCommand(fallbackMethod
= "reliable")
public String
readingList() {
URI uri =
URI.create("http://localhost:8090/recommended"); return this.restTemplate.getForObject(uri,
String.class);
}
public String
reliable() { 2
return "Cached recommended response";
return "Cached recommended response";
}
- Using @HystrixCommand annotation, we specify
the fallback method to execute in case of exception.
- fallback method should have the same signature (return type) as that of the original method. This method provides a graceful fallback behavior while the circuit is in the open or half-open state.
What is the difference between using a Circuit
Breaker and a naive approach where we try/catch a remote method call and
protect for failures?
Let's say we want to
handle service to service failure gracefully without using the Circuit Breaker
pattern. The naive approach would be to wrap the REST call in a
try-catch clause. But Circuit Breaker does a lot more than try-catch cannot
accomplish -
- Circuit Breaker does not even try calls once
the failure threshold is reached, doing so reduces the number of network
calls. Also, several threads consumed in making faulty calls are freed up.
- Circuit breaker provides fallback method
execution for gracefully degrading the behavior. Try catch approach will
not do this out of the box without additional boiler plate code.
- Circuit Breaker can be configured to use a
limited number of threads for a particular host/API, doing so brings all
the benefits of bulkhead design pattern.
So instead of
wrapping service to service calls with try/catch clause, we must use the
circuit breaker pattern to make our system resilient to failures.
How will you ignore certain exceptions in
Hystrix fallback execution?
@HystrixCommand
annotation provides attribute ignoreExceptions that can be used to provide a
list of ignored exceptions.
Code
@Service
public class
HystrixService {
@Autowired
private LoadBalancerClient
loadBalancer;
@Autowired
private RestTemplate
restTemplate;
@HystrixCommand(fallbackMethod
= "reliable", ignoreExceptions = IllegalStateException.class,
MissingServletRequestParameterException.class, TypeMismatchException.class)
public String
readingList() {
ServiceInstance
instance = loadBalancer.choose("product-service"); URI uri =
URI.create("http://product-service/product/recommended"); return
this.restTemplate.getForObject(uri, String.class);}
public String
reliable(Throwable e) { return "Cloud Native Java
(O'Reilly)";
In the above
example, if the actual method call throws IllegalStateException,
MissingServletRequestParameterException or TypeMismatchException then hystrix
will not trigger the fallback logic (reliable method), instead the actual
exception will be wrapped inside HystrixBadRequestException and re-thrown to
the caller. It is taken care by javanica library under the hood.
What is Strangulation Pattern in microservices
architecture?
Strangulation is
used to slowly decommission an older system and migrate the functionality to a
newer version of microservices.
Normally one
endpoint is Strangled at a time, slowly replacing all of them with the newer
implementation. Zuul Proxy (API Gateway) is a useful tool for this because we
can use it to handle all traffic from clients of the old endpoints but redirect
only selected requests to the new ones.
Let’s take an
example use-case:
/src/main/resources/application.yml
zuul:
routes:
first:
path: /first/**
url: http://first.example.com --1
path: /first/**
url: http://first.example.com --1
legacy:
path: /**
url: http://legacy.example.com -- 2
path: /**
url: http://legacy.example.com -- 2
1)Paths in /first/**
have been extracted into a new service with an external URL http://first.example.com
2 )legacy app is
mapped to handle all request that do not match any other patterns
(/first/**).
This configuration
is for API Gateway (zuul reverse proxy), and we are strangling selected
endpoints /first/ from the legacy app hosted at
http://legacy.example.com slowly to newly created microservice with external
URL http://first.example.com
How does Hystrix implement Bulkhead Design
Pattern?
The bulkhead
implementation in Hystrix limits the number of concurrent calls to a
component/service. This way, the number of resources (typically threads) that
are waiting for a reply from the component/service is limited.
Let's assume we have
a fictitious web e-commerce application as shown in the figure below. The
WebFront communicates with 3 different components using remote network calls
(REST over HTTP).
- Product catalogue Service
- Product Reviews Service
- Order Service
Now let's say due to
some problem in Product Review Service, all requests to this service start to
hang (or timeout), eventually causing all request handling threads in WebFront
Application to hang on waiting for an answer from Reviews Service. This would make
the entire WebFront Application non-responsive. The resulting behavior of the
WebFront Application would be same if request volume is high and Reviews
Service is taking time to respond to each request.
The Hystrix Solution
Hystrix’s
implementation for bulkhead pattern would limit the number of concurrent calls
to components and would have saved the application in this case by gracefully
degrading the functionality. Assume we have 30 total request handling threads
and there is a limit of 10 concurrent calls to Reviews Service. Then at most 10
request handling threads can hang when calling Reviews Service, the other 20
threads can still handle requests and use components Products and Orders
Service. This will approach will keep our WebFront responsive even if there is
a failure in Reviews Service.
How to handle versioning of microservices?
There are different
ways to handle the versioning of your REST api to allow older consumers to
still consume the older endpoints. The ideal practice is that any nonbackward
compatible change in a given REST endpoint shall lead to a new versioned
endpoint.
Different mechanisms
of versioning are:
- Add version in the URL itself
- Add version in API request header
Most common approach
in versioning is the URL versioning itself. A versioned URL looks like the
following:
Versioned URL
https://:/api/v1/...
As an API developer
you must ensure that only backward-compatible changes are accommodated in a
single version of URL. Consumer-Driven-Tests can help identify potential issues
with API upgrades at an early stage.
Is it a good idea to share a common database
across multiple microservices?
In a microservices
architecture, each microservice shall own its private data which can only be
accessed by the outside world through owning service. If we start sharing
microservice’s private datastore with other services, then we will violate the
principle of Bounded Context.
Practically we have
three approaches -
- Database
server per microservice -
Each microservice will have its own database server instance. This
approach has the overhead of maintaining database instance and its
replication/backup, hence its rarely used in a practical
environment.
- Schema
per microservice -
Each microservice owns a private database schema which is not accessible
to other services. Its most preferred approach for RDMS database (MySql,
Postgres, etc.)
- Private Table per microservice - Each microservice owns a set of tables that must only be accessed by that service. It’s a logical separation of data. This approach is mostly used for the hosted database as a service solution (Amazon RDS).
What are best practices for microservices
architecture?
Microservices
Architecture can become cumbersome & unmanageable if not done properly.
There are best practices that help design a resilient & highly scalable
system. The most important ones are
Partition correctly
Get to know the
domain of your business, that's very important. Only then you will be able to
define the bounded context and partition your microservice correctly based on
business capabilities.
DevOps culture
Typically,
everything from continuous integration all the way to continuous delivery and
deployment should be automated. Otherwise, a big pain to manage a large
fleet of microservices.
Design for stateless operations
We never know where
a new instance of a particular microservice will be spun up for scaling out or
for handling failure, so maintaining a state inside service instance is a very
bad idea.
Design for failures
Failures are
inevitable in distributed systems, so we must design our system for handling
failures gracefully. failures can be of different types and must be dealt with
accordingly, for example -
- Failure could be transient due to inherent
brittle nature of the network, and the next retry may succeed. Such
failures must be protected using retry operations.
- Failure may be due to a hung service which can
have cascading effects on the calling service. Such failures must be
protected using Circuit Breaker Patterns. A fallback mechanism can be used
to provide degraded functionality in this case.
- A single component may fail and affect the
health of the entire system, bulkhead pattern must be used to prevent the
entire system from failing.
Design for versioning
We should try to
make our services backward compatible, explicit versioning must be used to
cater different versions of the RESt endpoints.
Design for asynchronous communication b/w
services
Asynchronous
communication should be preferred over synchronous communication in inter
microservice communication. One of the biggest advantages of using asynchronous
messaging is that the service does not block while waiting for a response from
another service.
Design for eventual consistency
Eventual consistency
is a consistency model used in distributed computing to achieve high
availability that informally guarantees that, if no new updates are made to a
given data item, eventually all accesses to that item will return the last
updated value.
Design for idempotent operations
Since networks are
brittle, we should always design our services to accept repeated calls without
any side effects. We can add some unique identifier to each request so that
service can ignore the duplicate request sent over the network due to network
failure/retry logic.
Share as little as possible
In monolithic
applications, sharing is considered to be a best practice but that's not the
case with Microservices. Sharing results in a violation of Bounded Context
Principle, so we shall refrain from creating any single unified shared model
that works across microservices. For example, if different services need a
common Customer model, then we should create one for each microservice with
just the required fields for a given bounded context rather than creating a big
model class that is shared in all services. The more dependencies we have
between services, the harder it is to isolate the service changes, making it
difficult to make a change in a single service without affecting other
services. Also, creating a unified model that works in all services brings
complexity and ambiguity to the model itself, making it hard for anyone to
understand the model.
In a way are want to violate the DRY principle in microservices architecture when it comes to domain models.
In a way are want to violate the DRY principle in microservices architecture when it comes to domain models.
How will you implement caching for
microservices?
Caching is a
technique of performance improvement for getting query results from a service.
It helps minimize the calls to network, database, etc. We can use caching at
multiple levels in microservices architecture -
- Server-Side
Caching - Distributed caching
software like Redis/MemCache/etc are used to cache the results of business
operations. The cache is distributed so all instances of a microservice
can see the values from the shared cache. This type of caching is opaque
to clients.
- Gateway
Cache - central API gateway
can cache the query results as per business needs and provide improved
performance. This way we can achieve caching for multiple services at one
place. Distributed caching software like Redis or Memcache can be used in
this case.
- Client-Side Caching - We can set cache-headers in http response and allow clients to cache the results for a pre-defined time. This will drastically reduce the load on servers since the client will not make repeated calls to the same resource. Servers can inform the clients when information is changed, thereby any changes in the query result can also be handled. E-Tags can be used for client-side load balancing. If the end client is a microservice itself, then Spring Cache support can be used to cache the results locally.
What is a good tool for documenting
Microservices?
Swagger is a very
good open-source tool for documenting APIs provided by microservices. It
provides very easy to use interactive documentation.
By the use of
swagger annotation on REST endpoint, api documentation can be auto-generated
and exposed over the web interface. An internal and external team can use web
interface, to see the list of APIs and their inputs & error codes. They can
even invoke the endpoints directly from web interface to get the results.
Swagger UI is a very
powerful tool for your microservices consumers to help them understand the set
of endpoints provided by a given microservice.
What are the tools and libraries available for
testing microservices?
Important Tools and
Libraries for testing Spring-based Microservices are -
JUnit
the standard test
runners
TestNG
the next generation test runner
the next generation test runner
Hemcrest
declarative matchers and assertions
declarative matchers and assertions
Rest-assured
for writing REST Api driven end to end tests
for writing REST Api driven end to end tests
Mockito
for mocking dependencies
for mocking dependencies
Wiremock
for stubbing thirdparty services
for stubbing thirdparty services
Hoverfly
Create API simulation for end-to-end tests.
Create API simulation for end-to-end tests.
Spring Test and Spring Boot Test
for writing Spring Integration Tests - includes MockMVC, TestRestTemplate, Webclient like features.
for writing Spring Integration Tests - includes MockMVC, TestRestTemplate, Webclient like features.
JSONassert
An assertion library for JSON.
An assertion library for JSON.
Pact
The Pact family of frameworks provide support for Consumer Driven Contracts testing.
The Pact family of frameworks provide support for Consumer Driven Contracts testing.
Selenium
Selenium automates browsers. Its used for end-to-end automated ui testing.
Selenium automates browsers. Its used for end-to-end automated ui testing.
Gradle
Gradle helps build, automate and deliver software, fastr.
Gradle helps build, automate and deliver software, fastr.
IntelliJ IDEA
IDE for Java Development
IDE for Java Development
Using spring-boot-starter-test
We can just add the below dependency in project’s build.gradle
We can just add the below dependency in project’s build.gradle
testCompile('org.springframework.boot:spring-boot-starter-test')
This starter will
import two spring boot test modules spring-boot-test & spring-boot-test-
autoconfigure as well as Junit, AssertJ, Hamcrest, Mockito, JSONassert, Spring
Test, Spring Boot Test and a number of other useful libraries.
What is the difference between Orchestration
and Choreography in microservices context?
In Orchestration, we
rely on a central system to control and call other Microservices in a certain
fashion to complete a given task. The central system maintains the state of
each step and sequence of the overall workflow. In Choreography, each Microservice
works like a State Machine and reacts based on the input from other parts. Each
service knows how to react to different events from other systems. There is no
central command in this case.
Orchestration is a
tightly coupled approach and is an anti-pattern in a microservices
architecture. Whereas, Choreography’s loose coupling approach should be adopted
where-ever possible.
Example
Let’s say we want to
develop a microservice that will send product recommendation email in a
fictitious e-shop. In order to send Recommendations, we need to have access to
user’s order history which lies in a different microservices.
In Orchestration
approach, this new microservice for recommendations will make synchronous calls
to order service and fetch the relevant data, then based on his past purchases
we will calculate the recommendations. Doing this for a million users will
become cumbersome and will tightly couple the two microservices.
In Choreography
approach, we will use event-based Asynchronous communication where whenever a
user makes a purchase, an event will be published by order service.
Recommendation service will listen to this event and start building user
recommendation. This is a loosely coupled approach and highly scalable. The
event, in this case, does not tell about the action, but just the data.
How frequent a microservice be released into
production?
There is no right
answer to this question, there could be a release every ten minutes, every hour
or once a week. It all depends on the extent of automation you have at a
different level of the software development lifecycle - build automation, test
automation, deployment automation and monitoring. And of course, on the
business requirements - how small low-risk changes you care making in a single
release.
In an ideal world
where boundaries of each microservices are clearly defined (bounded context),
and a given service is not affecting other microservices, you can easily
achieve multiple deployments a day without major complexity.
Examples of deployment/release frequency
- Amazon is on record as making changes to
production every 11.6 seconds on average in May of 2011.
- Github is well known for its aggressive
engineering practices, deploying code into production on an average 60
times a day.
- Facebook releases to production twice a day.
- Many Google services see releases multiple
times a week, and almost everything in Google is developed on mainline.
- Etsy Deploys More Than 50 Times a Day.
What are Cloud-Native applications?
Cloud-Native
Applications (NCA) is a style of application development that encourages easy
adoption of best practices in the area of continuous delivery and distributed
software development. These applications are designed specifically for a cloud
computing architecture (AWS, Azure, CloudFoundary, etc).
DevOps, continuous
delivery, microservices, and containers are the key concepts in developing
cloud-native applications.
Spring Boot, Spring
Cloud, Docker, Jenkins, Git are a few tools that can help you write
Cloud-Native Application without much effort.
Microservices
It is an
architectural approach for developing a distributed system as a collection of
small services. Each service is responsible for a specific business capability,
runs in its own process and communicates via HTTP REST API or messaging (AMQP).
DevOps
It is collaboration
between software developers and IT operations with a goal of constantly
delivering high-quality software as per customer needs.
Continuous Delivery
Its all about
automated delivery of low-risk small changes to production, constantly. This
makes it possible to collect feedback faster.
Containers
Containers (e.g.
Docker) offer logical isolation to each microservices thereby eliminating the
problem of "run on my machine" forever. It’s much faster and
efficient compared to Virtual Machines.
How will you develop microservices using Java?
Spring Boot along
with Spring Cloud is a very good option to start building microservices using
Java language. There are a lot of modules available in Spring Cloud that can
provide boiler plate code for different design patterns of microservices, so
Spring Cloud can really speed up the development process. Also, Spring boot
provides out of the box support to embed a servlet container
(tomcat/jetty/undertow) inside an executable jar (uber jar), so that these jars
can be run directly from the command line, eliminating the need of deploying
war files into a servlet container.
You can also use
Docker container to ship and deploy the entire executable package onto a cloud
environment. Docker can also help eliminate "works on my machine"
problem by providing logical separation for the runtime environment during the
development phase. That way you can gain portability across on-premises and
cloud environment.
How to achieve zero-downtime during the
deployments?
As the name
suggests, zero-downtime deployments do not bring outage in a production
environment. It is a clever way of deploying your changes to production, where
at any given point in time, at least one service will remain available to
customers.
Blue-green deployment
One way of achieving
this is blue/green deployment. In this approach, two versions of a single
microservice are deployed at a time. But only one version is taking real
requests. Once the newer version is tested to the required satisfaction level,
you can switch from older version to newer version.
You can run a
smoke-test suite to verify that the functionality is running correctly in the
newly deployed version. Based on the results of smoke-test, newer version can
be released to become the live version.
Changes required in client code to handle
zero-downtime
Lets say you have
two instances of a service running at the same time, and both are registered in
Eureka registry. Further, both instances are deployed using two distinct
hostnames:
/src/main/resources/application.yml
spring.application.name: ticketBooks-service
---
spring.profiles: blue
eureka.instance.hostname: ticketBooks-service
-blue.example.com
---
spring.profiles: green
eureka.instance.hostname: ticketBooks-service -green.example.com
Now the client app
that needs to make api calls to books-service may look like below:
@RestController
@SpringBootApplication
@EnableDiscoveryClient
public
class ClientApp {
@Bean
@LoadBalanced
public RestTemplate restTemplate() {
@LoadBalanced
public RestTemplate restTemplate() {
return
new RestTemplate(); }
@RequestMapping("/hit-some-api")
public Object
hitSomeApi() {
return restTemplate().getForObject("https://ticketBooks-service/some-uri",
Object.class); }
Now, when
ticketBooks-service-green.example.com goes down for upgrade, it gracefully
shuts down and delete its entry from Eureka registry. But these changes will
not be reflected in the ClientApp until it fetches the registry again (which happens
every 30 seconds). So for upto 30 seconds, ClientApp’s @LoadBalanced
RestTemplate may send the requests to ticketBooks-service-green.example.com
even if its down.
To fix this, we can
use Spring Retry support in Ribbon client-side load balancer. To enable Spring
Retry, we need to follow the below steps:
Add spring-retry to
build.gradle dependencies
compile("org.springframework.boot:spring-boot-starter-aop")
compile("org.springframework.retry:spring-retry")
Now enable
spring-retry mechanism in ClientApp using @EnableRetry annotation, as shown
below:
@EnableRetry
@RestController @SpringBootApplication @EnableDiscoveryClient public
class ClientApp {
... }
Once this is done,
Ribbon will automatically configure itself to use retry logic and any failed
request to ticketBooks-service-green.example.com com will be retried to next
available instance (in round-robins fashion) by Ribbon. You can customize this
behaviour using the below properties:
/src/main/resources/application.yml
ribbon:
MaxAutoRetries:
5
MaxAutoRetriesNextServer:
5
OkToRetryOnAllOperations:
true
OkToRetryOnAllErrors:
true
How to achieve zero-downtime deployment
(blue/green) when there is a database change?
The deployment
scenario becomes complex when there are database changes during the upgrade.
There can be two different scenarios: 1. database change is backward compatible
(e.g. adding a new table column) 2. Database change is not compatible with an
older version of the application (e.g. renaming an existing table column)
- Backward
compatible change:
This scenario is easy to implement and can be fully automated using
Flyway. We can add the script to create a new column and the script will
be executed at the time of deployment. Now during blue/green deployment,
two versions of the application (say v1 and v2) will be connected to the
same database. We need to make sure that the newly added columns allow
null values (btw that’s part of the backward compatible change). If
everything goes well, then we can switch off the older version v1, else
application v2 can be taken off.
- Non-compatible
database change:
This is a tricky scenario, and may require manual intervention in-case of
rollback. Let's say we want to rename first_name column to fname in the
database. Instead of directly renaming, we can create a new column fname
and copy all existing values of first_name into fname column, keeping the
first_name column as it is in the database. We can defer non-null checks
on fname to post-deployment success. If the deployment goes successful, we
need to migrate data written to first_name by v1 to the new column (fname)
manually after bringing down the v1. If the deployment fails for v2, then
we need to do the otherwise.
Complexity may be
much more in a realistic production app, such discussions are beyond the scope
of this book.
How to maintain ACID in microservice
architecture?
ACID is an acronym
for four primary attributes namely atomicity, consistency, isolation, and
durability ensured by the database transaction manager.
Atomicity
In a transaction
involving two or more entities, either all of the records are committed or none
are.
Consistency
A database
transaction must change affected data only in allowed ways following specific
rules including constraints/triggers etc.
Isolation
Any transaction in
progress (not yet committed) must remain isolated from any other
transaction.
Durability
Committed records
are saved by a database such that even in case of a failure or database
restart, the data is available in its correct state.
In a distributed
system involving multiple databases, we have two options to achieve ACID
compliance:
- One way to achieve ACID compliance is to use a
two-phase commit (a.k.a 2PC), which ensures that all involved services
must commit to transaction completion or all the transactions are rolled
back.
- Use eventual consistency, where multiple
databases owned by different microservices become eventually consistent
using asynchronous messaging using messaging protocol. Eventual
consistency is a specific form of weak consistency.
2 Phase Commit
should ideally be discouraged in microservices architecture due to its fragile
and complex nature. We can achieve some level of ACID compliance in distributed
systems through eventual consistency and that should be the right approach to
do it.
No comments:
Post a Comment