Tuesday, December 31, 2019

MSA Data Mgmt patterns -> Event sourcing


Pattern: Event sourcing
Event sourcing is a great way to atomically update state and publish events. The traditional way to persist an entity is to save its current state. Event sourcing uses a radically different, event-centric approach to persistence. A business object is persisted by storing a sequence of state changing events. Whenever an object’s state changes, a new event is appended to the sequence of events. Since that is one operation it is inherently atomic. A entity’s current state is reconstructed by replaying its events.
To see how event sourcing works, consider the Order entity. Traditionally, each order maps to a row in an ORDER table along with rows in another table like the ORDER_LINE_ITEM table. But when using event sourcing, the Order Service stores an Order by persisting its state changing events: Created, Approved, Shipped, Cancelled. Each event would contain sufficient data to reconstruct the Order’s state.

Events are persisted in an event store. Not only does the event store act as a database of events, it also behaves like a message broker. It provides an API that enables services to subscribe to events. Each event that is persisted in the event store is delivered by the event store to all interested subscribers. The event store is the backbone of an event-driven microservices architecture.
In this architecture, requests to update an entity (either an external HTTP request or an event published by another service) are handled by retrieving the entity’s events from the event store, reconstructing the current state of the entity, updating the entity, and saving the new events.
Here is how the Order Service handles a request to update an Order


Other benefits of event sourcing

As you can see, event sourcing addresses a challenge of implementing an event-driven architecture. Additional significant benefits of persisting business events include the following:
·         100% accurate audit logging - Auditing functionality is often added as an afterthought, resulting in an inherent risk of incompleteness. With event sourcing, each state change corresponds to one or more events, providing 100% accurate audit logging.
·         Easy temporal queries - Because event sourcing maintains the complete history of each business object, implementing temporal queries and reconstructing the historical state of an entity is straightforward.

Event sourcing also has several drawbacks:
  • It is a different and unfamiliar style of programming and so there is a learning curve.
  • The event store is difficult to query since it requires typical queries to reconstruct the state of the business entities. That is likely to be complex and inefficient. As a result, the application must use Command Query Responsibility Segregation (CQRS) to implement queries. This in turn means that applications must handle eventually consistent data.
  • The Saga and Domain event patterns create the need for this pattern.
  • The CQRS must often be used with event sourcing.
  • Event sourcing implements the Audit logging pattern.




MSA Data Mgmt patterns -> Domain event


Domain event

Context
A service often needs to publish events when it updates its data. These events might be needed, for example, to update a CQRS view. Alternatively, the service might participate in an choreography-based saga, which uses events for coordination.

Problem
How does a service publish an event when it updates its data?

Solution
Organize the business logic of a service as a collection of DDD aggregates that emit domain events when they created or updated. The service publishes these domain events so that they can be consumed by other services.

Related patterns
·         The Saga and CQRS patterns create the need for this pattern
·         The Aggregate pattern is used to structure the business logic
·         The Transactional outbox pattern is used to publish events as part of a database transaction
·         Event sourcing is sometimes used to publish domain events

MSA Data mgmt patterns -> Command Query Responsibility Segregation (CQRS)

Problem

Once we implement database-per-service, there is a requirement to query, which requires joint data from multiple services — it's not possible. Then, how do we implement queries in microservice architecture?

 Solution

 CQRS suggests splitting the application into two parts — the command side and the query side.

  • The command side handles the Create, Update, and Delete requests.
  • The query side handles the query part by using the materialized views.
The event sourcing pattern is generally used along with it to create events for any data change. Materialized views are kept updated by subscribing to the stream of events.

MSA Data mgmt patterns - API Composition


API Composition

You have applied the Microservices architecture pattern and the Database per service pattern. As a result, it is no longer straightforward to implement queries that join data from multiple services.
This pattern is a direct solution to the problem of implementing complex queries in a microservices architecture.
In this pattern, an API Composer invokes other microservices in the required order. And after fetching the results it performs an in-memory join of the data before providing it to the consumer.
As evident, the downside to this pattern is the use of inefficient in-memory joins on potentially large datasets.

Problem

How to implement queries in a microservice architecture?

Solution

Implement a query by defining an API Composer, which invoking the services that own the data and performs an in-memory join of the results.

Example

An API Gateway often does API composition.

This pattern has the following benefits:
  • It a simple way to query data in a microservice architecture
This pattern has the following drawbacks:
  • Some queries would result in inefficient, in-memory joins of large datasets.

MSA Data mgmt patterns - Saga Pattern

Problem

When each service has its own database and a business transaction spans multiple services, how do we ensure data consistency across services? For example, for an e-commerce application where customers have a credit limit, the application must ensure that a new order will not exceed the customer’s credit limit. Since Orders and Customers are in different databases, the application cannot simply use a local ACID transaction.

Solution

The Saga pattern is the solution to implementing business transactions spanning multiple microservices.

Saga is basically a sequence of local transactions. For every transaction performed within a Saga, the service performing the transaction publishes an event. The subsequent transaction is triggered based on the output of the previous transaction. And if one of the transactions in this chain fails, the Saga executes a series of compensating transactions to undo the impact of all the previous transactions. It can be implemented in two ways:

  1. Choreography — When there is no central coordination, each service produces and listens to another service’s events and decides if an action should be taken or not.
  2. Orchestration — An orchestrator (object) takes responsibility for a saga’s decision making and sequencing business logic.

The Saga pattern is one of the microservice design patterns that allow you to manage such transactions using a sequence of local transactions. Each transaction is followed by an event that triggers the next transaction step.

If one transaction fails, the saga pattern triggers a rollback transaction compensating for the failure.


Example: Managing multiple eCommerce transactions with a Saga pattern

Here’s an example of an eCommerce application that consists of multiple transactions for orders, payment, inventory, shipping, and notifications. Once an order is generated for a specific product, the next transaction for the payment and the inventory update is initialized.



If the transaction for inventory update fails, for example, due to unavailability of a product, a rollback is triggered. If the transaction for inventory update is successful further transactions are initialized.

Moreover, Saga transactions don’t need to know about the command or the role of other transactions. This allows developers to build simplified business logic with clear separation of concerns.

This pattern is suggested for applications where ensuring data consistency is critical without tight coupling. Likewise, it’s less suitable for applications with tight coupling.


MSA Data mgmt patterns - Shared Database per Service

Problem

We have talked about one database per service being ideal for microservices, but that is possible when the application is greenfield and to be developed with DDD. But if the application is a monolith and trying to break into microservices, denormalization is not that easy. What is the suitable architecture in that case?

Solution

A shared database per service is not ideal, but that is the working solution for the above scenario. Most people consider this an anti-pattern for microservices, but for brownfield applications, this is a good start to break the application into smaller logical pieces. This should not be applied for greenfield applications. In this pattern, one database can be aligned with more than one microservice, but it has to be restricted to 2-3 maximum, otherwise scaling, autonomy, and independence will be challenging to execute.


Shared Database per Service
Use a (single) database that is shared by multiple services. Each service freely accesses data owned by other services using local ACID transactions.

Example
The OrderService and CustomerService freely access each other’s tables. For example, the OrderService can use the following ACID transaction ensure that a new order will not violate the customer’s credit limit.
BEGIN TRANSACTION
SELECT ORDER_TOTAL
 FROM ORDERS WHERE CUSTOMER_ID = ?
SELECT CREDIT_LIMIT
FROM CUSTOMERS WHERE CUSTOMER_ID = ?
INSERT INTO ORDERS …
COMMIT TRANSACTION
The database will guarantee that the credit limit will not be exceeded even when simultaneous transactions attempt to create orders for the same customer.

The benefits of this pattern are:
·         A developer uses familiar and straightforward ACID transactions to enforce data consistency
·         A single database is simpler to operate.

The drawbacks of this pattern are:
· Development time coupling - a developer working on, for example, the OrderService will need to coordinate schema changes with the developers of other services that access the same tables. This coupling and additional coordination will slow down development.
·   Runtime coupling - because all services access the same database they can potentially interfere with one another. For example, if long running CustomerService transaction holds a lock on the ORDER table then the OrderService will be blocked.
·    Single database might not satisfy the data storage and access requirements of all services.

MSA Database Patterns - Database per Service

Problem

There is a problem of how to define database architecture for microservices. Following are the concerns to be addressed:

·         Services must be loosely coupled. They can be developed, deployed, and scaled independently.
·         Business transactions may enforce invariants that span multiple services.
·         Some business transactions need to query data that is owned by multiple services.
·         Databases must sometimes be replicated and shared to scale.
·         Different services have different data storage requirement

 Solution

To solve the above concerns, One database per microservice must be designed; it must be private to that service only. It should be accessed by the microservice API only. It cannot be accessed directly by other services.

There are a few different ways to keep a service’s persistent data private. You do not need to provision a database server for each service. For example, if you are using a relational database then the options are:
·         Private-tables-per-service – each service owns a set of tables that must only be accessed by that service
·         Schema-per-service – each service has a database schema that’s private to that service
·         Database-server-per-service – each service has its own database server.
Private-tables-per-service and schema-per-service have the lowest overhead. Using a schema per service is appealing since it makes ownership clearer. Some high throughput services might need their own database server.

Using a database per service has the following benefits:
·         Helps ensure that the services are loosely coupled. Changes to one service’s database does not impact any other services.
·         Each service can use the type of database that is best suited to its needs. For example, a service that does text searches could use ElasticSearch. A service that manipulates a social graph could use Neo4j.

Using a database per service has the following drawbacks:
·         Implementing business transactions that span multiple services is not straightforward. Distributed transactions are best avoided because of the CAP theorem. Moreover, many modern (NoSQL) databases don’t support them.
·         Implementing queries that join data that is now in multiple databases is challenging.
·         Complexity of managing multiple SQL and NoSQL databases.

There are various patterns/solutions for implementing transactions and queries that span services:
·         Implementing transactions that span services - use the Saga pattern.
·         Implementing queries that span services:
o    API Composition - the application performs the join rather than the database. For example, a service (or the API gateway) could retrieve a customer and their orders by first retrieving the customer from the customer service and then querying the order service to return the customer’s most recent orders.
o    Command Query Responsibility Segregation (CQRS) - maintain one or more materialized views that contain data from multiple services. The views are kept by services that subscribe to events that each services publishes when it updates its data. For example, the online store could implement a query that finds customers in a particular region and their recent orders by maintaining a view that joins customers and orders. The view is updated by a service that subscribes to customer and order events.

It is a good idea to create barriers that enforce this modularity. You could, for example, assign a different database user id to each service and use a database access control mechanism such as grants. Without some kind of barrier to enforce encapsulation, developers will always be tempted to bypass a service’s API and access it’s data directly.

MSA Decomposition Pattern - Decompose by Subdomain

Problem

Decomposing an application using business capabilities might be a good start, but you will come across so-called "God Classes" which will not be easy to decompose. These classes will be common among multiple services. For example, the Order class will be used in Order Management, Order Taking, Order Delivery, etc. How do we decompose them?

Solution

For the "God Classes" issue, DDD (Domain-Driven Design) comes to the rescue. It uses subdomains and bounded context concepts to solve this problem. DDD breaks the whole domain model created for the enterprise into subdomains. Each subdomain will have a model, and the scope of that model will be called the bounded context. Each microservice will be developed around the bounded context.

Note: Identifying subdomains is not an easy task. It requires an understanding of the business. Like business capabilities, subdomains are identified by analyzing the business and its organizational structure and identifying the different areas of expertise.

The subdomains of an Order management include:
·         Product catalog service
·         Inventory management services
·         Order management services
·         Delivery management services

MSA Decomposition Pattern: Decompose by business capability


Pattern: Decompose by business capability

Problem

Microservices is all about making services loosely coupled, applying the single responsibility principle. However, breaking an application into smaller pieces has to be done logically. How do we decompose an application into small services?

Solution

One strategy is to decompose by business capability. A business capability is something that a business does in order to generate value. The set of capabilities for a given business depend on the type of business. For example, the capabilities of an insurance company typically include sales, marketing, underwriting, claims processing, billing, compliance, etc. Each business capability can be thought of as a service, except it’s business-oriented rather than technical.

 e.g.

  • Order Management is responsible for orders.
  • Customer Management is responsible for customers.

Microservice Architecture Design patterns


What are Microservices?
Microservices is an architectural style that structures an application as a collection of small autonomous services, modeled around a business domain. In a Microservice Architecture, each service is self-contained and implements a single business capability.
Principles Used to Design Microservice Architecture
The principles used to design Microservices are as follows:
1.    Independent & Autonomous Services
2.    Scalability
3.    Decentralization
4.    Resilient Services
5.    Real-Time Load Balancing
6.    Availability
7.    Continuous delivery through DevOps Integration
8.    Seamless API Integration and Continuous Monitoring
9.    Isolation from Failures
10. Auto -Provisioning







Application architecture patterns
  • Monolithic architecture -  architect an application as a single deployable unit
  • Microservice architecture - architect an application as a collection of independently deployable, loosely coupled services

Decomposition
  • Decompose by business capability - define services based on business capabilities
  • Decompose by subdomain - define services based on DDD subdomains
  • Self-contained Service - design services to handle synchronous requests without waiting for other services to respond
  • Service per team

Refactoring to microservices

Data management
  • Database per Service - each service has its own private database, Prevents tight coupling.
  • Shared database - services share a database, Temporary solution during migration.
  • Saga - use sagas, which a sequences of local transactions, to maintain data consistency across services
  • API Composition - implement queries by invoking the services that own the data and performing an in-memory join
  • CQRS - implement queries by maintaining one or more materialized views that can be efficiently queried
  • Domain event - publish an event whenever data changes
  • Event sourcing - persist aggregates as a sequence of events
Transactional messaging
  • Transactional outbox
  • Transaction log tailing
  • Polling publisher
Testing
  • Service Component Test - a test suite that tests a service in isolation using test doubles for any services that it invokes
  • Consumer-driven contract test - a test suite for a service that is written by the developers of another service that consumes it
  • Consumer-side contract test - a test suite for a service client (e.g. another service) that verifies that it can communicate with the service
Deployment patterns
  • Multiple service instances per host - deploy multiple service instances on a single host
  • Service instance per host - deploy each service instance in its own host
  • Service instance per VM - deploy each service instance in its VM
  • Service instance per Container - deploy each service instance in its container
  • Serverless deployment - deploy a service using serverless deployment platform
  • Service deployment platform - deploy services using a highly automated deployment platform that provides a service abstraction
Cross cutting concerns
  • Microservice chassis - a framework that handles cross-cutting concerns and simplifies the development of services
  • Externalized configuration - externalize all configuration such as database location and credentials
Communication style
  • Remote Procedure Invocation - use an RPI-based protocol for inter-service communication
  • Messaging - use asynchronous messaging for inter-service communication
  • Domain-specific protocol  - use a domain-specific protocol
External API
  • API gateway  - A single entry point that routes requests to microservices.
  • Backend for front-end - a separate API gateway for each kind of client, ex: Different gateways for Web App, Mobile App, Admin App. Avoids frontend complexity. ex: A mobile user gets compressed JSON tailored for mobile.
  • Client-side discovery - client queries a service registry to discover the locations of service instances
  • Server-side discovery- router queries a service registry to discover the locations of service instances
  • Service registry - a database of service instance locations
  • Self registration - service instance registers itself with the service registry
  • 3rd party registration- a 3rd party registers a service instance with the service registry
Reliability
  • Circuit Breaker  - Stops calling failing services temporarily, so Prevents cascading failures. Tools: Resilience4j, Netflix Hystrix (legacy)
Security
  • Access Token - a token that securely stores information about user that is exchanged between services
Observability
  • Log aggregation - aggregate application logs, ELK (Elasticsearch + Logstash + Kibana)
  • Application metrics - instrument a service’s code to gather statistics about operations, Prometheus, Grafana, Micrometer (Java)
  • Audit logging  - record user activity in a database
  • Distributed tracing  - Track API calls across services. tools: Zipkin, OpenTelemetry
  • Exception tracking - report all exceptions to a centralized exception tracking service that aggregates and tracks exceptions and notifies developers.
  • Health check API - service API (e.g. HTTP endpoint) that returns the health of the service and is intended to be pinged, for example, by a monitoring service
UI patterns
  • Server-side page fragment composition - build a webpage on the server by composing HTML fragments generated by multiple, business capability/subdomain-specific web applications
  • Client-side UI composition - Build a UI on the client by composing UI fragments rendered by multiple, business capability/subdomain-specific UI components

API Gateway Pattern

How to explain in interview

“API Gateway is a single-entry point for all client requests. It handles routing, authentication, rate-limiting, and API aggregation before forwarding to backend microservices.”

Problems it solves

  • Removes client-to-multiple-service complexity
  • Centralized auth
  • Prevents exposing internal microservices
  • Supports canary release & throttling

Use case

Mobile app + web app → need lightweight, optimized API responses.

Java tools

  • Spring Cloud Gateway
  • Netflix Zuul
  • Kong / Nginx

Follow-up question

If API Gateway goes down, whole system fails — how do you avoid?
Answer: Run gateway in active-active mode behind a load balancer, with autoscaling & distributed config.


Backend For Frontend (BFF)

Interview explanation

“BFF creates separate gateways for mobile, web, and partner applications so that each frontend gets a tailored response.”

When needed

  • Mobile app needs fewer fields
  • Web UI needs detailed data

Tools

  • Spring Boot BFFs
  • GraphQL BFF (Netflix uses)

Follow-up

Difference between API Gateway and BFF?
Answer: BFF is client-specific; Gateway is system-wide.


Saga Pattern – (MOST IMPORTANT for architect interviews)

Interview explanation

“Saga handles distributed transactions across microservices using either choreography (events) or orchestration (central controller). It replaces 2PC in microservices.”

Why needed

  • DB per service means no cross-service transactions
  • Ensures consistency using compensating actions

Example

Order → Payment → Inventory
If inventory fails → compensate Payment → compensate Order.

Implementations

  • Choreography: Kafka events
  • Orchestration: Camunda / Temporal / Axon

Follow-up

When to use choreography?
Lightweight, fewer services, event-driven.

When to use orchestration?
Complex workflows, error handling, timeouts.


CQRS (Command Query Responsibility Segregation)

Interview explanation

“CQRS separates write (commands) and read (queries) models to improve performance, scalability, and optimizes read-heavy systems.”

Benefits

  • Read DB can be NoSQL & denormalized
  • Commands enforce validations
  • Event sourcing integrates naturally

Use case

  • Banking
  • Wallet applications
  • Order tracking dashboards

Follow-up

Downsides of CQRS?
More complexity, eventual consistency, duplicate databases.


Event Sourcing

Interview explanation

“Instead of saving final state, we store all events. System state can be rebuilt by replaying events.”

Benefits

  • Perfect audit history
  • Rollback to past states
  • High write performance

Use case

  • Ledger systems
  • Audit-driven domains
  • Payment & transactions

Java tools

  • Axon
  • EventStoreDB

Follow-up

How do you prevent event store from becoming too large?
Snapshotting after fixed number of events.


Strangler Fig Pattern (Migration Pattern)

Interview explanation

“We gradually replace a monolith by routing specific functionality to a new microservice until the monolith is fully replaced.”

When needed

  • Legacy → Microservices migration
  • Zero downtime rollout

Example

Move inventory functionality from monolith → microservice → redirect traffic via gateway.

Follow-up

How do you ensure backward compatibility?
Versioned APIs and backward-compatible schema changes.


Database per Service Pattern

Interview explanation

“Each microservice owns its database. This decouples services, enables independent scaling, and avoids distributed locking.”

Challenges solved

  • No shared schema
  • No runtime coupling
  • Independent release cycles

Hard question

How do you maintain data integrity without foreign keys?
Using Saga, events, or domain-driven consistency guarantees.


Outbox Pattern

Architect-level explanation

“Outbox ensures reliable event publication using a single local transaction: write to DB + write event into outbox table.”

Then a background job publishes events to Kafka.

Why needed

  • Prevents lost events during crashes
  • Ensures atomicity without 2PC

Follow-up

How do you implement Outbox efficiently?
With Debezium CDC + Kafka; avoids custom polling.


Circuit Breaker Pattern

Interview explanation

“Circuit breaker stops calling failing services and gives fallback immediately. This avoids cascading failures.”

Java tools

  • Resilience4j (recommended)
  • Hystrix (deprecated but still asked)

State transitions

  • Closed → Open → Half-Open → Closed

Follow-up

Difference between retry & circuit breaker?
Retry handles temporary failures; circuit breaker handles persistent failures.

Bulkhead Pattern

Interview explanation

“Bulkhead isolates resources (thread pools, connections) so failure in one service doesn’t affect others.”

Example

If Payment API is slow → only its pool is consumed → Order API still runs.


1️Retry + Timeout Pattern

Interview explanation

“Retries must be combined with timeouts & backoff to avoid retry storms.”

Java

  • Resilience4j Retry
  • Spring Retry

Follow-up

How many retries?
Based on SLA, usually 2–3 with backoff.


Sidecar + Service Mesh Pattern

Interview explanation

“Service mesh offloads observability, security, and traffic control to sidecars instead of writing logic inside services.”

Features

  • mTLS
  • Retries
  • Circuit breaking
  • Canary & blue-green rollout

Tools

  • Istio / Linkerd

Follow-up

How is service mesh different from API gateway?
Gateway is north-south traffic; mesh is east-west traffic.


Centralized Configuration Pattern

Interview explanation

“Configuration is stored centrally and fetched at runtime.”

Java tools

  • Spring Cloud Config
  • Consul
  • AWS Parameter Store

Benefits

  • Secure secret handling
  • No restart needed
  • Versioned config

Distributed Logging & Tracing Pattern

Interview explanation

“Microservices generate logs in multiple nodes, so we need centralized log aggregation and distributed tracing.”

Tools

  • ELK
  • OpenTelemetry
  • Zipkin
  • Jaeger

Follow-up

How do you trace one request across 20 microservices?
Pass correlation ID through all services (propagated via headers).


Aggregator Pattern

Interview explanation

“Aggregator composes data from multiple microservices into a single response.”

Use case

  • Dashboard API
  • Combining Order + User + Inventory data

Anti-Corruption Layer (ACL)

Interview explanation

“ACL isolates microservices from legacy systems using translation adapters.”

Use case

Legacy SOAP → ACL → New REST Microservice.

Service Registry and Discovery:

    • Pattern: Service Discovery
    • Description: A service registry is used to keep track of the available microservices and their locations. Services register themselves with the registry, and clients can discover and communicate with services through the registry.
    • Benefits: Enables dynamic scaling and seamless service discovery.

 Event Sourcing:

    • Pattern: Event Sourcing
    • Description: Instead of storing only the current state of data, all changes to the state are captured as a sequence of events. The system's state can be reconstructed at any point by replaying the events.
    • Benefits: Provides a reliable audit trail, supports scalability, and allows for building event-driven architectures.
Bulkhead Pattern:
    • Pattern: Bulkhead Pattern
    • Description: Segregates components into isolated pools to prevent the failure of one component from affecting others. For example, thread pools can be used to isolate requests to a specific microservice.
    • Benefits: Enhances system resilience by isolating failures and limiting their impact.

Choreography vs. Orchestration:
    • Pattern: Choreography and Orchestration
    • Description: Defines how microservices collaborate to achieve a specific goal. Choreography is a decentralized approach where each service communicates directly with others. Orchestration is a centralized approach where a central component (orchestrator) coordinates the interactions.
    • Benefits: Flexibility with choreography and centralized control with orchestration.

API Composition:
    • Pattern: API Composition
    • Description: Instead of relying on a single microservice to fetch and aggregate data from multiple sources, each microservice retrieves and composes its own data through API calls. This pattern helps maintain independence and scalability.
    • Benefits: Reduces dependencies between microservices, allowing them to evolve independently.

API Composition Patterns

Aggregator Pattern

One component calls multiple microservices and aggregates data.

Ideal for:

  • Dashboard APIs
  • Mobile apps

Strangler Fig Pattern

Used for migrating monolith → microservices.

Gradually replace old modules with new microservices.

Anti-Corruption Layer

Protect microservices from legacy systems.

Example:

  • Converting old SOAP response → REST JSON
  • Data format mapping