What does a Solution Architect do?
A solution
architect designs the end-to-end technical solution for a business
problem.
Key
Responsibilities
- Understand
requirements (functional + non-functional)
- Create
HLD + LLD
a. HLD
→ What components? Why?
b. LLD
→ How each module works internally
- Choose
tech stack & cloud components
- Define
API contracts
- Define
scaling, HA, performance, security
- Review
code and architecture
- Work
with dev, DevOps, business, security teams
NFRs
(Non-Functional Requirements)
Important
NFRs in design rounds:
- Scalability (vertical/horizontal) - Scalability
is the ability of a system to handle increasing load by adding resources.
a. Vertical
scaling: Adding
more CPU/RAM to a single machine.
b. Horizontal
scaling: Adding
more machines (preferred for cloud/distributed systems).
- Availability (SLA, failover) - A system
that continues to perform even during failures.
Achieved
by:
a. Multi-AZ
deployments
b. Redundancy
c. Load
balancers
d. Health
checks & failover mechanisms
- Performance (latency, throughput)
a. Latency: time for one request
b. Throughput: requests per second
- Consistency (strong vs eventual)
a. Strong
Consistency -
After an update is made to the data, it will be immediately visible to any
subsequent read operations. In simple way —reads always reflect the latest
write.
b. Eventual
consistency - After an
update is made to the data, it will be eventually visible to any subsequent
read operations. The data is replicated in an asynchronous manner, ensuring
that all copies of the data are eventually updated.
- Security (IAM, auth, encryption)
- Fault tolerance (retry, circuit breaker) - The system should continue functioning even if components fail.
- Maintainability
- Observability (metrics, logs, traces)
Examples:
|
Choice |
Pros |
Cons |
|
Monolith |
Simple,
fewer network calls |
Hard
to scale independently |
|
Microservices |
Scalable,
flexible |
Complexity,
distributed issues |
|
SQL |
Strong
ACID |
Not
good for massive scale |
|
NoSQL |
Fast,
scalable |
Eventual
consistency |
“A
solution architect balances business requirements and engineering constraints,
designs a scalable & secure solution, and makes trade-offs considering
cost, performance, and complexity.”
Architect
Decision Framework (CRISP)
Use
this CRISP Framework for every architecture decision:
C
— Constraints
Tech
limits, team skill, timeline, data size
R
— Requirements
Functional
+ NFRs
I
— Impact
Latency,
cost, reliability, operability
S
— Scalability
Handling
growth (10x, 100x)
P
— Patterns
Use
suitable architecture patterns
Distributes incoming traffic across multiple servers to:
- High
availability
- Fault
tolerance
- Scalability
- Even workload distribution
Types: L4 (works on TCP/UDP), L7 (HTTP/HTTPS), Global Load Balancers.
|
Function |
Explanation |
|
Distributes
Traffic |
Spreads
requests across multiple servers (round-robin, least connections, IP hash). |
|
Health
Checks |
Sends
traffic only to healthy backend servers. |
|
Failover |
If one
instance goes down, LB routes to other healthy instances. |
|
SSL
Termination |
Can
terminate HTTPS before traffic reaches backend. |
|
Global
Traffic Routing |
In cloud
LB, routes based on geo, latency, etc. |
What
is API Gateway?
An
API Gateway is a full-featured API management layer that sits in front of
microservices and controls how APIs are accessed by clients.
Used
in microservices as single entry point.
|
Feature |
Explanation |
|
Routing |
Routes
API requests to correct microservice. |
|
Authentication & Authorization |
JWT,
OAuth2, API Keys, SSO, IAM. |
|
Rate Limiting |
Protects
services from overload. |
|
Throttling / Quotas |
Ensures
fair usage. |
|
Request Transformation |
Rewrite
headers, body, URL. |
|
API Versioning |
/v1
→ service1, /v2 → service2. |
|
Logging & Monitoring |
API
analytics, tracking, tracing. |
|
Circuit Breaker / Retry / Timeout |
Resilience
patterns. |
|
Security Policies |
CORS,
WAF, IP whitelisting. |
- Kong
- Apigee
- AWS
API Gateway
- Azure
API Management
- NGINX
API Gateway
- Zuul
/ Spring Cloud Gateway
Executing
same request multiple times produces same result.
Used in:
- Payment
APIs
- Retry
mechanisms
- Messaging
✔️ Idempotent/ Non-Idempotent HTTP Methods
|
HTTP
Method |
Idempotent? |
Why |
|
GET |
✔️ Yes |
Reading
does not change state |
|
PUT |
✔️ Yes |
Replacing
resource with same data gives same result |
|
DELETE |
✔️ Yes |
Resource
stays deleted after first call |
|
HEAD
/ OPTIONS |
✔️ Yes |
Metadata
only |
|
POST |
❌
No |
Creates
new resources each time |
These are
the foundation of designing scalable, maintainable, high-quality
systems.
Every architecture interview will test these.
1. SOLID
Principles (Core Software Design Principles)
1.
Single Responsibility Principle (SRP)
A
class/module should have only one reason to change.
- Leads
to high cohesion
- Easier testing and maintainability
2.
Open/Closed Principle (OCP)
Software
should be open for extension but closed for modification.
- Add
new behavior without altering existing code
- Uses interfaces, inheritance, strategy pattern
3.
Liskov Substitution Principle (LSP)
Child
classes must be replaceable for parent classes without breaking
functionality.
4.
Interface Segregation Principle (ISP)
Prefer smaller,
specific interfaces over large, general ones.
Clients should not depend on unused methods.
5.
Dependency Inversion Principle (DIP)
Depend on abstractions,
not concrete implementations.
This improves modularity, testability, flexibility.
DRY
(Don’t Repeat Yourself) -Avoid
duplication.
Duplicate logic → bugs + high maintenance.
KISS
(Keep It Simple, Stupid)- Prefer
simple architectures over complex ones.
Avoid over-engineering.
YAGNI
(You Aren’t Gonna Need It)- Do
not build features until necessary.
Prevents wasted effort and unnecessary complexity.
3. Coupling
& Cohesion
Loose
Coupling- Components
should have minimum dependency on each other.
Helps in scaling, modifying, testing.
High
Cohesion- Each
component should do one focused task.
Improves maintainability.
“High cohesion + loose coupling = resilient and maintainable architecture.”
4. Separation
of Concerns (SoC)
Break the
system into distinct sections, each handling one concern.
Examples:
- UI layer
- API layer
- Business logic layer
- Storage layer
Prevents
mixing responsibilities.
5. 12-Factor
App Principles (Cloud Architecture)
Must-know
for architect roles.
- Codebase
- Dependencies
- Config
in environment variables
- Backing
services
- Build,
release, run
- Processes
(stateless)
- Port
binding
- Concurrency
- Disposability
- Dev/prod
parity
- Logs
as event streams
- Admin
processes
6. Scalability
Principles
Scale
Out (Horizontal) → Add servers
Scale
Up (Vertical) → Increase resources
Key
concepts:
- Stateless
services
- Load
balancing
- Sharding
- Caching
- CQRS
7. Performance
& Optimization Principles
1.
Reduce latency —
use caching, CDN, async calls
2.
Increase throughput — parallelism,
streaming
3.
Favor asynchronous IO over synchronous
4.
Apply back-pressure in event systems
8. Security by Design
- Least
privilege
- Secure
by default
- Encryption
at rest + in transit
- Zero
Trust
- API
rate limiting
- Secrets
management (Vault, KMS)
9. Fail-Fast, Fault-Tolerance & Resilience
Patterns:
- Circuit
Breaker
- Retry
with backoff
- Bulkhead
- Timeouts
- Health
checks
- Graceful
degradation
10. Observability
Principles
Three
pillars:
- Logs
- Metrics
- Traces
Tools:
- Prometheus
- Grafana
- Jaeger
- ELK
stack
11. API
Design Principles
Good
APIs are:
- Predictable
- Consistent
- Versioned
- Stateless
- Idempotent
- Secure
- Well-documented
Use:
- REST
- GraphQL
- gRPC
based on need.
12. Maintainability
Principles
- Modular
architecture
- Readable
code
- Automation
(CI/CD)
- Monitoring
- Unit/e2e
testing
- Clear
boundary definitions
13. Architectural
Trade-Offs
Architects
constantly balance:
|
Area |
Trade-off |
|
Consistency vs Availability |
CAP
theorem |
|
Performance vs Cost |
Cloud
cost optimization |
|
Security vs Usability |
Login
friction |
|
Build vs Buy |
Time
vs customization |
|
Standardization vs Flexibility |
Team
skillsets |
14. Design
for Change (Evolutionary Architecture)
- Feature
toggles
- Canary
releases
- Blue-green
- Backward
compatibility
- Strangler
Fig pattern
Distributed System Concepts
CAP
Theorem
You
can choose only TWO:
- Consistency
- Availability
- Partition
Tolerance
Real
systems always choose Partition Tolerance.
- Only
2 can be guaranteed simultaneously.
Examples:
- CP
→ HBase
- AP
→ Cassandra
- CA
→ Not possible in distributed
Queues: Point-to-point, worker processing
Examples: SQS, RabbitMQ
Streams: Publish-subscribe with replay
Examples: Kafka
Synchronous
vs Asynchronous Communication
Sync =
blocking
Async = messaging/event-driven
Interview
tip:
“Asynchronous communication improves resilience and decoupling
Database
Sharding & Partitioning
Sharding →
Split across nodes
Partitioning → Split inside same DB
Improves read/write throughput
Leader
Election
Used
in distributed systems for:
- Master
node
- Coordinator
- Lock
holder
Tools: Zookeeper, Raft, Paxos
Event
Sourcing + CQRS
CQRS =
separate read/write models
Event Sourcing = state is sequence of events
Always set
timeouts for network calls.
Consistent
Hashing
Used in
caching clusters and load balancing.
Microservices
Core Patterns
1. API
Gateway Pattern
What it
is:
A single entry point for all microservices. Clients never directly call backend
services.
What
it does:
- Routing
to correct microservice
- Authentication
/ Authorization
- Request/Response
transformation
- Rate
limiting & throttling
- Logging
& monitoring
- Load balancing (L7)
Why
used:
Avoid exposing internal services directly; central control.
Examples:
Kong, Nginx, AWS API Gateway, Spring Cloud Gateway
Interview
Tip:
“API Gateway simplifies client communication and enforces cross-cutting
concerns in one place.”
2.
Service Registry & Discovery Pattern
What it
is:
Dynamic discovery of microservice endpoints.
How
it works:
- Services
register themselves at startup (self-registration)
- Other
services use registry to find them (lookup)
Why
used:
Microservices scale up/down frequently → IPs change → avoid hardcoding.
Tools:
Eureka, Consul, Zookeeper, Kubernetes DNS
3.
Circuit Breaker Pattern
What it
is:
A protection mechanism to stop calling a failing service.
How
it works:
- If
failures exceed threshold → circuit opens
- Requests
fail fast (no waiting)
- After
cooldown → half-open → test requests
- If
recovered → closed
Why
used:
Prevents cascading failures in distributed systems.
Tools:
Resilience4j, Hystrix
Interview
Tip:
“It improves fault tolerance by isolating failures.”
4.
Bulkhead Pattern
What it
is:
Isolates resources (threads, memory, connection pools) per service or function.
Why
used:
Prevents one service/thread pool from exhausting resources and crashing others.
Example:
Each microservice gets its own thread pool → if one fails, others unaffected.
5.
Sidecar Pattern
What it
is:
A helper container that runs alongside the main application container.
What
the sidecar does:
- Logging
- Proxying
- Service
mesh tasks
- Monitoring
- Security
policies
Why
used:
Separates responsibilities; no need to embed infrastructure code inside
service.
Examples:
Envoy, Istio sidecar proxy
6.
Strangler Fig Pattern
What it
is:
A safe strategy to migrate a monolith to microservices gradually.
How
it works:
- Route
one functionality from monolith → new microservice
- Slowly
replace modules one-by-one
- Monolith
shrinks over time
Why
used:
No big-bang rewrite; low risk.
7. Saga
Pattern
What it
is:
A way to manage distributed transactions across microservices.
Two
types:
- Choreography
Saga —
services communicate via events
- Orchestration
Saga —
central coordinator orchestrates steps
Why
used:
Avoid 2-phase commit. Ensures eventual consistency.
Use
Case:
Order creation → payment → inventory → shipping
8. CQRS
(Command Query Responsibility Segregation)
What it
is:
Separate the write model and read model into independent systems.
Why
used:
- Reads
need speed → denormalized data
- Writes
need correctness → normalized data
- High
performance at massive scale
Example:
Writes → PostgreSQL
Reads → ElasticSearch
9.
Event Sourcing
What it
is:
Instead of storing final state, store a sequence of events that produced
the state.
Why
used:
- Full
audit history
- Replay
events to rebuild state
- Works
well with CQRS
Use
Case Examples:
Bank account:
Deposit + Withdraw events → final balance
10.
Aggregator Pattern
What it
is:
A single service or layer that calls multiple microservices and returns a
combined response.
Why
used:
Reduces client calls → improves performance.
Example:
Mobile app needs:
- User
Profile
- Orders
- Notifications
Aggregator fetches all → returns single response.
11.
Database per Service Pattern
What it
is:
Each microservice has its own database (schema or physical DB).
Why
used:
- Loose
coupling
- Independent
scaling
- Independent
schema evolution
- Avoid
cross-service locking
Important
Rule:
❌ No
sharing database directly
✔
Communicate via APIs or events
12.
Anti-Corruption Layer (ACL)
What it
is:
Adapter layer to protect microservices from legacy system complexity.
Why
used:
Prevents legacy models & logic from polluting new microservices.
Example:
Mapping old SOAP XML → new REST JSON.
13.
Retry Pattern
What it
is:
Automatically retry failed operations with backoff.
Why
used:
Network errors are temporary in distributed systems.
Guidelines:
- Retry only idempotent
operations
- Use exponential backoff
- Use max retry limit
14.
Timeout Pattern
What it
is:
Every external call has a maximum waiting time.
Why
used:
Avoid threads waiting forever → improves system health.
Interview
Tip:
“Timeout + Retry + Circuit Breaker = resilient system.”
15.
Distributed Logging
What it
is:
Centralize logs of all microservices in one place.
Why
used:
Easy troubleshooting & debugging.
Tools:
ELK Stack, EFK, Splunk
16.
Distributed Tracing
What it
is:
Tracking a user request across multiple microservices using trace IDs.
Why
used:
Identify bottlenecks, failures, slow services.
Tools:
Jaeger, Zipkin, OpenTelemetry
17.
Idempotent Consumer Pattern
What it
is:
Consumer processes the same message multiple times safely.
Why
used:
Ensures safe retries in event-driven systems.
Examples:
- Kafka offset checks
- Upserts instead of inserts
- Ignore if already processed
1. Architecture Styles
Monolithic
Architecture
Definition:
Entire application packaged as a single deployment unit.
Advantages:
- Simple
to develop & deploy
- Easy
for small teams
Disadvantages:
- Scaling
is difficult
- One
failure can bring the entire app down
Example:
Spring Boot application packaged as a single WAR/JAR.
Interview
tip:
Explain why monoliths are good for early stages and why microservices come
later.
Microservices
Architecture
Definition:
Application broken into small, independent services with their own DB and CI/CD
pipeline.
Key
principles:
- Loose
coupling
- High
cohesion
- Independent
deployability
- Polyglot
(any language/DB)
Patterns:
- API
Gateway
- Service
Registry
- Circuit
Breaker
- Event-Driven
Architecture
- Saga
Pattern
Interview
tip:
Explain that microservices solve organizational scaling, not technical scaling
only.
Event-Driven
Architecture
Definition:
Services communicate asynchronously through events.
Tools: Kafka, RabbitMQ, AWS SNS/SQS
Benefits:
- Loose
coupling
- High
scalability
- Better
performance for large workloads
Interview
tip:
Explain difference between Event Notification vs Event Carried State
Transfer.
Serverless
Architecture
Definition:
Running functions without managing servers.
Platforms: AWS Lambda, Azure Functions
Advantages:
- Auto
scaling
- Pay
per use