TeachToJava: Software Architecture Interview Questions

Why should one use WebSocket over HTTP?

WebSocket protocol is a newer, more efficient way of creating real-time communication between a client and server compared to traditional HTTP. This two-way communication channel allows for the continual exchange of data, eliminating the need for frequent requests from one side and processing from the other. Given that current web applications demand faster responses from servers, WebSockets provide superior performance by reducing response latency and providing smoother data flow.

That higher throughput not only helps with better user experiences but can also lead to substantial cost reductions due to more efficient use of network resources. In addition, WebSocket connection pooling can further reduce latency by allowing multiple messages to be transferred using a single TCP connection. Ultimately, using a WebSocket protocol over an HTTP based request can have far reaching benefits when it comes to application speed, performance and cost savings.

Explain separation of concerns in software architecture.

Separation of Concerns (SoC) is a fundamental design principle that advocates for the division of a system into distinct sections—each addressing a particular set of functionalities.

Elements of SoC

· Single Responsibility: Modules, classes, and methods should only have one reason to change. For example, a User class should handle user data, but not also display user data.

· Low Coupling: Components should minimize their interdependence.

· High Cohesion: Elements within a component should pertain to the same functionality.

Define a system quality attribute and its importance in software architecture.

System quality attributes, often referred to as non-functional requirements, complement functional requirements in shaping a system's architecture and design. They define characteristics centering around reliability, maintainability, and other performance aspects.

Developers aim to ensure these attributes right from the conceptual stages, thus they shape the software's foundation and guide architectural decision-making throughout the development cycle.

Importance of Quality Attributes

Holistic User Experience

While functional requirements capture what the system should do, quality attributes capture how well it should do it. Together, these define a user's complete experience with the system.

Design Focus

Quality attributes help guide the system design, ensuring that not only the main features but also the system's overall environment, security, and utility are optimized and secure.

Balanced Decisions

Engineering decisions must often balance competing objectives. Quality attributes help communicate these objectives, making it easier for architects and developers to make informed decisions.

Technical Compatibility

Different quality attributes can complement or contradict each other, and it's important to balance them to ensure a cohesive, efficient system.

Common System Quality Attributes

1. Performance: Describes the system's responsiveness, throughput, and resource consumption levels, typically under specific conditions. For instance, the system might need to perform optimally when handling a large number of concurrent users or a heavy workload.

2. Reliability: Refers to the system's ability to perform consistently and accurately, without unexpected failures. Systems with high reliability often integrate fault tolerance mechanisms and have a defined recovery strategy in place, like data backups or redundant components.

3. Availability: This attribute specifies the system's uptime and accessibility. It's often expressed as a percentage of time the system is expected to be operational. For example, "99.99% uptime."

4. Security: A mandatory system attribute that addresses the protection of data and resources from unauthorized access, breaches, or corruption. It is vital for systems where data confidentiality, integrity, and availability are paramount.

5. Maintainability: Represents a system's ease of maintaining or modifying its components. It focuses on the efficiency of repairs, upgrades, and adaptions. Key metrics include time for updates, code complexity post-changes, and number of errors after a modification.

6. Portability: Defines a system's adaptability to run across different environments, such as diverse hardware, operating systems, or cloud providers. A more portable system is generally preferred as it offers flexibility and future-proofing.

7. Scalability: Refers to the system's ability to accommodate growing workloads. It might be realized through vertical scaling (upgrading hardware) or horizontal scaling (adding more instances).

8. Usability: Emphasizes the system's ease of use and intuitive operation, catering to user experience aspects.

9. Interoperability: Describes a system's capability to communicate and share data with other systems or components, and its compatibility with different technologies.

10. Testability: The degree to which a system facilitates the generation of test cases and testing processes.

11. Flexibility: Represents the system's capacity to adapt to new situations through customization.

Key Performance Indicators

Each quality attribute can measure its adherence, typically using quantitative metrics or key performance indicators (KPIs):

· Performance: Utilization metrics, response times, and throughput.

· Reliability: Measured often in uptime percentages.

· Availability: Can be measured using uptime metrics, such as "five nines" (99.999%).

· Security: Can be evaluated using penetration testing results, compliance indicators, security frameworks adhered to, and specific security protocols' success rates.

Architectural Decisions

Architectural patterns and styles, as well as design strategies, are thoroughly informed by quality attributes, ensuring that the completed software system best meets its operational goals.

For instance, a system focusing on high availability like a cloud-based ERP might adopt a microservices architecture and utilize load balancers and auto-scaling clusters.

In contrast, a system that requires high reliability, such as a medical equipment monitoring system, might employ a modular architecture with strict data consistency mechanisms and undergo stringent testing procedures.

Describe the concept of a software architectural pattern.

A Software Architectural Pattern is a proven, structured solution to a recurring design problem. These patterns offer a blueprint for conceptualizing systems and addressing common challenges in software architecture. They provide a vocabulary for developers and ensure commonly-faced problems are solved in a consistent manner.

Common Architectural Patterns

1. Layers: Segregates functionality based on roles like presentation, domain logic, and data access.

2. MVC: Divides an application into three interconnected components: Model, View, and Controller, each with specific responsibilities.

3. REST: Utilizes common HTTP verbs and status codes, along with stateless communication, for easy data transfer in client-server setups.

4. Event-Driven: Emphasizes communication through events, with publishers and subscribers decoupled from one another.

5. Microkernel: Centralizes core operations in a lightweight kernel, while other services can be dynamically loaded and interact via messaging.

6. Microservices: Distributes applications into small, independently deployable services that communicate via network calls.

7. Space-Based: Leverages a distributed data grid for data sharing and event-driven workflows.

8. Client-Server: Divides an application into client-side and server-side components, with the server providing resources or services.

9. Peer-to-Peer (P2P): Emphasizes equal share in roles for nodes, promoting decentralized communication and resources sharing.

10. Domain-Driven Design (DDD): Encourages close alignment between development and a domain model, integrating logic and data into one unit.

Best Practices for Architectural Patterns

· Understanding before Application: Ensure you are truly solving a problem specific to your context, and not adopting a pattern prematurely or needlessly complicating your design.

· Design Flexibility: A good architecture allows for future changes and is not overly rigid or exhaustive without reason.

· Code Reusability: Aim to minimize duplication by design and code reuse strategies.

· Separation of Concerns: Each component should have a clear, singular role, and should need to understand as little as possible about the rest of the system.

· Scalability: The architecture should be able to scale with complexity, requirements, and user load.

· Maintainability: It should be relatively easy to debug, enhance, and maintain the system.

· Security and Compliance: Your architecture should account for security standards in your domain, including data protection laws and best practices.

· Clear Communication: Developers should share a consistent vocabulary to understand the architecture, especially when collaborating on the system.

Real-World Examples

· MVC: It is widely used in web applications, where the model represents data, the view displays the data, and the controller handles user inputs.

· Microservices: This architecture is prevalent in cloud-based systems like Netflix and Amazon. Services are loosely coupled and focus on specific business functionalities.

· Event-Driven: Used in various applications like chat systems, stock trading platforms, and IoT solutions where real-time data processing is crucial.

· Space-Based: Apache Spark and other big data technologies often use it for stream and batch processing.

· Client-Server: Common in web and mobile applications, where the server serves as a centralized resource.

· P2P: Popular in file-sharing applications and some blockchain implementations like BitTorrent and Bitcoin.

· DDD: Many enterprise-level applications, CRM systems, and portfolio management tools leverage this model for a more domain-focused design approach.

What is CAP theorem?

CAP theorem is an important concept when it comes to distributed systems. It stands for Consistency, Availability, and Partition Tolerance. These three elements are at the core of any distributed system and understanding how they work together is essential for designing a reliable system.

Consistency

Consistency in distributed systems refers to the guarantee that all nodes in a network have the same view of data. This means that when a user requests data from one node, they should get the same response regardless of which node they query. Consistent systems ensure that all nodes will have access to the most up-to-date version of the data, making them highly reliable.

Availability

The availability element of CAP theorem guarantees that users will always be able to access their data on demand. This means that if a user sends a request to one node in the network, they can expect to receive a response within a certain amount of time. If one node fails or goes offline, another should be able to step in and provide service without any interruption or significant delay.

Partition Tolerance

Partition tolerance refers to how distributed systems handle network outages or communication errors between nodes. A partition-tolerant system is designed with redundancy built-in so that if one node fails, other nodes can pick up the slack and keep things running smoothly. In other words, even if some parts of the network become unavailable due to an outage or error, others can still provide service without interruption or degradation in quality.

What is shared-nothing architecture, and where does it apply?

Shared nothing architecture is a distributed computing model where each node operates independently, without sharing memory or disk storage with other nodes. This approach enhances scalability and fault tolerance.

Shared nothing architecture applies to systems requiring high scalability and availability, such as big data processing and distributed databases. This architecture minimises bottlenecks and improves system reliability and performance by avoiding shared resources.

What is eventual consistency?

Eventual consistency is a state of a system where changes to the data eventually take effect, but not necessarily immediately. It describes when a system has divergent states until all updates are synchronized across all nodes in the system. In essence, this is because distributed systems may be unable to always guarantee that operations on the same data occur in the same order across different nodes in a network. Because of this property of eventual consistency, it can be beneficial for certain scenarios, such as providing higher availability and better scalability where other properties such as strong consistency cannot be guaranteed.

What is Sticky Session load balancing?

Sticky session load balancing involves intelligently routing requests from one user to the same web server, ensuring any session-specific data remains connected between subsequent requests. It goes beyond traditional load balancing as it seeks to ensure interactions with a single server are always routed back so that idle time is used effectively and efficiently.

The ability of any given load balancer to handle sticky sessions depends upon its capabilities and understanding of which aspects of corresponding requests are considered "session dependent". Through this process, clients don't have to worry about data retrieval call times or bandwidth issues as the request can be routed back directly to the current web server without latency delays in loading that result from different machines handling each request.

What is the BASE property of a system?

The acronym stands for "Basically Available, Soft state, Eventual consistency" and is used to describe distributed data stores management. Basically Available represents that the system should aim for 100% availability and every read should return data. Soft State refers to dynamic data that can transition from one state to another without manual intervention, although this doesn't necessarily mean distributed systems need infinite memory to store inconsistent updates. Finally, Eventual Consistency means that updates may not exist or be reflected in the data store everywhere at once, but with given time they will eventually all become consistent everywhere.

Explain cache stampede?

Cache stampede is a situation where multiple processes or threads are trying to access the same data in a cache, resulting in thrashing and performance degradation.

There are several factors that can lead to cache stampede, including:

Lack of coherency: If the data in the cache is not coherent with the data in memory, then multiple processes will try to update the data simultaneously, leading to contention.
Contention: If multiple processes are trying to access the same data in the cache, they will contend for resources and may cause the cache to become overloaded.
Invalidation: If a process invalidates another process's cached data, then that process will have to go to memory to fetch the data, which can lead to high latency and performance degradation.

To avoid cache stampede, it is important to ensure that data is coherent across all caches, and that there is no contention for resources. Invalidation should also be avoided if possible. In cases where it is necessary, care should be taken to ensure that only the minimum amount of data is invalidated.

What does cohesion mean in software architecture?

Cohesion is a measure of how well the elements of a software module work together. A highly cohesive module is one in which the elements are tightly coupled, or related, to each other. In other words, they work together well. A low-cohesion module, on the other hand, is one in which the elements are loosely coupled, or unrelated, to each other. This can lead to problems because the elements may not work together as well as they should.

There are several factors that can contribute to cohesion, including functionality, data organization, and control structure. Functionality is probably the most important factor. A highly cohesive module is one that performs a single, well-defined task. For example, a module that calculates payroll taxes would be considered highly cohesive because it has a very specific function. On the other hand, a module that contains a collection of unrelated functions would be considered low-cohesive because it lacks a clear purpose.

Data organization is also important for cohesion. A highly cohesive module is typically organized around a single data object or entity. For example, a module that stores customer information would be considered highly cohesive because all of the data in the module is related to customers. A low-cohesive module, on the other hand, would be one in which the data is organized in an arbitrary or illogical manner. This can make it difficult to understand and use the data properly.

Finally, control structure can also affect cohesion. A highly cohesive module typically has a simple control structure with few branching points. This makes it easy to follow the flow of execution through the code. A low-cohesive module, on the other hand, may have a complex control structure with many branching points. This can make the code difficult to understand and maintain.

Can you name some performance testing metrics?

As a software architect, performance testing is an important part of making sure the software you develop runs smoothly and efficiently. Performance testing allows you to measure how well your code is running against various metrics such as load time, response time and throughput.

Load Time – Load time is a measure of how long it takes for the system under test to complete a single action or task. It’s typically measured in milliseconds (ms). The lower the load time, the better the system will perform when multiple users are accessing it at once.
Response Time – Response time measures how quickly the system responds to user input. This includes both the time it takes for the server to respond to requests as well as any visual feedback given by the application itself. A good response time should be below one second (1s) so that users don’t get frustrated waiting for results.
Throughput – Throughput measures how many tasks can be completed in a given amount of time and is often expressed as requests per second (rps). This metric is useful for measuring scalability and can help identify potential bottlenecks in your architecture. Generally, higher throughput indicates better performance and more efficient code execution.
Memory Usage – Memory usage measures how much memory your system uses while performing certain tasks, such as loading webpages or running database queries. Measuring memory usage can help spot areas where optimization is needed and ensure that your system doesn’t become overwhelmed by too many requests at once.

SOLID stands for what? What are its principles?

SOLID stands for five basic principles of object-oriented programming and design. The acronym was first used by Robert C. Martin in his 2000 paper Design Principles and Design Patterns. The five principles are Single Responsibility Principle (SRP), Open/Closed Principle (OCP), Liskov Substitution Principle (LSP), Interface Segregation Principle (ISP), and Dependency Inversion Principle (DIP). These principles help software architects create better code that is easier to understand, maintain, and extend.

Single Responsibility Principle (SRP)

The Single Responsibility Principle states that every class should be responsible for one thing only, or in other words, it should have only one reason to change. This helps prevent code from becoming overly complicated or hard to read while also making it easier to debug any issues that may arise. By limiting the scope of a class, developers can create more modular code that can easily be reused in different contexts.

Open/Closed Principle (OCP)

The Open/Closed Principle states that classes should be open for extension but closed for modification. This allows developers to extend the functionality of their code without needing to make changes to existing classes or functions. It also helps ensure code stability since new features can be added without breaking existing ones.

Liskov Substitution Principle (LSP)

The Liskov Substitution principle states that any parent class should be able to be substituted with any child class without affecting the correctness of the program. This principle helps ensure that objects behave as expected when they are passed around between different parts of an application or library.

Interface Segregation Principle (ISP)

The Interface Segregation Principle states that interfaces should not be too generic but instead should provide specific interfaces tailored to each client’s needs. This allows developers to create interfaces that are focused on specific tasks while still providing enough flexibility so they can easily be adapted if needed in the future.

Dependency Inversion Principle (DIP)

The Dependency Inversion principle states that high-level modules should not depend on low-level modules but rather both should depend on abstractions instead. This helps ensure loose coupling between components which makes them easier to test, maintain, and reuse across multiple projects or applications.

Describe YAGNI in more detail?

YAGNI stands for “You Ain’t Gonna Need It.” YAGNI is a principle that was first introduced in Extreme Programming (XP). It encourages software developers to focus on the current requirements of a project and not code for future possibilities that may never be needed. This approach helps to prevent over-engineering and excessive complexity when developing software applications. In other words, it is about writing code only when there is a need for it now or in the near future – not speculatively coding for potential needs down the line.

Practicing YAGNI means being mindful about which features and functions you include in your application design. For example, if you don’t have a clear use case for a feature then it’s probably best not to include it until you actually need it later down the line. This way, you won’t be wasting valuable resources on features that may never get used anyway.

Similarly, if you have identified a feature that could potentially be useful but there isn’t an immediate need right now then put off coding until there is one – this way your application remains lean and manageable while still fulfilling its purpose effectively.

How is YAGNI different from KISS principle?

YAGNI allows developers to avoid creating unnecessary code or features that will not be used by users or customers. YAGNI also helps to keep development costs low as developers do not have to spend time developing features that may never be used.

KISS helps developers strive to create easy-to-understand solutions with minimal complexity. This means that developers should focus on making sure their code is well-structured and organized so that users can easily understand how it works without having any technical knowledge about coding.

The main difference between these two principles lies in their primary focus – while KISS emphasizes keeping things simple, YAGNI emphasizes avoiding unnecessary work. While both are important considerations when building software applications, their end goals differ slightly – KISS focuses on making sure existing features remain as easy-to-use as possible while YAGNI focuses on making sure new features don’t become a burden before they're even necessary. Both are essential aspects of successful software architecture that should be taken into account when designing new applications or updating existing ones.

What are the DRY and DIE principles?

The DRY principle states that “every piece of knowledge must have a single, unambiguous, authoritative representation within a system.” This concept is based on the idea that duplication can lead to confusion and errors in software development. By following the DRY principle, developers can prevent writing unnecessary code that could lead to problems down the line.

The DIE principle was created as an extension of the DRY principle. It states that code duplication should be avoided whenever possible, as it can lead to inconsistencies between different parts of the system or between multiple versions of the same codebase. Duplication also increases maintenance costs and can make debugging more difficult because there may be multiple places where changes need to be made in order to fix a bug or update a feature.

By following both the DRY and DIE principles, developers can create more robust and reliable software architectures that are easier to maintain over time. For example, if a developer is creating a web application with multiple features that require similar functionality, they should use abstraction techniques like functions or classes instead of repeating code across different parts of the application. This ensures that any changes made in one part of the system will automatically be reflected in other parts as well, reducing bugs and speeding up development time.

Explain the Single Responsibility Principle?

The Single Responsibility Principle (SRP) is a principle of object-oriented programming that states that each module or class in a program should have only one responsibility. This means that each module or class should be responsible for only one type of functionality, such as data access, input/output handling, or business logic. The idea behind this principle is that it makes code more organized and easier to maintain by breaking down large programs into smaller modules or classes with specific responsibilities.

The Single Responsibility Principle is an important concept because it makes code easier to read and understand. If a module has multiple responsibilities, it can become difficult to keep track of all the different parts and how they interact with each other. By limiting each module to just one responsibility, it makes reading and understanding code much easier. Additionally, if changes are needed in one area of the program, then it can be done without affecting other areas of the program since the modules are isolated from each other. This makes maintenance much simpler since any changes made won't affect any other part of the system.

In addition to making code easier to read and maintain, SRP also helps improve overall system performance by reducing complexity and improving scalability. By keeping modules focused on a single purpose instead of having them handle multiple tasks at once, you can ensure that system resources are used efficiently and not wasted on unnecessary tasks. This increases performance by ensuring your application runs faster and more reliably than if all tasks were handled by one large module with multiple responsibilities.

What are the principles behind 12 factor app?

The twelve-factor app is a set of best practices created by Heroku in 2011 to help software teams better manage their applications. These principles provide guidance on how to design and deploy applications that can scale quickly and reliably. The twelve factors include:

Codebase: The codebase should be version controlled so that it can easily be tracked and maintained.
Dependencies: All application dependencies should be explicitly declared in the configuration files instead of relying on implicit dependencies from external sources.
Config: All configurations should be stored in environment variables instead of hardcoding them into the application codebase.
Backing Services: Applications should rely on external backing services like databases or third-party APIs instead of storing data locally within the application itself.
Build, Release and Run: Applications should use a specific build process for creating deployable artifacts that can then be released into production environments. These artifacts should also include scripts for running the application in production environments.
Processes: Applications should run as one or more processes (e.g., web server, database server). These processes should run independently of each other and not share any resources between them.
Port Binding: Applications should bind themselves to a port so that they can accept requests from outside sources without requiring any manual steps or configuration changes.
Concurrency: Applications should be designed to handle multiple tasks concurrently by using multiple processes or threads within each process where appropriate.
Disposability: Applications should start up quickly and terminate gracefully when necessary, so that they can easily be stopped/started during maintenance operations or if there is an unexpected failure in the system.
Dev/Prod Parity: Development environments should closely match production environments so that issues discovered during testing can be quickly reproduced in production environments if necessary.
Logs: Applications must log all output from their processes in order to diagnose issues quickly during development or debugging operations
Admin Processes: Administrative tasks such as database migrations or backups must also run as separate processes from the main application(s).

How are fault tolerance and fault resilience different?

Fault tolerance and fault resilience are both strategies for dealing with system errors. Fault tolerance is the process of anticipating errors in a system and making sure that those errors don't cause significant damage to the system as a whole. This type of error handling is focused on prevention, meaning that it relies on careful planning to make sure that no single component of the system can cause catastrophic failure. It's important to note that while fault tolerance is an effective strategy, it can be costly in terms of time and resources.

Fault resilience, on the other hand, is focused more on recovery than prevention. When an error occurs in a resilient system, it will be able to recover quickly without any major disruption to its operations. While this type of error handling may not completely prevent errors from occurring, it will help ensure that any downtime due to errors or failures is minimal. As such, it can be a cost-effective way of ensuring continuity despite occasional issues.

Does CQRS work without Event Sourcing?

CQRS is an architectural pattern that separates read operations from write operations. In other words, it allows for separate optimization of writing data into a database and reading data from a database.

Event sourcing is an architectural pattern that ensures every change to an application's state is recorded as an event. The events are stored in a log or sequence of records that can be used to reconstruct an application's state at any given time.

Yes! It is possible to use CQRS without event sourcing; however, there are certain advantages that come with combining these two patterns together. For example, when both CQRS and event sourcing are employed together, developers have more control over which parts of the system need to be updated when making changes—which can help reduce complexity and improve performance by reducing redundant work. Additionally, using both patterns together makes it easier for developers to audit changes made over time by providing visibility into each step of the process.

What is CQRS pattern?

The Command and Query Responsibility Segregation (CQRS) pattern is an architectural pattern used for separating read operations from write operations within an application or system. The goal of this separation is to improve scalability by allowing both sets of operations to scale independently of each other. Furthermore, CQRS also improves flexibility by allowing for different approaches in implementing each set of operations as well as reliability by reducing latency in query responses due to the separate data stores being used for each type of operation.

The core idea behind CQRS is that when a user interacts with an application or system, they will either be issuing a command or making a query. Commands are typically issued to modify data while queries are issued to retrieve data. By separating these two types of operations into two different models, each model can be optimized independently for its own specific purpose. This can lead to improved performance since querying and writing can take place in two separate databases which can be tuned and scaled independently from one another.

By separating commands from queries, applications are able to better utilize resources such as memory and processing power since they won’t have to run both types of operations simultaneously on the same hardware or platform.

Additionally, using separate models for commands and queries allows developers more flexibility when it comes time to make changes as they can update one model without affecting the other. Finally, using CQRS makes applications more reliable since queries will not have any impact on commands which means that query responses will not be affected by any delays caused by writing data.

What is Event Sourcing pattern?

The Event Sourcing Pattern is a technique used in software architecture that enables the ability to save changes to an application's state as a sequence of events. The Pattern is built on top of the Command Query Responsibility Segregation (CQRS) principle, which dictates that application functions for writing data (commands) should be separate from those for reading data (queries).

Event Sourcing further expands upon CQRS by taking these ‘write’ operations one step further and storing them as a log or record list of all events following each other in succession. This provides an immutable record of any prior state and facilitates recreating instance lags or replaying sequences of past events.

Essentially, the Event Sourcing Pattern allows developers to take advantage of even more granular control over how their applications are modified and operated. Thus, it has become an invaluable difference-maker when creating architecture solutions that need to run at maximum efficiency without sacrificing accuracy or reliability.

Can you tell me what is Unit test, Integration test, Smoke test and Regression test and how they different each other?

Unit tests focus on individual pieces of code while integration tests assess how different components interact with each other; smoke tests detect major issues before moving onto further depth testing; and regression tests make sure no changes have broken existing functionality within your applications' codebase.

Unit Test

A Unit Test is a type of software testing that focuses on individual units or components of code and verifies their correctness. The purpose of a unit test is to validate that each unit of the software performs as designed. This type of test isolates a section of code and determines if its output is correct under certain conditions. These tests are usually conducted using automated tools like JUnit, NUnit, and Jasmine.

Integration Test

Integration Tests are used to verify the functionality between different units or components within the system being tested. They are designed to evaluate how well different parts of an application work together as a whole. Unlike Unit Tests which focus on individual components, Integration Tests focus on multiple components interacting with each other in order to verify that all elements are working properly together. These tests can be performed manually or through automation tools like Selenium WebDriver and Cucumber.

Smoke Test

A Smoke Test is a type of testing used to identify any major problems with an application before performing further in-depth testing. It is often referred to as "Build Verification Testing" because its primary purpose is to determine if a build (stable version) is ready for further testing or not. This type of testing involves running basic functional tests against the application in order to ensure that its most critical functions are working correctly before moving forward with more detailed testing procedures such as regression tests and integration tests.

Regression Test

Regression Testing is used to verify that changes made have not broken existing functionality in an application's codebase. It helps ensure that new updates have not caused any unwanted side effects in areas where they were not intended or expected. Regression Tests can be either manual or automated depending on the complexity and size of the project being tested, but they are typically done using some sort of automation tool such as Selenium WebDriver or Robot Framework Suite.

What is Sharding, why it is important?

Sharding is the process of dividing up a large database into multiple smaller databases, called shards, each of which contain only a fraction of the data contained in the original database. These shards are then stored on separate servers so that they can be accessed independently. The goal of sharding is to increase performance by reducing the amount of data that needs to be processed when accessing or updating information in the database. It also allows for scalability since more shards can be added as needed without having to scale up the entire server infrastructure.

Sharding is an important technique because it can help improve overall performance and scalability by reducing contention on resources. When data is stored in multiple shards, there are fewer requests competing for resources at any given time, which means that queries can be processed faster and more efficiently.

Furthermore, adding additional shards as your datasets grow will help minimize downtime due to high load on servers and allow you to scale quickly as your business needs grow. Finally, sharding also makes it easier to maintain consistency across databases since each shard contains only a portion of the data from the original dataset.

Describe the GOD class. The GOD class should be avoided for what reasons?

A GOD class (or God object) is a computer programming concept used to describe classes that are overly large or too tightly coupled. It usually refers to classes that contain too many methods or properties—often more than 1000 lines of code or 10 different methods—which makes them unwieldy and difficult to maintain. A single change in such a class can cause unexpected side effects because the entire system depends on it. As such, developers often refer to these classes as “GOD classes” since they seem omnipresent and difficult to control.

The biggest problem with GOD classes is that they tend to make the code rigid and hard to maintain. If you have one big class that does everything, then changing one part of it affects the whole system. This can lead to bugs and other issues when changes are made down the line. Additionally, tight coupling between components leads to bad design decisions which can be hard to refactor later on in development.

Furthermore, having large classes also makes debugging more difficult since there's more code to go through when trying to find the source of an issue. And if your application relies on multiple instances of these God objects throughout its structure, tracking down issues becomes even harder since they could be caused by any number of things across multiple locations in your codebase.

TeachToJava

Tuesday, May 28, 2024

Software Architecture Interview Questions - 2

No comments:

Post a Comment

Contributors