Somesh Chatterjee

Saturday, May 11, 2019

Problems of the 'new' keyword

How do we create an object in most languages? It's by using the 'new' keyword and there's a big problem lurking in there! Some languages of-course skip it, like Python, where you can create an instance of class just by doing 'ClassName()' instead of 'new ClassName()' but the underlying problem remains the same.

The new keyword causes our codes to get coupled.

Let's take this simple example:

interface IShape
{
  public string Name { get; }
}
class Square : IShape
{
  public string Name { get; private set; }
  public Square()
  {
    Name = "Square";
  }
}
class Circle : IShape
{
  public string Name { get; private set; }
  public Circle()
  {
    Name = "Circle";
  }
}
class ShapeHolder
{
  public IShape Shape { get; private set; }
  ShapeHolder()
  {
    shape = new Square();
  }
}

If we see the above example, even though we have used an interface 'IShape', our ShapeHolder is now tightly coupled to the Square instance. It can never ever hold an instance of a circle. This may seem fine but this causes two major problems:

Unit Tests: There's no way to test ShapeHolder independently of Square class. Any test written for ShapeHolder would end up working with a concrete instance of Square class. If Square class was modified later to start using network access or a db call, unit tests of ShapeHolder may start failing or take up huge amounts of time, as the testing environment won't have the necessary setups.
Extensibility: The above code doesn't show run time polymorphism. In future if a new shape comes in, which must work with the ShapeHolder class, we will have to modify the ShapeHolder class too. This may in-turn cause some other class to be modified. This is a violation of open closed principle.

The main way to solve this is to use Dependency Inversion. Instead of ShapeHolder being responsible for creating the Square class, it should be passed in to the class, either via the parameter of the constructor or using something like the Factory design pattern. Now at runtime we can change which IShape type is being held by the ShapeHolder. For unit tests, this can be a simple MockShape class.

Friday, May 10, 2019

Space Based Architecture

Photo credit: https://en.wikipedia.org/wiki/Space-based_architecture

Like CQRS (Command Query Responsibility Segregation), this architecture is used for high scalability and performance.
In a typical setup there is usually a client, an application server and a database. Whenever load increases, we can increase the number servers handling the client requests, once we fix that we soon find that the application server handing all the business logic is now the bottleneck. Once we scale that up we are pass the bottleneck on to the database.
Unlike the previous layers, scaling up the database is very tough. Databases are not really meant to be scaled. Sure, there are solutions out there which try to address this, but it’s difficult and expensive.
This pattern tries to address that bottleneck – the database. In this architecture, there are 5 components:

Processing Unit
Middle ware
1. Message grid
2. Data Grid
3. Processing Grid
4. Deployment Manager

The processing unit is the component which houses the business application. Its complete with it’s own persistence store (the database) and is self sufficient.
Message Grid is responsible for orchestrating the various requests from client to the different processing units.
Data Grid: It’s responsible for keeping the data in sync within all the individual persistence stores in the processing units.
Processing Grid: Often the request is split between multiple processing units to get even better throughput. Processing grid is responsible for this.
Deployment Manager: As the load increases, it’s responsible for spawning up more processing units. Similarly when the load decreases, it’s responsible for getting the number of processing units down.
The reason why this architecture effectively solves the database bottleneck problem is that in this solution there is no common repository. Every processing unit has it’s persistence store. So the database never becomes a bottleneck here.
Also, as the load increases/decreases, the system scales accordingly.
So, both CQRS and Space Based Architecture support scalability by having multiple instances of the databases. The Space Based Architecture scales the system dynamically based on the load, however, with this additional power comes the additional complexity. If the load fluctuates a lot, then Space Based Architecture is the better option.

A More Beautiful Question

This is an amazing book and it’s all about how to innovate. The author raises a very interesting point that we are always running after answers however, to come up with something truly innovative, we must ask the ‘More Beautiful Question’.
So what does it mean to ask the more beautiful question? It can be broken down into 3 main steps:

Why?
As we grow up, our minds become more rigid and accept a lot of things because that’s how it has always been. We stop asking this basic question – why? The application is slow, um, that’s because it’s a large system which needs to serve multiple requests concurrently. No, we need to step back and ask – why? Why is the application slow and come to the root cause.
What if?
Now that we have identified the probable root cause, next question is ‘what-if’? In this step we must forget about all constraints and what’s practical and impractical. We should just come up with all solutions that we can think of, and as stated before these does not have to be practical. This is important to understand, as an absurd idea in itself may not be of much value but it can lead to more and more ideas some which may actually be viable.
How?
Now, that we have brainstormed a lot of solutions to the given problem, now is the time to figure out what actually is practical of all the options mentioned. Which one of these can be really achieved given the current constraints and one or two may actually fit the bill.

What happened in this exercise is that we kept aside the practical aspects till the last stage and let our thoughts move freely beyond any restrictions. By taking a step back and not chasing after answers we have allowed ourselves to think beyond the obvious and come up some potentially unique solutions.

CQRS

Software architecture has evolved a long way from the standard 3 tier architecture. A lot of the previous architectures were optimized for the constraints present at that time like limited resources but with time such restrictions no longer apply.
The most promising of the new architecture is the Domain Centric Architecture which places domain right at the center of the architecture. But more on the domain driven design later.
Another interesting concept is CQRS – Command Query Responsibility Segregation. This is useful for creating a system optimized for performance. The principle is simple, keep the command and query separate.
Command is anything that changes the state of the system on the other hand query requests some information from the system. A query should never alter the state of the system and the command should ideally not return some information. Of-course, a strong distinction is not always possible, a command does often return whether it was successful or not but keeping them separate leads to a lot of benefits.
Code level:
CQRS should be even strived at a code level. A function responsible for fetching some information should never alter the state of the class. As the codebase grows, a public facing getter API can be called from different modules. It’s next to impossible to control that and other modules may not have enough information to identify when a simple get call ended up altering the state of system inadvertently.
Architecture level:
A lot of the features in the software are data-read heavy and usually a smaller set of features in the software actually update the data. If we keep them separate we can scale them independently. The ‘Query’ module having optimized for fast reads and ‘Command’ module optimized for writes. In a little more advanced version of CQRS, both the stacks can have independent databases. Every write into the main database is then synchronized with the database meant for ‘read’. For short duration these databases will be out of sync but this leads to highly performant systems and the performance won’t degrade as you scale the system.
There are downsides of it too. Implementing CQRS makes the architecture complex and should not be used if the performance and scalability are not the primary needs of the software. Also, the query and command stacks will have data which will be out of sync which leads to additional complexity in the system.

Technical Expertise

I was going through a course on software architecture and I came across an interesting concept. It’s nothing too uncommon but it’s something I haven’t paid attention to.
In the simplest terms it states that our knowledge is like a pyramid. What we know forms a small part on the top of the pyramid. What we know that we don’t know forms the next layer. But what really takes up the most chunk are stuff we don’t know that we don’t know about.

An example of this might be that I need to create an application which should run across iOS, Android and Windows phone. Given this requirement I might blindly start using a framework like ‘Xamarin’ and start implementation. What I did not know that there is another way – creating a ‘hybrid’ app using the ‘webview’ component present in all the three SDKs. While at the end of the day Xamarin might still be the better choice, based on the requirements, but now knowing that another approach exists, we missed out on an opportunity to create a simpler solution.
Also, as we grow in our careers we need to move things out of the lower triangle and at-least move it into the middle triangle, so that whenever an opportunity presents itself, we know that a way to tackle the problem already exists. This increases our ‘technical breadth’ which is more useful than ‘technical depth’ as we go higher up the technical career path.

Mind Scripts

As we saw in the last post, making the first impression right is very important. But often we are faced with precarious situations where it’s not easy. It could be that we are going for the interview for the dream job in the dream company so we are nervous and being nervous can lead to a very bad first impression. How do we overcome this?
In his book ‘The Full Facts Book of Cold Reading’, Ian Rowland gives a very simple solution: ‘Mind Scripts’.
So what are ‘Mind Scripts’? It’s nothing but repeating a small positive statement about the upcoming encounter. So in this scenario it could be something like this: ‘I am prepared, I’ll do good and they will like me.’ Keep repeating this in your mind till the initial conversation starts. This makes a good impact on out mindset and body language.
You should also checkout the book for other interesting facts like how to ‘foretell’ someone’s past and future using some clever tricks.

First Impressions are Key

Roger is smart, intelligent proud and lazy.
Smith is lazy, proud, intelligent and smart.
So, of the two who do you think is the better person? Roger, right? Isn’t it obvious?
Well, not really. They both have the same qualities. Only difference being that in case of Roger adjectives like ‘smart’ and ‘intelligent’ were used first and for Smith they were used later. So our first impression of Roger was that of a smart person and that of Smith was of a lazy person.
So, you see. Even simple statements like the ones above can create a impact and these are not to be taken lightly. In the book – ‘Social Animal’ Elliot Aronson highlights a number of psychological experiments where the impact is clearly seen. For example, assume Roger and Smith are students in your class. Initially, Roger does well but over time Roger’s grades fall and Smith’s catches up. You are more likely to believe Roger is the true genius of the two but his mind is little diverted for the time being where as Smith is not that bright but is working hard these days.

Pages