The term „module“ is used often in software development – like „component“ –, but what is a module anyway? Developers are mostly hard pressed to give a concise definition.
We believe, though, that precise definitions are fundamental to fluent software production. Precision averts misunderstandings. Misunderstandings (as errors) are frictions; they are waste to be eliminated.
In Flow Design a module is a source code container for logic and data which helps to create evolvability.
Modularization thus is the discipline of decomposing software into modules so that it becomes and stays easy to understand and change. Refactoring changes how logic is wrapped up into modules.
Here’s some logic as an example – but not yet wrapped into a module of its own, although it has a very specific purpose. Rather it’s separated from other logic by whitespace and comments:
To design software means thinking about it in terms of modules; design answers the question: Which modules on which levels should there be to create the desired behavior and at the same time make the resulting code easy to evolve? Flow Design distinguishes modules on five different levels: Each level contains modules of lower levels. To explain them, though, we start at the bottom of the hierarchy. That’s the module category you’re most familiar with:
Functions (or more generally: subroutines) are the smallest modules. They are the only modules to contain logic directly. Their purpose is to attach a label to chunks of logic and provide a clear interface to it – at least in terms of input and output. Functions provide meaning. Instead of interpreting stretches of logic you just look at the function name and know right away what the logic is all about. At least that’s how it should be 😉 That’s what the SRP and other principles are about. Here’s the above logic refactored into its own function:
With functions the contract is tied together with the implementation. A function’s signature is right where the contained logic is.
Although you will find that Flow Design uses techniques from Functional Programming (FP) it’s not the same. Flow Design is less strict about dealing with state or immutability. But of course you’re welcome to use Flow Design and interpret it in a more FP-like manner.
There are a few programming languages which let you write just logic without the need to put it into a module. The majority of code even in these languages will be wrapped in functions, though. We thus find is ok to say all logic needs to be put into modules starting with functions. Without functions there simply is no re-use. Without functions there is no composability. Without functions there is no meaning closely bound to logic.
A class is the next higher level module above function. With a class you can wrap up several functions plus data. That’s the encapsulation object-orientation is focused on: to hide details of how behavior is created behind an implicit interface of public methods.
When should functions be bound together in a class? If their cohesion is higher than the cohesion with other functions. How do you measure cohesion? Sometimes it’s just in the eye of the beholder 😉 You then need to understand the subtleties of the domain. At other times, though, cohesion is more obvious, e.g. when several functions share state or access the same API. Classes allow to hide such details.
Regarding the palindrome example: Methods Main() and CheckForPalindrome() could be separated into different classes simply because their responsibilities differ widely. Main() is about getting data from the command line and writing data to the console. Its logic is API-focused. The CheckForPalindrome() logic, though, is focused on the domain, which is checking strings for a certain property.
What might look a bit overengineered with just one method per class will become more reasonable when domain logic gets added. For example, checking palindromes could be extended to so called semordnilaps. Since a semordnilap cannot be detected by looking at the a string some „database“ is needed, i.e. state is introduced which should be encapsulated.
The implicit contract of the class consists of CheckForPalindrome() and the constructor which needs to be filled. How the semordnilaps are handled is hidden to clients of the Palindromes class.
Libraries bind together classes like classes bind together functions. But what distinguishes libraries from classes is their opaqueness. The clients of „just“ a class do white box re-use. Developers of client code can look at the source code of the classes they use at any time. Unless…
Unless the classes are made available via libraries. Then client developers cannot see the source code of the classes. Libraries in many languages are binary, in others they are at least not easy to read (e.g. since minimized). So we call the special property of library „opaqueness“.
It’s this property which makes libraries the ideal modules for re-use. They are true black boxes to be used only through a specific contract – which still is implicit, because it’s part of the classes in the library.
When to bundle up classes in libraries? Cohesion could be high because classes belong together with regard to their domain. Or it could be high because classes are commonly used together.
Or they need to be deployed together. Libraries are modules which can be independently deployed. This is not true for classes or functions. That’s another reason why libraries have been the modules of re-use for decades.
One of the often used terms in software development is „component“ – but few developers know a definition for it. Most use it interchangeably with „library“.
But components deserve to be distinguished from libraries. Although they share a lot of properties with them, they should be different.
Flow Design offers a simple definition of component: A component is a library with a separated contract.
So far contracts were implicit, they were bound together with their implementations. Components depart from this. With components contracts become explicit entities of their own – preferably even wrapped up in their own libraries.
This has several benefits:
- If a contract exists separately from any implementation it can be implemented multiple times. Typical implementations are „the real thing“, i.e. the service you want to provide, and a mockup of the service for testing purposes.
- If a contract exists separately it can exist before any implementation and still clients of it can already be build. In fact clients and implementations can be build in parallel and even by different teams.
Components are the prerequisites for true „industrial“ software development. They allow for a flexible division of labor and true independent development using mockups and testbeds.
A component can tie together any number of libraries to hide their services behind an explicit contract.
Here’s what this would mean for the palindrome example:
The contract is referenced by the client as well as the implementation. But the client does not (!) know the implementation. It depends on the abstraction only (Inversion of Control, IoC), the interface IPalindromes, and does not care about details.
That way it can be run with different implementations, for example against some fake implementation:
Whether the final implementation exists during development of the client is not important.
Since the client of a component contract does not know any implementation of it, there needs to be some authority to bringt both together. That’s what in the example does the component.app program. It references the client and all parts of the component, creates the implementation and constructs the client by injecting it:
That’s the sole purpose (read: single responsibility) of the app.
The top level of the module hierarchy are services (or µServices if you like). They are like components, in that their contract is separate. But there must be more to them. Like with „component“ the term „service“ is much used. But what actually is a service? Flow Design defines it very simply: A service is a component with a platform neutral contract. This has two implications:
- A platform neutral contract has no specific form. It could be a WSDL file or written on a piece of paper, it could use HTTP and XML or iron.io Queues and Json. You just need to be able to implement it in many programming languages/on many platforms, e.g. with .NET or on the JVM or with Ruby or Go.
- Since a client cannot rely on a service contract to be implemented on the same platform it cannot rely on the implementation to run in the same operating system process. Thus services are autonomous. Each service implementation runs in its own operating system process. Otherwise it would not be possible to replace it with an alternative implementation of the same contract but on a different platform.
Here’s the palindrome example as a service running in terminal window being called from the command line of another:
The contract of the service is simple:
The palindrome service is a HTTP GET based service listening on port 1234 and taking the text to analyze from the query string parameter named text and returning either 1 or 0 in the HTTP response. Example: http://localhost:1234?text=race%20car
This verbal contract then is implemented as a program and uses the above palindrome library as its implementation. A client can then be a standard tool like curl or a custom client with its own process.
In this example the service is made available using the lightweight HTTP framework Nancy for .NET. As you can see the contract is just a wrapper around using the palindrome library:
Once the service process runs you can call it from a process based on any development platform as long as it can deal with the HTTP contract, e.g.
curl --get http://localhost:1234?text=stressed
This looks simple on purpose. There is not more to the service module level. It’s just a container for code with a special property: platform neutral separate contract.
If the services you find in the literature are not that simple because they are depending on tons of infrastructure, then that’s fine – but it does not change the fundamental nature of them. Contracts can be very complicated. Hosting environments for services can be complicated. But at the core it’s still just a container for logic with a separate platform neutral contract to improve evolvability.
Whatever a software is supposed to do for users it’s done solely by logic. No modules are needed. Modules enter the scene only to make understanding and modifying the code easier for humans. Modules only serve the non-functional requirements evolvability and productivity.
Which makes them not less important than logic. Right to the contrary! Without modules software development does not scale and is not sustainable.
Many established principles like SRP or „high cohesion, low coupling“ are modularization principles.
But it’s not enough to have just one kind of module, e.g. classes. Because then containment hierarchies cannot be (physically) expressed. That’s why Flow Design makes a hierarchy of modules explicit.
Due to the physical embedding and separation meaning can be tangibly attached on different levels of abstraction. Also the different kinds of modules allow for different physical levels of decoupling: decoupling between functions is weaker than between libraries or services.