The purpose of this article is to familiarize payment gateway software users with the benefits of high availability and fault tolerance concepts. These two concepts are vitally important for various merchant services industry players, because transaction processing (especially, real-time processing) became mission critical for most businesses, such as online stores, online ticket booking and purchase systems, hotel booking sites etc.
More and more transnational online businesses emerge, which are targeted at customers around the world. Payment system of such businesses must be available 24/7 with the minimal number of the maintenance windows. Fault tolerance in itself is another reason for utilizing cluster architecture for transaction processing. If some node of the system fails, the customers must not even notice it.
Why are high availability and fault tolerance important?
There are several reasons why you should think in advance of the concepts of high availability and fault tolerance as you design the architecture of your future payment ecosystem:
- Downtime reduction. You should minimize downtime, because your clients might be located around the world, in different time zones. On the other hand, your payment software still needs to be updated from time to time (and this, generally, requires some server restarts or downtime). If your system is designed with high availability in mind, then you are able to service all your customers 24/7, and still perform maintenance operations and updates when necessary.
- Physical hardware failures. You might experience physical hardware failures, including database storage failures. For these situations, you must develop an effective backup strategy. Sometimes, however, doing a lengthy recovery process from backup and is simply not an option, because the system will not be functional while the process executes. For such cases, maintaining a cluster of data bases, both of which are active at any time, is the only way.
- Division of labor. You might (sometimes or regularly) have a consistent high volume of authorizations going, while somebody drops a large file for processing, or some resource-intense settlement process has to execute. Running all of these processes on the same server can significantly impair the ability to do real-time authorizations. Therefore, it might be necessary to segment the functionality, so that certain nodes of the cluster are dedicated to authorization, some handle settlement and batch file processing, while other nodes can handle reports and data export.
- Resource utilization optimization. Similarly to functionality-based segmentation, sometimes a need for customer-based segmentation might arise. Customers with some specific behavior patterns, or customers from specific time zones, can be directed to respective nodes, “reserved” for them (always or at a certain time of the day).
Some of the company’s customers process transactions 24/7, while other customers process large volumes at particular time of the day or month (these are merchants doing recurring billing through real-time transactions). In such case one node should be dedicated to merchants who process transactions 24 hours a day and have evenly distributed processing patterns, while dedicated nodes could be used for merchants who only process at certain times. Based on the processing times, they could be arranged in such sequence that the servers are never overloaded with transaction volume.
If you are serious about payment processing and customer service, you have to be serious about high availability.