Game Changing Intervention Removes the Pain for Ralawise

Two years of operational pain fixed in one week

Success can bring its own problems

Ralawise pride themselves on exceptional customer service but due to significant sales growth, delivering it had become a real challenge.

Ralawise’s unique selling point is that any order placed before 5pm arrives the next day. This inevitably results in a large inrush of orders as the deadline approaches.

The IT systems supporting the order processing and fulfilment functions had become increasingly challenged, with order bottlenecks affecting guaranteed despatch times. For a couple of years they had been really struggling to the point where the IT issues were creating major operational problems.

As sales volumes had risen the IT systems had become increasingly over loaded and unstable during the daily peak periods as the order deadline approached. The order processing and pick list generation process had become slower and less reliable.

Picking instructions were reaching the warehouses later and later. So, more staff were having to work later and later to complete picking before the courier cut-off for next day delivery. Ensuring all orders caught that 10pm cut-off had become a significant problem.

People were suffering

The whole situation had created huge challenges for the IT staff who had to nurse the system along, as well as the warehouse staff who were under pressure to get the picking done at short notice.

Everyone was working late and the staff were incredibly stressed. Despite best efforts Ralawise were struggling to fulfil the promise they had made to their customers. Some orders did not make it which was putting the company’s reputation at risk.

Living with the pain

Coming from small beginnings, Ralawise were using a mid-range ERP system that had, over the years, with the addition of third-party extensions and customisations of logic and business rules, evolved into a complex enterprise IT solution.

For almost two years IT had been desperately trying to manage the overload issue with staff monitoring the systems, stopping jobs and restarting servers throughout the evening. It was becoming increasingly difficult to find the time to work on other projects and the whole situation had become untenable.

An unknown cause

There had been numerous attempts to try and identify the root cause of the problem over the past two years. Different providers had been brought in to investigate the issue. All without success.

All operational measures indicated there was infrastructure with capacity to spare and so investigations focussed on searching for hidden flaws in the coding of system customisations. These reviews had born no fruit. There was no explanation for what was happening. No one could work out what to change to fix things.

Extra help

A senior advisor to Ralawise, Peter Taylor, had previously used SQC’s services when he was the CIO at the Charity Commission and he had seen, first hand, SQC in action, identifying and solving obscure problems in a complex IT system.

He knew SQC could succeed where others had failed; taking ownership of problems, getting to the root cause of mysterious issues and identifying how to fix them.

On Peter’s recommendation, the Ralawise CIO, Adam Barwick, commissioned an investigation by SQC, with the aim of identifying and resolving the issue.

SQC developed a hypothesis on what was creating the slow order processing times and where the problem was occurring. It was suggested to Ralawise that it could be their SQL server, which was running a virtual machine and struggling with the volume.

Understanding the issue

Introductions made, SQC spent time in the Ralawise IT department’s office gaining insight into the nature of the problem and how it manifested itself within the different parts of the IT estate.

SQC observed, first-hand, the very predictable emergence of the early signs of issues. Queue lengths started to increase, followed by ‘controlled chaos’ as services began to lock-up, whilst the IT team tried to intervene to keep things moving as best they could.

Familiarisation went beyond the core ERP system. It covered the solution as a whole including the warehouse management system and devices used in the picking process. The aim was to obtain a holistic picture of the business process, the systems used to support it and how these elements interacted, especially during the problematic parts of the day.

Searching for an answer

There were no obvious causes. Volumes had grown but other organisations were processing much higher volumes through similar ERP systems. There were no obvious signs of infrastructure stress, nothing was telling the IT team ‘add more horse power’.

With no obvious suspects the search for the root cause, ranged far and wide. It involved looking at all of the ‘innocent parties’, examining each one in turn, considering what might be happening, and checking those theories out. A ‘could it be’ approach that considered nothing out of bounds. The search included the review of API’s, of locking, of performance metrics and of network conditions. None of these revealed any prime suspects. All plausible, and some highly implausible, angles were considered. With everything else eliminated, attention returned to the database server. It was now the prime suspect even though all was ‘green’.

The guilty party

Throughout two painful years there had been many eyes on the database server. All of the operating system level metrics showed that the machine had ample capacity. Even on the busiest and most problematic of occasions there was memory and CPU to spare. The available database statistics showed low levels of activity but, with the CPU sitting idle, this was put down to a lack of demand from the clients, rather than a database problem. Hence the searches had gone elsewhere, looking for other bottlenecks.

Through a process of elimination of other candidates, SQC developed a hypothesis that it could be the database that was causing the slow order processing, and that blind spots in the monitoring was the reason this had not been seen. Having brought in a specialist database monitoring tool and applied fine grained processor monitoring, the team converged on a clear explanation of the behaviour and why it was ‘invisible’ to the IT team. SQC informed Ralawise that it was their SQL server, running on a virtual machine, that was struggling with the volume. Ralawise were dubious, they could see no signs of any issue, but agreed to trial a change.

Immediate results

SQC recommended that Ralawise double the power on their SQL server by adding additional CPUs to the virtual machine. A change was made on a Saturday and everybody waited to see what would happen on the Monday, always the busiest and most problematic day of the week.

“We’ve not seen any database overload issues, nor have we seen the 2,000 per hour problem reappear. When the volume is there, we’ve been hitting 8,000 order lines an hour, and regularly do 5,000-6,000.”

“Cyber Monday was our biggest day this year with 46.7k order lines and we were able to process them through the system without issue. A massive win for us.”

“We’ve also seen various other processes improve, notably the invoicing process which used to take 4-5hrs is now taking around 2.5-3hrs.”

Adam Barwick, CIO, Ralawise

Additional input

After the project, SQC followed up with additional, secondary, recommendations that covered changes in business processes, database tuning and changes to move reporting from the active database into the secondary one.

Ralawise is the UK’s leading wholesale distributor of ‘blank canvas’ products, supplying businesses that apply custom embroidery, print and embellishment. Founded in the Batson’s family home in the 1980’s, Ralawise had grown significantly over the years and today employs over 500 team members and operates across multiple countries, whilst remaining family-owned.

  • Turnover £169 million (2024)
  • 100+ different brands stocked
  • 4,000+ styles of products stocked
  • 600,000+ orders shipped per annum
  • 8 distribution centres