ghp June 2015

ghp June 2015 | 33 The problem is basically one of ‘data-rich, informa- tion-poor’. The current methods involve data scientists spending weeks and sometimes months focusing on a specific repeatable issue and building the perfect model. Apart from this, can anything else be done? The answer is yes. There are technologies emerging into the mainstream from a highly scientific back- ground that can either automate or greatly assist with the joining, cleaning and transformation of disparate and unstructured data and some that have predictive analytics algorithms that can work with dirty and incomplete data. There are different ways to resolve the same problem and different strengths and weak- nesses of each. The technologies at the company I represent, Warwick Analytics, is one of the potential new approaches to address this. The approach it takes is two technologies combined: The first technology is known as ‘automated information retrieval’ (or AIR) and this is an automated approach and exhaustive pattern mining and pruning. Significantly it does not clean data in case it strips out a signal in the future. It does have a downside in that it requires access to significant distributed computing power (whether on premise or cloud) but if this is available it can be extremely useful as it makes no assumptions and can work with text including typos and industry-specific terms. The output is to structure and classify the data. The second technology is the home-grown, non-statisti- cal, predictive analytics algorithm called RCASE (“Root Cause Analysis Solver Engine”) which was built to deal with dirty and/or incomplete data. The combination of both technologies in an iterative workflow means that all of the data can be fed in and the system will mine the historian for signals to explain/predict the clusters it has generated from AIR of seemingly unrelated inci- dents. One of the key benefits is that the alarm thresh- olds are explicitly predicted in terms of the amount of false positives and negatives and therefore signals can only be retrieved that are tolerable or valuable. The predictive analytics solution is known as Sig- maGuardian and is being used at many healthcare establishments. For example it was recently implemented at a major pharmaceutical company to reduce impurity levels in the production of a haemorrhoid treatment. They wanted to identify where the underlying reasons for the impurity formation lay. Although the initial dataset was small and contained some inaccuracies, SigmaGuardian was able to employ its Information Retrieval technology to extract as much information from the data as possible, including logical inferences and derivative variables. From the first itera- tion, its ‘non-statistical’ predictive algorithm was able to generate and validate some strong signals without any hypotheses. Some of the results were expected and some were a surprise to the team. This led to a second iteration with more data. Warwick Analytics generated root causes to explain the factors which predicted both high yields as well as low yields. A Design of Experiments (“DoE”) was created from the results to validate the results. These concurred with the findings from SigmaGuardian and the relevant process changes could be implemented. The Site Operational Excellence Lead commented: “The pharmaceutical industry is highly regulated and we have to verify and document everything we do. So finding and validating the root causes which drove yield improvements was a key advancement and we are very pleased with the analysis that SigmaGuardian provided. “We did not expect the results to be as good as they were, particularly with limited data we provided. Also the speed of calculation and the ease of interpreting the results was impressive too.” In terms of healthcare provision, there is another use case in hospitals for predictive analytics: University Hospitals Coventry and Warwick NHS Trust (UHCW) also used SigmaGuardian to improve its treatment for stroke metrics (especially the then 80:90 stroke target) and also deliver sustainable savings of over £400,000 per annum in costs of running the stroke department. UHCW were falling behind on their 80:90 stroke target and also experiencing sub-optimal outcomes, budget challenges, incomplete data and an unclear action plan. Patient flows were monitored and analysed using the software. Matching these against a hospital sys- tem-map and patient data, the software identified the bottlenecks of the hospital system, enabling UHCW to prioritise and reallocate resources to the appropriate points in the system. Without significant new investment, UHCW was able to rapidly improve the outcomes for stroke victims while also producing significant savings in costs of running the stroke unit. Within less than a year, the hospital was achieving well over the NICE quality standard for stroke units and is one of the best per- forming hospitals in the UK. So there are many opportunities for improvement in care quality, production quality and yield. Yes, we are challenged with data but there are still predictive analyt- ics out there which can work. Visit for more information. innovation