Typically, researchers will have a model they wish to train, and will, therefore, need to collect some data with which to train it; an important aspect is ensuring biases are, as much as possible, eliminated from the data. Importantly, the stage at which the model is considered ‘good enough’ during training is pre-determined by researchers, depending on what question the model was created to answer.
By ‘good enough’ we mean when the model is considered to have learnt enough. Machine learning, by its very nature, cannot guarantee the performance of its algorithms, and it is thus necessary for researchers to stipulate probabilistic bounds on the performance.
For instance, for a very simple example, suppose you wish to create a model that can distinguish between wine, juice and beer. Your data might consist of, say, sugar content, colour (maybe measured through a spectrometer), and alcohol content.
The quality and quantity of data is directly correlated to how well your model will perform after training, so if more data was gathered about juice than the other categories, then there is clearly an imbalance here - the model would typically guess ‘juice’ more often, as it would have a greater chance of being right. Once the model has been trained, the next step is evaluating it, which must be done using different data, as otherwise it would just guess the answer.
Within the field of ML, lies neural networks (NN). The first neural network ever made was in 1943 by neurophysiologist Warren McCulloch and mathematician Walter Pitts, based on a paper that sought to describe how neurons in the brain work. They created an approximate model using electrical circuits to explain how neurons might work in the brain.
The first computer-based neural network came in 1959, with a system called ADALANE, from Stanford University, that detected and cancelled echo on telecommunications lines using adaptive filters. This is still in use today.
While neural networks provided a new and novel approach to information processing and computation, computing power was at a premium at the time.
Marvin Minsky, who laid the foundations of neural network computation, also discouraged research in the discipline in his book, Perceptrons, which some say led to the ‘AI winter’ of reduced research funding, publishing and progress in neural networks that lasted between the 1970’s and the early 2000’s.
However, since then, as computing power became more readily available, interest in neural networks has increased once again. This additional computing power has allowed more computationally demanding techniques, such as ‘deep learning’, to be deployed and to generate compelling results.
Despite their unpopularity in computer science, some researchers still pursued neural network research through the AI winter, recognising its great potential.
Neural networks are particularly useful when the problem being analysed has a degree of uncertainty; they tend to work best when our conventional computation approaches have failed to turn up robust models.
There are numerous examples of neural networks being used in medicine to this end. An example of some importance in the area of medical application of neural networks is in the diagnosis and surgical planning for horizontal strabismus. Strabismus is an anomaly of the eyes in which the eyes lose alignment with one another. One eye may fixate on the frontal point and the other may turn inwards, outwards, upwards or downwards.
The treatment for this condition is usually surgery to the eye muscles to return binocular vision. The surgery involves either weakening one set of muscles or increasing muscular torque in another, depending on the patient’s condition. There are four muscles that can be operated on and surgical intervention often involves surgery to two or three of these. The factors that influence the surgeon’s choice are complex, demanding both theoretical knowledge and practical experience from the surgeon.
When planning the surgery, the patient’s condition is analysed in a number of ways including: measurements of visual acuity, refraction exam, binocular fixation, a fundoscopic exam, and checking primary and secondary gaze positions in the leftwards, rightwards, upwards and downwards positions. The critical unit of measurement is how many millimetres of torque should be applied through either tightening or loosening the muscles to return the best binocular vision to the patient.
Researchers have analysed data from 114 surgical interventions and collated them along with an expert judgement of each patient’s best course of action. This data was then used to train a neural network to determine which muscles should be operated on and how much torque or loosening should be applied to each muscle measured in millimetres.
The efficiency of the trained NN is measured by comparing the networks predictions and the expert judgement. The model’s error rate was 0.5mm for tightening the muscle and 0.7mm for loosening. This model, therefore, shows great potential over previous formula-based techniques that relied on averaging patient values, but which didn’t capture the variability in each case.
The trained model can then be used as a decision support aid for surgeons, which should improve the outcome of these surgical interventions.
The beneficial potential of neural networks for the basis of clinical decision support is clear. They are able to represent complex relationships found in data that are not immediately obvious to human inspection.
Depending on the algorithm used in the learning process, neural networks are also tolerant to noise and error in the training set data. Depending on the size and quality of the data set, neural networks can tolerate up to 15 per cent error and still produce a robust result. In some cases, introducing noise to the dataset can actually improve the performance of the network.
However, the major problem with using these networks is the availability of data in suffi cient quality and quantity. If the data is somehow unrepresentative or in insuffi cient quantity to cover all eventualities in the area of study, the resulting models can be brittle or unrepresentative of the real-world situation.
Medical decision making
Clinical decision support systems (CDSS) are designed to facilitate clinical decision - making by health care providers. The landscape of CDSS providers is rather mixed. Our own experience is with groups providing large scale resources with 20,000 medical algorithms in multiple medical disciplines, either online, via XML forms, or as a downloadable Android or IOS application.
Other providers such as MDCalc and QXMD offer around 200 different medical calculators covering a range of medical specialties either as a web tool or a downloadable app for IOS, Android and Windows.
There are numerous examples of one- off designed CDSS that cover one or two specific illnesses, generally these have better design, build and test ethics than mass-produced CDSS systems. There is no agreed standardised way of designing CDSS and each application tends to be unique to the specific condition it is intended to assist with.
Therefore, there is very little continuity in the user interface or the functionality of each CDSS, which can cause usability issues with the end-user. Further, there are very few agreed standards to govern the use of such technology and, as such, many medical organisations do not have policies in place to guide users in the technologies’ safe use.
However, the potential benefits of CDSS for clinicians are significant; in theory CDSS offer clinicians more reliable decision-making, greater performance and fewer medical errors. Combined with machine learning, and, in particular, with neural networks, these benefits are greatly amplified. These are known as nonknowledge-based CDSS (as opposed to their knowledge-based counterparts, which will consist of a knowledge base and several if-then-rules). Nonknowledge-based CDSS rely on some form of artificial intelligence, and are typically used post- diagnostic to suggest patterns for clinicians and researchers to investigate more deeply.
A note on combining ML and distributed ledger technology
Distributed ledger (or blockchain) technology has been called the ‘next big thing’, and its power lies both in how machines in a blockchain can maintain their anonymity, as well as alleviating security concerns, given how difficult a blockchain system is to penetrate.
Blockchain is the underlying mechanism in crypto currencies. The mechanism is such that the privacy of data transfer and authentication are guaranteed. Security and privacy in the medical discipline are of the upmost importance.
A blockchain is a database composed of unchangeable digital records, called blocks, stored in a chain. Each block contains cryptographically encrypted data, which contains information about the ‘previous’ block.
The database is shared between several parties, such that everyone has a consistent view of it, and the integrity of the records is formed by a consensus amongst all authenticating parties. Distributed ledger technologies can make the services they are applied to more transparent and trustworthy, without compromises of security or privacy.
If blockchain were fully implemented in a country’s public health service, it could, in theory, allow researchers to investigate and gather patient data without compromise to patient anonymity.
The possibilities would then be virtually endless. For instance, researchers could pattern-spot for disease commonalities, to investigate unknown factors linked to a particular illness.
Additionally, combined neural networks would be a fruitful area in which to explore pattern-recognition for medical uses, as well as to enhance the NN applications mentioned in this article.
If you would like more information on this, or any other research mentioned in this article, please contact the authors at: acalderon@cardifmet.ac.uk or sthorne@ cardifmet.ac.uk at the Department of Computing, Cardif Metropolitan University, Cardif, U.K.