Machine learning can predict strains of bacteria likely to cause food poisoning outbreaks, research has found. Researchers at the University of Edinburgh’s Roslin Institute used software that compares genetic information from bacterial samples isolated from both animals and people. The software learns the DNA signatures that are associated with E. coli samples that have caused outbreaks of infection in people. It can then pick out the animal strains that have these signatures, which are therefore likely to be a threat to human health. Most E. coli strains live in the guts of people and animals without causing illness but E. coli O157 is linked with more serious human infections. Cows also carry E. coli O157 and serve as the main reservoir for these toxic bacteria. The animals excrete the bacteria in their faeces but do not become ill. This makes it difficult to spot which herds and animals are carrying strains that are likely to cause disease in people. The team trained the software on DNA sequences from strains isolated from cattle herds and human infections in the UK and the US. Once trained, the computer is able to predict whether an E. coli strain is likely to have come from a cow or a person. Using this approach, the team predicts that less than ten percent of the E. coli O157 cattle strains are likely to have the potential to cause human disease. Interventions to stop the spread of the disease – such as vaccines – could be targeted at herds with these strains to minimise the risk of outbreaks in people, the team says. E. coli O157 causes stomach cramps, vomiting and severe diarrhoea in infected people. A recent outbreak of the illness in Scotland resulted in the death of a child and a further 19 cases of serious food poisoning. The infection is believed to have originated from an unpasteurised cheese source. Researchers say their approach could be adapted to test samples of other types of bacteria isolated from animals – such as salmonella and campylobacter – to identify strains with the potential to cause human disease. The study, funded by Food Standards Scotland and the Food Standards Agency, is published in the Proceedings of the National Academy of Sciences. The research was a collaboration between Scientists at The Roslin Institute, Public Health England (Colindale), the University of Glasgow and United States Division of Agriculture. The Roslin Institute receives strategic support from the Biotechnology and Biological Sciences Research Council. Professor David Gally, of the University of Edinburgh’s Roslin Institute, said: “Our findings indicate that the most dangerous E. coli O157 strains may in fact be very rare in the cattle reservoir, which is reassuring. The study highlights the potential of machine learning approaches for identifying these strains early and prevent outbreaks of this infectious disease. “We hope that by investigating the genes the software uses to discriminate the strains, we can learn why certain isolates are more of a threat to human health” For further information, please contact: Dr Jen Middleton Press and PR Office Contact details Mobile: 07795 640662 Work: +44 (0)131 650 6514 Email: jen.middleton@ed.ac.uk