App helps fight food poisoning in Las Vegas with machine learning
nEmesis continuously collects tweets throughout Las Vegas and connects the tweets to food venues. These tweets are evaluated by the language model to determine which are self-reports of symptoms of foodborne illness. (Photos courtesy of Adam Sadilek)
(This press release from the National Science Foundation was issued March 7, 2016)
It's happened to many of us. We eat at a restaurant with less than ideal hygiene and come down with a nasty case of food poisoning.
Foodborne illness afflicts 48 million people annually in the U.S. alone; 120,000 individuals are hospitalized annually, and 3,000 die from the illness. In fact, one out of every six Americans gets food poisoning each year. And many of these sufferers write about it on Twitter.
Computer science researchers from the University of Rochester have developed an app for health departments that uses natural language processing and artificial intelligence to identify food poisoning-related tweets, connect them to restaurants using geotagging and identify likely hot spots.
The team presented the results of its research at the 30th Association for the Advancement of Artificial Intelligence (AAAI) conference in Phoenix, Arizona, in February. The project was supported by grants from the National Science Foundation (NSF), the National Institutes of Health and the Intel Science and Technology Center for Pervasive Computing.
Location-based epidemiology is nothing new. John Snow, credited as the world's first epidemiologist, used maps of London in 1666 to identify the source of the Cholera epidemic that was rampaging the city (a neighborhood well) and in the process discovered the connection between the disease and water sources.
However, as the researchers showed, it's now possible to deduce the source of outbreaks using publicly available social media content and deep learning algorithms trained to recognize the linguistic traits associated with a disease -- "I feel nauseous," for instance.
"We don't need to go door to door like John Snow did," says Adam Sadilek, a researcher who worked on the project at the University of Rochester and who is now at Google Research. "We can use all this data and mine it automatically."
Testing the app in Las Vegas
The work presented at AAAI described a recent collaboration with the Las Vegas health department, where officials used the app they developed, called nEmesis, to improve the city's inspection protocols.
Typically, cities (including Las Vegas) use a random system to decide which restaurants to inspect on any given day. The research team convinced Las Vegas officials to replace their random system with a list of possible sites of infection derived using their smart algorithms.
In a controlled experiment, half of the inspections were performed using the random approach and half were done using nEmesis, without the inspectors knowing that any change had occurred in the system.
"Each morning we gave the city a list of places where we knew that something was wrong so they could do an inspection of those restaurants," Sadilek said.
For three months, the system automatically scanned an average of 16,000 tweets from 3,600 users each day. 1,000 of those tweets snapped to a specific restaurant and of those, approximately 12 contained content that likely signified food poisoning. They used these tweets to generate a list of highest-priority locations for inspections.
Analyzing the results of the experiment, they found the tweet-based system led to citations for health violations in 15 percent of inspections, compared to 9 percent using the random system. Some of the inspections led to warnings; others resulted in closures.
The researchers estimate that these improvements to the efficacy of the inspections led to 9,000 fewer food poisoning incidents and 557 fewer hospitalization in Las Vegas during the course of the study.
"nEmesis has proved to be a useful tool for quickly and accurately identifying facilities in need of support, education, or regulation by the health department," says Lauren DiPrete, senior environmental health specialist for the Southern Nevada Health District.
"Adaptive inspections allow us to focus our limited resources on the restaurants with problems," says Brian Labus, a visiting research assistant professor at the University of Nevada Las Vegas's School of Community Health Sciences. "The sooner we find out about a problem, the sooner we can intervene and keep people from getting sick."
The research received an Innovative Applications of Artificial Intelligence award at AAAI as one of the best deployed applications of AI with measurable benefits.
Just beginning to scratch the surface
"Adaptive inspections are significantly more effective and can make a real dent in the statistics," Sadilek said. "This case shows how you can use public data to improve public health."
Not only that, the approach can be applied to a range of other public health problems.
"This happens to be restaurants, but the method can also be used for bedbugs," he says. "Similarly, you can look what people tweet about after they visit their doctor or hospital. We're just beginning to scratch the surface of what's possible."
nEmesis serves as an example of how researchers can leverage social media users as a kind of distributed sensor network, where each person observes and reports on some aspect of the world, according to Henry Kautz, director of the Institute for Data Science at the University of Rochester and the principal investigator on the NSF-funded project that developed the app.
"Each report is very noisy, but the aggregate results can be reliable," Kautz says. "The approach can be used for health, environmental protection, public safety and many other applications."
-- Aaron Dubrow, National Science Foundation
nEmesis web interface. The top window shows a portion of the list of food venues ranked by the number of tweeted illness self-reports by patrons. The bottom window provides a map of the selected venue and allows the user to view the specific tweets that were classified as illness self-reports.
The top 20 most significant negatively and positively weighted features in the researchers' language model.