• Newsletters
  • Account Activating this button will toggle the display of additional content Account Sign out

How IBM’s Watson Went From the Future of Health Care to Sold Off for Parts

Most likely, you’re familiar with Watson from the IBM computer system’s appearance on Jeopardy! in 2011, when it beat former champions Ken Jennings and Brad Rudder. Watson’s time on Jeopardy! was fun viewing, but it was also a very savvy public debut of a product that IBM wanted to sell: Watson Health.

Watson Health was supposed to change health care in a lot of important ways, by providing insight to oncologists about care for cancer patients, delivering insight to pharmaceutical companies about drug development, helping to match patients with clinical trials, and more. It sounded revolutionary, but it never really worked. Recently, Watson Health was, essentially, sold for parts: Francisco Partners, a private equity firm, bought some of Watson’s data and analytics products for what Bloomberg News said was more than $1 billion.

On Friday’s episode of What Next: TBD, I spoke with Casey Ross , technology correspondent for Stat News, who has been covering Watson Health for years, about how Watson went from being the future of health care to being sold for scraps. Our conversation has been edited and condensed for clarity.

Lizzie O’Leary: I look at the amount of money that went into pulling this together. Acquisition after acquisition. It was billions of dollars, and it sold for a billion in the end. Is there any way to read that as anything but a failure?

Casey Ross: Financially, certainly not. They spent way more money building this than they got back. Just the acquisitions alone cost them $5 billion. That it was sold so many years later, after so much in effort—7,000 employees at one point—means that this will as a total failure that they needed to just cut their losses and move on.

Why did IBM want to get into the health data business? What problem did they think Watson would help solve?

There’s a tremendous amount of information that is collected every day on the care of hundreds of millions of people. However, there is currently no way to connect that information, to link it to an individual across all the domains in which they get care, and then to develop a holistic picture of who they are, of what their diseases are, of what the best treatments are, and how to ensure that they get the best care at the lowest possible cost. There is no connectivity right now that can do that at scale. The people in the technology sector look at it and say, “This has to be fixed, and we’re going to fix it.”

Google, Microsoft, a lot of very big companies are extremely interested in health care. What is so attractive for these big tech companies about health care?

It’s one of the biggest parts of our economy. It’s a three trillion business that has legacy technology infrastructure that should be embarrassing. Tech companies are drawn to audacious challenges like this, and ones where they can make—if they’re successful—a ton of money.

That’s how things are today, but the same problems have been around since the advent of digitized data. In 2012, IBM closed a deal with Memorial Sloan Kettering, one of the preeminent cancer centers in the country, to train an AI to make treatment recommendations. What was the goal? What were they trying to do?

They were really trying to democratize the expertise of Memorial Sloan Kettering’s oncologists, to make that expertise available to patients all over the world and to develop this standardized engine for providing optimal treatment recommendations, customized to a patient, in front of a doctor, thousands of miles away. It was a beautiful notion. They were trying to say, “Well, let’s make it more objective. Let’s look at all of the data, and let’s tell every physician, for this patient in front of you, this is how they should be treated.”

So you get your biopsy results, and things don’t look good, but you’re not just getting the expertise or the biases of your particular oncologist. You’re getting the wealth of thousands of oncologists distilled into an algorithm?

Yes, you are getting all of that data, across so many different physicians, crunched down into a very digestible format and recommendation that could then lead to the best treatment for that patient.

Reading your reporting, it sounds like this was incredibly important to IBM. In 2015, Ginni Rometty, who was the CEO at the time, went on Charlie Rose . She said health care was “our moonshot.” How much of IBM’s hopes were hung on this thing?

The company made a huge bet that this could be the bridge to a different kind of future for IBM, which at the time was several years of quarterly revenue declines. They were trying to use Watson as a bridge to a different future where IBM wasn’t this old guard hardware company that everybody knew so well, but was operating on the cutting edge of artificial intelligence. Health care was the biggest, the buzziest use case. This was where they were going to really show the surpassing value of their technology.

To do that, IBM needed massive amounts of data on which to train Watson. It got that data through acquisitions, eventually spending some $5 billion buying a series of health data companies. What were those companies?

Truven, Phytel, Explorys and merge. Truven had the biggest insurance database in the nation with 300 million covered lives, Explorys provided a clinical data set of actual electronic health records kept by health systems representing about 50 million or so patients, Phytel added on top of that, and Merge had a huge imaging database. They had all this data and the idea was: Expose Watson to that, and it finds patterns that physicians and anyone else can’t possibly find when looking at that data, given all the variables in it.

Except that was not the reality. One of IBM’s high-profile partnerships with MD Anderson Cancer Center in Texas fell apart. A doctor involved said that there wasn’t enough data for the program to make good recommendations, and that Watson had trouble with the complexity of patient files. The partnership was later audited and shelved. What went wrong?

If you think about it, knowing what we know now or what we’ve learned through this, the notion that you’re going to take an artificial intelligence tool, expose it to data on patients who were cared for on the upper east side of Manhattan, and then use that information and the insights derived from it to treat patients in China, is ridiculous. You need to have representative data. The data from New York is just not going to generalize to different kinds of patients all the way across the world.

What was happening in a clinical setting? What was happening to patients?

Our window through the reporting was talking to physicians. We got concerns from them that the recommendations that it was giving were just not relevant. Maybe it would suggest a particular kind of treatment that wasn’t available in the locality in which it was making the recommendation, or the recommendation did not at all square with the treatment protocols that were in use at the local institution or, and more commonly so, especially in the U.S. and Europe, “you’re not telling me anything I don’t already know.” That was the big credibility gap for physicians. It was like, “Well duh. Yeah, I know that that’s the chemotherapy I should pursue. I know that this treatment follows that one.”

You got a hold of an internal IBM presentation from 2017 where a doctor at a hospital in Florida told the company this product was a piece of shit .

Seeing that written down in an internal document, which was circulated among IBM executives, was a shocking thing to see. It really underscored the extent of the gap between what IBM was saying in public and what was happening behind the scenes.

There were a lot of internal discussions, even a presentation, that indicated that the technology was not as far along as they’d hoped, that it wasn’t able to accomplish what they set out to accomplish in cancer care. There were probably a lot of people that believed, that truly did believe, that they would get there or that it was closer than maybe some people realized. I think the marketing got way ahead of the capabilities.

It’s very hard to listen to you and not think about Theranos , even though this is not a one-to-one parallel in any way. When you are trying to move by leaps and bounds with technology in the health care sector, it feels like a reminder that all things are not created equal, that making big leaps with people’s health is a much riskier proposition.

That underscores the central theme of this story: When you try to combine the bravado of the tech culture and the notion that you can achieve these huge audacious goals in a domain where you’re dealing with people’s lives and health and the most sacrosanct aspects of their existence and their bodies, you need to have evidence to back up that you can do what you say you can do.

Why did they continue on trying to rescue this product that they seemed to know internally was failing?

I think they had so much invested in it that it really was, for them, too big to fail. It had 7,000 employees. They’d invested so much time and energy on marketing in the success of the product that they really needed it to succeed.

Instead, they got a fail. But Watson’s fate certainly doesn’t mean that AI in health care is going away. Just recently, Microsoft and a large group of hospitals announced a coalition to develop AI solutions in health care. If you had to pin down a moral to the story, is it that AI in health care isn’t ready for prime time, or that IBM did it wrong?

I think it’s both of those. This will be a case study for business schools for decades. When you look at what IBM did and the strategy mistakes, the tactical errors that they made in pursuing this product, they made a lot of unforced errors here. It’s also true that the generation of technology that they had was nowhere near ready to accomplish the things that they set out to accomplish and promised that they could accomplish. I don’t think that the failure of Watson means that artificial intelligence isn’t ready to make significant improvements and changes in health care. I think it means the way that they approached it is a cautionary tale that lays out how not to do it.

Does the failure of Watson Health make you worry that it’s going to shut down other avenues for innovation? Will such a spectacular belly flop impede progress?

I don’t think so. There were so many mistakes that were made, that were learned from, that, if anything, it will facilitate faster learning and better decision making by other parties that are now poised to disrupt health care and make the progress that IBM failed to achieve. There’s a saying that pioneers often end up with arrows in their backs, and that’s what happened here. They’re an example, a spectacular example, of wrongheaded decision making and missteps that didn’t have to happen. By learning from that, I think advancement and progress and true benefits will be faster coming.

Future Tense is a partnership of Slate , New America , and Arizona State University that examines emerging technologies, public policy, and society.

comscore beacon

ibm watson case study

What Ever Happened to IBM’s Watson?

IBM’s artificial intelligence was supposed to transform industries and generate riches for the company. Neither has panned out. Now, IBM has settled on a humbler vision for Watson.

Credit... Video by Maria Chimishkyan

Supported by

  • Share full article

Steve Lohr

By Steve Lohr

  • Published July 16, 2021 Updated July 17, 2021

A decade ago, IBM’s public confidence was unmistakable. Its Watson supercomputer had just trounced Ken Jennings , the best human “Jeopardy!” player ever, showcasing the power of artificial intelligence. This was only the beginning of a technological revolution about to sweep through society, the company pledged.

“Already,” IBM declared in an advertisement the day after the Watson victory, “we are exploring ways to apply Watson skills to the rich, varied language of health care, finance, law and academia.”

But inside the company, the star scientist behind Watson had a warning: Beware what you promise.

David Ferrucci, the scientist, explained that Watson was engineered to identify word patterns and predict correct answers for the trivia game. It was not an all-purpose answer box ready to take on the commercial world, he said. It might well fail a second-grade reading comprehension test.

His explanation got a polite hearing from business colleagues, but little more.

“It wasn’t the marketing message,” recalled Mr. Ferrucci, who left IBM the following year.

It was, however, a prescient message.

IBM poured many millions of dollars in the next few years into promoting Watson as a benevolent digital assistant that would help hospitals and farms as well as offices and factories. The potential uses, IBM suggested, were boundless, from spotting new market opportunities to tackling cancer and climate change. An IBM report called it “the future of knowing.”

IBM’s television ads included playful chats Watson had with Serena Williams and Bob Dylan. Watson was featured on “60 Minutes.” For many people, Watson became synonymous with A.I.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and  log into  your Times account, or  subscribe  for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?  Log in .

Want all of The Times?  Subscribe .

Advertisement

  • Harvard Business School →
  • Faculty & Research →
  • December 2020 (Revised April 2021)
  • HBS Case Collection

IBM Watson at MD Anderson Cancer Center

  • Format: Print
  • | Language: English
  • | Pages: 27

About The Author

ibm watson case study

Shane M. Greenstein

Related work.

  • Faculty Research
  • August 2021
  • IBM Watson at MD Anderson Cancer Center  By: Shane Greenstein, Mel Martin and Sarkis Agaian
  • IBM Watson at MD Anderson Cancer Center  By: Shane Greenstein and Mel Martin

More From Forbes

Md anderson benches ibm watson in setback for artificial intelligence in medicine.

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

Virginia "Ginni" Rometty, chief executive officer of International Business Machines Corp. (IBM) ... [+] Photographer: David Paul Morris/Bloomberg

It was one of those amazing “we’re living in the future” moments. In an October 2013 press release , IBM declared that MD Anderson, the cancer center that is part of the University of Texas, “is using the IBM Watson cognitive computing system for its mission to eradicate cancer.”

Well, now that future is past. The partnership between IBM and one of the world’s top cancer research institutions is falling apart. The project is on hold, MD Anderson confirms, and has been since late last year. MD Anderson is actively requesting bids from other contractors who might replace IBM in future efforts. And a scathing report from auditors at the University of Texas says the project cost MD Anderson more than $62 million and yet did not meet its goals. The report, however, states: "Results stated herein should not be interpreted as an opinion on the scientific basis or functional capabilities of the system in its current state."

“When it was appropriate to do so, the project was placed on hold,” an MD Anderson spokesperson says. “As a public institution, we decided to go out to the marketplace for competitive bids to see where the industry has progressed.”

Also on Forbes:

The disclosure comes at an uncomfortable moment for IBM. Tomorrow, the company’s chief executive, Ginni Rometty, will make a presentation to a giant health information technology conference detailing the progress Watson has made in healthcare, and announcing the launch of new products for managing medical images and making sure hospitals deliver value for the money, as well as new partnerships with healthcare systems. The end of the MD Anderson collaboration looks bad.  Even if the decision is as much a result of MD Anderson's mismanagement or red tape--which it may be--it is still a setback for a field without any big successes.

But IBM defended the MD Anderson product, known as the Oncology Expert Advisor or OEA. It says the OEA’s recommendations were accurate, agreeing with experts 90% of the time. “The OEA R&D project was a success, and likely could have been deployed had MD Anderson chosen to take it forward,” says an IBM spokesperson.

Watson, IBM’s language-based computing project, gripped the world’s imagination in 2011 when the supercomputer won an exhibition of the game show Jeopardy! against the show’s two highest-rated players. In March 2012, IBM signed a deal with Memorial Sloan Kettering Cancer Center in New York to develop a commercial product that would use the same technology to analyze the medical literature and help doctors choose treatments for cancer patients.

MD Anderson, Memorial’s longtime rival, entered the fray after this agreement was already in place. Lynda Chin, the former chair of the MD Anderson Department of Genomic Medicine and the wife of MD Anderson president Ronald DePinho, set up a collaboration with IBM to develop a separate project. Chin left MD Anderson for another job within the University of Texas system in 2015.

In a strange twist, MD Anderson would pay for the whole thing, eventually giving $39.2 million to IBM and $21.2 million to PricewaterhouseCoopers, which was hired to create a business plan around the product. According to the Washington Post , at least $50 million of the money came from Low Taek Jho, a flamboyant Malaysian financier whose business dealings are reportedly now under investigation by the U.S. Department of Justice .

Usually, companies pay research centers to do research on their products; in this case, MD Anderson paid for the privilege, although it would have apparently also owned the product. This was a “very unusual business arrangement,” says Vinay Prasad, an oncologist at Oregon Health & Science University.

According to the audit report, Chin went around normal procedures to pay for the expensive undertaking. The report notes "a consistent pattern of PwC fees set just below MD Anderson’s Board approval threshold," and its appendix seems to indicate this may have occurred with payments to IBM, too.* She also didn’t get approval from the information technology department.

It seems “very strange” that the IT department was bypassed, and “very unusual” that payments were not based on measurable deliverables, says John Halamka, the chief information officer at Beth Israel Deaconness Medical Center in Boston. He also notes that payments seem to have been made from donations that had not yet been received.

Despite all this drama, initial reports on the MD-Anderson/Watson collaboration were positive. In 2015 the Washington Post said MD Anderson doctors-in-training were amazed by the machine’s recommendations. “I was surprised,” one told the newspaper. “Even if you work all night, it would be impossible to be able to put this much information together like that.”

But inside the University of Texas, the project was apparently seen as one that missed deadlines and didn’t deliver. The audit notes that the focus of the project was changed several times, first focusing on one type of leukemia, then another, then lung cancer. The initial plan was to test out the product out in pilots at two other hospitals. That never happened.

MD Anderson changed the software it uses for managing electronic medical records, switching to a system made by Epic Systems of Madison, Wis. It has blamed this new system for a $405 million drop in its net income . According to the audit report, the Watson product doesn’t work with the new Epic system, and must be revamped in order to be re-tested. The information in the MD-Anderson/Watson product is also now out of date.

In September, IBM stopped supporting the product, according the audit, which was produced last November. The Cancer Letter and the Houston Chronicle reported on the audit last week. Forbes obtained a copy of a request for proposals confirming that MD Anderson is actively looking for a company to take on IBM’s role. In a statement, MD Anderson said that it was not excluding companies that had previously worked with it from job, implying that it might choose to go with IBM to reboot the project.

Meanwhile, IBM now sells a product it developed with Memorial Sloan Kettering. The goal, as with the MD Anderson product, is to help doctors select treatments. Without a computer, this is done with a so-called “tumor board,” a group of experts who meet weekly. IBM points to a dozen studies presented at academic meetings showing that Watson’s recommendations agree with those of tumor boards.

When IBM CEO Rometty makes her announcements tomorrow at HIMSS, the health-tech conference, the question for doctors and investors will be this: are they more like the Memorial Sloan Kettering effort, which seems to have resulted in a real product? Or are they like the mess that seems to have happened at MD Anderson?

*A previous version of this story said the payments below the board threshold were made to IBM.

Matthew Herper

  • Editorial Standards
  • Reprints & Permissions

IBM: Making Machine Learning and AI Accessible

Watson warriors, an experiential experience.

Watson Warriors is an interactive game competitions designed to win the hearts and minds of data scientists from around the world.

The goal of the experience is to educate and evangelize the power of IBM's Watson AI stack.

IBM had a problem

Resonating with future data scientists.

IBM and the advantages of Watson were not effectively evangelized and not in enough conversations. This was especially true when it came to emerging businesses and college campuses, both strategic targets for IBM. Watson is the gold standard and IBM needed it to be recognized as such.

Vintage IBM magazine ad

IBM needed to shift this perception and get Watson (and therefore IBM) in the most relevant conversations among current and future tech workforces. Watson and IBM needed to be showcased at the decision makers' table.

Making IBM cool

A gamified marketing platform.

Developed in partnership with TechData and IBM, Launch dreamed up Watson Warriors, an interactive data science experience that leverages the power of Watson AI. Launch’s team of experts immersed themselves in all things Watson: the software, the hardware (Power9 ac922), the accelerator (WMLA), and data from the Weather Company - resulting in a live-game event that pits seasoned and budding data scientists against each other to effectively solve problems using Watson AI and Power9 hardware.

IBM Watson web page on a laptop

A marketing event like no other

Watson Warriors logo

With historical weather data supplied by The Weather Company, combined with a comprehensive dataset of all wildfires in the U.S between 1992 and 2015, players compete head-to-head in a series of challenges that step through the development of a wildfire prediction model. From creating a project and connecting to a GitHub repo, to leveraging the V4 tool to train a model that can distinguish images of smoke from fog, these challenges highlight the advantages of Watson AI that make a data scientist’s work more productive.

Watson Warriors competitors

The game interface is a reactive web-hosted application that connects directly to a Watson Studio environment, so actions players make using Watson Studio tools are reflected and scored in real-time - and displayed on a global leaderboard for all players to see.

Data for good

At the Beta Event in Seattle, players navigated a series of nine data science challenges that leveraged a massive dataset supplied by the Weather Channel. This data helped players build, improve, and deploy data science models at scale to solve macro problems, specifically designed to prevent forest fires.

Watson Warriors competitors using laptops at a table

Taking the show on the road

Or maybe not….

Like most plans in 2020, the Watson Warriors roadshow was derailed by COVID-19. Despite the change of course, the team’s resilience prevailed, and the entire experience was shifted to online and virtual events within a matter of days.

Watson Warriors roadshow mockup

Watson Warriors events continued without missing a beat.

Winning hearts and minds

Watson Warriors was a huge success and showcased the power of IBM Watson in a fun and educational way. Whether in person or online, the game illustrated the advantages of Watson, making machine learning and AI concepts accessible to players of all backgrounds and experiences.

For IEEE Members

Ieee spectrum, follow ieee spectrum, support ieee spectrum, enjoy more free content and benefits by creating an account, saving articles to read later requires an ieee spectrum account, the institute content is only available for members, downloading full pdf issues is exclusive for ieee members, downloading this e-book is exclusive for ieee members, access to spectrum 's digital edition is exclusive for ieee members, following topics is a feature exclusive for ieee members, adding your response to an article requires an ieee spectrum account, create an account to access more content and features on ieee spectrum , including the ability to save articles to read later, download spectrum collections, and participate in conversations with readers and editors. for more exclusive content and features, consider joining ieee ., join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of spectrum’s articles, archives, pdf downloads, and other benefits. learn more →, join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of ieee spectrum’s articles, archives, pdf downloads, and other benefits. learn more →, access thousands of articles — completely free, create an account and get exclusive content and features: save articles, download collections, and talk to tech insiders — all free for full access and benefits, join ieee as a paying member., how ibm watson overpromised and underdelivered on ai health care, after its triumph on jeopardy, ibm’s ai seemed poised to revolutionize medicine. doctors are still waiting.

Conceptual photo-illustration imagining IBM’s AI Watson as a concerned doctor, with the Watson logo standing in for the doctor’s face.

In 2014, IBM opened swanky new headquarters for its artificial intelligence division, known as IBM Watson . Inside the glassy tower in lower Manhattan, IBMers can bring prospective clients and visiting journalists into the “immersion room,” which resembles a miniature planetarium. There, in the darkened space, visitors sit on swiveling stools while fancy graphics flash around the curved screens covering the walls. It’s the closest you can get, IBMers sometimes say, to being inside Watson’s electronic brain.

One dazzling 2014 demonstration of Watson’s brainpower showed off its potential to transform medicine using AI—a goal that IBM CEO Virginia Rometty often calls the company’s moon shot. In the demo, Watson took a bizarre collection of patient symptoms and came up with a list of possible diagnoses, each annotated with Watson’s confidence level and links to supporting medical literature.

Within the comfortable confines of the dome, Watson never failed to impress: Its memory banks held knowledge of every rare disease, and its processors weren’t susceptible to the kind of cognitive bias that can throw off doctors. It could crack a tough case in mere seconds. If Watson could bring that instant expertise to hospitals and clinics all around the world, it seemed possible that the AI could reduce diagnosis errors, optimize treatments, and even alleviate doctor shortages—not by replacing doctors but by helping them do their jobs faster and better.

Project: Oncology Expert Advisor

MD Anderson Cancer Center partnered with IBM Watson to create an advisory tool for oncologists. The tool used natural-language processing (NLP) to summarize patients’ electronic health records, then searched databases to provide treatment recommendations. Physicians tried out a prototype in the leukemia department, but MD Anderson canceled the project in 2016—after spending US $62 million on it.

Outside of corporate headquarters, however, IBM has discovered that its powerful technology is no match for the messy reality of today’s health care system. And in trying to apply Watson to cancer treatment, one of medicine’s biggest challenges, IBM encountered a fundamental mismatch between the way machines learn and the way doctors work.

IBM’s bold attempt to revolutionize health care began in 2011. The day after Watson thoroughly defeated two human champions in the game of Jeopardy! , IBM announced a new career path for its AI quiz-show winner: It would become an AI doctor. IBM would take the breakthrough technology it showed off on television—mainly, the ability to understand natural language—and apply it to medicine. Watson’s first commercial offerings for health care would be available in 18 to 24 months, the company promised.

In fact, the projects that IBM announced that first day did not yield commercial products. In the eight years since, IBM has trumpeted many more high-profile efforts to develop AI-powered medical technology—many of which have fizzled, and a few of which have failed spectacularly. The company spent billions on acquisitions to bolster its internal efforts, but insiders say the acquired companies haven’t yet contributed much . And the products that have emerged from IBM’s Watson Health division are nothing like the brilliant AI doctor that was once envisioned: They’re more like AI assistants that can perform certain routine tasks.

“Reputationally, I think they’re in some trouble,” says Robert Wachter , chair of the department of medicine at the University of California, San Francisco, and author of the 2015 book The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age (McGraw-Hill). In part, he says, IBM is suffering from its ambition: It was the first company to make a major push to bring AI to the clinic. But it also earned ill will and skepticism by boasting of Watson’s abilities. “They came in with marketing first, product second, and got everybody excited,” he says. “Then the rubber hit the road. This is an incredibly hard set of problems, and IBM, by being first out, has demonstrated that for everyone else.”

Since 2011, IBM Watson has announced a multitude of projects in health care.

How have they fared?

At a 2017 conference of health IT professionals, IBM CEO Rometty told the crowd that AI “is real, it’s mainstream, it’s here, and it can change almost everything about health care,” and added that it could usher in a medical “golden age.” She’s not alone in seeing an opportunity: Experts in computer science and medicine alike agree that AI has the potential to transform the health care industry. Yet so far, that potential has primarily been demonstrated in carefully controlled experiments. Only a few AI-based tools have been approved by regulators for use in real hospitals and doctors’ offices. Those pioneering products work mostly in the visual realm, using computer vision to analyze images like X-rays and retina scans. (IBM does not have a product that analyzes medical images, though it has an active research project in that area.)

Looking beyond images, however, even today’s best AI struggles to make sense of complex medical information. And encoding a human doctor’s expertise in software turns out to be a very tricky proposition. IBM has learned these painful lessons in the marketplace, as the world watched. While the company isn’t giving up on its moon shot, its launch failures have shown technologists and physicians alike just how difficult it is to build an AI doctor.

The Jeo par dy! victory in 2011 showed Watson’s remarkable skill with natural-language processing (NLP). To play the game, it had to parse complicated clues full of wordplay, search massive textual databases to find possible answers, and determine the best one. Watson wasn’t a glorified search engine; it didn’t just return documents based on keywords. Instead it employed hundreds of algorithms to map the “entities” in a sentence and understand the relationships among them. It used this skill to make sense of both the Jeopardy! clue and the millions of text sources it mined.

Project: Cognitive Coaching System

The sportswear company Under Armour teamed up with Watson Health to create a “personal health trainer and tness consultant.” Using data from Under Armour’s activity-tracker app, the Cognitive Coach was intended to provide customized training programs based on a user’s habits, as well as advice based on analysis of outcomes achieved by similar people. The coach never launched, and Under Armour is no longer working with IBM Watson.

“It almost seemed that Watson could understand the meaning of language, rather than just recognizing patterns of words,” says Martin Kohn , who was the chief medical scientist for IBM Research at the time of the Jeopardy! match. “It was an order of magnitude more powerful than what existed.” What’s more, Watson developed this ability on its own, via machine learning. The IBM researchers trained Watson by giving it thousands of Jeopardy! clues and responses that were labeled as correct or incorrect. In this complex data set, the AI discovered patterns and made a model for how to get from an input (a clue) to an output (a correct response).

Long before Watson starred on the Jeopardy! stage, IBM had considered its possibilities for health care. Medicine, with its reams of patient data, seemed an obvious fit, particularly as hospitals and doctors were switching over to electronic health records. While some of that data can be easily digested by machines, such as lab results and vital-sign measurements, the bulk of it is “unstructured” information, such as doctor’s notes and hospital discharge summaries. That narrative text accounts for about 80 percent of a typical patient’s record—and it’s a stew of jargon, shorthand, and subjective statements.

Kohn, who came to IBM with a medical degree from Harvard University and an engineering degree from MIT, was excited to help Watson tackle the language of medicine. “It seemed like Watson had the potential to overcome those complexities,” he says. By turning its mighty NLP abilities to medicine, the theory went, Watson could read patients’ health records as well as the entire corpus of medical literature: textbooks, peer-reviewed journal articles, lists of approved drugs, and so on. With access to all this data, Watson might become a superdoctor, discerning patterns that no human could ever spot.

“Doctors go to work every day—especially the people on the front lines, the primary care doctors—with the understanding that they cannot possibly know everything they need to know in order to practice the best, most efficient, most effective medicine possible,” says Herbert Chase , a professor of medicine and biomedical informatics at Columbia University who collaborated with IBM in its first health care efforts. But Watson, he says, could keep up—and if turned into a tool for “clinical decision support,” it could enable doctors to keep up, too. In lieu of a Jeopardy! clue, a physician could give Watson a patient’s case history and ask for a diagnosis or optimal treatment plan.

Chase worked with IBM researchers on the prototype for a diagnostic tool, the thing that dazzled visitors in the Watson immersion room. But IBM chose not to commercialize it, and Chase parted ways with IBM in 2014. He’s disappointed with Watson’s slow progress in medicine since then. “I’m not aware of any spectacular home runs,” he says.

He’s one of many early Watson enthusiasts who are now dismayed. Eliot Siegel , a professor of radiology and vice chair of information systems at the University of Maryland, also collaborated with IBM on the diagnostic research. While he thinks AI-enabled tools will be indispensable to doctors within a decade, he’s not confident that IBM will build them. “I don’t think they’re on the cutting edge of AI,” says Siegel. “The most exciting things are going on at Google , Apple, and Amazon.”

As for Kohn, who left IBM in 2014, he says the company fell into a common trap: “Merely proving that you have powerful technology is not sufficient,” he says. “Prove to me that it will actually do something useful—that it will make my life better, and my patients’ lives better.” Kohn says he’s been waiting to see peer-reviewed papers in the medical journals demonstrating that AI can improve patient outcomes and save health systems money. “To date there’s very little in the way of such publications,” he says, “and none of consequence for Watson.”

AI’s First Foray Into Health Care

Doctors are a conservative bunch—for good reason—and slow to adopt new technologies. But in some areas of health care, medical professionals are beginning to see artificially intelligent systems as reliable and helpful. Here are a few early steps toward AI medicine.

Robotic Surgery Image Analysis Genetic Analysis Pathology
Currently used only for routine steps in simple procedures like laser eye surgery and hair transplants. Experts are just beginning to use automated systems to help them examine X-rays, retina scans, and other images. With genome scans becoming a routine part of medicine, AI tools that quickly draw insights from the data are becoming necessary. Experimental systems have proved adept at analyzing biopsy samples, but aren’t yet approved for clinical use.
Clinical-Decision support Virtual Nursing Medical Administration Mental Health
Hospitals are introducing tools for applications like predicting septic shock, but they haven’t yet proved their value. Rudimentary systems can check on patients between office visits and provide automatic alerts to physicians. Companies are rushing to offer AI-enabled tools that can increase efficiency in tasks like billing and insurance claims. Researchers are exploring such applications as monitoring depression by mining mobile phone and social media data.

In trying to bring AI into the clinic, IBM was taking on an enormous technical challenge. But having fallen behind tech giants like Google and Apple in many other computing realms, IBM needed something big to stay relevant. In 2014, the company invested US $1 billion in its Watson unit , which was developing tech for multiple business sectors. In 2015, IBM announced the formation of a special Watson Health division , and by mid-2016 Watson Health had acquired four health-data companies for a total cost of about $4 billion. It seemed that IBM had the technology, the resources, and the commitment necessary to make AI work in health care.

Today, IBM’s leaders talk about the Watson Health effort as “a journey” down a road with many twists and turns. “It’s a difficult task to inject AI into health care, and it’s a challenge. But we’re doing it,” says John E. Kelly III, IBM senior vice president for cognitive solutions and IBM research. Kelly has guided the Watson effort since the Jeopardy! days, and in late 2018 he also assumed direct oversight of Watson Health. He says the company has pivoted when it needs to: “We’re continuing to learn, so our offerings change as we learn.”

Project: Sugar.IQ

Medtronic and Watson Health began working together in 2015 on an app for personalized diabetes management. The app works with data from Medtronic’s continuous glucose monitor, and helps diabetes patients track how their medications, food, and lifestyle choices affect their glucose levels. The FDA-approved app launched in 2018.

The diagnostic tool, for example, wasn’t brought to market because the business case wasn’t there, says Ajay Royyuru , IBM’s vice president of health care and life sciences research. “Diagnosis is not the place to go,” he says. “That’s something the experts do pretty well. It’s a hard task, and no matter how well you do it with AI, it’s not going to displace the expert practitioner.” (Not everyone agrees with Royyuru: A 2015 report on diagnostic error s from the National Academies of Sciences, Engineering, and Medicine stated that improving diagnoses represents a “moral, professional, and public health imperative.”)

In an attempt to find the business case for medical AI, IBM pursued a dizzying number of projects targeted to all the different players in the health care system: physicians, administrative staff, insurers, and patients. What ties all the threads together, says Kelly, is an effort to provide “decision support using AI [that analyzes] massive data sets.” IBM’s most publicized project focused on oncology, where it hoped to deploy Watson’s “cognitive” abilities to turn big data into personalized cancer treatments for patients.

In many attempted applications, Watson’s NLP struggled to make sense of medical text—as have many other AI systems. “We’re doing incredibly better with NLP than we were five years ago, yet we’re still incredibly worse than humans,” says Yoshua Bengio , a professor of computer science at the University of Montreal and a leading AI researcher. In medical text documents, Bengio says, AI systems can’t understand ambiguity and don’t pick up on subtle clues that a human doctor would notice. Bengio says current NLP technology can help the health care system: “It doesn’t have to have full understanding to do something incredibly useful,” he says. But no AI built so far can match a human doctor’s comprehension and insight. “No, we’re not there,” he says.

IBM’s work on cancer serves as the prime example of the challenges the company encountered. “I don’t think anybody had any idea it would take this long or be this complicated,” says Mark Kris , a lung cancer specialist at Memorial Sloan Kettering Cancer Center, in New York City, who has led his institution’s collaboration with IBM Watson since 2012.

The effort to improve cancer care had two main tracks. Kris and other preeminent physicians at Sloan Kettering trained an AI system that became the product Watson for Oncology in 2015. Across the country, preeminent physicians at the University of Texas MD Anderson Cancer Center, in Houston, collaborated with IBM to create a different tool called Oncology Expert Advisor. MD Anderson got as far as testing the tool in the leukemia department, but it never became a commercial product.

Both efforts have received strong criticism. One excoriating article about Watson for Oncology alleged that it provided useless and sometimes dangerous recommendations (IBM contests these allegations ). More broadly, Kris says he has often heard the critique that the product isn’t “real AI.” And the MD Anderson project failed dramatically: A 2016 audit by the University of Texas found that the cancer center spent $62 million on the project before canceling it. A deeper look at these two projects reveals a fundamental mismatch between the promise of machine learning and the reality of medical care—between “real AI” and the requirements of a functional product for today’s doctors.

Watson for Oncology was supposed to learn by ingesting the vast medical literature on cancer and the health records of real cancer patients. The hope was that Watson, with its mighty computing power, would examine hundreds of variables in these records—including demographics, tumor characteristics, treatments, and outcomes—and discover patterns invisible to humans. It would also keep up to date with the bevy of journal articles about cancer treatments being published every day. To Sloan Kettering’s oncologists, it sounded like a potential breakthrough in cancer care. To IBM, it sounded like a great product. “I don’t think anybody knew what we were in for,” says Kris.

Watson learned fairly quickly how to scan articles about clinical studies and determine the basic outcomes. But it proved impossible to teach Watson to read the articles the way a doctor would. “The information that physicians extract from an article, that they use to change their care, may not be the major point of the study,” Kris says. Watson’s thinking is based on statistics, so all it can do is gather statistics about main outcomes, explains Kris. “But doctors don’t work that way.”

In 2018, for example, the FDA approved a new “tissue agnostic” cancer drug that is effective against all tumors that exhibit a specific genetic mutation. The drug was fast-tracked based on dramatic results in just 55 patients, of whom four had lung cancer. “We’re now saying that every patient with lung cancer should be tested for this gene,” Kris says. “All the prior guidelines have been thrown out, based on four patients.” But Watson won’t change its conclusions based on just four patients. To solve this problem, the Sloan Kettering experts created “synthetic cases” that Watson could learn from, essentially make-believe patients with certain demographic profiles and cancer characteristics. “I believe in analytics; I believe it can uncover things,” says Kris. “But when it comes to cancer, it really doesn’t work.”

Do You Agree?

Several studies have compared Watson for Oncology’s cancer treatment recommendations to those of hospital oncologists. The concordance percentages indicate how often Watson’s advice matched the experts’ treatment plans.

The realization that Watson couldn’t independently extract insights from breaking news in the medical literature was just the first strike. Researchers also found that it couldn’t mine information from patients’ electronic health records as they’d expected.

At MD Anderson, researchers put Watson to work on leukemia patients’ health records—and quickly discovered how tough those records were to work with. Yes, Watson had phenomenal NLP skills. But in these records, data might be missing, written down in an ambiguous way, or out of chronological order. In a 2018 paper published in The Oncologist , the team reported that its Watson-powered Oncology Expert Advisor had variable success in extracting information from text documents in medical records. It had accuracy scores ranging from 90 to 96 percent when dealing with clear concepts like diagnosis, but scores of only 63 to 65 percent for time-dependent information like therapy timelines.

In a final blow to the dream of an AI superdoctor, researchers realized that Watson can’t compare a new patient with the universe of cancer patients who have come before to discover hidden patterns. Both Sloan Kettering and MD Anderson hoped that the AI would mimic the abilities of their expert oncologists, who draw on their experience of patients, treatments, and outcomes when they devise a strategy for a new patient. A machine that could do the same type of population analysis—more rigorously, and using thousands more patients—would be hugely powerful.

But the health care system’s current standards don’t encourage such real-world learning. MD Anderson’s Oncology Expert Advisor issued only “evidence based” recommendations linked to official medical guidelines and the outcomes of studies published in the medical literature. If an AI system were to base its advice on patterns it discovered in medical records—for example, that a certain type of patient does better on a certain drug—its recommendations wouldn’t be considered evidence based, the gold standard in medicine. Without the strict controls of a scientific study, such a finding would be considered only correlation, not causation.

Kohn, formerly of IBM, and many others think the standards of health care must change in order for AI to realize its full potential and transform medicine. “The gold standard is not really gold,” Kohn says. AI systems could consider many more factors than will ever be represented in a clinical trial, and could sort patients into many more categories to provide “truly personalized care,” Kohn says. Infrastructure must change too: Health care institutions must agree to share their proprietary and privacy-controlled data so AI systems can learn from millions of patients followed over many years.

According to anecdotal reports , IBM has had trouble finding buyers for its Watson oncology product in the United States. Some oncologists say they trust their own judgment and don’t need Watson telling them what to do. Others say it suggests only standard treatments that they’re well aware of. But Kris says some physicians are finding it useful as an instant second opinion that they can share with nervous patients. “As imperfect as it is, and limited as it is, it’s very helpful,” Kris says. IBM sales reps have had more luck outside the United States, with hospitals in India, South Korea, Thailand, and beyond adopting the technology. Many of these hospitals proudly use the IBM Watson brand in their marketing, telling patients that they’ll be getting AI-powered cancer care.

In the past few years, these hospitals have begun publishing studies about their experiences with Watson for Oncology. In India, physicians at the Manipal Comprehensive Cancer Center evaluated Watson on 638 breast cancer cases and found a 73 percent concordance rate in treatment recommendations; its score was brought down by poor performance on metastatic breast cancer. Watson fared worse at Gachon University Gil Medical Center, in South Korea, where its top recommendations for 656 colon cancer patients matched those of the experts only 49 percent of the time. Doctors reported that Watson did poorly with older patients, didn’t suggest certain standard drugs, and had a bug that caused it to recommend surveillance instead of aggressive treatment for certain patients with metastatic cancer.

These studies aimed to determine whether Watson for Oncology’s technology performs as expected. But no study has yet shown that it benefits patients. Wachter of UCSF says that’s a growing problem for the company: “IBM knew that the win on Jeopardy! and the partnership with Memorial Sloan Kettering would get them in the door. But they needed to show, fairly quickly, an impact on hard outcomes.” Wachter says IBM must convince hospitals that the system is worth the financial investment. “It’s really important that they come out with successes,” he says. “Success is an article in the New England Journal of Medicine showing that when we used Watson, patients did better or we saved money.” Wachter is still waiting to see such articles appear.

Sloan Kettering’s Kris isn’t discouraged; he says the technology will only get better. “As a tool, Watson has extraordinary potential,” he says. “I do hope that the people who have the brainpower and computer power stick with it. It’s a long haul, but it’s worth it.”

Some success stories are emerging from Watson Health—in certain narrow and controlled applications, Watson seems to be adding value. Take, for example, the Watson for Genomics product, which was developed in partnership with the University of North Carolina, Yale University, and other institutions. The tool is used by genetics labs that generate reports for practicing oncologists: Watson takes in the file that lists a patient’s genetic mutations, and in just a few minutes it can generate a report that describes all the relevant drugs and clinical trials. “We enable the labs to scale,” says Vanessa Michelini , an IBM Distinguished Engineer who led the development and 2016 launch of the product.

Watson has a relatively easy time with genetic information, which is presented in structured files and has no ambiguity—either a mutation is there, or it’s not. The tool doesn’t employ NLP to mine medical records, instead using it only to search textbooks, journal articles, drug approvals, and clinical trial announcements, where it looks for very specific statements.

IBM’s partners at the University of North Carolina published the first paper about the effectiveness   of Watson for Genomics in 2017. For 32 percent of cancer patients enrolled in that study, Watson spotted potentially important mutations not identified by a human review, which made these patients good candidates for a new drug or a just-opened clinical trial. But there’s no indication, as of yet, that Watson for Genomics leads to better outcomes.

The U.S. Department of Veterans Affairs uses Watson for Genomics reports in more than 70 hospitals nationwide, says Michael Kelley , the VA’s national program director for oncology. The VA first tried the system on lung cancer and now uses it for all solid tumors. “I do think it improves patient care,” Kelley says. When VA oncologists are deciding on a treatment plan, “it is a source of information they can bring to the discussion,” he says. But Kelley says he doesn’t think of Watson as a robot doctor. “I tend to think of it as a robot who is a master medical librarian.”

Most doctors would probably be delighted to have an AI librarian at their beck and call—and if that’s what IBM had originally promised them, they might not be so disappointed today. The Watson Health story is a cautionary tale of hubris and hype. Everyone likes ambition, everyone likes moon shots, but nobody wants to climb into a rocket that doesn’t work.

So Far, Few Successes

IBM began its effort to bring Watson into the health care industry in 2011. Since then, the company has made nearly 50 announcements about partnerships that were intended to develop new AI-enabled tools for medicine. Some collaborations worked on tools for doctors and institutions; some worked on consumer apps. While many of these alliances have not yet led to commercial products, IBM says the research efforts have been valuable, and that many relationships are ongoing. Here’s a representative sample of projects.

Date IBM Partner Project Current Status

Feb.
No tools in use
Sept. WellPoint (now ) No tools in use

March
Watson for Oncology
Oct. ;
No tools in use

Oct.
No tool in use

March
No tool in use
June No app available
Sept. Watson for Clinical Trial Matching

April
No apps available
April Sugar.IQ app
May No tool in use
May , others Watson for Genomics
July No tool in use
Sept. ; No tool in use; no app available
Sept. No tool in use
Dec. No app available
Dec. No app available

Jan.
No app available
Feb. No app available
April No app available
June No app available
Oct. Watson for Genomics from Quest Diagnostics
Nov. No tool in use

May
No tool in use

This article appears in the April 2019 print issue as “IBM Watson, Heal Thyself.”

  • IBM's Watson Tries to Learn…Everything - IEEE Spectrum ›
  • IBM Watson's Next Challenge: Modernize Legacy Code - IEEE ... ›
  • Tomorrow's AI Will Reason Like Humans, IBM Watson Developer ... ›
  • Layoffs at Watson Health Reveal IBM's Problem With AI - IEEE ... ›
  • Paige's AI Diagnostic Tech is Revolutionizing Cancer Diagnosis - IEEE Spectrum ›
  • 7 Revealing Ways AIs Fail ›
  • AI Could Make Air Conditioners 10x Better - IEEE Spectrum ›
  • Is AI Good for Health Care? - IEEE Spectrum ›
  • AI Can Offer Insight into Who Responds to Anti-Depressants - IEEE Spectrum ›
  • We've Entered a New Era of Streaming Health Care. Now What? - IEEE Spectrum ›
  • Google's New AI Is Learning to Diagnose Patients - IEEE Spectrum ›
  • What Ever Happened to IBM's Watson? - The New York Times ›
  • Watson (computer) - Wikipedia ›
  • IBM Watson Health | AI Healthcare Solutions | IBM ›
  • IBM Watson | IBM ›

This article is for IEEE members only. Join IEEE to access our full archive.

Membership includes:.

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions

ibm watson case study

Opportunity

Supporting the Hardworking Professionals behind Essential Programs

Have professionals express their challenges in their own words, "our team got emotional while watching this as it authentically captured our story and challenges" - ibm employee.

Our Contributors

Special thanks to our contributors on this project:

More Case Studies

Content That Cuts Through The Noise.

ibm watson case study

  • Video Production
  • Audio Production
  • Post Production
  • Creative Development
  • See Our Services
  • Quick Links:
  • Case Studies
  • Start a Project

ibm watson case study

Ethics of Medical AI: The Case of Watson for Oncology

Danish translation forthcoming in: 8 Cases i Medicinsk Etik

25 Pages Posted: 8 Aug 2019 Last revised: 5 Dec 2019

Ezio Di Nucci

University of Copenhagen

Rasmus Thybo Jensen

Aaro tupasela.

University of Copenhagen - Faculty of Law

Date Written: August 5, 2019

Let’s be honest: one of the big motivators for studying medicine is its job prospects: namely plenty of well-paid safe jobs. That is why medical artificial intelligence (medical AI) should scare you: because it is coming after your jobs. In this chapter we will discuss IBM Watson for Oncology (from now on just Watson for short) as a case study in the emergence of medical AI. We will analyse the most interesting ethical and philosophical questions raised by medical AI in general and Watson in particular. Watson is “a decision-support system that ranks cancer therapeutic options” based on machine learning algorithms, which are computer systems that are, according to cognitive scientists, able to “figure it out on their own, by making inferences from data”. So you can double down on your fear already, dear medics: those machines are coming after your jobs and they are also coming after the jobs of their own programmers – that’s how greedy they are. They clearly won’t stop until they have taken over the whole world, which is in fact what technophobes and their extremist friends, the techno-apocalypsts, are afraid of. How does Watson work? Based primarily on its access to up-to-date medical research publications and patient’s health records, Watson’s algorithm – developed by IBM engineers together with oncologists from the Memorial Sloan Kettering Cancer Center in New York - generates cancer treatment recommendations that oncologists can review and use in consultation with patients.

Keywords: Medical AI; Watson for Oncology; Bioethics

Suggested Citation: Suggested Citation

Ezio Di Nucci (Contact Author)

University of copenhagen ( email ).

Nørregade 10 Copenhagen, København DK-1165 Denmark

University of Copenhagen - Faculty of Law ( email )

Studiestraede 6 Studiestrade 6 Copenhagen, DK-1455 Denmark

Do you have a job opening that you would like to promote on SSRN?

Paper statistics, related ejournals, health law, policy & ethics ejournal.

Subscribe to this fee journal for more curated articles on this topic

Ethics eJournal

Subscribe to this free journal for more curated articles on this topic

Development of Innovation eJournal

Artificial intelligence ejournal, artificial intelligence - law, policy, & ethics ejournal, cognitive psychology ejournal, decision-making in public policy & the social good ejournal, psychology research methods ejournal, decision-making in computational design & technology ejournal, digital health ejournal, health psychology ejournal, medical ethics ejournal.

  • Dean’s Office
  • External Advisory Council
  • Computing Council
  • Extended Computing Council
  • Undergraduate Advisory Group
  • Break Through Tech AI
  • Building 45 Event Space
  • Infinite Mile Awards: Past Winners
  • Frequently Asked Questions
  • Undergraduate Programs
  • Graduate Programs
  • Educating Computing Bilinguals
  • Online Learning
  • Industry Programs
  • AI Policy Briefs
  • Envisioning the Future of Computing Prize
  • SERC Symposium 2023
  • SERC Case Studies
  • SERC Scholars Program
  • SERC Postdocs
  • Common Ground Subjects
  • For First-Year Students and Advisors
  • For Instructors: About Common Ground Subjects
  • Common Ground Award for Excellence in Teaching
  • New & Incoming Faculty
  • Faculty Resources
  • Faculty Openings
  • Search for: Search
  • MIT Homepage

Researchers use large language models to help robots navigate

ibm watson case study

The method uses language-based inputs instead of costly visual data to direct a robot through a multistep navigation task.

Someday, you may want your home robot to carry a load of dirty clothes downstairs and deposit them in the washing machine in the far-left corner of the basement. The robot will need to combine your instructions with its visual observations to determine the steps it should take to complete this task.

For an AI agent, this is easier said than done. Current approaches often utilize multiple hand-crafted machine-learning models to tackle different parts of the task, which require a great deal of human effort and expertise to build. These methods, which use visual representations to directly make navigation decisions, demand massive amounts of visual data for training, which are often hard to come by.

To overcome these challenges, researchers from MIT and the MIT-IBM Watson AI Lab devised a navigation method that converts visual representations into pieces of language, which are then fed into one large language model that achieves all parts of the multistep navigation task.

Rather than encoding visual features from images of a robot’s surroundings as visual representations, which is computationally intensive, their method creates text captions that describe the robot’s point-of-view. A large language model uses the captions to predict the actions a robot should take to fulfill a user’s language-based instructions.

Because their method utilizes purely language-based representations, they can use a large language model to efficiently generate a huge amount of synthetic training data.

While this approach does not outperform techniques that use visual features, it performs well in situations that lack enough visual data for training. The researchers found that combining their language-based inputs with visual signals leads to better navigation performance.

“By purely using language as the perceptual representation, ours is a more straightforward approach. Since all the inputs can be encoded as language, we can generate a human-understandable trajectory,” says Bowen Pan, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on this approach .

Pan’s co-authors include his advisor, Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing, MIT director of the MIT-IBM Watson AI Lab, and a senior research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL); Philip Isola, an associate professor of EECS and a member of CSAIL; senior author Yoon Kim, an assistant professor of EECS and a member of CSAIL; and others at the MIT-IBM Watson AI Lab and Dartmouth College. The research will be presented at the Conference of the North American Chapter of the Association for Computational Linguistics.

Solving a vision problem with language

Since large language models are the most powerful machine-learning models available, the researchers sought to incorporate them into the complex task known as vision-and-language navigation, Pan says.

But such models take text-based inputs and can’t process visual data from a robot’s camera. So, the team needed to find a way to use language instead.

Their technique utilizes a simple captioning model to obtain text descriptions of a robot’s visual observations. These captions are combined with language-based instructions and fed into a large language model, which decides what navigation step the robot should take next.

The large language model outputs a caption of the scene the robot should see after completing that step. This is used to update the trajectory history so the robot can keep track of where it has been.

The model repeats these processes to generate a trajectory that guides the robot to its goal, one step at a time.

To streamline the process, the researchers designed templates so observation information is presented to the model in a standard form — as a series of choices the robot can make based on its surroundings.

For instance, a caption might say “to your 30-degree left is a door with a potted plant beside it, to your back is a small office with a desk and a computer,” etc. The model chooses whether the robot should move toward the door or the office.

“One of the biggest challenges was figuring out how to encode this kind of information into language in a proper way to make the agent understand what the task is and how they should respond,” Pan says.

Advantages of language

When they tested this approach, while it could not outperform vision-based techniques, they found that it offered several advantages.

First, because text requires fewer computational resources to synthesize than complex image data, their method can be used to rapidly generate synthetic training data. In one test, they generated 10,000 synthetic trajectories based on 10 real-world, visual trajectories.

The technique can also bridge the gap that can prevent an agent trained with a simulated environment from performing well in the real world. This gap often occurs because computer-generated images can appear quite different from real-world scenes due to elements like lighting or color. But language that describes a synthetic versus a real image would be much harder to tell apart, Pan says.

Also, the representations their model uses are easier for a human to understand because they are written in natural language.

“If the agent fails to reach its goal, we can more easily determine where it failed and why it failed. Maybe the history information is not clear enough or the observation ignores some important details,” Pan says.

In addition, their method could be applied more easily to varied tasks and environments because it uses only one type of input. As long as data can be encoded as language, they can use the same model without making any modifications.

But one disadvantage is that their method naturally loses some information that would be captured by vision-based models, such as depth information.

However, the researchers were surprised to see that combining language-based representations with vision-based methods improves an agent’s ability to navigate.

“Maybe this means that language can capture some higher-level information than cannot be captured with pure vision features,” he says.

This is one area the researchers want to continue exploring. They also want to develop a navigation-oriented captioner that could boost the method’s performance. In addition, they want to probe the ability of large language models to exhibit spatial awareness and see how this could aid language-based navigation.

This research is funded, in part, by the MIT-IBM Watson AI Lab.

Related Stories

ibm watson case study

ibm watson case study

We’re a group of 3,000 researchers inventing what’s next in computing at labs across the world. Learn more about us and our work below.

  • Yorktown Heights

Europe, Middle East, and Africa

  • United Kingdom

Stephanie Houde

Stephanie Houde

Shubhi Asthana

Shubhi Asthana

Liran Funaro

Liran Funaro

Vini Kanvar

Vini Kanvar

Vinayaka Pandit

Vinayaka Pandit

Sunyanan Choochotkaew

Sunyanan Choochotkaew

Laxmi Parida

Laxmi Parida

Jack Kouloheris

Jack Kouloheris

Ioana Giurgiu

Ioana Giurgiu

Satyananda Kashyap

Satyananda Kashyap

Guilherme Augusto Ferreira Lima

Guilherme Augusto Ferreira Lima

Daniel Worledge

Daniel Worledge

Francesco carzaniga.

David Grove

David Grove

Karol Lynch

Karol Lynch

Nathan Marchack

Nathan Marchack

Kate Soule

Aritra Bose

Thomas Frick

Thomas Frick

Muaaz Bhamjee

Muaaz Bhamjee

Martin Mevissen

Martin Mevissen

Atsuya Okazaki

Atsuya Okazaki

Anton Dekusar

Anton Dekusar

Krishnan Kailas

Krishnan Kailas

Growing client revenue through high-quality, targeted media campaigns

IMAGES

  1. (PDF) Case Study: IBM Watson Analytics Cloud Platform as Analytics-as-a

    ibm watson case study

  2. IBM Watson Health Oncology Case Study 2019

    ibm watson case study

  3. IBM's Watson: A case study on the future of computing by Gregg Ilan on

    ibm watson case study

  4. IBM Watson Health Oncology Case Study 2019

    ibm watson case study

  5. IBM Watson Case Solution And Analysis, HBR Case Study Solution

    ibm watson case study

  6. IBM Watson Health Oncology Case Study 2019

    ibm watson case study

VIDEO

  1. Watson Case Study

  2. The Short: Quantum Open Science winners, watsonx.ai on ESPN Fantasy FB and Generative AI: 101

  3. Practical-II IBM Watson

  4. Case IBM Watson at MD Anderson Cancer Center

  5. Why IBM’s dazzling Watson supercomputer made a lousy tutor

  6. Revolutionizing Healthcare with the Power of IBM Watson

COMMENTS

  1. Case Studies

    Toronto Dominion Bank is revolutionizing customer care through the strategic integration of cutting-edge technologies such as generative AI and the IBM® watsonx™ data and AI platform. AI, analytics, and cloud technology are combined to power Ava, a virtual assistant who can answer more than 150 common questions 24x7 in English and Spanish ...

  2. IBM Supply Chain

    The system uses IBM Watson® technology to enable natural language queries and responses, which accelerates the speed of decision-making and offers more options to correct issues. ... View case study View more case studies Supply Chain Consulting Services Build AI-enabled, sustainable supply chains that prepare your business for the future of ...

  3. How IBM's Watson went from the future of health care to sold off for parts

    Most likely, you're familiar with Watson from the IBM computer system's appearance on Jeopardy! in 2011, ... This will be a case study for business schools for decades. When you look at what ...

  4. Use Cases

    Using natural language processing (NLP), IBM Watson® Discovery helps your underwriters, claims processors, customer service agents and actuaries find answers and insights from insurance documents, customer and public data faster. That means faster business results, satisfied customers and happier employees. Client stories Meiji Yasuda Life ...

  5. What Ever Happened to IBM's Watson?

    IBM poured many millions of dollars in the next few years into promoting Watson as a benevolent digital assistant that would help hospitals and farms as well as offices and factories. The ...

  6. IBM Watson at MD Anderson Cancer Center

    Abstract. After discovering that their cancer diagnostic tool, designed to leverage the cloud computing power of IBM Watson, needed greater integration into the clinical processes at the MD Anderson Cancer Center, the development team had difficult choices to make. The Oncology Expert Advisor tool used a combination of machine learning and the ...

  7. MD Anderson Benches IBM Watson In Setback For Artificial ...

    Watson, IBM's language-based computing ... in this case, MD Anderson paid for the privilege, although it would have apparently also owned the product. ... IBM points to a dozen studies presented ...

  8. PDF The Business Case for AI in HR

    initially for internal IBM employee use, delivered such significant value that they are now offered commercially. These include IBM Watson Candidate Assistant, IBM Watson Recruitment, IBM Watson Career Coach, and Your Learning. For the last decade, IBM has been proud to work with clients around the world on their most important transformations.

  9. Case Study: Helping Inform Benefits Decisions

    Example: "computer". Liberty Mutual Insurance implemented IBM Watson Health's Benefits Mentor with Watson (formerly known as myBenefitsMentor) to help inform their employees' benefits decisions. ibm watson health watson health liberty mutual marc phillips health. ctas in production. english (u.s.) en-us. Appears In. WH-Payer Health.

  10. Launch Consulting

    Developed in partnership with TechData and IBM, Launch dreamed up Watson Warriors, an interactive data science experience that leverages the power of Watson AI. Launch's team of experts immersed themselves in all things Watson: the software, the hardware (Power9 ac922), the accelerator (WMLA), and data from the Weather Company - resulting in ...

  11. Practising Value Innovation through Artificial Intelligence: The IBM

    The second phase was based on an in-depth analysis of four case studies and the new practices emerging due to the Watson technologies. In identifying case studies, we referred to the IBM Redbook 'Enhancing the IBM power systems platform with IBM Watson services' (Diener & Piller, 2010) and chose the most

  12. IBM Human Resources

    One day in 2021, Jon and his team received a new technology developed by the IBM Watson® Research Lab—a trial version of software now known as the IBM watsonx Orchestrate solution. They thought it was a new iteration of familiar digital assistant and conversational AI technology, until they began working with it.

  13. How IBM Watson Overpromised and Underdelivered on AI Health Care

    In 2015, IBM announced the formation of a special Watson Health division, and by mid-2016 Watson Health had acquired four health-data companies for a total cost of about $4 billion. It seemed that ...

  14. IBM Watson

    We tapped into global non-profit and social networks to reach out to people worldwide, listening to their struggles and ideas for better community service. Despite a tight budget, our goal was to make a far-reaching video. In this video, health and social program experts, along with their clients, openly share the challenges they encounter, the ...

  15. Case Study: IBM Watson Analytics Cloud Platform as Analytics-as-a

    Case Study: IBM Watson Analytics Cloud Platform as Analytics-as-a-Service System for Heart Failure Early Detecti. on.pdf. Content available from CC BY 4.0: 578a128e08ae59aa6679328d.pdf.

  16. Ethics of Medical AI: The Case of Watson for Oncology

    In this chapter we will discuss IBM Watson for Oncology (from now on just Watson for short) as a case study in the emergence of medical AI. We will analyse the most interesting ethical and philosophical questions raised by medical AI in general and Watson in particular. ... Watson's algorithm - developed by IBM engineers together with ...

  17. Practising Value Innovation through Artificial Intelligence: The IBM

    The second phase was based on an in-depth analysis of four case studies and . the new practices emerging due to the W atson technologies. ... BYJU'S [21], and IBM Watson Education ...

  18. Cloud case studies

    The solution's IBM Cloud public hosting platform reduces operating costs for the app by 40 percent and scales effortlessly as its user base continues to grow. Read the case study LogDNA. LogDNA saw a clear need to address data sprawl in the modern, cloud-native development stack. Its innovative software-as-a-service (SaaS) platform built on ...

  19. UNDER ARMOUR: IBM WATSON COGNITIVE COMPUTING A case study approach

    A case study approach. Ratnayake Dilanka Kamali 1*. 1* School of Management and Business. Manipal International University, 71800 Putera N ilai, Negeri Sembilan, Malaysia. Author email ...

  20. Researchers use large language models to help robots navigate

    A large language model uses the captions to predict the actions a robot should take to fulfill a user's language-based instructions. Because their method utilizes purely language-based representations, they can use a large language model to efficiently generate a huge amount of synthetic training data. While this approach does not outperform ...

  21. Practising Value Innovation through Artificial Intelligence: The IBM

    The second phase was based on an in-depth analysis of four case studies and the new practices emerging due to the Watson technologies. In identifying case studies, we referred to the IBM Redbook 'Enhancing the IBM power systems platform with IBM Watson services' (Diener & Piller, 2010) and chose the most representative ones, as they ...

  22. NatWest Group

    NatWest (link resides outside of ibm.com) is a leading banking and financial services company based in the UK. The company serves approximately 19 million people, families and businesses throughout the UK and Ireland. Headquartered in Edinburgh, Scotland, the group had net income of more than GBP 3.8 billion in 2019.

  23. IBM Watson'S Theory

    IBM WATSON JEOPARDY PROJECT - CASE STUDY DEPARTMENT OF PROJECT MANAGEMENT ALGOMA UNIVERSITY PMAL105 A 3 INTRODUCTION TO PROJECT MANAGEMENT ANDREW FEDRUKO SEPTEMBER 30 2022 2 YASHKUMAR DHOKAIPMAL 105 A ANDREW FEDRUKO 30-09-Introduction. Describe the major components of the strategic management process and if/how the case fulfil these?

  24. People

    A global community. A global. community. We're a group of 3,000 researchers inventing what's next in computing at labs across the world. Learn more about us and our work below. Filter by.

  25. IBM watsonx Assistant Virtual Agent

    IBM watsonx Assistant is a next-gen conversational AI solution—it that empowers a broader audience that includes non-technical business users, anyone in your organization to effortlessly build generative AI Assistants that deliver frictionless self-service experiences to customers across any device or channel, help boost employee productivity, and scale across your business.

  26. Home Case Studies Search Featured Case Study: Active International

    Search. Home Case Studies Search Featured Case Study: Active International. Growing client revenue through high-quality, targeted media campaigns. Learn moreView more case studies.