New Research Reveals Uptake of AI-powered Messaging in Healthcare Settings

Clinicians with access to generated drafts used them around 20 percent of the time, and improved response times by seven percent.
A new study from NYU Stern, NYU Tandon, and NYU Langone Health offers one of the first data-driven looks at how generative AI might help healthcare providers manage their message overload — and why many are hesitant to adopt the technology was published in npj Digital Medicine.
The research team included NYU Stern Professor Batia M. Wiesenfeld, as well as NYU Langone Health’s Adam C. Szerencsy, William R. Small, Vincent Major, Safiya Richardson, Antoinette Schoenthaler, and Devin Mann, and was led by NYU Tandon Professor Oded Nov and NYU Langone’s Soumik Mandal.
Over a ten-month period from October 2023 through August 2024, the team observed more than 55,000 patient messages sent to healthcare providers through a secure online patient portal. The system used an embedded generative AI tool that automatically generated draft replies for incoming patient messages; healthcare providers could choose to start with the draft, begin a reply from scratch, or use their usual reply interface.
According to the published results, providers chose to “Start with Draft” in 19.4 percent of cases where a draft was shown. Adoption rose modestly over the course of the study as the system’s prompting improved. Using a draft shaved roughly 7 percent off response times, a median of 331 seconds versus 355 seconds when drafting from scratch, but in many cases, this time saved was offset by time spent reviewing, editing, or ignoring drafts.
By analyzing tens of thousands of messages, the researchers found that certain qualities made drafts more likely to be used. Shorter, more readable, and more informative drafts tended to be preferred. Tone also mattered: messages that sounded slightly more human and empathetic were more likely to be adopted, though the ideal balance differed by role. Physicians leaned toward concise, neutral text, while support staff were more receptive to messages with a warmer tone. These preferences hint at a future where AI systems could adapt their writing style based on the user’s role or communication history.
Still, the study shows how hesitant healthcare providers remain to rely on AI-generated language at all. The authors suggest several possible reasons including suboptimal alignment with clinical workflows, and the cognitive cost of reviewing a constant stream of AI output, much of which may be irrelevant. Simply generating text for every message, they argue, can create clutter that undermines the very efficiency such tools are meant to provide.
The researchers see ample opportunity ahead. Future systems may need to learn each user’s style, selectively generate drafts only for messages likely to benefit, and continuously adapt prompt strategies.
"When people and AI have to work together in fields like healthcare, we are discovering many unintended consequences of trying to substitute AI for human work. One paradox is that GenAI tends to make easy tasks easier but hard tasks become much more difficult," explains Wiesenfeld. "In addition, organizations may need to redesign workflows and reallocate tasks across roles when leveraging GenAI to unlock greater value for its customers."
____
This piece was adapted from an article that originally appeared on NYU Tandon's website.