Knowledge workers who used generative AI to help complete a series of realistic consulting tasks were significantly more productive and produced higher quality results than a control group, according to a recently released study from Harvard Business School.
One group of 385 participating consultants was given a set of 18 consulting tasks that were deemed to be within the frontier of artificial intelligence capabilities.
Participants with access to the large language model GPT-4 completed 12.2% more tasks on average and completed tasks 25.1% more quickly than those without access to AI, the study conducted in tandem with Boston Consulting Group found.
Additionally, the responses produced by the consultants with access to generative AI were more than 40% higher quality compared to a control group.
“Firstly, our results lend support to the optimism about AI capabilities for important high-end knowledge work tasks such as fast idea generation, writing, persuasion, strategic analysis, and creative product innovation,” the Harvard Business School study said.
Meanwhile, the study found that on a task deemed to be outside the frontier of AI capabilities, participants using AI were less likely to produce correct solutions compared to a control group.
The paper about the study is labeled a “working paper,” which means it is in draft form.
Implications for legal industry
The study’s release comes amid growing interest in generative AI among knowledge workers in legal departments and at law firms in the U.S. and abroad.
For example, more than half of in-house legal professionals surveyed earlier this year said they believe generative AI should be used for legal work, Thomson Reuters Institute found.
Additionally, a LexisNexis survey of lawyers across several major countries found legal professionals were already using generative AI for legal research and drafting documents, among other use cases.
The results from the Harvard Business School study could provide further fuel for both legal departments and law firms to move forward with generative AI implementation plans in hopes of boosting efficiency and improving the quality of their work.
Additionally, the findings may help some legal professionals move beyond their skepticism that emerging AI can help with high-level legal work.
Background
In addition to working with Boston Consulting Group, researchers from Harvard Business School collaborated on the study with others from the MIT Sloan School of Management and the Wharton School of the University of Pennsylvania.
The study featured 758 consultants from Boston Consulting Group who participated in one of two experiments.
The first experimental task involved developing new product ideas. The second experimental task consisted of analyzing brand performance.
The study noted that the experimental tasks were complex, knowledge-intensive and selected by industry experts to replicate real-world workflows.
Among the participants who were given access to GPT-4, some were provided a supplementary prompt engineering overview to increase their familiarity with AI.
These additional materials included instructional videos and documents that outlined and illustrated effective usage strategies.
Other AI benefits
The study found that receiving access to AI plus the overview produced a consistently “more pronounced positive effect” compared to just having access to GPT-4.
Additionally, the study participants for each experimental task were divided into two categories after completing an initial assessment task.
Those who were categorized as bottom-half-skill performers saw a larger boost in performance from having access to AI (43%) when completing the experimental task compared to the bump experienced by top-half-skill performers (17%).
“Thus, AI seems to both level performance differences across levels of ability and raise the quality of work for inside-the-frontier tasks,” the study said.
AI drawbacks
The recent academic study also produced some results indicating there are areas in which emerging AI may not be as useful for knowledge workers.
For example, the second experiment required study participants to analyze a company’s brand performance using insights and financial data.
The participants’ responses were evaluated based on correctness, and there was “a noticeable dip in performance” among those with access to AI compared to the control group.
Subjects in the control group were correct about the exercise about 84.5% of the time, while those in the two AI access categories saw an average decrease of 19 percentage points by comparison.
The results indicate that there is a so-called “jagged frontier” where tasks that appear to be of similar difficulty may either be performed better or worse by humans using AI.
“The experiments show that the shape and position of the frontier are vital to understanding the impact of AI on work,” the academic study said. “On tasks within the frontier, AI significantly improved human performance. Outside of it, humans relied too much on the AI and were more likely to make mistakes.”