M.B.A. Students vs. ChatGPT: Who Comes Up With More Innovative Ideas?
We put humans and AI to the test. The results weren’t even close.
We put humans and AI to the test. The results weren’t even close.
How good is AI in generating new ideas?
The conventional wisdom has been not very good. Identifying opportunities for new ventures, generating a solution for an unmet need, or naming a new company are unstructured tasks that seem ill-suited for algorithms. Yet recent advances in AI, and specifically the advent of large language models like ChatGPT, are challenging these assumptions.
We have taught innovation, entrepreneurship and product design for many years. For the first assignment in our innovation courses at the Wharton School, we ask students to generate a dozen or so ideas for a new product or service. As a result, we have heard several thousand new venture ideas pitched by undergraduate students, M.B.A. students and seasoned executives. Some of these ideas are awesome, some are awful, and, as you would expect, most are somewhere in the middle.
The library of ideas, though, allowed us to set up a simple competition to judge who is better at generating innovative ideas: the human or the machine.
In this competition, which we ran together with our colleagues Lennart Meincke and Karan Girotra, humanity was represented by a pool of 200 randomly selected ideas from our Wharton students. The machines were represented by ChatGPT4, which we instructed to generate 100 ideas with otherwise identical instructions as given to the students: “generate an idea for a new product or service appealing to college students that could be made available for $50 or less.”
In addition to this vanilla prompt, we also asked ChatGPT for another 100 ideas after providing a handful of examples of successful ideas from past courses (in other words, a trained GPT group), providing us with a total sample of 400 ideas.
Collapsible laundry hamper, dorm-room chef kit, ergonomic cushion for hard classroom seats, and hundreds more ideas miraculously spewed from a laptop.
The academic literature on ideation postulates three dimensions of creative performance: the quantity of ideas, the average quality of ideas, and the number of truly exceptional ideas.
First, on the number of ideas per unit of time: Not surprisingly, ChatGPT easily outperforms us humans on that dimension. Generating 200 ideas the old-fashioned way requires days of human work, while ChatGPT can spit out 200 ideas with about an hour of supervision.
Next, to assess the quality of the ideas, we market tested them. Specifically, we took each of the 400 ideas and put them in front of a survey panel of customers in the target market via an online purchase-intent survey. The question we asked was: “How likely would you be to purchase based on this concept if it were available to you?” The possible responses ranged from definitely wouldn’t purchase to definitely would purchase.
The responses can be translated into a purchase probability using simple market-research techniques. The average purchase probability of a human-generated idea was 40%, that of vanilla GPT-4 was 47%, and that of GPT-4 seeded with good ideas was 49%. In short, ChatGPT isn’t only faster but also on average better at idea generation.
Still, when you’re looking for great ideas, averages can be misleading. In innovation, it’s the exceptional ideas that matter: Most managers would prefer one idea that is brilliant and nine ideas that are flops over 10 decent ideas, even if the average quality of the latter option might be higher. To capture this perspective, we investigated only the subset of the best ideas in our pool—specifically the top 10%. Of these 40 ideas, five were generated by students and 35 were created by ChatGPT (15 from the vanilla ChatGPT set and 20 from the pre trained ChatGPT set). Once again, ChatGPT came out on top.
We believe that the 35-to-5 victory of the machine in generating exceptional ideas (not to mention the dramatically lower production costs) has substantial implications for how we think about creativity and innovation.
First, generative AI has brought a new source of ideas to the world. Not using this source would be a sin. It doesn’t matter if you are working on a pitch for your local business-plan competition or if you are seeking a cure for cancer—every innovator should develop the habit of complementing his or her own ideas with the ones created by technology. Ideation will always have an element of randomness to it, and so we cannot guarantee that your idea will get an A+, but there is no excuse left if you get a C.
Second, the bottleneck for the early phases of the innovation process in organisations now shifts from generating ideas to evaluating ideas. Using a large language model, an innovator can produce a spreadsheet articulating hundreds of ideas, which likely include a few blockbusters. This abundance then demands an effective selection mechanism to find the needles in the haystack.
To date, these models appear to perform no better than any single expert in their ability to predict commercial viability. Using a sample of a dozen or so independent evaluations from potential customers in the target market—a wisdom of crowds approach—remains the best strategy. Fortunately, screening ideas using a purchase intent survey of customers in the target market is relatively fast and cheap.
Finally, rather than thinking about a competition between humans and machines, we should find a way in which the two work together. This approach in which AI takes on the role of a co-pilot has already emerged in software development. For example, our human (pilot) innovator might identify an open problem. The AI (co-pilot) might then report what is known about the problem, followed by an effort in which the human and AI independently explore possible solutions, virtually guaranteeing a thorough consideration of opportunities.
The human decision maker is likely ultimately responsible for the outcome, and so will likely make the screening and selection decisions, informed by customer research and possibly by the opinion of the AI co-pilot. We predict such a human-machine collaboration will deliver better products and services to the market, and improved solutions for whatever society needs in the future.
Christian Terwiesch and Karl Ulrich are professors of operations, information and decisions at the Wharton School of the University of Pennsylvania, where Terwiesch also co-directs the Mack Institute for Innovation Management.
What a quarter-million dollars gets you in the western capital.
Alexandre de Betak and his wife are focusing on their most personal project yet.
Multinationals like Starbucks and Marriott are taking a hard look at their Chinese operations—and tempering their outlooks.
For years, global companies showcased their Chinese operations as a source of robust growth. A burgeoning middle class, a stream of people moving to cities, and the creation of new services to cater to them—along with the promise of the further opening of the world’s second-largest economy—drew companies eager to tap into the action.
Then Covid hit, isolating China from much of the world. Chinese leader Xi Jinping tightened control of the economy, and U.S.-China relations hit a nadir. After decades of rapid growth, China’s economy is stuck in a rut, with increasing concerns about what will drive the next phase of its growth.
Though Chinese officials have acknowledged the sputtering economy, they have been reluctant to take more than incremental steps to reverse the trend. Making matters worse, government crackdowns on internet companies and measures to burst the country’s property bubble left households and businesses scarred.
Now, multinational companies are taking a hard look at their Chinese operations and tempering their outlooks. Marriott International narrowed its global revenue per available room growth rate to 3% to 4%, citing continued weakness in China and expectations that demand could weaken further in the third quarter. Paris-based Kering , home to brands Gucci and Saint Laurent, posted a 22% decline in sales in the Asia-Pacific region, excluding Japan, in the first half amid weaker demand in Greater China, which includes Hong Kong and Macau.
Pricing pressure and deflation were common themes in quarterly results. Starbucks , which helped build a coffee culture in China over the past 25 years, described it as one of its most notable international challenges as it posted a 14% decline in sales from that business. As Chinese consumers reconsidered whether to spend money on Starbucks lattes, competitors such as Luckin Coffee increased pressure on the Seattle company. Starbucks executives said in their quarterly earnings call that “unprecedented store expansion” by rivals and a price war hurt profits and caused “significant disruptions” to the operating environment.
Executive anxiety extends beyond consumer companies. Elevator maker Otis Worldwide saw new-equipment orders in China fall by double digits in the second quarter, forcing it to cut its outlook for growth out of Asia. CEO Judy Marks told analysts on a quarterly earnings call that prices in China were down roughly 10% year over year, and she doesn’t see the pricing pressure abating. The company is turning to productivity improvements and cost cutting to blunt the hit.
Add in the uncertainty created by deteriorating U.S.-China relations, and many investors are steering clear. The iShares MSCI China exchange-traded fund has lost half its value since March 2021. Recovery attempts have been short-lived. undefined undefined And now some of those concerns are creeping into the U.S. market. “A decade ago China exposure [for a global company] was a way to add revenue growth to our portfolio,” says Margaret Vitrano, co-manager of large-cap growth strategies at ClearBridge Investments in New York. Today, she notes, “we now want to manage the risk of the China exposure.”
Vitrano expects improvement in 2025, but cautions it will be slow. Uncertainty over who will win the U.S. presidential election and the prospect of higher tariffs pose additional risks for global companies.
For now, China is inching along at roughly 5% economic growth—down from a peak of 14% in 2007 and an average of about 8% in the 10 years before the pandemic. Chinese consumers hit by job losses and continued declines in property values are rethinking spending habits. Businesses worried about policy uncertainty are reluctant to invest and hire.
The trouble goes beyond frugal consumers. Xi is changing the economy’s growth model, relying less on the infrastructure and real estate market that fueled earlier growth. That means investing aggressively in manufacturing and exports as China looks to become more self-reliant and guard against geopolitical tensions.
The shift is hurting western multinationals, with deflationary forces amid burgeoning production capacity. “We have seen the investment community mark down expectations for these companies because they will have to change tack with lower-cost products and services,” says Joseph Quinlan, head of market strategy for the chief investment office at Merrill and Bank of America Private Bank.
Another challenge for multinationals outside of China is stiffened competition as Chinese companies innovate and expand—often with the backing of the government. Local rivals are upping the ante across sectors by building on their knowledge of local consumer preferences and the ability to produce higher-quality products.
Some global multinationals are having a hard time keeping up with homegrown innovation. Auto makers including General Motors have seen sales tumble and struggled to turn profitable as Chinese car shoppers increasingly opt for electric vehicles from BYD or NIO that are similar in price to internal-combustion-engine cars from foreign auto makers.
“China’s electric-vehicle makers have by leaps and bounds surpassed the capabilities of foreign brands who have a tie to the profit pool of internal combustible engines that they don’t want to disrupt,” says Christine Phillpotts, a fund manager for Ariel Investments’ emerging markets strategies.
Chinese companies are often faster than global rivals to market with new products or tweaks. “The cycle can be half of what it is for a global multinational with subsidiaries that need to check with headquarters, do an analysis, and then refresh,” Phillpotts says.
For many companies and investors, next year remains a question mark. Ashland CEO Guillermo Novo said in an August call with analysts that the chemical company was seeing a “big change” in China, with activity slowing and competition on pricing becoming more aggressive. The company, he said, was still trying to grasp the repercussions as it has created uncertainty in its 2025 outlook.
Few companies are giving up. Executives at big global consumer and retail companies show no signs of reducing investment, with most still describing China as a long-term growth market, says Dana Telsey, CEO of Telsey Advisory Group.
Starbucks executives described the long-term opportunity as “significant,” with higher growth and margin opportunities in the future as China’s population continues to move from rural to suburban areas. But they also noted that their approach is evolving and they are in the early stages of exploring strategic partnerships.
Walmart sold its stake in August in Chinese e-commerce giant JD.com for $3.6 billion after an eight-year noncompete agreement expired. Analysts expect it to pump the money into its own Sam’s Club and Walmart China operation, which have benefited from the trend toward trading down in China.
“The story isn’t over for the global companies,” Phillpotts says. “It just means the effort and investment will be greater to compete.”
Corrections & Amplifications
Joseph Quinlan is head of market strategy for the chief investment office at Merrill and Bank of America Private Bank. An earlier version of this article incorrectly used his old title.