This short article serves as an introduction to Thibault Schrepel and Sandy Pentland’s latest working paper, “Competition between AI Foundation Models: Dynamics and Policy Recommendations” (open-access)
***
Learning Curve
The dynamics between foundation models underlining Generative AI (ChatGPT, Bard, Midjourney, Hugging Face…) are defined not only by their design, but also by their ability to grow the user base. The more users they attract, the better the training, which improves the quality of the fine-tuning, attracts new users, increases the capacity to afford expensive training, and so on. In other words, foundation models are subject to positive feedback loops (i.e., increasing returns).1W. Brian Arthur, “Increasing Returns and the New World of Business,” Harvard Business Review, July 1, 1996: 100-109; W. Brian Arthur, “Competing Technologies, Increasing Returns, and Lock-in by Historical Events,” The Economic Journal 99, no. 394 (March 1989): 116. As a result, foundation models appear to compete for the market rather than competing in the market. However, a closer look reveals that the strength of the feedback loops differs depending on the type of foundation model.
Strength of the returns | Limits | |
General public foundation models | Significant increasing returns | The 10th million users improve the fine-tuning less than the first2OpenAI, “GPT-4 Technical Report,” ArXiv:2303.08774, March 15, 2023. |
Ecosystem
foundation models |
Moderate increasing returns | Increasing returns limited to the industry: the model is not easily transferable to another industry |
Personal
foundation models |
Small increasing returns | The model cannot be perfectly adjusted by other users, what matters most is the fine tuning |
When it first comes to general public foundation models, returns tend to increase rapidly for several reasons. First, the more a model is used, the more it can compute user inputs. This dynamic makes the model better over time, which can attract new users and thus increase the learning curve (defined as the relationship between the number of users and the ability to improve the service by learning from them).3Hal R Varian, “Artificial Intelligence, Economics, and Industrial Organization,” in The Economics of Artificial Intelligence: An Agenda, ed. Ajay Agrawal, Joshua Gans, and Avi Goldfarb, 2019, 399–422. (“There is a concept that is circulating among lawyers and regulators called “data network effects”. The model is that a firm with more customers can collect more data and use this data to improve its product. This is often true—the prospect of improving operations is that makes ML attractive—but it is hardly novel. And it is certainly not a network effect! This is essentially a supply-side effect known as ‘learning by doing,’ also known as the ‘experience curve’ or ‘learning curve’.”) But the learning curve will flatten over time, knowing that the 10th million user will improve the model proportionally less than the first user.
Second, the more users they have, the more general public foundation model providers can generate revenue and pay for access to exclusive databases. Knowing that several companies such as Reddit, Stack Overflow, Twitter and others have started licensing access to their database for the purpose of training foundation models, one can expect large foundation model players to pay high fees and integrate them.4KeyserSosa, “An Update Regarding Reddit’s API,” Reddit, April 18, 2023, https://perma.cc/48CC-HT8M; Staff, “News/Media Alliance AI Principles,” News/Media Alliance, April 20, 2023, https://perma.cc/D98J-WNQG; Maria Diaz, “Stack Overflow Joins Reddit and Twitter in Charging AI Companies for Training Data,” ZDNET, April 21, 2023, https://perma.cc/4BSC-GPYJ. Small players will not be able to get the same access, which means that big players will have a competitive advantage, attract more users, etc.
Third, the more users a foundation model has, the greater its reputation and the opportunity to partner with user-facing products. For example, one could imagine a search engine — let us call it Bing — partnering with a widely used foundation model — let us call it ChatGPT. Fourth, the more users, the more money the model can generate, i.e., the more the company can advertise and acquire new users. Here, the ability of large tech companies to push their foundation models to billions of users should translate into important increasing returns. In general, companies with an existing user base will have an advantage if they can add a foundation model to their existing products. These companies are well-positioned to meet the distribution challenge that comes with the scalability of foundation models.
When it comes to ecosystem foundation models, increasing returns are limited to each use case. A model fine-tuned to help judges write court decisions cannot be used to help sporting goods companies with commercial strategies. Compared to general public foundation models, ecosystem foundation models are even more dependent on the quality and exclusivity of the data on which they are trained. But within each industry or use case, the more users, the better the model, which increases the incentive to use the model. A company that already provides a key service to an industry will be well positioned to push a foundation model to its users and benefit from positive feedback loops. Overall, we should expect more ecosystem foundation models to survive than general public foundation models (knowing that use case specificity will drive demand without easily transferable models across the space), with dominant foundation models per use case likely to emerge.
Personal foundation models benefit from relatively smaller increasing returns. Companies that already have a strong reputation — such as Apple — and access to online users — such as Google —will benefit from an early advantage. However, the models underlying personal foundation models cannot be well tuned for other users. Knowing that fine-tuning to each user remains key to their relevance and user experience, increasing returns remain smaller than they are for other types of foundation models. Moreover, individuals produce large collections data with clear copyrights attached, e.g., usage of services, personal data stored locally, etc. Personal foundation models can thus be easily trained to assist each individual in a space where quality matters. One might expect strong competition in this space, with relatively low barriers to entry. Given that the personal development industry generated $41.81 billion in 2021 in the United States alone,5https://www.grandviewresearch.com/industry-analysis/personal-development-market we see a strong incentive for new players to effectively enter the market and compete to provide individuals with tailored advice on health, work, financial decisions, learning experiences, leisure activities, etc. That might explain why Google, in a leaked memo, described “scalable Personal AI” as a major reason why the company is not positioned to win the foundation AI model race.6Maya Posch, “Leaked Internal Google Document Claims Open Source Ai Will Outcompete Google And OpenAI,” Hackaday, May 5, 2023, https://perma.cc/YH5F-ZXE3; Dylan Patel and Afzal Ahmad, “Google ‘We Have No Moat, And Neither Does OpenAI’,” Semi Analysis, May 4, 2023 https://perma.cc/FE3V-2MUS.
Expected dynamics
When it comes to competitive dynamics within each type of foundation model, the higher the increasing returns, the more likely the foundation model is to be dominated by a handful of firms. In the presence of high increasing returns, the quality of foundation models design remains central to the ability to retain users, but competition is not based on quality alone: history matters. Random events and initial competitive advantages could well translate into sustained dominance. There is a multiplicity of possible outcomes.
If our analysis is correct, the larger players in the general public foundation models space will initially improve their foundation models faster than smaller competitors thanks to positive feedback loops. They will acquire dominance that way. That being said, the feedback loops from which they benefit will increase less rapidly over time. Should they sustain dominance, their market position will not necessarily correlate with superior foundation models, as smaller players will also be able to benefit from sufficient increasing returns to achieve similar quality. The ability of large players to lock-in the ecosystem will need to be closely monitored, along with other variables such as network effects among developers, consumer inertia, dynamic capabilities, etc.7Aaron Holmes and Jon Victor, “OpenAI Considers Creating an App Store for AI Software,” The Information, June 20, 2023, https://perma.cc/J9Z9-2PQZ (OpenAI is reportedly considering the creation of an app store. The creation of an app store could lead to a network effect that will make OpenAI’s market position more robust. The failure of OpenAI to successfully deploy its API). Conversely, the more limited the increasing returns, the less robust market shares will be. A company or open-source project with a significantly better model will be able to break the initial cycle of feedback loops in the short to medium term and regain competitive advantage. The competitive dynamics in the personal foundation model space are therefore likely to remain intense over time.
When it comes to competition between models, although they serve different purposes and are likely to coexist, we see a competitive dynamic to attract investment and achieve scalability. Venture capitalists are currently investing in the general-purpose models such as ChatGPT and Bing. Investments in specialized and process improvement models such as BloombergGPT and GitHub Copilot are now emerging. The winner-take-all effect of these general-purpose LLMs should quickly attract more investment. We anticipate a longer lead time for ecosystem and individual foundation models to attract investment and scale up. There are two reasons for this. First, lower increasing returns mean that investors have less hope of capturing the market (i.e., betting on the winning horse). Second, these types of foundation AI models require a change in behavior on the part of users, e.g., to provide private data, be willing to rely on their input, etc. In short, we think these two types of foundation AI models will attract investment away from general public foundation AI in the not-too-distant future.
“Competition between AI Foundation Models: Dynamics and Policy Recommendations” provides policy recommendations based on these dynamics. These recommendations are directed to enforcers (e.g., antitrust agencies), economic ministries, parliaments (willing to regulate AI). Thank you very much for your interest!
Thibault Schrepel and Sandy Pentland
@ProfSchrepel & @Alex_Pentland