(CrossRef look up test) On the Dangers of Stochastic Parrots

No Thumbnail Available
Authors
Emily M. Bender
Timnit Gebru
Angelina McMillan-Major
Shmargaret Shmitchell
Issue Date
2021-03-01
Type
proceedings-article
Language
en_US
Keywords
Research Projects
Organizational Units
Journal Issue
Alternative Title
Abstract
The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.
Description
Citation
FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, March 2021, Pages 610–623
Publisher
License
Journal
Volume
Issue
PubMed ID
ISSN
EISSN
Collections