From this paper: https://cims.nyu.edu/~sbowman/eightthings.pdf
Below are some selections from the list (quoted verbatim):
—
1. LLMs predictably get more capable with increasing investment, even without targeted innovation
There are substantial innovations that distinguish these three models, but they are almost entirely restricted to infrastructural innovations in high-performance computing rather than model-design work that is specific to language technology.
2. Specific important behaviors in LLM tend to emerge unpredictably as a byproduct of increasing investment
They’re justifiably confident that they’ll get a variety of economically valuable new capabilities, but they can make few confident predictions about what those capabilities will be or what preparations they’ll need to make to be able to deploy them responsibly.
4. There are no reliable techniques for steering the behavior of LLMs
In particular, models can misinterpret ambiguous prompts or incentives in unreasonable ways, including in situations that appear unambiguous to humans, leading them to behave unexpectedly
6. Human performance on a task isn’t an upper bound on LLM performance
they are trained on far more data than any human sees, giving them much more information to memorize and potentially synthesize