Limitations of language models
While generative AI tools like language models can be powerful and versatile, they have significant limitations that users must understand to use them effectively. Below are key constraints, with detailed explanations and examples.
Knowledge limit
Language models are trained on datasets containing information up to a specific cut-off date. This means they lack awareness of events, discoveries, or developments that occurred after that date.1
- Good Fit: Asking about the historical significance of the Renaissance or the basics of quantum mechanics. These topics are well within the training data.
- Limitation: Asking for insights on the latest research paper published this month or for predictions about ongoing political events. The model wonāt have this information unless you include it in the prompt.
For tasks requiring up-to-date information, you must provide the latest context within your query (e.g., current events, recent technology updates, or new legal policies). The model cannot spontaneously update itself with new knowledge.
Biased training data
Language models are trained on vast datasets sourced from the internet, books, and other materials. These datasets inevitably reflect the biases present in their sources. As a result, the modelās outputs can unintentionally reinforce stereotypes, cultural biases, or inaccuracies.2
Examples
- Gender bias: Default assumptions about roles, e.g., associating ānurseā with women or āengineerā with men.
- Cultural bias: Overrepresentation of certain cultural perspectives while neglecting others.
- Political bias: Responses that reflect the political leanings of the data sources.
AI can inadvertently perpetuate harm when biased outputs are used uncritically. Itās essential for users to review and contextualize the responses, especially in sensitive or high-stakes scenarios.
Hallucinations
Language models may confidently present false information, especially for queries beyond their knowledge boundary. This phenomenon, known as hallucination, occurs when the model generates plausible-sounding but incorrect or misleading answers.3 4
- Example: If you ask a language model about the health benefits of a fictional fruit, it may provide a detailed response based on its general knowledge of fruits, even though the fruit doesnāt exist.
- Mitigation: When using language models, be cautious with answers that seem too specific or detailed, especially in niche or fictional contexts. Cross-check the information with reliable sources to verify its accuracy.
You should always critically evaluate the outputs, especially when the AI is used for research, teaching, or publishing. If the answer seems implausible or too specific, itās wise to verify it independently.
To mitigate hallucinations, you can provide additional context or constraints in your prompt to guide the model towards more accurate responses. Furthermore, you can use techniques like chain-of-thought prompting to encourage the AI to explain its reasoning step-by-step.5
References & Footnotes
Footnotes
-
Martino, A., Iannelli, M., & Truong, C. (2023). Knowledge injection to counter large language model (LLM) hallucination. In C. Pesquita, H. Skaf-Molli, V. Efthymiou, S. Kirrane, A. Ngonga, D. Collarana, R. Cerqueira, M. Alam, C. Trojahn, & S. Hertling (Eds.), The Semantic Web: ESWC 2023 Satellite Events (Vol. 13998, pp. 182ā185). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-43458-7_34 ā©
-
Ntoutsi, E., Fafalios, P., Gadiraju, U., Iosifidis, V., Nejdl, W., Vidal, M., Ruggieri, S., Turini, F., Papadopoulos, S., Krasanakis, E., Kompatsiaris, I., KinderāKurlanda, K., Wagner, C., Karimi, F., Fernandez, M., Alani, H., Berendt, B., Kruegel, T., Heinze, C., ā¦ Staab, S. (2020). Bias in dataādriven artificial intelligence systemsāan introductory survey. WIREs Data Mining and Knowledge Discovery, 10(3), e1356. https://doi.org/10.1002/widm.1356 ā©
-
Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2024). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 3703155. https://doi.org/10.1145/3703155 ā©
-
Raunak, V., Menezes, A., & Junczys-Dowmunt, M. (2021). The curious case of hallucinations in neural machine translation. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1172ā1183. https://doi.org/10.18653/v1/2021.naacl-main.92 ā©
-
Ji, Z., Yu, T., Xu, Y., Lee, N., Ishii, E., & Fung, P. (2023). Towards mitigating LLM hallucination via self reflection. Findings of the Association for Computational Linguistics: EMNLP 2023, 1827ā1843. https://doi.org/10.18653/v1/2023.findings-emnlp.123 ā©