1. Introduction
The era of large-scale language models (LLMs) has fundamentally transformed the landscape of text generation, enabling unprecedented fluidity and syntactic coherence. However, despite these remarkable advances, a critical analysis reveals that the linguistic output of these artificial intelligences still presents significant pragmatic and rhetorical deficiencies. Human language, in fact, isn't just a set of grammatical rules; it is a complex system of social cues, intentions, and stylistic choices that current models struggle to replicate. This article aims to explore these unresolved issues, arguing that they arise from a purely statistical interpretation of language that overlooks its complex social and communicative functions.
2. Pragmatic and rhetorical issues of AI language
2.1 Rhetorical saturation: a language that "screams"
One of the first and most evident shortcomings found in AI-generated texts is the unnatural use of certain rhetorical figures. AI, trained on a vast and indiscriminate corpus of online data, learns that some expressions are frequently associated with successful contexts, such as marketing or sensationalist journalism. This leads the models to replicate them excessively, a phenomenon that has been defined as rhetorical saturation.
Consider hyperboles or figures like emphatic epanorthosis. Epanorthosis is a rhetorical figure that involves returning to a recently made statement to correct it or, more often, to intensify it. For example: "It was a good meal, or rather, it was exceptional!" or "He wasn't angry, in fact, he was furious," but also "It's not a problem, but an opportunity to grow." The effectiveness of this figure lies precisely in its rarity and its ability to capture attention, creating an effect of surprise and emotional impact. However, algorithms do not perceive this pragmatic value. For them, epanorthosis is not a rhetorical move to be used sparingly, but a statistical pattern to be replicated whenever the opportunity arises. This leads to the production of texts that, while technically correct, are unnaturally emphatic and lacking in nuance, as if they were constantly "screaming" for attention.
2.2 The lack of an authorial voice
In addition to the unbalanced use of rhetorical figures, artificial intelligence has great difficulty in maintaining a coherent authorial "voice." Human language is intrinsically linked to the speaker's identity, manifesting through lexical choices, syntactic preferences, and a unique rhythm. In contrast, AI tends to produce a homogeneous and "anonymous" language [2].
This deficiency is not surprising, given its architecture. Models are designed to predict the next word in a local context, not to maintain a long-term stylistic intention [3]. The result is a text that can subtly change tone or style from one paragraph to another, appearing "sterile" and impersonal to the reader.
2.3 The cognitive impact: learning an unnatural language
The issue is not just about the aesthetic quality of the generated text; it extends to its cognitive and social impact. With the increasing exposure of young people to AI-generated content through search engines, social media, and educational platforms, the risk is that they will learn a distorted language model [4].
This language, characterized by rhetorical saturation, lack of a voice, and an unconsidered use of rhetorical figures, could become their norm. Learning a "perfect" language that is devoid of the imperfections and nuances that make it human could compromise their ability to understand and produce authentic, subtle, and pragmatically effective communication. In this sense, AI is not just a production tool, but a linguistic model that, if not corrected, could influence the very evolution of human language.
3. Perspectives and future actions
To overcome the pragmatic and rhetorical gaps we have identified, the scientific community must adopt an approach that goes beyond mere statistical optimization. It is no longer enough to teach AI what to say, but also how and when to say it, with a sensitivity closer to that of a human.
3.1 Beyond quantity: training on quality
The current training of language models is based primarily on enormous quantities of unfiltered data. The solution lies in the creation and use of higher-quality training corpora. This could include:
Annotated corpora. Developing datasets where rhetorical figures, tone, and pragmatic intentions are explicitly tagged. This way, AI could learn not only to recognize epanorthosis but also to understand in which contexts it is most effective and how frequently it should be used.
Training for parsimony. Implementing mechanisms in the training process that penalize the excessive use of certain expressions. AI would thus learn that for some rhetorical figures, moderation is an added value, and their impact diminishes as their frequency increases.
3.2 Models for coherence and voice
To address the lack of an authorial voice and long-term coherence issues, model architectures will need to evolve. We could see the development of:
Stylistic modules. Creating specific modules within the model's architecture that are dedicated solely to maintaining a consistent style and tone throughout the entire text. This would allow the model to have a long-term stylistic "memory," avoiding shifts in register from one paragraph to the next.
Discourse planning systems. Instead of generating text word by word, future LLMs could adopt a preliminary planning approach, outlining the structure and argumentative goals before beginning the writing process. This would ensure greater logical and thematic coherence in extended texts.
3.3 The crucial role of human feedback
Finally, Reinforcement Learning from Human Feedback (RLHF) will need to become more sophisticated. Human evaluation should no longer focus only on grammatical correctness or logical coherence but also include judgments on the naturalness and authenticity of the text. In this way, humans can guide AI toward a deeper understanding of the nuances that make communication truly human.
4. Conclusions
The current limitations of AI language are not insurmountable; they represent a crucial research opportunity. Addressing pragmatic and rhetorical issues will not only make AI a more sophisticated text generator but also a more natural and effective communication partner. This step, however, will require a concerted effort among computer science, linguistics, and cognitive sciences to teach machines not only to speak but to communicate in a truly human way.
Bibliography [1] Bambini, V. (2017). Pragmatica della comunicazione umana. Carocci editore. [2] Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. [3] Schmid, H. (2018). Natural language processing. Oxford University Press. [4] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems.
*Board Member, SRSN (Roman Society of Natural Science) Past Editor-in-Chief, Italian Journal of Dermosurgery

 
				