The impact of GenAI – an interview with Anastassia Shaitarova
In the era of generative AI (GenAI), language technology has reshaped communication, learning, and content creation. As GenAI models evolve, their influence on natural language ─ the natural, human-constructed form of communication, including spoken, written, or signed languages ─ raises important questions about lexical diversity, syntactic structure, and potential language shifts.
We recently interviewed Anastassia Shaitarova from the University of Zurich’s Department of Computational Linguistics. She shared insights from her PhD research on how machine-generated text shapes natural language, especially for SwissGlobal, where some internal translations contributed to this research.
What was the primary focus of your PhD research?
My research focused on how machine-generated language ─ whether through neural machine translation (NMT, like DeepL or Google Translate) or large language models (LLMs, like ChatGPT and Google’s Gemini) ─ influences human language. This topic has gained relevance alongside rapid AI advancements.
My work was part of a broader interdisciplinary effort in Switzerland to examine the future of language. I specifically focused on the potential effects of AI-driven language on human linguistic practices.
What hypothesis did you start with, and what were your key findings?
I hypothesised that generative models would contribute to “lexical impoverishment” in natural language. This theory suggests that because these models rely on frequency-based word selection, they might lead to more standardised and potentially less diverse language.
My findings showed mixed results, especially when comparing NMT with newer LLMs. While some NMT systems exhibited lexical impoverishment, advanced LLMs like GPT-4 demonstrated a surprising degree of lexical diversity, surpassing human language in some cases, especially outside of a constrained translation task.
How did you measure lexical impoverishment, and did newer language models perform differently than older ones?
I used corpus linguistic methods to analyse various corpora of human and machine translations and machine-generated and human-written texts. I examined many linguistic features, including word frequency, syntactic complexity, morphological diversity, and text readability.
Lexical impoverishment is often traceable in the output of NMT systems, demonstrating more frequent, simplified language choices. It is also quite apparent in German texts generated by GPT-3.5, an earlier ChatGPT model. However, some NMT systems create text that is as diverse as human translations in some genres. GPT-4 shows a much broader vocabulary than the previous model. This indicates that lexical impoverishment might no longer be the main problem of generated texts.
You mentioned that modern models often train on simplified language. How does this relate to broader trends in language simplification?
Yes, one factor here is that these models are trained to choose the most probable next word, which can promote a simplified style. Assuming that much text is produced with the most probable textual outcome, it may lose some lexical diversity. This data is then used again to train another model, and the cycle continues. Previous research has observed this trend in NMT.
What are the main differences between machine-generated and human-written texts?
The main differences between human and LLM-generated texts include punctuation, word length, sentence structure, and lexical diversity. For instance, generated texts have longer words and sentences, resulting in lower readability scores than human-written text.
Differences in dependency length (how words relate within a sentence) further highlight these distinctions, with human text generally demonstrating more nuanced sentence complexity.
How does syntax vary across systems like DeepL, Microsoft, and Google?
While lexical differences are often system-dependent, all NMT systems mirror the source text’s syntax much more than human translators. DeepL produces more syntactically diverse output among the MT engines I tested than Google or Microsoft. However, bigger LLMs like GPT-4 have improved, offering more syntactic diversity by working with a broader context than traditional NMTs.
What differences did you observe between NMT systems and LLMs regarding translation quality?
The primary difference is contextual awareness. NMT systems operate mainly on a sentence-by-sentence basis, which limits their ability to interpret broader text structures. LLMs analyse entire paragraphs or even larger segments, leading to more coherent translations. LLMs can combine or split sentences to achieve more fluent output, which traditional NMTs can’t do as effectively. Although research on using LLMs for translation is still underway, there are reports on human readers preferring LLM translations to NMT translations, specifically referring to a more natural syntactic flow.
Did you find notable differences in lexical diversity across different versions of GPT models?
Absolutely. For example, GPT-4 has shown significantly increased lexical diversity compared to its predecessors. This diversity supports various expressions and vocabulary, including many nouns and adjectives. The increased diversity aligns with the model’s exposure to extensive, varied data sources during training. However, this extended vocabulary can sometimes be excessive, resulting in overused phrases and occasionally even meaningless expressions.
How do lexical items contribute to identifying machine-generated text?
When used without specific instructions, LLMs can exhibit typical lexical patterns. For example, studies have shown that certain adjectives are disproportionately present in ChatGPT-generated text. This is partly due to feedback-driven training, which influences how models prioritise certain word types. The excessive use of the verb “delve” by ChatGPT recently made headlines. This case was traced to the OpenAI’s human quality raters in Nigeria, where ‘delve’ is used much more frequently in business English than in any other anglophone country.
In German, I examined discourse connectives and saw that ChatGPT favours longer items with a higher semantic weight, like “darüber hinaus” and “des Weiteren” (‘furthermore’ in both cases). Machine-generated text is often more complicated lexically, while human writers are more selective in word choices, making the text more cohesive and natural.
Was there a noticeable difference in variety across genres or domains in machine-generated text?
Yes, genre consistency remains a challenge for machine-generated text. While LLMs like GPT-4 exhibit notable diversity, they still struggle to capture the stylistic nuances inherent to specific genres.
This limitation often stems from the models’ frequency bias toward more common expressions, which affects domain-specific content.
Do you think machine-generated text could shape human language? How might you approach future research on this topic?
The influence of machine-generated text on human language is difficult to assess. People are increasingly exposed to GenAI outputs without realising it. This could lead to “priming effects”, where humans mimic AI-generated language in their writing.
My initial psycholinguistic experiments suggest that exposure to machine-generated language impacts cognitive processing, potentially shaping individual language patterns over time.
Future research should investigate these priming effects, particularly within professional fields like translation, where post-editing machine translations are common.
In the long term, we might even see shifts in natural language as humans adapt to language norms introduced by generative models. This potential feedback loop raises fascinating questions about the future evolution of language in the age of AI.
What SwissGlobal says
We see GenAI and LLMs as groundbreaking technologies that open new opportunities while presenting challenges that demand critical thinking and research. It’s essential to evaluate use cases carefully, educate users about potential risks like vocabulary misuse, and promote informed, responsible applications. These technologies are here to stay, offering valuable benefits when used thoughtfully and with awareness.
About Anastassia Shaitarova
Anastassia Shaitarova is a fourth-year PhD student in the Department of Computational Linguistics at the University of Zurich. Her research is conducted within the framework of the Swiss-wide consortium NCCR Evolving Language, where she explores the impact of generative AI on natural language. In 2020, she completed her Master’s degree in multilingual text analysis at the same department. She has contributed to various projects in areas such as machine translation, language models, and natural language processing applications.
Learn more about Anastassia Shaitarova.
-
Language Services
translation