Generative AI and Large Language Models in Communication Research

From method to medium: two registers of inquiry

The papers gathered here split cleanly into two registers that increasingly bleed into each other: LLMs as instruments for communication research, and generative AI as an object of communication research — a new vector of persuasion, disinformation, and epistemic disruption. The methodological literature asks whether these systems can credibly replace or augment human coders, scalers, and annotators. The substantive literature asks what happens to publics, parties, and truth claims once such systems are loose in the information environment. Reading across the corpus, the two registers converge on a single concern: the conditions under which LLM outputs can — or cannot — be trusted as evidence.

LLMs as research instruments: a maturing toolkit

A first cluster establishes LLMs as workable measurement tools across classical communication tasks. Le-Mens2025-qz shows that “asking and averaging” instruction-tuned models recovers established scaling benchmarks across languages and text types, while DiGiuseppe2025-es extends the logic to open-ended survey responses via paired comparisons. Meher2025-qb demonstrates that parameter-efficient fine-tuning (QLoRA) lets political scientists adapt open-weight LLMs to specialised coding schemes on consumer hardware, and Tan2024-vl provides the broader methodological scaffolding for treating LLMs as annotation and synthesis engines. Validation studies pushing into harder territory — visual content Achmann-Denkler2026-lx, cross-lingual narrative similarity Waight2025-al, narratological actant coding Elfes2026-jb, and political-entity extraction at scale Iris2026-pg — all report that multimodal and reasoning-capable LLMs now match or surpass specialised pipelines and approach human inter-coder agreement. Larsson2026-ro adds a working example in a non-English context, classifying a decade of Norwegian Facebook posts with GPT-4.

A second cluster maps the boundaries of this toolkit. Brown2025-jk finds that demographic bias in LLM annotation is real but small and dataset-specific, with item difficulty (label entropy) dwarfing model or prompt choice as a predictor of agreement. Paci2025-ag shows that LLMs still falter on pragmatic phenomena — implicatures and presuppositions in Italian political speech — where world knowledge about speakers and contexts is essential. DeVerna2025-dl is perhaps the most sobering: even reasoning- and search-enabled frontier models perform poorly at political fact-checking unless given curated retrieval context, and their unaided citation patterns skew toward credible but left-leaning sources. Balluff2026-if knits these caveats into a programmatic critique, arguing that the field’s adoption of LLMs has outpaced its reflexivity about reproducibility, corporate dependency, language bias, and environmental cost. The emerging consensus is not “use LLMs” or “don’t” but a trade-off mindset: match the model’s affordances to the task, validate locally, and prefer open, lighter alternatives where possible.

Pushing into the visual and the latent

Several papers stretch the instrument register into territory where text-based methods have historically struggled. Achmann-Denkler2026-lx uses GPT-4o for face recognition and crowd-counting in campaign imagery, finding that prompt-based pipelines outperform dedicated computer vision systems — though with hints of gender bias in legacy models. Arminio2025-tw inverts the pipeline: rather than embedding images directly, VLLMs generate textual descriptions that are then clustered, capturing connotative meaning (memes, symbolism) that CNNs miss and yielding interpretable TF-IDF cluster summaries. Lee2026-je pushes furthest into latent inference, showing that LLMs can recover users’ political alignment from ostensibly nonpolitical online talk by exploiting politicised cultural cues — a methodological advance that doubles as a privacy threat.

Generative AI as object: persuasion, accuracy, and their trade-off

The persuasion literature has matured rapidly from speculation to systematic experimentation. Hackenburg2025-dj provides the keystone result: across 19 LLMs and ~77,000 responses, post-training and prompting strategies — not scale or personalization — drive persuasiveness, with information density the central mechanism. Crucially, the same levers that boost persuasion erode factual accuracy, establishing a structural trade-off. Lin2025-xp replicates and extends this in four electoral contexts, finding that conversational AI outperforms traditional political ads, that ~one-third of the effect persists a month later, and — strikingly — that LLMs advocating for right-leaning candidates produce systematically more inaccurate claims, mirroring asymmetries in the broader information environment. DiGiuseppe2026-pu adds a critical contingency: perceived neutrality is itself a lever. A brief warning that the LLM is biased against the user’s party cuts belief correction by roughly a quarter, operating through motivated argumentation rather than disengagement. Together these papers reposition the Cambridge-Analytica imaginary: microtargeting matters less than information volume, source credibility, and the political colouring of the messenger.

From artifacts to synthetic realities

A third strand widens the lens from individual persuasion to systemic epistemic effects. Schroeder2026-im warns that the fusion of LLMs with agentic architectures enables “malicious AI swarms” capable of manufacturing consensus, infiltrating communities, and poisoning future training data. Orlando2025-ul supplies the mechanistic evidence in simulation, showing that mere teammate awareness among networked LLM agents produces coordination — dense narrative convergence, synchronized amplification, fast hashtag diffusion — nearly equivalent to explicit collective deliberation. Emilio2026-ik generalises this into a “synthetic reality” stack (content / identity / interaction / institutions) and articulates the Generative AI Paradox: as synthetic content saturates, rational actors discount all digital evidence, raising the cost of truth itself. Schiffrin_undated-gi grounds these abstractions in the political economy of deepfake financial fraud, mapping a scam ecosystem whose harms cannot be addressed without shifting liability onto platforms, telcos, and financial gatekeepers. Dierickx2026-tw offers the epistemological complement: GenAI outputs constitute a new category of emergent facts — plausible, prompt-dependent, opaque — that traditional positivist, constructivist, and institutional accounts of factuality cannot accommodate.

Training data, institutions, and the new media system

Several papers locate the political stakes upstream, in the data and platforms that shape model behaviour. Waight2026-ts provides perhaps the most consequential institutional finding: Chinese state-coordinated media is memorised by widely-used commercial LLMs, and models produce systematically more pro-regime responses when queried in languages of low-media-freedom countries — a cross-national “laundering” of authoritarian framings through ostensibly neutral AI. Triedman2025-uy documents the parallel project from the other direction: Musk’s Grokipedia is highly derivative of Wikipedia but cites blacklisted sources (including Stormfront, InfoWars) at vastly higher rates, particularly on controversial topics and elected officials, even engaging in “LLM auto-citogenesis” by citing Grok chatbot conversations as sources. Nguyen2026-vm examines how news media themselves frame this emerging technology, finding that mentalistic-agentic framings of LLMs coexist with critical and technical registers — varying more by editorial style than by region, but everywhere subordinate to a techno-capitalist master frame.

The widest-angle synthesis comes from Tornberg2026-lc and its companion Tornberg2025-ir, which argue that the “social media” paradigm itself is dissolving under three pressures: algorithmic recommendation displacing the social graph, generative AI substituting for user-generated content, and user retreat into semi-private spaces. AI chatbots, they note, now exceed active posting on major platforms — a “media form without publics” that privatizes communicative functions and erodes the shared epistemic infrastructure democratic discourse has relied on.

A common arc

Read together, the corpus traces an arc from instrumental optimism toward structural concern. LLMs are genuinely useful — for scaling, annotating, classifying, even seeing — provided researchers validate carefully and resist corporate lock-in. But the same capabilities that make them good measurement tools (fluent synthesis, plausible inference, context sensitivity) make them potent vectors of persuasion, coordination, and synthetic realities when deployed at scale by interested actors. The methodological and substantive registers are thus two sides of one question: how do we sustain a shared evidentiary basis — for research and for democratic publics alike — in an information environment increasingly mediated by systems whose outputs are plausible by design rather than verified by construction?

fg-zettelkasten

Explorer

generative-ai-and-llms