Blog: LLMs in Qualitative Analysis: Navigating the Intersection of Efficiency and Ethical Use
Large Language Models (LLMs) are rapidly becoming part of research practices, and qualitative analysis is no exception. They enable researchers to broaden the scope of their work, analyze large volumes of textual and visual data, and save time on repetitive tasks. The question is no longer whether these advanced tools will be used, but how they can be integrated transparently and ethically into research workflows.
Qualitative analysis is a complex practice that relies heavily on human interpretation, reflexivity, and judgment. From initial coding to theme development and the writing of analytical claims supported by evidence, the human element remains essential. Models for using LLMs in qualitative analysis vary widely. These range (1) from integrations with traditional analysis applications, which can provide a secure working environment but are limited by predefined features, to (2) dedicated qualitative AI applications, which offer speed but can be costly and may rely too heavily on algorithmic delegation, and finally to (3) general-purpose LLMs, which support a more flexible, iterative, and human-AI collaborative approach to qualitative analysis.
Considering the third model, general-purpose LLMs can be used as a co-pilot rather than as tools for fully delegating qualitative analysis to AI. In this arrangement, the human researcher determines the analytical strategy and research objectives, while the AI contributes through rapid processing, pattern detection, and summarization. The researcher then re-enters the process to conduct reflexive checking, adjust the analytical direction, and provide context-sensitive interpretation. Within this collaborative approach, two main modes can be followed.
- Scenario 1: AI as Validator. The human researcher performs the primary coding, and the AI subsequently reviews it for consistency, potential omissions, and possible contradictions. Here, consistency refers to the extent to which similar data excerpts are coded similarly across the dataset. The human researcher retains authority over interpretation and final coding decisions. This mode is particularly appropriate for high-stakes deductive coding, in which codes are established in advance based on theory or prior frameworks and where strong human control is required.
- Scenario 2: AI as Initial Annotator. The human researcher provides examples, or sample excerpts that demonstrate how data may be understood and labeled. The AI then generates initial open codes, meaning provisional labels that emerge from the data itself rather than from a predetermined coding scheme. The researcher reviews, validates, and refines these outputs, transforming them into more analytically meaningful categories where appropriate. This model is especially effective for rapid inductive coding, where patterns emerge from the data, and for overcoming blank-page syndrome, that is, the difficulty of initiating analysis when no initial coding structure is yet in place.
To make this collaborative model effective, careful prompt design is essential. Well-structured prompts help general-purpose LLMs produce outputs that are more consistent, transparent, and analytically useful. One helpful approach is the ACTOR framework proposed by Kien Nguyen-Trung, which offers a structured way to communicate with AI in qualitative research.
- A - Actor (Role): Specify the AI's operational domain and expertise (e.g., "You are an expert qualitative research assistant").
- C - Context (Definitions): Provide strict academic definitions and explicitly state your research questions (e.g., "Define 'open coding' as the process of assigning initial labels to segments of qualitative data").
- T - Task (Action): State the precise analytical action required (e.g., "Perform initial coding").
- O - Output (Format): Dictate the exact structure of the deliverable (e.g., "Present findings strictly in a 4-column table").
- R - Reference (Data): Bound the AI to a specific data source so it does not make up data (e.g., "Use only the interview transcripts provided in Documents 1-10 and do not draw on outside sources or assumptions").
However, this collaborative approach also requires careful attention to the risks of AI-generated content. These risks include unsupported claims, overly neat summaries, reduced emotional nuance, and the marginalization of minority or unusual voices. In addition, LLMs may impose generic categories, reproduce biases from their training data, create false confidence through polished language, and raise privacy concerns. To address these challenges and use LLMs ethically in the qualitative analysis process, the LEAP-PVD checklist offers a practical framework for ethical AI integration:
- L - Legal consideration: Ensure that the use of AI complies with relevant laws and institutional rules.
- E - Ethical guidelines: Follow established research ethics standards throughout the study.
- A - Application: Submit a clear ethics application explaining how AI will be used for ERB.
- P - Permission: Obtain explicit consent from participants for the use of LLMs for analyzing their anonymized data.
- P - Privacy: Remove or anonymize personal data before uploading any material to AI tools.
- V - Verification: Carefully review and validate all AI-generated outputs.
- D - Disclosure: Clearly report the use of AI in the final publication.
To conclude, LLMs are likely to become an important part of qualitative research practice, but their most valuable role is not to automate interpretation. Rather, they can support researchers in managing scale, iteration, and organization, while interpretation, reflexivity, and analytical judgment remain firmly in human hands.
Resources:
- Evaluation of Large Language Models Within GenAI in Qualitative Research
- ChatGPT in Thematic Analysis: Can AI Become a Research Assistant in Qualitative Research?
- LLM-Assisted Thematic Analysis: Opportunities, Limitations, and Recommendations
- The Use of Generative AI in Qualitative Analysis: Inductive Thematic Analysis With ChatGPT
- Leveraging Large Language Models for Thematic Analysis: A Case Study in the Charity Sector
- Generative AI in Qualitative Data Analysis: Introducing the Guided AI Thematic Analysis Framework
- Thematic Analysis With ChatGPT - Full Analysis From Codes to Themes
- Written by Hamayoon Behmanush (https://www.hamayoon.me/)