
Large Language Models (“LLMs”) are a subset of artificial intelligence (“AI”) which use a type of machine learning called deep learning in order to understand how characters, words, and sentences function together. The advent of LLMs represents one of the most disruptive transformations in the evolution of AI with LLMs being now employed across a wide range of contexts, from chatbots, content creation, to coding assistance and business process automation.
The development of these AI tools powered by LLMs – generally relying on vast amount of data, including personal data, as part of their training dataset – also raises significant concerns from a data protection law standpoint. On 10 April 2025, the European Data Protection Board (“EDPB”) published a report outlining a risk management methodology for LLMs (“Report”).
The Report goes beyond a theoretical overview of the main categories of privacy risks – it proposes an operational methodology for identifying, assessing, and mitigating the risks associated with LLMs, adopting an approach grounded in the GDPR principles of data protection by design and by default, data minimization, and accountability. It also includes a systematic review of technical and organizational measures recommended to reduce exposure to risk and outlines the implications arising from the distribution of roles and responsibilities among the various actors involved in the AI system lifecycle.
The Report’s key themes as applied to the use of LLMs in the context of a chatbot are as follows:
Data Flow in LLM Systems
Understanding the data flow in AI systems powered by LLMs is crucial for assessing privacy risks and identifying the appropriate mitigation measures. The Report recommends that organizations have a clear overview of the possible architecture of the AI system at the early design and development stage, to better understand the data flows and potential risks associated with its development and deployment.
As a working example, the Report examines a chatbot assistance to illustrate the proposed risk assessment methodology. First, the expected data flow for the processing of personal data is considered and nine distinct stages are identified:
Mapping data flows throughout the lifecycle of the AI system (e.g., by identifying the sources of data, categories of data recipients, storage and transfer locations, data retention) is crucial in staying on top of, and mitigating, any potential privacy risks. For example, as the chatbot processes user personal data, understanding where this data may be transferred to (e.g., to a third country with no adequacy decision from the European Commission) is critical to build appropriate safeguards into the contractual arrangements with vendors, such as cloud providers.
Risk Assessment
Following a comprehensive data mapping, organizations should consider the risks associated with the deployment of the chatbot by involving their key business stakeholders with decision-making authority and direct involvement in the development, deployment and use of the AI system. These stakeholders would include representatives from various teams, such as the engineering, IT/security, privacy, and UX design teams, who will be best placed acting collaboratively to identify cross-functional risks, if any.
The Report considers three main risk factors that may affect the deployment of a chatbot:
1. Large scale processing – A significant volume of user data will be processed as a result of the users’ interactions with the chatbot.
2. Low data quality – Customer query inputs may have low quality data which could lead to inaccuracies or inefficiencies in processing.
3. Insufficient security measures – There is a potential risk of transferring personal data to countries without an adequate level of protection, especially if the LLM model is hosted or maintained in third countries.
Though a chatbot would not fall under the classification of a high-risk system under the EU AI Act, the Report recommends the undertaking of a Data Protection Impact Assessment (“DPIA”) in this working example as key to evidencing the organization’s accountability under the GDPR.
To the extent any risks have been identified, organizations should consider both the severity of the potential privacy impact on data subjects and the probability of such risks materializing, to be able to appropriately classify the relevant risk(s).
Risk Mitigation
After evaluating the identified risks, organizations should consider mitigating measures to effectively deal with the probability and severity of such risks. In the context of the chatbot working example, the Report sets out the following recommendations to mitigate each of the three risks identified above:
1. Large scale processing:
a. applying post-processing/output filters to remove or redact sensitive information from responses;
b. implementing relevance filters or scoring mechanisms to ensure only appropriate content is passed to the LLM; and
c. restricting retrieval sources to approved, privacy-screened datasets (e.g., filtered CRM data).
2. Low data quality:
a. evaluating chatbot responses regularly for accuracy and relevance;
b. training the model on high-quality, diverse datasets to reduce biases; and
c. including disclaimers in chatbot responses to clarify they are AI-generated and not definitive advice.
3. Insufficient security measures:
a. securing data transmission using adequate encryption protocols;
b. using robust API security measures, including access controls, authentication, and rate limiting;
c. encrypting stored data and implementing access controls; and
d. applying retrieval filters and output sanitization to reduce the risk of the chatbot leaking sensitive information.
Following the implementation of the mitigation measures, organizations should re-run their risk assessment with a view to obtaining an updated risk classification level and determining whether any risks are left remaining. Organizations should keep this process under review and repeat it to the extent new functionalities are added to the chatbot.
Conclusion
The Report is not just for privacy teams — it is a strategic playbook for any organization deploying generative AI, particularly where it is providing or deploying the AI system in the EU market falling under the EU AI Act. It also serves as an illustrative guide and benchmark in relation to risk assessments for organizations with a global footprint, whether or not subject to the GDPR.
The key takeaways for organizations are:
1. Risk assessments must go beyond surface-level checks and account for actual use cases. This exercise must be cross-functional to ensure a holistic assessment of all possible risks.
2. AI governance starts as early as the AI system design stage, and throughout the AI lifecycle, including procurement, implementation and updates.
3. LLM ecosystems are complex: cloud providers, API users, internal development teams, and deployers all play a role. Data mapping is key to staying on top of a complex compliance legal and regulatory framework in Europe, including the GDPR and the EU AI Act.