Risks and limitations
In order to use GenAI safely and responsibly in research (support), numerous legal, policy and ethical considerations and risks must be taken into account. Below, we provide key principles for the use of AI for research at UU, as well as the main risks of using AI and tips for risk mitigation.
Plagiarism and authorship
Utrecht 木瓜福利影视鈥檚 position on research integrity is described in the UU Code of Conduct for Scrupulous Academic Practice and Integrity and the Netherlands Code of Conduct for Research integrity (2018). In the latter, plagiarism is defined as 鈥渢he use of another person鈥檚 ideas, work methods, results or texts without appropriate acknowledgement鈥. Whether and to what extent AI-generated text counts as plagiarism is currently still a grey area. It therefore remains your responsibility to ensure that submitted work reflects your own effort, also to reflect rules on authorship described in the Netherlands Code of Conduct for Research integrity.
Originality
When your work is expected to be original, it's important that the audience can reasonably assume that you created the content yourself. This expectation is lower in certain cases, like writing a standard instruction manual or a user guide. Even in those situations, if AI is used to help create the content, it鈥檚 still important for a person to review it to make sure it鈥檚 accurate.
Transparency
If you use GenAI in your workflow or research in a 鈥榮ubstantial鈥 way (for example, use that involves more than basic text editing support), be transparent about your use of GenAI tools. Although 'substantial' as a term cannot be simply defined, the provide a starting point to consider this.
As research support staff, you may sometimes work with material (for example text) that was written by a researcher to whom you are providing support. If you enter their work in a GenAI tool, make sure to ask whether the researcher you are supporting feels comfortable with you using a GenAI tool and specify which tool you intend to use (and how).
As a researcher, appropriately and transparently acknowledge the use of the source/platform as you would any other piece of evidence/material in your submission. This will vary widely depending on how individuals may use AI tools in specific instances and/or the conventions in different disciplines. More information on citing GenAI in your work.
Tips on Research Integrity
- Ensure that any work you submit remains a reflection of your own effort.
- In case of substantial use, be transparent (inside and outside the organisation) about your use of GenAI.
Using GenAI involves processing large amounts of data, which can in many cases contain personal data. Depending on the AI tool you use, the data with which you prompt the model could eventually be used by the model developers to train and improve the models on which the tool is based. As a result, that data can 鈥榣eak鈥, e.g. be reproduced as output for another user and become publicly available even though that was not the intention beforehand.
Anything done with personal data with the use of any computing tools, including AI, is subject to strict regulations, and the legal definition of personal data is much broader than the lay understanding of the term. For further information, please consult the intranet pages on and . The General Data Protection Regulation (GDPR, or AVG in Dutch) sets rules for handling personal data. If you use GenAI with personal data, you must follow these rules. In academic research, there are some exceptions, but they don鈥檛 remove most of the key requirements.
Before you input data into any AI tool, consult a relevant officer of your department if the data you intend to input into the AI tool is personal data and what conditions apply: every organisational unit in faculties and corporate offices of UU has experts on data governance and protection who can help you. See the intranet page for more information.
Tips on protecting privacy and data security
- When using GenAI, treat the information you enter as if you were posting it on a public site (e.g., a social network or a public blog).
- Do not use any personal or sensitive data as input when using GenAI tools, unless you are certain that the data protection law is respected.
- Many risks pertaining to GenAI are not immediately visible. In many cases, it is necessary to assess the impact of your use of GenAI on data protection and privacy. You are not on your own: UU has experts who are available to assist and advise you with this process.
When using GenAI tools, it is crucial to be aware of potential IP issues. In the context of any AI-generated content, questions may arise about who owns the IP rights of the generated content. Current laws do not provide clear answers to AI-related IP issues yet. In case of substantial use of GenAI (see 'Research Integrity'), keep a track record of your workflow (e.g. earlier text drafts and revisions, literature summaries, etc.) to be able to demonstrate that your work is original if needed (e.g. if there is doubt as to the origin of the work).
Similar to personal data, works of others entered into an AI tool can also accidentally 鈥榣eak鈥. The works of others you input into an AI tool are protected by copyright of the authors and/ or publishers. Inputting those works into AI tools may constitute a violation of copyright. Read the licences under which those works are published. While older licenses do not explicitly account for use of works in AI tools, they will still contain relevant general provisions, and more recent licenses may already contain AI-specific clauses.
Tips on avoiding legal risks
Ensure that the training data of your GenAI model does not infringe on any third-party IP rights and until clear laws are in place, err on the side of caution. To mitigate these risks:
- When used 鈥榮ubstantially鈥 (see 'Research Integrity') in for example publications or grant proposals, make sure to be transparent about the way GenAI was used.
- Keep a track record of interactions with GenAI.
- Carefully select your GenAI tool, taking into account IP and data security risks (see also Effective use of GenAI).
- Some tools let you opt out of using your data for model training (under user settings). Others let you opt in and don鈥檛 use your data for model training unless you've provided explicit consent. When possible, opt out of model training or use tools that don鈥檛 use input data for model training.
GenAI models are only as good as the data they are trained on. If the training data is biased or incomplete, the GenAI's output will also be biased or incomplete. For example, if a model is only trained on text coming from western countries, it will include the wording, cultural biases, and ideas prevalent in these countries. If, for instance, you happen to be writing a proposal about mental health conditions that are more prevalent in non-western countries, this could be an issue. It is crucial to use GenAI tools that use diverse and representative training data, and/or to screen the output of such tools carefully for potential biases.
Tips on avoiding bias
- Understand the GenAI鈥檚 training data as much as possible. Diverse and representative data lead to fairer outputs.
- Stay critical and continuously assess the GenAI鈥檚 output for biased patterns.
GenAI recognises patterns and produces statistically likely output without understanding the content the way a human does. It can therefore generate output that seems plausible but is factually inaccurate or even nonsensical - a phenomenon often referred to as "hallucination". The AI might for example confidently describe a non-existent psychological theory called "Cognitive Resonance Theory" and attribute it to a fictional psychologist. Or, when asked for references, it might cite publications that do not actually exist. Since GenAI models seldom express uncertainty, they will tend to state such things baldly as facts, making it harder to detect inaccuracies.
The output of a GenAI model can also vary substantially depending on the way you phrase your prompt, instruction, or question. Thus, effective use may sometimes require modifying a prompt to obtain more relevant or precise output. Note that since these models are probabilistic, the exact same output will never be obtained, even with the same prompt, over multiple queries.
Tips on fact checking GenAI outputs
- Finetune your prompts to obtain relevant and accurate output and keep a record of your prompts.
- Always critically evaluate GenAI outputs and check for hallucinations.
While GenAI can speed up processes and help generate novel ideas, over-reliance can lead to a lack of oversight and critical thinking. Some GenAI tools are marketed as time saving 鈥楢I Research Assistants鈥 (e.g., Elicit, Scite, Scholarcy). If you choose to use them, use them with caution and be aware of their limitations. Be the human-in-the-loop (or even better; in the driver鈥檚 seat) at all times and evaluate the accuracy in the output of the AI.
Tips on maintaining balance with GenAI
- Use (and promote) a balanced approach, keeping in mind and emphasizing that GenAI is a tool to complement human expertise rather than replace it entirely.
- Form your own GenAI peer review community: for longer and/or more important output, apply the many pairs of eyes principles that you would also apply for scientific publications.
Running, and especially training, GenAI models require significant computational power, leading to high energy consumption and a substantial carbon footprint. Data centers use nearly 1-2% of the world鈥檚 energy, and 1 average Chat GPT-3 prompt uses around 10 times more electricity than an average browser search, roughly the equivalent of lighting a LED light bulb for an hour [1]. Aside from energy use, GenAI indirectly requires large amounts of fresh water for server cooling. When considering the larger picture, the global demand for AI could result in 4.2 to 6.6 billion cubic meters of water withdrawal by 2027 [2]. This figure is greater than the annual water withdrawal of 4 to 6 times that of Denmark or half of the United Kingdom.
Though using GenAI can have many benefits for your work, it is important to balance maximising the benefits of AI, while taking the environmental impact into account. Learning where GenAI adds most value in your specific workflow and being well informed about its resource consumption can help to find a middle ground between these conflicting values.
[1] Vries, The growing energy footprint of artificial intelligence, Joule (2023), j.joule.2023.09.004
[2] Li, P., Yang, J., Islam, M. A., & Ren, S. (2023). Making ai less鈥 thirsty鈥: Uncovering and addressing the secret water footprint of ai models.
Tips on reducing your environmental impact
- Ask yourself whether GenAI is really needed for the task, or whether a less energy-intensive solution (like a browser search or -in a research context- standard machine learning approach) would be sufficient or even better suited.
- Large, general-purpose models are powerful, but also use a lot of energy. Use smaller, task-specific models when possible for simple tasks (e.g., grammar checks or translation).
Guiding Principles
- Human accountability and human oversight: Always remain the 鈥榟uman in the loop鈥 when working with GenAI. Ensure content generated by AI is factually accurate.
- Data protection and security: Familiarise yourself with the UU data security policies and know how to find expert help on data security in the organisation.
- Transparency; show and tell: Proactively learn about the benefits and dangers of the GenAI tools you use or provide, and proactively communicate how you used it.
- Diversity, non-discrimination and fairness: Be aware of bias in GenAI data and output. Make sure output is in line with the UU values regarding Equality, Diversity and Inclusion.
- Environmental and societal well-being: Whenever possible, choose GenAI tools and instruments designed with climate and social considerations in mind. Do not use GenAI if your purpose can be achieved as effectively by other means not involving the use of GenAI.