4. What am I doing here?
Create a great speech to start with; https://chat.openai.com/
Clone my voice; https://myvoice.speechify.com/
Create an Avatar; https://www.lightxeditor.com/
Create a personal AI Chatbot, with access to my ‘digital life’, so people can
interact with me; https://www.personal.ai/
And why are you here?
5.
6.
7.
8.
9. Ai vs. AI
Who am I?
Can you describe a sunset using
only numbers?
What is the one question you
would like to ask humans?
Do you believe in karma?
Source: https://www.designboom.com/art/ai-weiwei-81-questions-artificial-intelligence-picadilly-lights-exhibition-01-15-2024/
10. Technology will safe us. Artificial intelligence will make us happy and
bring us beauty and eternal life. Blockchain technology will make trade
fair. Lab-grown meat will make the livestock industry unnecessary.
Biotechnology will cure cancer and dementia. We have great faith in
technological progress.
Companies, governments and independent researchers could
deploy powerful AI systems to handle everything from business
to warfare. Those systems could do things that we do not want
them to do. And if humans tried to interfere or shut them down,
they could resist or even replicate themselves so they could
keep operating.
What do you believe?
Or....
11. Jene van der Heide
Coordinator Research WDCC
Knowledge Base programme lead:
- Data Driven & High Tech
- Wageningen Modelling Group
AI Taskforce
Jene.vanderheide@wur.nl
Generative AI
The Impact of GenAI on Researchers
12. Artificial Intelligence
Machine Learning
Deep Learning
Generative AI
Machine systems that aim
to simulate human
intelligence
Automatically learns and
improves from experience
Multilayered artificial
neural networks
AI capable of generating
text, images, or other media,
using generative models
Foundation models
General-purpose technologies
that function as platforms for a
wave of AI applications
13. The “classic” three steps of Artificial Intelligence evolution, moving from Narrow, General to Super. Artificial Intelligence. Image credit: Blagoj Delipetrev
18. We encourage lecturers, researchers,
PhD/EngD candidates, students
(including professional learners) and
support staff at WUR to use
generative AI (GenAI) methods
responsibly.
19. But, how to deal
with GenAI responsibly
at WUR?
20. Today
Your needs and expectations
GenAI quiz
How to deal with GenAI at WUR
Back to your needs and expectations
21. What are your needs and
expectations?
Wooclap – so grab your phone!
34. We encourage lecturers, researchers,
PhD/EngD candidates, students
(including professional learners) and
support staff at WUR to use
generative AI (GenAI) methods
responsibly.
35. But, how to deal
with GenAI responsibly
at WUR?
37. Important principles:
Transparency: on the use of GenAI, depending on the type of use;
Verification: of the correctness of the generated output, with
attention to correct source references and check on biases;
Respect / Privacy: for personal, confidential information and
according to the rules of data ownership by not entering it on
platforms that are not managed on their own servers;
Knowledge and responsibility: for the proper use of GenAI (initially
as help and support) and for the output you publish. Make sure you
make a well-considered choice for certain (situation-related) use of
GenAI applications. Inform and train yourself.
Important restrictions:
Blackbox
No language understanding
Inaccurate, erroneous or misleading information (hallucinations)
Bias and discrimination
Out of date
Plagiarism
Poor reproducibility
Security / confidentiality of the entered data is not garanteed
General guiding
principles
- to be published -
39. Principles and guidelines for WUR Doctoral candidates
(work in progress)
GenAI should be treated as a research support tool (in line with the Netherlands Code
of Conduct for Research Integrity) and integrated into the training and supervision of
Doctoral candidates.
Doctoral candidates (like all researchers) remain accountable for the accuracy, integrity and
originality of their research that includes the use of GenAI. The misuse of GenAI can constitute
falsification, plagiarism or ‘aigiarism’ and should be sanctioned as ‘research misconduct’.
GenAI can be interpreted as a tool or working procedure in the Code, the use of which
should be reported and cited honestly and transparently in PhD theses taking into
account the various risks outlined in the Guidelines for the Use of GenAI at WUR.
Authorship Statement for each chapter in the thesis itself.
I’m a PhD
What else?
I’m a supervisor
I’m a researcher
41. Be aware:
• Broadening the gap between
richer and poorer (accessibility)
• Working conditions for employees
of AI developing big tech
companies
• Carbon footprint
• Hallucination
• Bias and discrimination
• Privacy / Data security
• Decrease in writing skills
• Blurred authenticity
• Copyright / Trademarks
• How to credit AI
• GenAI needs you
42
Were you hesitant to share yourself to create a digital self?
Were you impressed?
Do you believe you need an avatar like you need an email address? When I started studying Spatial planning in 94
Introductory statement at WUR
What rules should we have?
Agree on the proposed rules?
Example on how to discuss with students what is and is not allowed.
Advice to teachers: make it as explicit as possible. Add rules and regulations in course guide.
Examples of risks.
Hallucination: Confidently stating a falsehood as truth. Requires knowledge on the subject and critical reflection to counteract.
Privacy: All input and output is used for training (unless opted-out, as is possible now in ChatGPT via user Settings, though this is still a faith-based approach). Example: Putting in text with research questions/knowledge gaps from NL. Someone in Australia can ask for knowledge gaps in your field and get your input as result. Input cannot be traced back to source, but scientific or corporate advantages can be lost this way. Also risk of proprietary information leaking. Requires careful evaluation of input allowed, and boundaries related to data storage on AI-side.
Decrease in writing skills: Is writing a prompt and copying the output still writing, or does it diminish our writing skills? Debatable.
Blurred authenticity: Does the user writing the prompt claim authorship of a text, or should the AI? Is the AI tool or co-author? More in this in later slide.
Copyright: Already discussed related to artistic AI, but also applicable to other fields, such as audio and text.
How to credit AI? Should AI use be disclosed, or do we accept its use like a calculator without mention?
ChatGPT needs you: ChatGPT is a personal assistant, not a professor. You get out what you put in. If you put in low quality prompts, you will get out low quality results. Quality is your responsibility. Learning how to write great prompts will take time and energy, but it will pay off.
welcome, this presentation aims to take you along the why-is this interesting and are we talking about it, the what-is it actually and what can we (not) expect from it, and the how-can we at WUR exploit generative AI to ‘explore the potential of nature to improve the quality of life’, for research, to start with
I am director of Wageningen Data Competence Center where we (try to) coordinate and support our efforts in exploiting the potential of data, to improve … Thanks to Jene for (co-)organizing this meeting!
Only last week our national governenment presented a broad vision on generative AI. Our government will proceed with a diverse action programme in society. All aspects are relevant for us. In this presentation I will focus on content, since I hope that can guide us in all other aspects such as legal, education, responsible use, and because I think expertise in this field will help us to influence the societal debate. At WUR, we are a kind of societal microcosmos, of course.
So, why? What you see here is … You may know these examples. Astonishing. It seems there is a huge potential for everything involving computers. Imagine, connect it to a text-based command interface of something executing these commands. So, include it in an online real-time loop of some machine working from text-based commands. Climate control? Logistics? Production process control? Imagination has no limits here. However, (1) TWh/yr (nation level) energy consumption. (2) Restricted access to and doubtful legal background of training data. And, (3) fault rates can be low but in general nonzero, strongly depending on search or training space. Exams as an example have a very limited search space, namely material to be learned by students. The three (energy – data – errors) are closely related. Why is that? These things are not intelligent! Turning to the what.
Now, what. I have been talking about search space, or training space. Why is this relevant? This space contains the possibilities where we want to find our way. Generative AI namely works like a kind of navigator. Like finding your way on a map. Points in space can be characterized by coordinates. Such as x, y, z. Depending on your problem you may need 1, 2, 3 or 500+ coordinates. For instance, if you are on the right road to your destination you only need one dimension, namely distance. If you want to hit a moving bird, you need four, x, y, z, time.
The picture shows ...
This is a way, given data and the problem to be solved, to determine how many and which coordinates you need. A well known method is called principal component analysis. Here, linear (so relatively simple) relationships are used for finding the right coordinates. An example is shown. Depending on the desired accuracy, one or two coordinates suffice.
So, the point here is: finding the right coordinates from you data helps you solve your problem.
There is a famous publication from now 7 years ago that explains how the principle of representation with coordinates can be used for large language models and generative AI. Here, attention refers to the idea using distance to estimate probability. So, short distance means high probability. For distance you need coordinates. And in this paper a way to determine coordinates is described. It is nonlinear, but in principle very similar to PCA.
This shows a schematic of the machinery inside LLMs and genAI. It’s called a Transformer. A bit more complex. However, there is a part determining the coordinates, called embedding. And there is a part determining distances, called attention. Which is all I wanted to point out here. These are the fundamental processed to determine output. And, one more of course. Not knowing beforehand what coordinates to choose, this entire system needs to be trained. Using data, computing time, funding, and human feedback. So, it is really only about fitting existing data. The stream one (left) is to train. And stream two (right) is to use. I am not aware of all the details involved but I guess we have researchers in the audience who are!
This method clearly is amazing but not perfect. First, of course it is almost prohibitively expensive to be developed by mortals like us. This makes us dependent on a few closed companies. So, unless we want to become entirely dependent, we will need to be clever and develop something smarter and not as expensive. Then, obivously, it has shortcomings. With an example. So, not only for this dependence, also for reliable scientific applications I would say we cannot rely on this. I would say, the search space is still too big to narrow it down with current methods, responsibly, reliably, and affordably. The machine is not intelligent enough. But all the same, as you can see, it has been trained to be very polite (and wrong)! And, I am tempted to say, to deceive you in making you believe it is intelligent.
Are such obvious mistakes being noticed? That has been explored, somehow, overseas. We do notice them, but evidently not all the time. 4% never recognizes mistakes. 19% rarely. 37% only sometimes! And just 40% often. In science, evidently, this is not acceptable. As is, btw, in commanding critical systems. So, the example I used before, still lies somewhere in the imagination. Results may be devastating if we would carelessly adopt generative AI in such circumstances. More so than automatically driving cars accidentally causing traffic victims, but causing serious mistakes in scientific research, or endangering whatever text controlled system you may want to operate with it. So far the what. With all due respect to those computerboys who made this, I think we should not be too impressed and not at all content with the current state of the art!
Now the how. I think we can distinguish two ways of combining AI with Science. For the first, I refer to a recent paper published in Nature. It uses the expression AI 4 Science, and shows various ways how AI can be included in different parts of the scientific process. The different parts are indicated: from data collection and curation (data labelling), to representation (to be interpreted here as visualization), generating scientific hypotheses (such as by nonsupervised clustering algorithms) and experiment design and simulations (chemical synthesis planning, controlling experiments). Like we do, as recently published with examples in a news article on our intranet. The distinction is not strict, but AI and Science can be combined in multiple ways. Not excluding each other.
The alternative I’d llike to indicate as Science 4 AI. Or, AI on scientific steroids. Imagine a neural network, consisting of nodes. Each node has a kind of activation function that needs to be trained to represent behaviour captured in data. These usually are some general type of function. Now imagine having data from the solar system: images showing unidentified moving objects. We could use a NN with GTFs to try to predict their behaviour. Will they collide with Earth? But instead, would it not be easier to somehow include what we know about the movements of planets by including special functions such as representing the behavior of the force of gravity? Then, we would only have to find parameters for the functions, and not find a way to represent behaviour of this function. So, we limit the search space to parameters, instead of functions as well. Examples of this kind, in our domain, are being actively investigated at WUR. For instance, as part of the investment theme D3C2. For more info, please ask Ioannis and Yvette.
To conclude with, let us go and find out!
And participate in the societal debate, based on our specific expertise.
Any questions?
Example of literature crawlers.
Scite is broadly used in the US, assesses content of papers and helps find new related articles etc. Is a paid service, but broadly accepted.
Elicit scans literature based on a research question input, provides papers (with additional information such as Q1/Q2, funding sources, commentary in other papers, etc.) as well as a summary based on the top 4 results. Example shown adjusted on audience chairgroup topic.
ResearchRabbit focusses on citations. Takes paper as input, and lists citations and cited-by as web, based on relations. Helpful to find progress on specific topics in a field.
Three examples of errors related to prompt engineering. Ask the audience to think for a moment for each of the three examples what they think the result will be, what the pitfalls could be, and how it could be improved. Briefly discuss audience answers before revealing answers in slides.
Example prompt engineering. Implicit meaning of words results in wildly different results. Consider words carefully.
“Condense” implies same content needs to be maintained, but using fewer words. Is better suited for such tasks.
Example on pitfall of AIs: Bias and lack of context. How should the AI evaluate this, from what viewpoint and context? Objective review needs to be prompted by asking for pros and cons (or similar tricks).
Example of prompt engineering with lack of context. More information needs to be given to potentially generate the desired output.