The Heisenberg Uncertainty Principle and Bias in Generative AI

This edition of my GenAI Demystified series, I consider a metaphorical parallel between the Heisenberg Uncertainty Principle and Bias in Generative AI [mostly].

If you have been following along, you will know that the aim of my series on “GenAI Demystified” is to help a wide range of readers fill their AI tool boxes with not only tools but knowledge, approaches and a healthy dose of skepticism. Equipped with such a “tool-kit” [“tool-kit” being a metaphor for a set of skills and knowledge], one can master the complex web of AI technologies.

The Heisenberg Uncertainty Principle

The Heisenberg uncertainty principle [posited by Werner Heisenberg], is a fundamental principle in quantum mechanics. The long and short of it is that physicists agree that there is a limit to the precision with which certain pairs of physical properties of a particle can be measured. For example, the position of a particle and its momentum or velocity can’t be measured/known simultaneously. Another way of thinking of it is: the more precisely one property is measured, the less precisely the other property can be known. In the context of Generative AI, the interesting thing to me about the Heisenberg Uncertainty Principle is the “observer effect”.

The Observer Influences the Observation

In quantum mechanics, the act of observation or measurement can influence the state of a quantum system. This is known as the observer effect, where the observer of an experiment influences the experimental observation. According to the Heisenberg uncertainty principle, the very act of measuring one property of a particle, such as its position, disturbs the other property, such as its momentum. This means that the observer's interaction with the system affects the outcome of the measurement. The observer effect highlights the inherent probabilistic nature of quantum mechanics and the limitations of our ability to simultaneously know certain properties of a quantum system with absolute precision.

Parallel between the observer effect and human interaction with Generative AI

Okay, before you conclude that I’ve flipped my lid, I understand that the field of quantum mechanics is not the same as Generative AI; bear with me for a minute: the concept of the Heisenberg Uncertainty Principle is well accepted, and I would like you to consider with me a parallel between the observer effect in quantum mechanics and the influence of human prompts and other influences on the responses generated by AI models. In both cases, the act of observation or interaction with the experiment or model plays a role in shaping the outcome.

When it comes to Generative AI, the responses generated by the model are influenced by the prompts or other influences provided by humans. For example, the choice of words, context, and framing of the prompt can significantly impact the generated response. This means that the same AI model can produce different outputs based on variations in the prompts it receives.

However, it's important to note that while humans influence the generated responses, it doesn't necessarily mean that the GenAI responses are always an accurate representation of truth. AI models like GPT are trained on vast amounts of data and learn patterns from that data, but they don't possess true understanding or consciousness. The responses they generate are based on statistical patterns and associations in the training data. See my post on “Semantic Understanding and Reasoning in GenAI Models” for more details on this topic: https://www.rogercornejo.com/genai-demystified/2024/6/4/semantic-understanding-and-reasoning-in-genai-models .

Human-in-the-Loop

While prompt chaining and human interaction can shape the responses of Generative AI models, one must critically evaluate and verify the information provided by AI models to ensure accuracy and reliability. In my opinion, there is no substitute for the human domain expert, so I advise careful use of GenAI responses.

Bias in the context of Data Science, Machine Learning, and Generative AI:

In my view, bias in the context of Data Science, Machine Learning, and Generative AI is an important topic that warrants a closer look. In these fields, bias can be unintentional or, sadly, intentional (i.e. bad actors unethical creation or use of GenAI). At any rate, bias is essentially encapsulated in the human influences that are present in the model inputs, the training and/or augmented data, the algorithms themselves, and the human decision-making process applied to the results. Thus, bias can be injected into your results from various sources, including: biased prompting, biased training data, biased algorithm design, and/or biased interpretation of results.

Understanding bias in Data Science, Machine Learning, and Generative AI is crucial for ensuring ethical use of these technologies. For example, if generative AI foundation models were trained excluding a diversity of facts and thought patterns, and people were relying on GenAI to represent truth or be comprehensive, you can easily see where things can go terribly wrong. I’m not saying that I have any examples of this, but one could imagine a user encountering problems if one expects truth and the model “lies” to you because of unethical biases introduced by the model creators**. In my view, truth is important because, by definition, truth is not evolving or in process; truth is not tentative or without finality [human interpretation of truth is what wavers]. Thus, any attempts to render truth as untruth (or vice versa) is subversive. My warning here is that generative AI should not be used as authoritative and be considered the final arbiter of truth. Think of it this way, generative AI models based on the transformer model are probabilistic mechanisms and as such the probability of tokens generated by AI is not true knowledge or truth. This is the warning that should be understood by all.

**[Side Bar Note - On the possibility of intentional/unethical biases introduced in the model: In this hypothetical scenario, the model is promoted as a truth source but used by the creators to influence societal behavior by seeding the model with a skewed version of authoritative information. I’m certain that social engineers (in whatever capacity they operate) understand well how generative AI can be used as a means of social control. Control the information, control society.]

Let's consider again the role of prompting in GenAI responses: one aspect that concerns me is the significant dependence of Generative AI responses on the human influences present in the prompts. It raises the question of whether we tend to favor the responses from Generative AI simply because they align with our own preferences or expectations. This inherent confirmation bias could be a significant contributing factor to the popularity of generative AI. However, it's important to acknowledge that this phenomenon is a double-edged sword. While it's natural to appreciate responses that resonate with us, it is equally important to seek responses that can be objectively validated as correct.

Reducing bias requires careful and ethical consideration of training data, algorithm design, evaluation methodologies, and ongoing monitoring to detect and mitigate biases. By actively working to identify and address bias, we can strive for unbiased systems that benefit a wider cross-section of use cases.

Bias Types – A Brief Breakdown:

Here are some examples of the kinds of bias that might be introduced by the human in the context of Data Science, Machine Learning, and Generative AI:

1. Algorithmic bias: Algorithmic bias refers to the bias that can be present in machine learning models due to the data used for training. I’m primarily concerned with any skewing of the training data, but you’ll often hear people raising the concern that if the training data is biased or reflects societal prejudices. In the case of skewed training data, the model will learn and perpetuate that bias, which will lead to biased outcomes or predictions. For example, if a facial recognition system is trained primarily on data from a specific demographic group, it may perform poorly on individuals from underrepresented groups, leading to biased outcomes and potential harm. I acknowledge the difficulty in eliminating data skew, and I’m more or less happy to live with it as long as it is unintentional skew (i.e. no bad actors, or folks influenced by bad actors, feeding the model).

2. Confirmation bias in data analysis: Confirmation bias can also occur in data analysis, where analysts may selectively interpret or favor data that aligns with their preconceived notions or desired outcomes. This can lead to biased conclusions or biased interpretations of the data, hindering objective and unbiased decision-making.

3. Observer bias in model evaluation: Observer bias can influence the evaluation of machine learning models. The subjective expectations or preferences of the evaluator can unintentionally influence the assessment of model performance, leading to biased judgments or interpretations of the model's effectiveness.

4. Feedback loop bias in Generative AI: Generative AI models, such as language models, can be influenced by biased or unrepresentative training data. If the training data contains biases or reflects societal prejudices, the generated outputs can also exhibit those biases, creating a feedback loop that perpetuates and amplifies the biases present in the data.

These examples highlight the importance of addressing and mitigating biases in Data Science, Machine Learning, and Generative AI. It involves careful and ethical consideration of training data, evaluation methodologies, and continuous monitoring to ensure fairness, accuracy, and ethical use of these technologies.

Conclusion:

The observer effect of the Heisenberg Uncertainty Principle, where the observer influences the observation, can be conceptually applied to the field of Generative AI in terms of unintended (or intended) confirmation bias, observer bias, and biases built into the Generative AI Transformer Model.

In the context of Generative AI, unintended confirmation bias can arise when the prompts or inputs provided to the model unintentionally reinforce existing biases or preconceived notions. Just as the act of observation can influence the outcome in quantum mechanics, the choice of prompts or inputs can shape the generated responses. If the prompts consistently reinforce certain biases, the model may inadvertently produce biased or skewed outputs, perpetuating existing biases in the data it was trained on.

Observer bias in Generative AI refers to the influence of the human evaluator or user on the interpretation and evaluation of the model's outputs. Just as the observer's expectations or beliefs can influence the observed outcome in quantum mechanics, the evaluator's subjective expectations or preferences can unintentionally introduce bias in assessing the quality or appropriateness of the generated responses. This can lead to biased judgments or interpretations of the model's performance.

Moreover, biases can be built into the Generative AI Transformer Model itself. If the training data used to train the model contains biases or reflects societal prejudices, the model can learn and perpetuate those biases in its generated outputs. This can occur due to biases present in the data collection process or biases inherent in the training algorithms. These biases can manifest in the form of skewed language, stereotypes, or discriminatory behavior in the generated responses.

Bias (unintentional or otherwise) is inherent in the human experience simply because facts don’t interpret themselves. Since data and information is a human product, there is no way to completely get around bias. Because biases in Generative AI are essentially baked in at multiple stages, reducing bias involves diverse and representative training data, bias detection and mitigation techniques during model development, and ongoing evaluation and monitoring to detect bias, and ethical use of the technology. By acknowledging bias and actively working to mitigate biases, we can strive for optimal Generative AI systems that produce less biased and more reliable outputs.

Roger Cornejo

Roger Cornejo has over 34 years’ experience with large/complex Oracle applications (versions 4.1.4 – 18c). Roger’s main focus is on DB performance analysis and tuning, and for the past 8 years, diving deep into AWR tuning data. He is often relied on to produce Oracle Database tuning results across 12c/11g/10g (and occasionally 9i) databases. As a thought leader, he has been sought out for his expertise in tuning (presenter at the past 8 East Coast Oracle Conferences, as well as COLLABORATE14 and COLLABORATE18, RMOUG16, and Hotsos 2017-2018). Additionally, Roger authored a book on his recent work, Dynamic Oracle Performance Analytics: Using Normalized Metrics to Improve Database Speed, The book is available through Amazon and Apress: http://www.apress.com/9781484241363

Linked in: https://www.linkedin.com/in/roger-cornejo-1805642/

Twitter: @OracleDBTuning

Website

Cras mattis consectetur purus sit amet fermentum. Integer posuere erat a ante venenatis dapibus posuere velit aliquet. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum.

Jul 22 The Heisenberg Uncertainty Principle and Bias in Generative AI