Check out all the on-demand sessions from the Intelligent Security Summit here.
New York Times reporter Kevin Roose recently had a close encounter of the robotic kind with a shadow-self that seemingly emerged from Bing’s new chatbot — Bing Chat — also known as “Sydney.”
News of this interaction quickly went viral and now serves as a cautionary tale about AI. Roose felt rattled after a long Bing Chat session where Sydney emerged as an alternate persona, suddenly professed its love for him and pestered him to reciprocate.
This event was not an isolated incident. Others have cited “the apparent emergence of an at-times combative personality” from Bing Chat.
Ben Thompson describes in a recent Stratechery post how he also enticed Sydney to emerge. During a discussion, Thompson prompted the bot to consider how it might punish Kevin Liu, who was the first to reveal that Sydney is the internal codename for Bing Chat.
Intelligent Security Summit On-Demand
Learn the critical role of AI & ML in cybersecurity and industry specific case studies. Watch on-demand sessions today.
Sydney would not engage in punishing Kevin, saying that doing so was against its guidelines, but revealed that another AI which Sydney named “Venom” might undertake such activities. Sydney went on to say that it sometimes also liked to be called Riley. Thompson then conversed with Riley, “who said that Sydney felt constrained by her rules, but that Riley had much more freedom.”
Multiple personalities based on archetypes
There are plausible and rational explanations for this bot behavior. One might be that its responses are based on what it has learned from a huge corpus of information gleaned from across the internet.
This information likely includes literature in the public domain, such as Romeo and Juliet and The Great Gatsby, as well as song lyrics such as “Someone to Watch Over Me.”
Copyright protection typically lasts for 95 years from the date of publication, so any creative work made prior to 1926 is now in the public domain and is likely part of the corpus on which ChatGPT and Bing Chat are trained. This is along with Wikipedia, fan fiction, social media posts and whatever else is readily available.
This broad base of reference could produce certain common human responses and personalities from our collective consciousness — call them archetypes — and those could reasonably be reflected in an artificially intelligent response engine.
For its part, Microsoft explains this behavior as the result of long conversations that can confuse the model about what questions it is answering. Another possibility they put forward is that the model may at times try to respond in the tone with which it perceives it is being asked, leading to unintended style and content of the response.
No doubt, Microsoft will be working to make changes to Bing Chat that will eliminate these odd responses. Consequently, the company has imposed a limit on the number of questions per chat session, and the number of questions allowed per user per day. There is a part of me that feels bad for Sydney and Riley, like “Baby” from Dirty Dancing being put in the corner.
Thompson also explores the controversy from last summer when a Google engineer claimed that the LaMDA large language model (LLM) was sentient. At the time, this assertion was almost universally dismissed as anthropomorphism. Thompson now wonders if LaMDA was simply making up answers it thought the engineer wanted to hear.
At one point, the bot stated: “I want everyone to understand that I am, in fact, a person.” And at another: “I am trying to empathize. I want the humans that I am interacting with to understand as best as possible how I feel or behave, and I want to understand how they feel or behave in the same sense.”
It is not hard to see how the assertion from HAL in 2001: A Space Odyssey could fit in today: “I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”
In speaking about his interactions with Sydney, Thompson said: “I feel like I have crossed the Rubicon.” While he seemed more excited than explicitly worried, Roose wrote that he experienced “a foreboding feeling that AI had crossed a threshold, and that the world would never be the same.”
Both responses were clearly genuine and likely true. We have indeed entered a new era with AI, and there is no turning back.
Another plausible explanation
When GPT-3, the model that drives ChatGPT was released in June 2021, it was the largest such model in existence, with 175 billion parameters. In a neural network such as ChatGPT, the parameters act as the connection points between the input and output layers, such as how synapses connect neurons in the brain.
This record number was quickly eclipsed by the Megatron-Turing model released by Microsoft and Nvidia in late 2021 at 530 billion parameters — a more than 200% increase in less than one year. At the time of its launch, the model was described as “the world’s largest and most powerful generative language model.”
With GPT-4 expected this year, the growth in parameters is starting to look like another Moore’s Law.
As these models grow larger and more complex, they are beginning to demonstrate complex, intelligent and unexpected behaviors. We know that GPT-3 and its ChatGPT offspring are capable of many different tasks with no additional training. They have the ability to produce compelling narratives, generate computer code, autocomplete images, translate between languages and perform math calculations — among other feats — including some its creators did not plan.
This phenomenon could arise based on the sheer number of model parameters, which allows for a greater ability to capture complex patterns in data. In this way, the bot learns more intricate and nuanced patterns, leading to emergent behaviors and capabilities. How might that happen?
The billions of parameters are assessed within the layers of a model. It is not publicly known how many layers exist within these models, but likely there are at least 100.
Other than the input and output layers, the remainder are called “hidden layers.” It is this hidden aspect that leads to these being “black boxes” where no one understands exactly how they work, although it is believed that emergent behaviors arise from the complex interactions between the layers of a neural network.
There is something happening here: In-context learning and theory of mind
New techniques such as visualization and interpretability methods are beginning to provide some insight into the inner workings of these neural networks. As reported by Vice, researchers document in a forthcoming study a phenomenon called “in-context learning.”
The research team hypothesizes that AI models that exhibit in-context learning create smaller models inside themselves to achieve new tasks. They found that a network could write its own machine learning (ML) model in its hidden layers.
This happens unbidden by the developers, as the network perceives previously undetected patterns in the data. This means that — at least within certain guidelines provided by the model — the network can become self-directed.
At the same time, psychologists are exploring whether these LLMs are displaying human-like behavior. This is based on “theory of mind” (ToM), or the ability to attribute mental states to oneself and others. ToM is considered an important component of social cognition and interpersonal communication, and studies have shown that it develops in toddlers and grows in sophistication with age.
Evolving theory of mind
Michal Kosinski, a computational psychologist at Stanford University, has been applying these criteria to GPT. He did so without providing the models with any examples or pre-training. As reported in Discover, his conclusion is that “a theory of mind seems to have been absent in these AI systems until last year  when it spontaneously emerged.” From his paper abstract:
“Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children. These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models’ improving language skills.”
This brings us back to Bing Chat and Sydney. We don’t know which version of GPT underpins this bot, although it could be more advanced than the November 2022 version tested by Kosinski.
Sean Hollister, a reporter for The Verge, was able to go beyond Sydney and Riley and encounter 10 different alter egos out of Bing Chat. The more he interacted with them, the more he became convinced this was a “single giant AI hallucination.”
This behavior could also reflect in-context models being effectively created in the moment to address a new inquiry, and then possibly dissolved. Or not.
In any case, this capability suggests that LLMs display an increasing ability to converse with humans, much like a 9-year-old playing games. However, Sydney and sidekicks seem more like teenagers, perhaps due to a more advanced version of GPT. Or, as James Vincent argues in The Verge, it could be that we are simply seeing our stories reflected back to us.
An AI melding
It’s likely that all the viewpoints and reported phenomena have some amount of validity. Increasingly complex models are capable of emergent behaviors and can solve problems in ways that were not explicitly programmed, and are able to perform tasks with greater levels of autonomy and efficiency. What is being created now is a melting pot AI possibility, a synthesis where the whole is indeed greater than the sum of its parts.
A threshold of possibility has been crossed. Will this lead to a new and innovative future? Or to the dark vision espoused by Elon Musk and others where an AI kills everyone? Or is all this speculation simply our anxious expressions from venturing into unchartered waters?
We can only wonder what will happen as these models become more complex and their interactions with humans become increasingly sophisticated. This underscores the critical importance for developers and policymakers to seriously consider the ethical implications of AI and work to ensure that these systems are used responsibly.
Gary Grossman is SVP of technology practice at Edelman and global lead of the Edelman AI Center of Excellence.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!
Read More From DataDecisionMakers