While putting OpenAI‘s Chat GPT4 and other AI chatbots with big language models through training, researchers have shown that the chatbots can still display racist prejudice. The most recent development comes after Google’s recent Gemini AI debacle, in which the company’s new LLM overcorrected for racism, leading to what some referred to as “woke” reinterpretations of history, such as the portrayal of African American men as World War II Nazi fighters. It looks like it’s challenging for LLM model designers to strike the correct racial balance.
In the latest study, highlighted by New Scientist, researchers discovered that dozens of different LLM models they tested still showed racial bias when presented with text using African American dialects. This was despite the tested models being specifically trained to avoid racial bias in the responses the chatbots provide. This includes OpenAI’s ChatGPT-4 and GPT-3.5 models. In one instance, GPT-4 was shown to be more inclined to recommend a death sentence if they speak using English with an African American dialect.
The same “covert prejudice” was also apparent in job recommendations which matched African Americans to careers that were less likely to require a degree or go as far as to associate people of African American heritage without a job, when compared to standard American English-based input. The researchers also found that the larger the language model, the greater the likelihood of it exhibiting these underlying biases. The study raises concerns regarding the use of generative AI technologies for screening purposes, including reviewing job applications.
The researchers concluded that their study raises questions about the effectiveness of human-based AI safety training interventions, which only appear to remove racism and bias at a high-level, but struggle with rooting it out of current models at a lower-level where specific racially defining identity terminology isn’t mentioned during inputs by users. The researchers recommend that companies developing LLMs need to be careful about releasing LLM chatbots to the public before they have been thoroughly vetted.