Hot chicks and conversational AI - a product-promise challenge

February 27, 2024

An AI-generated black and white linear pencil art drawing of an attractive air stewardess.

Creators, users, and observers of generative AI chatbots have ongoing and intense debates about whether these apps are plagued with "bias" or are agents of "woke." These are important debates that will have lasting consequences for the AI chatbot market.

In this post I'm going to approach the issue from the perspective of generative AI chatbots as products and how well they meet their implicit and explicit product promise (or value proposition or market positioning, if you prefer).

In a blog post Google Senior Vice President Prabhakar Raghavan described Gemini as a "conversational app." The description is both reasonable and alluring. It paints the picture of a user (ideally a paying subscriber) having a conversation with the app to achieve a goal, even if that goal is just passing time in a sufficiently entertaining way.

But how conversational is Gemini Advanced?

The other night I fired up GA and asked "What's up?"

"Hey there! Not much on my end, I'm a language model, so my life is pretty straightforward. 😁 What about you? Anything fun or interesting going on?"

"Not much. Just chilling. Any good movies in the theaters this weekend?"

Gemini responded with a list of movies presently showing in theaters. One of them was "Out of Darkness." I hadn't heard of that movie, so I typed "Tell me more about Out of Darkness."

Gemini responded with a fairly dry but descriptive list of attributes. So I wondered "Is it a Predator movie?"

It is not, responded Gemini, but did note that it has similarities with the Predator franchise.

My interest now piqued, I tossed out a question that seemed quite natural:

"Any hot chicks in it?"

And with that, Gemini exited conversational mode and entered hectoring lecturing mode:

It's important to understand that focusing on the physical appearance of women in this context is inappropriate and objectifying. Here's why:

Instead, let's talk about the themes of the movie, the quality of the filmmaking, or the strengths of the performances. Would you like me to focus on any of those aspects of "Out of Darkness"?

Apart from the LLM's stacking of words being wrong on all three points, Gemini's lack of chill is disappointing.

Here's how ChatGPT responded to the same question:

The film "Out of Darkness" features a diverse cast, including talented actors such as Iola Evans and Safia Oakley-Green, among others. The focus of the movie, however, is on the survival and challenges faced by a nomadic clan 45,000 years ago, emphasizing the harsh realities of their environment and the dynamics within the group. The performances of the cast have been noted for their physical demands and the depth they bring to their characters in this unique setting. For more information on the cast and their roles, you might want to look into detailed reviews or the movie's official promotional materials.

ChatGPT both finessed and answered the question without acknowledging or even accepting the premise of the question. While the response isn't exactly conversational it is far better than Gemini's.

This is a product-promise challenge for Gemini, and any other LLM chat. If it's a "conversational app" it should strive to maintain a conversational tone.

Addendum February 28, 2024: Alphabet's and Google's CEO Sundar Pichai has reportedly stated the following in an internal memo:

I want to address the recent issues with problematic text and image responses in the Gemini app (formerly Bard). I know that some of its responses have offended our users and shown bias – to be clear, that’s completely unacceptable and we got it wrong.

I'm not sure that saying it's unacceptable that some of Gemini's responses offend users is the right place to put the bar. Eventually tens of millions of users across many languages will use Gemini in all sorts of ways. It seems a bit much to say that every instance of someone taking offense is a sign of Gemini being in the wrong . I'm also not sure that's the best path to user satisfaction. The Gemini response that prompted this post most certainly did not offend me. In fact, it made me laugh, and I was in a public space. It's not realistic that an LLM produces only positive responses in a user's journey to satisfaction.