Technical Chops by Simon Minton - Short thoughts on chatting with an AI

Open AI released multi-modal AI a couple of weeks ago and it has been slowly making its way into the ChatGPT app. It is quite disconcertingly brilliant.

An illustration of a Robot and a Human having a conversation. — An accurate depiction of the process

Conversation is a funny thing. Reading podcast transcripts can be quite nightmarish - we don’t realise how much the spoken word, especially during conversations, meanders and is peppered with hesitation, deviation and repetition until we see it written down. When we’re speaking though, these unnecessary additions make conversation human and enjoyable. When we’re listening, we often don’t realise how much we are actually playing an active role in doing so–in person, it’s the facial expressions and nods which encourage the speaker to continue; on the phone, the short acknowledgements that let a partner know you’re still there and listening. I often speak to a friend on the phone who mutes when they’re not speaking and the experience is fine, but the silence is slightly off-putting.

And so to the experience of chatting with an AI. It’s brilliant, in as much as it actually feels as though you are having something of a conversation. The responses aren’t the same as the ones you would receive by directly typing the same words into Chat GPT - they’ve clearly thought about the fact that spoken conversation is different. There is surprisingly little lag in the response. You don’t say your piece and then wait for 10 seconds for it to process; the AI responds in a couple of seconds, almost straight away once it’s heard a long enough pause. The quality of the AI is fantastic - it’s using GPT-4 which is about as state-of-the-art as it can get, and the voices, whilst not human, are surprisingly great.

However.

The entire experience is disconcerting because of how precise it is. There is no room for you to take long pauses while you think mid-sentence, or rephrase as you talk. There is absolute silence when you are talking which causes you to look down at the screen to make sure it’s still working. The responses are often long and apparently deeply thought through, but they often end with a question, rather than just being an open-ended response to work from. I’m looking forward to having an AI conversational partner, but I want it to help me tease out ideas, not necessarily give me fully formed AI thoughts on a subject. I want it to say “yes” whilst I’m speaking for no apparent reason other than to encourage me to keep talking through the idea. I want it to meander and bring in new unrelated but tangential ideas. Ultimately, I guess I want it to be a little more human.