Back when I studied sociology in Berlin in the mid-2000s, my best mates happened to be Computer Linguists. I recall a few times when they would invite a whole gaggle of students around to their dinky flatshare to help them with assignments. They’d fill their bathtub with bottles of beer in ice, order pizza, and hand out reams and reams of printed texts and highlighter pens. Our job was to go through our stack and highlight every time a specific word would appear. We were not required to understand why – I recall the texts barely made any sense to me, but I was told not to miss a single word because that could muck up their whole thesis. So fuelled by pizza and beer, we were there for hours. This memory came back to me when I first used an LLM – so THAT was what this was all about… in a way.
Today you can’t click far without running into stories, opinions and podcasts on the wide spectrum between AI enthusiasm and apocalypse. I’ve enjoyed the odd podcast on the matter myself, but even from a public management angle – I can’t get myself to have a strong opinion on the good, the bad and the ugly of it all yet. But I do have one nagging concern at the thought of us – and our public services – beginning to rely on LLM-based AI that I don’t see discussed much out there, so I’ll lay it out here today.
The long arch of human-computer interaction
I came up in User Experience and Usability – for years my job was to invite a human to sit in front of a computer and complete a certain task, usually on some website or app – while I observe their intuitive behaviour. The goal was to find way to make the website or app easier to use, so that all different kinds of people could do what the sites were supposed to do - usually buy something or get some kind of information.
Most users were nervous when they felt they couldn’t complete a task or find a piece of information – they’d apologise, make awkward jokes or talk down on themselves:
“Sorry, it’s probably just me. I’m sure I’d be faster if I used it more often”
… was a phrase I heard in almost every session. That’s partly due to the lab-setting, to have someone watching over your shoulder as you’re looking for a raincoat online is not necessarily comfortable. What I’d always say to reassure them was:
“No, it’s not you – it’s the website. The website should work for everyone, no matter what background. It’s not us who should learn to use the website better – the site should adapt to us.”
I’d get a smile and a sigh of relief at that statement, and we could continue with more confident and clear feedback.
There’s a long arch from the very first introduction of commercial computers into people’s homes, where it was understood that it was up to the human to educate themselves on the computer’s language and the way it works – to the idea that tech should meet user’s needs. And now we have the astounding capacity of LLM’s and AI that adapts to US, directly. Brave new world!
Tech that adapts to us
You could think that the emergence of LLM’s and their capacity for adaptation makes UX- and usability research redundant. Such systems learn our language, preferences and quirks and are supposed to get better at understanding, anticipating and responding to us (at least on a communication level) – so it should no longer be necessary to have these intermediary researchers who try and translate between the design- and development of a system or service and its users.
To some extent that’s probably true. But the thing is: the development and operation of a service and its underlying tech is a LOT more than design requirements interfacing with human behaviour.
In other words: even in a world where a lot of our tech has an LLM core that learns and adapts on its own, it still lives in a system of human governance and management (for now).
I’m explicitly not thinking far into a world of artificial general intelligence or LLM here, my imagination can only reach so far…
My contention is that despite 50 years human-computer interaction, and nearly 30 years of usability as a structured practice, there is still a humanities gap at the center of software development and related governance systems like product management. A gap in which we are over-confident in engineering practices, and under-value human-centered methods.
Let me explain how I come to that conclusion, before we circle back to LLMs.
Tech and the humanities
Most of “tech” is developed iteratively: you don’t know all facts, limitation and needs to start with. So you start small, design and develop a prototype that can be tested (ideally with real users), learn from that what works and doesn’t work, adjust, improve, and continue that cycle until you’ve got something you can release.
As I’ve explained before – that’s rarely how we actually go about it in practice, especially in the public service. We might go through the same motions, and you’ll be hard-pushed to find a business case or strategy paper that doesn’t make declarations around “customers” being “at the center” of it all…
But at the end of the day, decisions are made around financial questions and technical feasibility. Every now and then a user-issue gets a look-in and changes the course of a project, and I don’t want to paint everyone with a broad brush here: yes, this is a much bigger issue in the public service than in private companies. But even they fundamentally have their “bottom-line” to think about. In my experience, private companies have developed a genuine conviction that a good customer experience is a factor in aid of that bottom line – but cash always has the final word.
Practically, that means that as a usability specialist, you spend much of your working life chasing after product owners, project managers and stakeholders – journey map in hand – trying to draw their attention to usability issues that could be prevented - only to get this reaction:
I’m exhausted just thinking about how much time I’ve spent trying to convince the people who hired me to “champion human-centricity” to make even just one decision based on user research.
When push comes to shove, success is defined as a product or service that’s shipped at a certain time, within a certain scope and budget. And as long as the lights are on – it doesn’t matter how many preventable user issues were caused by rushed compromises and decision-making that treats developer-advice as essential, and user-research as this:
“Humanities are not science”
To me, this issue that has plagued my working life goes back to that old chestnut: social sciences and the humanities are not “real” science. They don’t need funding (I will never get over that one) – and they are not an area worth investing in.
I’ve had screaming discussions with people at parties over this strange line that some people smugly defend – for REASONS. When you interrogate with them what marks a “real” science, their arguments tend to reveal this conviction for what it is: a belief based on ignorance of the methods and practice of social sciences.
When they hear of “experiment” in physics, they think of the large hadron collider. But their only frame of reference for “experiment” in social sciences is the Milgram experiment.
There’s your problem.
I’ll stick with my own experience to illustrate this further: As someone who has observed literally hundreds of people interacting with digital technology in a commercial lab-setting, these few things are always reliable:
After 5-6 participants, you’ll start to see the main patterns of behaviour shaping up and repeating. So, about 70-80% of what people do with the same task and the same prototype will be more or less the same.
8 out of 10 participants will claim something along the lines of: “I’m a bit unusual, because I’m a real visual person.”
20-30% of what people do will be something that utterly surprises and baffles you – in no world could you have predicted that someone was going to do that.
Human behaviour is complex, and diverse. But there are patterns. They are observable, there is cause and reaction, they can be replicated.
And to be clear, I’m not saying that what I did for most of my career is scientific – more often than not in commercial settings you’d not pass the muster, and you have no peer review. But the principles of the methods have the same basis:
People be people
If you believe that there is some kind of value in systemically involving the study of human behaviour with your tech development - you’d need to make that the rule, not the exception. But after all these years, the opposite is true:
There’s a serious ceiling-effect when it comes to the inclusion of people-based practices into the development and operation of technology – in our public systems especially.
The default setting is on engineering – the influence of people-based advice and design remains limited and conditional.
Why is that?
The ways of working that we have adopted - project management, business cases, strategy papers, even agile to some extent - strive to make our work repeatable, consistent, predicable, and reduce complexity. So the (simplified) 80-20 bell curve variation in human behaviour means that the needs of the ones at the far extremes are just not what the system is built for. In our way of working, there’s an inbuilt skew towards the mass. That might be ok for a commercial entity, but not for a public service that fundamentally needs to meet all, whereever they are.
And what does that have to do with AI now?
If you’ve read anything on AI and its risks, you’ll be aware of the (potentially dangerous) distortions in how LLMs process information, stemming from the societal- and systemic biases inherent in their training materials. You could say that in its process of generation, AI mirrors the world in its status quo, warts and all. Perhaps even zoomed in on the warts.
When AI enters the world of digital public services, it encounters an ecosystem with mechanisms, incentives and blind spots, formed and institutionalised by its preceeding technological logic and underlying cultural code. An ecosystem, that after 20 years of attempts has not managed to secure a systemic spot for people-centred practices in its machinery.
If you believe, like many people do, that one of the biggest risk of using LLMs and AI is that it reinforces existing power structures, biases, misconceptions and is a threat to critical thinking - then it’s utilisation in digital government systems that can’t currently handle simple user-research in any reliable way is… a concern.
Managing this concern, imho, has to start with the acknowledgement of the current gap in recognition and utilistion of the humanities within tech- development and management: that the study of the variation of human behaviour can serve as much as a robust and required input into the development of technology as architecture, testing and digital strategy.