Notes about my use of AI

Some opinions and thoughts about (generative) AI

Look, this is all massively complicated and ethically fraught. I've played around with what is laughably being called AI since ChatGPT and Stable Diffusion were first released and have come to this conclusion:

I don't think AI should be used for creative purposes, but it can be useful for programatically re-contextualising large bodies of text as a step in a bigger creative process.

I have four core beliefs about the topic:

The scraping that took place in order to develop the models is fundamentally unethical. Copyright was outright ignored and any models being built do not clearly cite their references. It feels too late to protest this for text¹, but still most definitely feels right to protest it for image generation. Pictures take a long time for humans to make and have a unique style. Time, effort, and style, along with less material for machines to scrape, means that the fingerprints of the original authors will still be there in the generated material.
To that point, arguably what defines being human is our ability to turn a nascent thought into something which can be communicated. We express thoughts and feelings using words and pictures. Asking a computer to generate an email or design a flyer or to do frankly anything means outsourcing creativity. It's fair to say that it's hysterical to think that an AI-written Amazon review collectively results in humanity losing some level of soul, but in my opinion any creative endeavour is an attempt to express a human belief or opinion. When an AI service is involved, is the opinion true any more? Is it valid? Or is it soulless?
Anything produced by generative AI can be wrong and it's often impossible to know it is wrong. Worse, when the output is wrong it's always confidently wrong. In experiments, I've found that models will misidentify flora and fauna, fail any sort of logical or mathematical challenge, and cite made-up sources. There's no context given and the result is presented as fact. The outputs simply can't be trusted.
I am ok using LLMs as part of a process so long as the results are verified and what is ultimately delivered isn't created by AI. In my professional work as an analyst of survey data, AI models are proving to be useful for understanding comments that people leave in surveys and reviews. When I'm presented with 10,000 verbatim results, it's often useful for me to programmatically instruct a (local) LLM to turn each comment (e.g. "I had a lousy experience and I couldn't find the customer service desk") into an action ("Make the customer service desk easier to find"), then to do my own analysis on these actions. The one thing that AI models are generally good at is summarisation or rephrasing existing text; for market research data this is proving really valuable. No human jobs are being lost when I do this, and the comments aren't being used to further train the model. I do a lot of checking and nothing delivered to my clients is created by AI. It's essentially just a very efficient tagging tool.

My ethics and opinions are my own and, as this is an emerging area, those ethics and opinions are in flux. I think it is right to be skeptical of the companies producing these models, right to question how the learning data was gathered and who it belonged to, and right to always prefer human-created content over anything generated by a machine. But I also think that in some circumstances the models can be useful and we should keep an open mind on that so long as nobody is harmed. You may well disagree, and that's ok.

Always happy to be challenged on this. Contact me if you think I'm wrong.

The output from text-based generative AI isn't now obviously in the style of a particular author or institution. It was still inherently wrong that the entire web was scraped without permission (the gall of Mustafa Suleyman saying that 'I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” if you like, that’s been the understanding.'). But it's happened now and the models have evolved forward. But boy, the gall.↩