I was very interested to read Norm Smith's latest blogplost on 'Battlefields and Warriors' on generative AI within gaming, specifically in terms of scenario generation, 'The Impacts of AI'. To make sense of my post, it will definitely be worth your while to read his first.
The central claim is that generative AI does not really create, but it basically just steals stuff and reformats it. This felt testable, so I have had a go, but before I link to the results, I held a few things in mind:
If I were asked, even if just asked myself, to create a Wars of the Roses scenario, to what degree would I fall back on things which I had read and then slightly re-package them? How would I come up with a creative solution which wasn't also a very unlikely scenario i.e. the scenario had to be both creatively different but also highly plausible? What, in terms of generating non-historical scenarios for historical miniatures games, would I consider to be sufficiently different to be not merely copying? In Norm's test, the differences between his own internet-published games and the AI-generated scenario, seem to be minimal (he didn't link to his own scenario directly)...what would happen when I tried it? In short, I would try to apply the same standard of judgement to AI behaviour and performance as I would to human behaviour and performance.
The results are here. Essentially, I asked the LLM (ChatGPT5 - Thinking mode) to create three scenarios. I have included the prompts I used, but to summarize:
Scenario One - a Wars of the Roses scenario for about 5000 combatants per side, each with a roughly even chance of winning.
Scenario Two - a scenario appropriate for Norm's own Piggy Longton campaign.
Scenario Three - a Wars of the Roses scenario, no further guidance.
And then I posted them up as a page on this blog, with no changes and some very light formatting.
I am no expert on The Wars of the Roses. I would say I have more than a passing acquaintance, but no expertise at all, so I don't know if these scenarios are minimally repackaged historical battles - there are no battles immediately springing to mind. I also don't know if they are repackaged human-written and published scenarios, although again, nothing immediately sprang to mind, but my knowledge is even more limited here. So, if anyone can see through a thinly disguised historial or imaginative scenario please let me know! The challenge, to be clear, is not to find points of contact, it is to find something which has been basically been copied and very minimally changed - i.e. if a human had written it, you would have considered that human had copied it.
Disclaimer: I don't use AI to generate the scenarios I play personally.
Well, this is interesting. My Battle of Solden Hill scenario and battle reports saw an unusual increase in the number of hits...
ReplyDeleteThat is interesting. I would not have expected it to look up anything and instead just generate from its training data; I wonder if it did, for some reason. None of the scenarios seem that similar to it at first blush, except perhaps in the structure.
DeleteI had a read through the scenarios and they look pretty good to me! A lot more detail than any scenarios I have ever come up with I think!
ReplyDeleteYes, generative AI has a lot to say for itself, if nothing else...
DeleteHi John, very interesting. Thanks for reference back to my post.
ReplyDeleteIf I may, I will limit my comments to your 2nd Scenario as I am very familiar with the subject - ‘making a scenario compatible with the piggy longton setting’. I should say for those who don’t know, that Piggy Longton is my fictitious Imaginations ‘Parish’ and since the story line is unique to my blog, one would expect that for AI to do the task, it would have to visit my blog. When I asked it for A Piggy Longton scenario, it pretty much just lifted one of my scenarios.
However, the BIG difference here is that the AI is told not to copy and so …… on the face of it, it didn’t copy in anyway that I could detect. If I just found the scenario on the internet, I would not readily see it as being Piggy Longton related at all.
If I HAD to look for similarities, I would say there are some elements of my ‘Save the Treasury’ scenario (a raid against the treasury, full of the annual tax collection and the need to take several turns to fill the wagons) and also the Attack on Beacon Farm scenario, which involves raiding to steal winter feed, plus it involves a ‘fire’ rule.
It mentions St. Guthlac Chapel and I have an Osric’s Chapel, but again, that is a bit tenuous - so I would say overall, the AI has worked to the brief of not copying, certainly in any sort of overt way and not really anything I could detect if I came across the scenario in the wild!
Overall, the scenario gave the type of romp in the kind of place that accords with what I do with Piggy Longton. I thought the scenario instructions made sense as you read them, but by the end of it all, putting that on a table would confuse me, certainly without a map for context.
Thanks Norm. I don't imagine the AI would have to visit your blog to generate a Piggy Longton scenario - given the uniqueness of Piggy Longton, it would probably do it from training data. The AI did claim this, although that of course does not make certain (but I also didn't notice any live searches when it was doing so). I think the "Anglo-Saxon sounding chapel" probably is the biggest tell there, although it wouldn't, to my mind, come close to what is normally meant by copying.
DeleteI did note on reading all three sets that some words stuck out, ‘FICKLE’, ‘FRICTION CARDS’, ‘TEMPO’ and the term “BLOCKS’ for units. I would imagine that someone might recognise them as being peculiar to a narrow range of rules. I know Tempo is a Baccus thing.
ReplyDeleteTo persistently use the term blocks, when we might expect to see ‘units’ might suggest a narrow source for the scenario creation with regard to that one point i.e. the A.I. Has found blocks in a scenario format and is happy with it..
Perhaps yes, blocks struck me as a little peculiar for the period too. Tempo is a Baccus thing, although there have never been any published Wars of the Roses or medieval rules under the Polemos banner, so that would be strange if it were the inspiration...blocks makes me think of Commands and Colors, but I have never played the Medieval version, or Battlelore, so don't know how similar the concepts are...I guess 'blocks' are sometimes used to describe WotR troops, so not entirely surprising the AI has found that in its weights for WotR...I wonder if it would do the same for a Napoleonic scenario? I may try it.
DeleteThat is an interesting experiment. I've seen LLMs generate some pretty decent scenarios, and one chap even posted a very convincing text based battle he'd 'fought' against ChatGPT. Tbh I'm generally fairly dismissive of the "intelligence" aspect of Generative AI, there isn't any intelligence as we would understand it is humans, however like AI encryption, the way they generate output has moved beyond our ability to understand how the algorithms work as have evolved. They really have learned how to do stuff we can't do, and I think trying to figure out how they work via black box experiments is going to be quite challenging. Frankly, the whole thing is terrifying.
ReplyDeleteYes, there are issues of fundamental process unknowability here. I do a bit more on this for w*rk, but in this context, testing expectations versus output over several iterations is about as good as it gets. What has surprised everyone I guess is the degree to which next word prediction effectively mimics the outputs of (lots of) human thought, although TBF it was equally unknowable at a fundamental level how human intelligence was working and it may well have more in common than weighted next word prediction than we realized (or at least, cared to admit). And yes - a bit terrifying!
Delete