top of page
chatgpt.jpg

TECHNICAL AND TOPICAL

 Can ChatGPT Write An Insurance Product?

I wanted to know whether I and many others serving the insurance industry would be out of a job in 2 years.  I teamed up with my internet generation friend Sam Bysh (currently doing a PhD in the philosophy of physics in Chicago) and we embarked on a mission to find out.

 

I realise others have looked into this already and also written articles. But I wanted to see the results for myself because I had a couple of questions nagging away at me:

 

  • First, should I assume that the capabilities of ChatGPT had been tested to their limits if the article writer would, understandably, want to reach a conclusion that didn’t threaten their livelihood? Further, the overlap between highly specialised drafting skills and the technical nous capable of testing out ChatGPT properly is probably quite small; certainly, I can say from my own experience that there is little overlap in skills between the writers of policy wordings and those with AI know-how.

 

  • Second, should anyone really expect ChatGPT to replace those tasks that go significantly beyond chatting, given that the clear purpose of ChatGPT (clue in the name) is to correspond in a way that mimicks natural human conversation? Chatting in a human manner is an incredibly impressive task for a computer, but whether an intelligence is artificial or natural it remains the case that the purpose to which that intelligence is turned matters a good deal. Want legal advice? Don’t ask a medic. Want to teach a child? Don’t ask a university professor. Want a succinct and easy to understand summary of something? Don’t ask a lawyer (only joking, lawyers). As for human intelligence, so for artificial intelligence.

 

Method

 

Sam and I combined our different expertise, and over the last 4 months or so set about testing the capabilities of ChatGPT in the field of drafting insurance policy wordings.

 

We set ourselves a goal: to get ChatGPT to produce a D&O policy in which a given set of insuring clauses, extensions, definitions, exclusions, claims conditions and general conditions based on our prompts, was at least 60% satisfactory.  We chose 60% to give it a decent chance, based on an assumption that it would then take relatively little time for a wordings professional to turn it into a “95% product”. Why D&O?  Because I know it inside out and back to front, so it would be easy to critique the ChatGPT output. Our subsequent efforts and results are outlined below:

 

1. Our first attempt was to provide ChatGPT with example D&O policies. However, ChatGPT was unable to handle documents of this length, despite using a couple of the ChatGPT extensions designed with this express purpose.

2. Next, using tailored prompts, we asked ChatGPT directly to generate a D&O policy from a specific list of clauses. Initially, ChatGPT responded with something akin to a policy schedule, with none of the text details filled in. After asking for the text to be filled in, ChatGPT responded with something like a summary document. In both instances it also warned us that drafting contracts was a legal matter and needed lawyers’ involvement.

3. We needed to go back to the drawing board. Clearly, asking ChatGPT to do all this in one go was too much. How about changing the prompts and asking ChatGPT to write a D&O policy section by section? Finally, this approach bore some fruit. However, while some clauses were written to something like 60% completion, some were entirely incorrect. Where it did not understand what was asked of it, it was clear that ChatGPT simply guessed, and the resulting clauses were far wide of the mark. There was still more work to do in finessing the prompts given to ChatGPT.

 

4. After going through the generated clauses line-by-line and beefing up the initial prompts, we tried again to regenerate the policy. This was much more successful. We did, indeed, achieve a D&O policy of around 60% accuracy: something like an abridged D&O policy. And yet, it would seem as though the improvements that were made were achieved by ChatGPT repeating almost entirely the more detailed instructions given to it; the nuanced comprehension promised by this large-language model certainly was not in evidence.

 

Observations

 

Let’s take stock. After at least 25 hours of Sam’s time and about 8 hours of mine, we did end up with a D&O policy that was about 60% complete. On the one hand, with any labour-saving technology it should be expected there will be an initial outlay of time. Yet, in assessing the usefulness of such a technology, we are surely most interested in two things: the novel outcomes achievable, and the transferability of our results. For if this technology is to be able to add anything useful, it must be able to achieve substantial novel results, and be transferable to other kinds of insurance policy. And, although some successes were achieved, the lack of nuanced “understanding” demonstrated by ChatGPT leads us to conclude that ChatGPT is fundamentally not able to do what we asked of it: to generate a new insurance policy.  Indeed, I can write a new D&O policy from scratch in less time than it took us to feed the information to it – and as a template I could re-use that just as much as the output we achieved from ChatGPT.

 

If we were to ask ChatGPT to write a policy for another area (say, a Commercial Crime policy) based on what we have learnt, it is likely the same (or greater) outlay of time would be required engineering the prompts to raise the initial results to the 60% level.


If our aim was to generate something that could provide a useful example for someone who doesn’t know anything about D&O policies, this would be a good outcome. Or, if our aim was to generate an off-the-shelf SME D&O policy, again, perhaps this would be a useful outcome (although as mentioned above, I can do that). However, as something that can be incorporated into the workflow of the business of Coverage Matters, that of writing bespoke policy wordings, ChatGPT clearly is not there, yet. Again, though, perhaps this should be unsurprising: ChatGPT is not an AI developed to write insurance policies.

 

From playing with ChatGPT it seems clear that it represents a step-change in the abilities of computers to act like humans in certain limited situations or contexts. It does not seem that long ago that Microsoft Paperclip was the cutting edge of android intelligence, and anyone who ever has to deal with automated phone systems when trying to access customer services will know how humanlike computers can fail to be. But, just as almost all intelligent human beings would do a less than perfect job at writing an insurance policy, so too for ChatGPT.

 

While it represents a great leap forward, ChatGPT isn’t designed to write insurance policies. This seems to be true of other software out there at the moment, but there are products that can do quite an impressive job of reviewing insurance policies. When somebody builds an AI model whose purpose is to write bespoke insurance policies from scratch, that will potentially be a game changer. Notwithstanding the continued need for human oversight of AI-generated material, an AI model trained on the millions of insurance policies available in the market will certainly disrupt the business as it currently exists. The question is when, not if, that day will come.

 

Rest assured I will be monitoring this every step of the way.

bottom of page