# Introduction

The clmplus package provides practitioners with a fast and user friendly implementation of the modeling framework we derived in our paper Pittarello G., Hiabu M., and Villegas A., Chain ladder Plus: a versatile approach for claims reserving, (pre-print, 2022).

We were able to connect the well-known hazard models developed in life insurance to non-life run-off triangles claims development. The flexibility of this approach goes beyond the methodological novelty: we hope to provide a user-friendly set of tools based on the point of contact between non-life insurance and life insurance in the actuarial science.

This vignette is organized as follows:

• We show the connection between the age-period representation and run-off triangles.

• We replicate the chain-ladder model with an age-model. As shown in the paper, by using the clmplus approach the resulting model is saving some parameters with respect to the standard GLM approach.

• We show an example where adding a cohort effect can lead to an improvement on the model fit.

In this tutorial, we show an example on the AutoBIPaid run-off triangle from the ChainLadder package.

## One discipline, one language

Consider the data set we chose for this tutorial.

The run-off triangle representation is displayed below:

library(ChainLadder)

data("AutoBI")
dataset=AutoBI$AutoBIPaid dataset #> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] #> [1,] 1904 5398 7496 8882 9712 10071 10199 10256 #> [2,] 2235 6261 8691 10443 11346 11754 12031 NA #> [3,] 2441 7348 10662 12655 13748 14235 NA NA #> [4,] 2503 8173 11810 14176 15383 NA NA NA #> [5,] 2838 8712 12728 15278 NA NA NA NA #> [6,] 2405 7858 11771 NA NA NA NA NA #> [7,] 2759 9182 NA NA NA NA NA NA #> [8,] 2801 NA NA NA NA NA NA NA colnames(dataset)=c(0:(dim(dataset)[1]-1)) rownames(dataset)=c(0:(dim(dataset)[1]-1)) Practitioners in general insurance refer to the x axis of this representation as development years. Similarly, the y axis is called accident years. The third dimension that matters is the diagonals: the calendar years. There is a one-to-one correspondence between the age-period representation and run-off triangles. In notional terms, life insurance actuaries use the following terminology: • ages are development years. • cohorts are accident years. • periods is calendar years. Indeed, the age-period representation of the run-off triangle is the following: #> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] #> [1,] 1904 2235 2441 2503 2838 2405 2759 2801 #> [2,] NA 5398 6261 7348 8173 8712 7858 9182 #> [3,] NA NA 7496 8691 10662 11810 12728 11771 #> [4,] NA NA NA 8882 10443 12655 14176 15278 #> [5,] NA NA NA NA 9712 11346 13748 15383 #> [6,] NA NA NA NA NA 10071 11754 14235 #> [7,] NA NA NA NA NA NA 10199 12031 #> [8,] NA NA NA NA NA NA NA 10256 Observe that the y axis is now the development years (or age) component. Calendar years (or periods) are displayed on the x axis. Accident years (or cohorts) are on the diagonals. # Replicate the chain-ladder with the clmplus package clmplus is an out-of-the box set of tools to compute the claims reserve. We now show how to replicate the chain ladder model. Observe the run-off triangle data structure first needs to be initialized to a RtTriangle object. Starting from the results in the paper we showed how to replicate the chain ladder with an age model. The computation on the RtTriangle object is computed with the method clmplus specifying an hazard model. a.model=clmplus(RtTriangle = rtt, hazard.model = "a") #> StMoMo: The following ages have been zero weighted: 1 #> StMoMo: The following years have been zero weighted: 1 #> StMoMo: The following cohorts have been zero weighted: -7 -6 -5 -4 -3 -2 -1 7 #> StMoMo: Start fitting with gnm #> StMoMo: Finish fitting with gnm We show the consistency of our approach by comparing our estimates with those obtained with the Mack chain ladder method as implemented in the ChainLadder package. mck.chl <- MackChainLadder(dataset) ultimate.chl=mck.chl$FullTriangle[,dim(mck.chl$FullTriangle)[2]] diagonal=rev(t2c(mck.chl$FullTriangle)[,dim(mck.chl$FullTriangle)[2]]) Estimates are gathered in a data.frame to ease the understanding. data.frame(ultimate.cost.mack=ultimate.chl, ultimate.cost.clmplus=a.model$ultimate.cost,
reserve.mack=ultimate.chl-diagonal,
reserve.clmplus=a.model$reserve ) #> ultimate.cost.mack ultimate.cost.clmplus reserve.mack reserve.clmplus #> 0 10256.00 10256.00 0.00000 0.00000 #> 1 12098.24 12098.24 67.23865 67.23865 #> 2 14580.19 14580.19 345.18727 345.18727 #> 3 16323.69 16323.69 940.68770 940.68770 #> 4 17628.86 17628.86 2350.85562 2350.85562 #> 5 16237.77 16237.77 4466.77443 4466.77443 #> 6 18285.24 18285.24 9103.24335 9103.24335 #> 7 17281.44 17281.44 14480.43832 14480.43832 cat('\n Total reserve:', sum(a.model$reserve))
#>
#>  Total reserve: 31754.43

## Claims reserving with GLMs compared to hazard models

We fit the standard GLM model with the apc package. As shown in the paper the chain-ladder model can be replicated by fitting an age-cohort model.

library(apc)

ds.apc = apc.data.list(cum2incr(dataset),
data.format = "CL")

ac.model.apc = apc.fit.model(ds.apc,
model.family = "od.poisson.response",
model.design = "AC")

Inspect the model coefficients derived from the output:


ac.model.apc$coefficients.canonical[,'Estimate'] #> level age slope cohort slope DD_age_3 DD_age_4 DD_age_5 #> 7.41596168 0.74105900 0.16519698 -1.16411707 -0.02909890 -0.17467013 #> DD_age_6 DD_age_7 DD_age_8 DD_cohort_3 DD_cohort_4 DD_cohort_5 #> -0.17533888 0.17408744 -0.55360421 0.02140672 -0.07364998 -0.03603392 #> DD_cohort_6 DD_cohort_7 DD_cohort_8 #> -0.15911660 0.20095088 -0.17521544 ac.fcst.apc = apc.forecast.ac(ac.model.apc) data.frame(reserve.mack=ultimate.chl-diagonal, reserve.apc=c(0,ac.fcst.apc$response.forecast.coh[,'forecast']),
reserve.clmplus=a.model$reserve ) #> reserve.mack reserve.apc reserve.clmplus #> 0 0.00000 0.00000 0.00000 #> 1 67.23865 67.23865 67.23865 #> 2 345.18727 345.18727 345.18727 #> 3 940.68770 940.68770 940.68770 #> 4 2350.85562 2350.85562 2350.85562 #> 5 4466.77443 4466.77443 4466.77443 #> 6 9103.24335 9103.24335 9103.24335 #> 7 14480.43832 14480.43832 14480.43832 Our method is able to replicate the chain-ladder results with no need to add the cohort component. a.model$model.fit\$ax
#>           1           2           3           4           5           6
#>          NA  0.02366899 -1.01313612 -1.72538123 -2.48027780 -3.34130515
#>           7           8
#> -3.99615988 -5.18978418

Further inspection can be performed with the clmplus package, which provides the graphical tools to inspect the fitted effects. Observe we model the rate in continuous time, the choice of a line plot is then consistent.

plot(a.model)

# The benefitial effect of adding the cohort component

It is straightforward to state that from the statistical perspective it is desirable to have a model with less parameters. Nevertheless, our approach goes far beyond that.

By adding the cohort effect we are able to improve our modeling.

We show these results by inspecting the residuals plots.

#make it triangular
plotresiduals(a.model)

Clearly, the red and blue areas suggest some trends that the model wasn’t able to catch. Consider now the age-cohort model and its residuals plot.

ac.model <- clmplus(rtt,
hazard.model="ac",
gk.fc.model='a')
#> StMoMo: The following ages have been zero weighted: 1
#> StMoMo: The following years have been zero weighted: 1
#> StMoMo: The following cohorts have been zero weighted: -7 -6 -5 -4 -3 -2 -1 7
#> StMoMo: Start fitting with gnm
#> StMoMo: Finish fitting with gnm
plotresiduals(ac.model)

With no need of extrapolating a period component, we were able to improve the fit already. Pay attention, the cohort component for the cohort m is extrapolated.


plot(ac.model)

## Extrapolation of a period component

Similarly, it is possible to add a period component and choose an age-period model or an age-period-cohort model.

ap.model = clmplus(rtt,
hazard.model = "ap",
ckj.fc.model='a',
ckj.order = c(0,1,0))
#> StMoMo: The following ages have been zero weighted: 1
#> StMoMo: The following years have been zero weighted: 1
#> StMoMo: The following cohorts have been zero weighted: -7 -6 -5 -4 -3 -2 -1 7
#> StMoMo: Start fitting with gnm
#> StMoMo: Finish fitting with gnm

apc.model = clmplus(rtt,hazard.model = "apc",
gk.fc.model='a',
ckj.fc.model='a',
gk.order = c(1,1,0),
ckj.order = c(0,1,0))
#> StMoMo: The following ages have been zero weighted: 1
#> StMoMo: The following years have been zero weighted: 1
#> StMoMo: The following cohorts have been zero weighted: -7 -6 -5 -4 -3 -2 -1 7
#> StMoMo: Start fitting with gnm
#> StMoMo: Finish fitting with gnm
plotresiduals(ap.model)

It can be seen that the age-period model does not suggest any serious improvement from the age-cohort model. It is worth noticing one more time that the age-cohort model does not require any extrapolation. In a similar fashion, we plot the age-period-cohort model below, which seems to lead us to a small improvement.

plotresiduals(apc.model)

Below, the effects of the age-period-cohort model.

plot(apc.model)

# Conclusions

In this vignette we wanted to show the flexibility of our modeling approach with respect to the well-known chain-ladder model.

• By modeling the hazard rate we are able to replicate the chain-ladder results with less parameters. Indeed, we model the age component directly and add a cohort effect if needed.

• Going from an age model to an age-cohort model may lead to a serious improvement in the model results.