### Data setting

We use data from 271 regions of the 27 EU member countries (all members except Croatia) from 1995 to 2011. In particular, the countries included in the study are: Austria, Belgium, Bulgaria, Republic of Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden and the UK. The years analysed are constraint to data availability. Data are obtained from EUROSTAT [25].

### Econometric model

Although models are specified based on the well-known β-convergence hypothesis [26–29], in the form of the conditional specification of the β-convergence hypothesis, in contrast to more standard studies, we do not specify cross-section, but rather spatio-temporal models, i.e. a dynamic panel model. Furthermore, we are not only interested in the (conditional) β-convergence, but also in the σ-convergence.

In particular, we have specified the following model:

$$ \begin{array}{l} \log \left({y}_{ijt}\right)={\alpha}_i+{\beta}_{jt} \log \left({y}_{ijt-1}\right)+{\gamma}_{1jt} \log \left( gdpp{c}_{jt}\right)+{\gamma}_2 \log \left( gdpp{c}_{jt-1}\right)+\\ {}{\gamma}_3 \log \left( gdpp{c}_{jt-2}\right)+{\gamma}_{4jt} \log \left( Gin{i}_{jt}\right)+{\gamma}_{5jt} \log \left( Gin{i}_{jt-1}\right)+{\gamma}_6 \log (empht)+\\ {}{\gamma}_7 \log \left(uni{v}_{ijt}\right)+{\gamma}_8 \log \left(um{y}_{ijt}\right)+{\gamma}_9 \log \left(uf{y}_{ijt}\right)+{\gamma}_{10} \log \left( rand{d}_{jt}\right)+{\gamma}_{11} \log \left({\mathrm{bpg}}_{jt}\right)+\\ {}{\gamma}_{12} \log \left( pub{ \exp}_{jt}\right)+{\gamma}_{13}\left(I>2003\right)+{\gamma}_{14}\left(I>2006\right)+{\gamma}_{15}\left(I>2007\right)+{S}_i+{\tau}_t+{u}_{ijt}\end{array} $$

(1)

$$ \begin{array}{l} \log \left( Gin{i}_{jt}\right)={\delta}_{0j}+{\delta}_1 \log \left({y}_{jt-1}\right)+{\delta}_2 \log \left({y}_{jt-2}\right)+{\delta}_3 \log \left({y}_{jt-3}\right)+{\delta}_4 \log \left( gdpp{c}_{jt-1}\right)+{\delta}_5 \log \left( gdpp{c}_{jt-2}\right)+\\ {}{\delta}_6 \log \left( gdpp{c}_{jt-3}\right)+{\delta}_7 rat{e}_{jt-1}+{\delta}_8 \log \left( Gin{i}_{jt-1}\right)+{\delta}_9\left(I>2003\right)+{\delta}_{10}\left(I>2006\right)+{\delta}_{11}\left(I>2007\right)+\\ {}{S}_j^{\prime }+{\tau}_j^{\prime }+{v}_{jt}\end{array} $$

(2)

Where *y* denotes one of the five dependent variables we chose: life expectancy at birth (in years); mortality for all causes; and cause-specific mortality: ischemic heart disease mortality; cancer mortality; and larynx, trachea, bronchus and lung cancer mortality (cause-specific mortality was standardised as death rate per 100,000 inhabitants, 3-year average). The theoretical explanation behind the use of these variables is the following. First, as in most previous studies on health (in concurrence with the seminal article of Sen et al. [30]), we use life expectancy at birth (in years). However, instead of using only total mortality, we prefer to use here (several) cause-specific mortality. Total mortality is actually a combination of many phenomena that could undermine this variable as an indicator of social ill-being [31]. In particular, we chose those causes of mortality most associated with socioeconomic deprivation in the literature [22–24]: ischaemic heart disease mortality; cancer mortality; and larynx, trachea, bronchus and lung cancer mortality.

The Gini index is one of the main explanatory variables of this model. According to Eurostat [25], it is defined as the relationship of cumulative shares of the population arranged according to the level of equivalized disposable income to the cumulative share of the equivalized total disposable income received by them. More conveniently, it can be defined as twice the covariance between income and income ranks. Note that, because there could be bidirectional causation between health variables (i.e. dependent variables) and income inequality, the Gini index (the main explanatory variable in Eq. (1)), could be an endogenous variable. Even if there exist controversy across authors about this bidirectional causation, evidence (few papers) shows that unhealthy societies can have an important effect on a persistent low economic growth and, maybe, inequality [32, 33]. Moreover, the macroeconomic theory says that the countries with poorer health conditions have more difficulties to reach a sustained economic growth in comparison to other countries with better health [34]. For this reason we specify a model of simultaneous equations.

The subscript *i* denotes region (*i = 1,…,273*); *j* country (*j = 1,…,27*); and *t* year (*t = 1995 1996,…, 2011*); α, β and γ denote unknown parameters; *S* denotes spatial random-effects (see below); and *u* normally distributed disturbance term. Some data is missing for the five dependent variables mainly for the beginning of the period and specifically for some regions in Belgium, Denmark, Italy, Poland, Romania and Slovenia.

Socioeconomic inequalities in health are approached by the Gini index (Gini) (data available only on country level) and the Gross Domestic Product per capita (GDP per capita, (gdppc)) (data available regionally). Note that we assume that the effects, if any, of GDP per capita on socioeconomic inequalities in health, are distributed in time. Hence, we include the current level (*t*) and two lags (*t-*1 and *t-2*) of GDP per capita (gdppcjt-1 and gdppcjt-2). In the equation corresponding to the Gini index (Eq. (2)) we include, additionally, the lag of the growth rate of GDP (rate).

Moreover, we also consider additional variables that may secondarily contribute to socioeconomic inequalities in health. These variables are available on both a regional and country level. The panel that we create with these data is unbalanced. Data was not available for the entire period or for all regions. Further details on the dataset can be found in Maynou et al. [21].

*Regional level:*

*Empht*: high-tech employment | Employment in technology and knowledge-intensive sectors (thousands of employees), 1999–2011. |

*Univ: Percentage of university students*
| Ratio of the sum of level 5 and 6 students (tertiary education) over total population from 1999 to 2011. Data is missing for Germany, Greece, Spain and United Kingdom. These countries do not report all data on education to EUROSTAT. |

*Umy: Youth male unemployment rate.*
| Unemployment rate for young males (15–24 years old) from 1999 to 2011 on average for the regions of the EU. For some regions, some data is missing for some years, mainly for the latter period. |

*Ufy: Youth female unemployment rate*
| Unemployment rate for young females (15–24 years old) from 1999 to 2011. |

*Country level:*

*RandD*: R&D | Ratio of R&D over the country’s GDP. For some regions, some data is missing for some years, mainly for the first period. Data available from 1995–2011. |

*Bpg:* External balance | The ratio of exported goods minus imported goods over the country’s GDP. All data available from 1995 to 2011, except for the first years of the period in Greece. |

*Pubexp:* Public expenditure rate | Ratio of goods and services bought by the State over the country’s GDP. All data available from 1995 to 2011. |

Finally, we included three dummy variables, taking the value 1 for 2004 onwards (corresponding with the first expansion of the EU in 2003 and so within the study period), for 2007 onwards (corresponding with the second expansion in 2006), and for 2008 onwards (corresponding to the first year of the financial crisis, in 2007).

In order to analyse σ-convergence, we used the coefficient of variation for each health variable. It is important to note, however, that instead of using the coefficient of variation calculated on the original variables, we calculated the fitted values from the model (1-2).^{Footnote 1}

Some of the coefficients have subscripts. In fact, we specify (dynamic) random coefficient panel data models [35] or, in mixed models terminology, we allow (some of the) coefficients to be random-effects [36]. In other words, we have allowed them to be different for the various levels we have considered. Thus, for example, *β,* varies per year,

$$ {\beta}_t=\beta +{\nu}_t $$

and also per country,

$$ {\beta}_{jt}=\beta +{\upsilon}_{jt} $$

With respect to the other explanatory variables, the random-effects are associated with different levels depending on the final model.^{Footnote 2}

When the random-effects vary by country, we assume they are identical and independent Gaussian random variables with constant variance, i.e. *υ*
_{
jt
} ~ *N*(0, *σ*
^{2}_{
υ
}
). When the random-effects vary by year, we assume a random walk of order 1 (i.e. independent increments) for the Gaussian random-effects vector [37].

$$ \varDelta {\upsilon}_{jt}={\upsilon}_{jt}-{\upsilon}_{jt+1}\kern2em \varDelta {\upsilon}_{jt}\sim N\left(0,{\sigma}_{\upsilon}^2\right) $$

### Spatio-temporal adjustment

We took into account the spatio-temporal extra-variability present in our model (i.e. spatial heterogeneity and spatial and temporal dependence), by introducing some structure into the model. Heterogeneity was captured by using the random-effect associated with the intercept (*α*) (varying on a region, level i in the response variable equation and on a country level j in the Gini equation). Temporal dependency is approximated through the random walk of order 1, and linked to the random-effects associated with the temporal trend (τ in Eqs. (1) and (2)) and also with those parameters varying on a year level, t. Note also, that we allow that this temporal trend to vary per country.

For spatial dependency, we follow the recent work of Lindgren et al. [38], and specify a Matérn structure [39] for the corresponding random-effect (S_{i} or S_{j}, in the response variables and in the Gini equation, respectively). In short, we use a representation of the Gaussian Markov Random Field (GMRF) explicitly constructed through stochastic partial differential equations (SPDE) and which has as a solution a Gaussian Field (GF) with a Matérn covariance function [39].

### Inference

We preferred to relax the assumption of strict exogeneity, allowing a weak exogeneity of the lagged dependent variable, that is to say, that current shocks only affect future values of the dependent variable [40]. By doing this, we are able to obtain consistent estimates of the parameters of interest (even with fixed T). It is important to point out that this relaxation involves two requirements, first, a large N: i.e. obtained in our case by considering regional data and second, identically and independently distributed error terms. This can only be achieved by the space-time adjustment explained above, imposing a certain structure on the original disturbance term.

Inferences were performed using a Bayesian framework, following the Integrated Nested Laplace Approximation (INLA) approach [41, 42]. It is important to point out that both equations were estimated simultaneously, avoiding endogeneity.

All analyses are made with the free software R (version 2.15.3) [43], made available through the INLA library [37, 42].