The wider determinants of inequalities in health: a decomposition analysis

Background The common starting point of many studies scrutinizing the factors underlying health inequalities is that material, cultural-behavioural, and psycho-social factors affect the distribution of health systematically through income, education, occupation, wealth or similar indicators of socioeconomic structure. However, little is known regarding if and to what extent these factors can assert systematic influence on the distribution of health of a population independent of the effects channelled through income, education, or wealth. Methods Using representative data from the German Socioeconomic Panel, we apply Fields' regression based decomposition techniques to decompose variations in health into its sources. Controlling for income, education, occupation, and wealth, we assess the relative importance of the explanatory factors over and above their effect on the variation in health channelled through the commonly applied measures of socioeconomic status. Results The analysis suggests that three main factors persistently contribute to variance in health: the capability score, cultural-behavioural variables and to a lower extent, the materialist approach. Of the three, the capability score illustrates the explanatory power of interaction and compound effects as it captures the individual's socioeconomic, social, and psychological resources in relation to his/her exposure to life challenges. Conclusion Models that take a reductionist perspective and do not allow for the possibility that health inequalities are generated by factors over and above their effect on the variation in health channelled through one of the socioeconomic measures are underspecified and may fail to capture the determinants of health inequalities.


Introduction
There is no shortage of empirical evidence illustrating the existence of health inequalities and association between socio-economic position and health inequalities is well established [1][2][3]. Reducing health inequalities, especially socioeconomic health inequalities, has therefore been on the agenda of policy-makers in a number of countries [4][5][6] and international organisations [7,8].
Nevertheless, the underlying mechanisms that determine health inequalities are not fully understood [9,10], which makes it hard for policy-makers to create welltargeted public policy strategies. On the conceptual level, various factors have been proposed to generate socioeconomic health inequalities including material factors, cultural-behavioural factors, and psycho-social factors [11,12]. Other important factors are ethnic- [13,14] and gender-based differences [15]. In health economics, the relative importance of these factors is commonly assessed by decomposition methods based on the concentration-index [16]. This process separates the contributions of individual factors to income-related health inequality, in which each contribution is the product of the sensitivity of health with respect to that factor and the degree of income-related inequality associated with that particular factor [16]. Various authors have contributed to this literature refining decomposition methods and their interpretation [17,18]. As an alternative to income-related health inequalities, education- [19], occupation- [1] and wealth-related health inequalities [20] have been assessed. Studies emerging from public health and epidemiology have used multiple regression analyses differentiated by education level or occupation [21][22][23] to assess the importance of different sets of factors [9]. A common starting point of both strands in the literature is that factors affecting health are rooted in or at least channelled through income, education, occupation, wealth or a similar indicator of socioeconomic structure. In the deepest sense, these approaches build on two schools of thought: the Marxian and the Weberian. In a very simplified framework the Marxian approach concentrates on the antagonistic class-relationship based on the distribution of means of production. In this framework, socioeconomic health inequalities emerge because of the different exposures to material factors, the most important being differences in work-related strains between the bourgeoisie and the working class. In the Weberian traditions classes are numerous and reflect the hierarchical structure in a number of dimensions such as prestige or status related to occupation, education, income, and other sources of power [24].
We use Fields' decomposition [25] techniques to decompose health inequalities, rather than socioeconomic inequalities in health, into their sources. This allows us to assess the relative importance of different sets of factors over and above their effect on the variation in health channelled through one of the measures of socioeconomic structure previously outlined. Our approach does not deny the especially worrisome nature of socioeconomic health inequalities, and nor do we suggest that income or other measures of socioeconomic structure are not import. Rather, we investigate if and to what extent different sets of factors can assert systematic influence on the distribution of health of a population independent of the effects channelled through income, education, or wealth. To our knowledge, no decomposition study has so far attempted to scrutinise the effect of different sets of explanatory factors over and above the influence channelled through income, education, and wealth.
Previous work on health inequalities in Germany were conducted from an epidemiological perspective within the framework of large, comparative European research projects [26][27][28][29]. In addition, income-related health inequalities were analysed in a comparative manner across countries applying approaches derived from health economists [30,31]. The particularities of health inequalities within Germany were discussed in various studies [32,33]. Overall, these studies suggest a moderate socio-economic gradient in health inequalities [34].

Factors and Approaches
As outlined, different sets of explanatory factors have been proposed to account for health inequalities. We concentrate on three widely discussed sets of factors, namely, material, cultural-behavioural, and psycho-social factors, while also taking into account the life-course perspective. We also assess the capability approach as formulated by Hall and Taylor [35]. Due to limited space, we cannot fully develop the underlying theories and models, but various excellent overviews of the models and theories and their respective strengths and weaknesses exist and can be consulted [11,35].
The materialist approach explains health inequalities through differences in an individual's socio-economic position. The basic idea is that different social hierarchical positions in socio-economic stratification are linked to differential exposures to the material world, which can be either conducive or unconducive to health (e.g., noise, pollution, material working conditions). Various authors stressed that factors referring to the public infrastructure may determine the private resources available for health production and should also be considered as (neo-) material factors [36,37].
The psycho-social approach argues that individuals from lower socio-economic backgrounds experience more negative life events [38], less social support [39], less autonomy at work [40,41], less job security and therefore suffer from poorer health [42]. Various underlying mechanisms are assumed, but the core argument is that stress negatively affects health by reducing resilience and increasing vulnerability to illness [43]. Siegrist [44][45][46] puts forward a different variant of the psychosocial explanation, arguing that harmful stress is triggered by a perceived lack of reciprocity in the workplace, i.e., when rewards from employment or other central social roles are threatened or lost, persons become more vulnerable to addiction and other types of high risk behaviour due to biological processes in the brain [11].
The cultural-behavioural approach stresses that culture determines or frames behavioural choices, including decisions affecting health, i.e., engaging in higher risk lifestyles that may include drinking, smoking, or an unhealthy diet. Cultural-behavioural factors are often implicitly motivated by Bourdieu's concept of habitus [47]. Habitus is expressed in daily lifestyle decisions, partialities, body awareness, and consumption patterns. Differences in access to cultural, economic, and social capital are central to the class specific development of habitus patterns. In line with Bordieu's notion of habitus is the well-documented relationship between high educational attainment and health promoting behaviours.
The life-course perspective adds a temporal dimension and explains health inequalities as the result of differences in increasing and decreasing bundles of factors, which influence health at different times in an individual's life [48]. Thus, health is no longer solely the result of current conditions and individual lifestyle choices but is also determined by past living conditions and events [49].
Recently, Hall and Taylor [35] put forward the capability approach. Hall and colleagues follow the general analytical foundations formulated by Amartya Sen and others, arguing that the study of wellbeing, i.e., in our case 'health', should consider factors beyond material or narrow socioeconomic factors by focusing more broadly on the capabilities of people to realize functions (such as 'good health') they value [50,51]. This approach broadens the perspective on social relations as these can affect health independent of their relationship with people's income, employment, or wealth [35]. Hall et al. define an explicit micro-level explanatory mechanism and argue that an individual's health status is a function of individual capabilities and life challenges over time. Capabilities and challenges are, in turn, determined by socio-economic position, social connectedness, emotional disposition, collective imaginaries (understood as cultural and societal norms), self-determination, and stress. Hall and Taylor hypothesize that the observed socio-economic gradient in health inequalities is generated by differences in individual balances of capabilities and challenges.

Data and Variables
The data for the analysis is taken from the 2006 German Socioeconomic Panel (GSOEP). The GSOEP is an annual panel with approximately 20,000 individuals aged 16 from over 11,000 households throughout Germany. Each person in the sample is interviewed individually. The GSOEP collects data on a broad range of issues, such as population and demography, education, training and qualification, earnings and income, health, basic orientation and satisfaction-specific aspects of life.
Our dependent variable physical health is measured through the reliable and validated Short Form-12 Health Survey (SF-12v2), which uses 12 questions to measure functional health and well-being from the individual's point of view and allows scores for physical and mental health to be generated. 1 Four subscales are used to generate a physical health score, and four are used to generate a mental health score respectively. The physical score mainly refers to evaluations of one's ability to perform physical activity. We use the physical health scores derived from the 2006 GSOEP as the dependent variable. As the discourses on mental health are very different [52], extending our analysis to mental health would require us to scrutinize an even greater variety of approaches. Therefore, we limit our analysis to physical health.

Control Variables
To assess the relative importance of different sets of factors over and above their effect on the variation in health channelled through one of the socioeconomic measures commonly applied in decomposition analyses, we control for income, wealth, education and occupation. Wealth is measured using two binary variables: whether the person is a property owner and whether the person holds financial assets. To measure the impact of the level of wealth, we include two continuous variables: one for the monetary value of the property and one increasing in the value of financial assets. Occupation of the individual is operationalised using the Cambridge-Scale of Occupations. Education is captured using five binary variables that take a value of one when the individual has general elementary education, a midvocational, vocational or higher vocational qualification or higher education respectively. The reference cases are workers without any school qualification. To control for personal circumstances and working arrangement of each individual, the analysis also includes information on marital status, the number of children living in the household, whether the respondent has immigrant status, and whether the individual works part-time.

Explanatory Variables Material Factors
Working conditions are captured through a binary variable, which takes a value of one if the individual works in poor conditions. Moreover, we use an ordinal scale, five category variable on self-assessed pollution and noise, ranging from no impact to very strong impact and a binary variable on whether there is strong social coherence in the neighbourhood. We approximate infrastructural conditions by using two binary variables, which take a value of one if the individual needs more than twenty minutes to arrive at the doctor's practice or the nearest public transportation site and also include a self-assessed kilometre distance to the nearest big city. Differences caused by different exposures to the health care system are measured via the individual's mandatory, voluntary, or private health insurance arrangement.

Psychosocial-approach
To capture the psycho-social approach focusing on social support and the nature of living conditions, we include two binary variables, which take a value of one if the person lacks social or personal support (e.g., no one to confide in, no one to support his/her career). Furthermore, we use an ordinal variable increasing in perceived job security to approximate secure living conditions, an ordinal variable that captures the degree of job autonomy, and an ordinal variable for the level of the perceived time pressure at work. Siegrist's [44,45] reciprocity notion is conceptualized using interaction terms between the lifestyle variables of smoking and alcohol and a binary variable, which takes a value of one if the individual does not believe that his/her efforts at the workplace are adequately rewarded in terms of pay or direct appreciation.

Cultural-behavioural factor
We include four arguably reliable lifestyle variables from the 2006 GSOEP: how often the individual exercises, whether the individual smokes, whether the individual regularly drinks hard liquor, and how the individual's weight compares to the norm. The inclusion of the weight status is a proxy for the quantity and quality of food intake. To incorporate Bordieu's notion of culturally framed behaviours, we include further interactions between smoking and alcohol consumption and a binary variable on low educational attainment (lower or midvocation training) in the model.

Capability approach
The capability approach has not been operationalised for decomposition analysis of health inequalities yet. Our understanding is that an individual is equipped with resources that can be applied to life challenges. The discrepancy between the magnitude of resources enabling the capability to deal with negative life events and the sum of challenges decides the individual's ability to sustain good health. To capture the distance between resources and challenges, we calculate individual scores for resources and challenges and then subtract the latter from the former to generate one variable representing the capability approach. This approach departs from the original formulation by Sen and others by adding up several dimensions/variables of capabilities. It is, however, necessary to generate a measure that approximates what Hall and Taylor [35] consider the overall 'balance' between challenges and capabilities and its subsequent distribution across the population. Hence, our measure captures the combined effect of (different) capabilities and challenges, approximating the 'wear and tear' a person suffers in daily life [35].
Our resource score is computed by adding up the variables' social status (transformed in a five-category ordinal scale using quintiles), supportive confidantes (two binary variables), trust in democracy (a binary variable on whether the person belongs to the top 50 per cent of trustful citizens was calculated based on individuals' assessment on a ten-point scale ranging from zero (no trust) to ten (high trust)), and an ordinal variable capturing how well the person deals with stress on a seven-category scale.
The challenge score is constructed by adding up binary variables for poor working conditions, lack of advancement chances, lack of job security, and an ordinal variable increasing in lack of autonomy at work on a five-point scale.

Modelling
We only include individuals 16 years of age and older who are employed either part-or full-time to a) test various theoretical explanations that explicitly draw on mechanisms based in working life and b) reduce potential endogeneity caused when people leave the working population due to health problems [31]. 2 Excluding age and labour market status reduces the sample to 11,388. Of these individuals, 11,067 have a valid physical health score. After excluding the observations with missing values in the explanatory variables, 3,500 individuals in our final sample are female and 3,980 are male.

Methods
Research in the decomposition of factors is rooted in and driven by research applied to income inequality. In 1982, Shorrocks [54] developed a method that decomposes inequality in income by sources of factor components. Later, Murdoch and Sicular [55] and Fields [25] extended Shorrocks' [54] approach to a regression-based decomposition of inequality. They expressed household income as a linear function of explanatory variables and used the regression coefficients to calculate the decomposed variance for all variables in the model. The regression-based decomposition had the advantages that (1) it yielded an exact allocation of contributions to the identified factors, (2) it provided measures of uncertainty around the decomposed values that are part of standard regression analysis, and (3) it allowed for the analysis of multiple factors.
Given that our aim is to scrutinise the direct impact of various sets of explanatory variables on health, we choose to depart from the standard concentration index approach [56,2] and follow Fields' [25] decomposition method. Following Fields' method, health is first regressed on a range of explanatory variables using a standard least squares regression model of the form: where Y i is the health of individual i, X k is a vector of variables X 1 , X 2 ,...X K thought to determine health (there are k = 1, 2, ... K variables included in X k ), and b k is a vector of coefficients b 1 , b 2 ,... b K pertaining to each variable k. ε is an error term with a mean value of zero and a variance of unity, b 0 is the intercept term. The estimated coefficients are denoted by (β 0 ,β 1....βK ) and the residual term is given bŷ To deduct the decomposition according to Fields [25], we then first take the variance of the left and right hand sides of equation (1), which is written as: Dividing (2) by the variance of Y then yields: The equation partitions the full variance of Y into the share that is explained by the covariance between each of the X factors and the Y values. Fields [25] calls the proportions denoted by s(X k ) "relative factor inequality weights" or s-weights. It can be described as the share of the variance in health explained by the determinant k, holding all other determinants constant. Dividing the individuals' s-weights for each k by the model R 2 (the proportion of variance explained by all determinants X k taken together) gives the share of each factor in the explained variation of the linear regression, the so called p-weights. 4 Each p-weight, assigned to a variable k, gives the share of overall R-squared explained by this variable k. The sum of the p-weights is R-squared. The pweights can therefore be interpreted as "little R-squared" for each of the variables k.
Formally, this is given by: Furthermore, Fields [25] shows that under six decomposition conditions (Additional File 1), the s-weights and p-weights are the same for any measure of dispersion that is continuous, symmetric, and takes value zero when all Y are identical (namely the Gini coefficient, the Theil index, and the Atkinson index).
Three points are important when interpreting the decomposition results. First, Fields' [25] approach decomposes the predicted value of Y rather than the actual value of Y. Thus, using this approach, we quantify the relative importance of determinants of explained inequality in Y. Second, Fields [25] also pointed out that the weights can take negative values. 5 Third, this method relies on the linearity of the model.

Descriptive statistics
The level of inequality in physical health is descriptively quantified using two commonly used measures: the Gini coefficient and the Theil index. The Gini and Theil entropy measures illustrate that inequality in health increases with age. This is true for both men and women. However, the level of inequality is generally higher among females (Table 1).
Furthermore, the Theil index, which is more sensitive to inequality at the top of the distribution, has lower values indicating lower inequality among the healthier individuals. The Lorenz curve illustrates these findings graphically (Figures 1 and 2).
Summary statistics on all binary and continuous variables for women and men can be found in Table 2 and 3 respectively. About 17 percent of females hold a higher education degree in the youngest age group (16 to 35 years). The rate increases to 30 percent in women aged 55 to 65 years. The rate is lower in men aged 16 to 35 years (ca. 14 percent) but considerably higher in the oldest age group (ca. 41 percent), illustrating a change in the gender-related difference in education between the generations. About 20 percent of women and men in the youngest age group work in poor conditions, compared to more than 30 percent of both sexes between 45 and 55 years of age. The feeling that nobody supports an individual's own career is quite similar in men and women (about 50 percent of the oldest age group), while the dissatisfaction is slightly higher among females. Obesity steadily increases with age in both sexes and peaks at in the oldest age group (15 percent of women and more than 20 percent of men).
Smoking and a low level of education correlates for more than 20 percent of females and for more than 30 percent of males in the age group 16 to 35 years. This correlation decreases in the subsequent age groups.

Decomposition results
In Table 4, we report the percentage contribution of each variable to the total sum of squares for women (i.e., the sum of the s-weights per factor of each of the competing approaches) by applying Fields' [25] regression techniques. The smoking status in the age group of 16 to 35 years of age explains, for example, approximately two percent of the total variance in health. Weight problems explain almost three percent of the total variation in health in individuals aged 56 to 65 years. Figure 3 illustrates the relative importance of the different approaches towards health inequalities in the four age groups. The height of the bar indicates the overall explained sum of squares. It ranges between approximately 10 (in the age group 36 to 45 years) and 22 percent (in the oldest age group).
Cultural-behavioural variables are most influential in the youngest generation (16 to 35 years). In the subsequent two age groups, their relevance in terms of total R-squared is lower but increases again in the fourth age group. Other variables, such as material variables and the capability score become relevant in later age groups. The variables summarised under the material perspective play a prominent role in the last age group, but their impact is low between the ages of 16 and 55. The capability score (i.e., the distance between challenges and resources) is a strong and increasing explanatory factor between 35 and 65 years but is far less important to the first age group. The contribution of psycho-social variables is substantial in the second age group. For all other age groups, the contributions are modestly relevant. The importance of the control variables is high in the first and last age group, but it contributes to explained variance on a low level between 36 and 55 years of age.
In Table 5, the explained variance of health steadily increases in the first three age groups (from about 10 percent to 18 percent) but slightly decreases in the last age group (to about 16 percent).
For men, behavioural variables contribute to variance in health to a substantial degree. The effects are largest in the last two age groups. The capability approach also exerts a considerable effect on total R-squared. It relatively accounts for the explained variance in health more or less equally in the first two age groups but has a high contribution for the last two age groups. Total contribution to R-squared is highest in the fourth age group. The psycho-social approach is most influential in the first age group, and the impact varies between low and medium contribution for the other age groups.
The importance of the control variables steadily increases between the ages of 45 and 55 and is the most influential factor in the third age group. In fact, control variables contribute more than seven percent to the explanation of the total sum of squares. Interestingly, the control variables play a less important role for the last age group. Figure 4 illustrates the contributing factors by approach.
Each of the decompositions is based on an ordinary regression model, which provides information on the direction and magnitude of the coefficients. In the regression models, behavioural variables (obesity in females and hard liqueur consumption in men) have a comparable high statistical siginificance at young age. The same holds for obesity in the older age groups as it has a direct impact on health. 6 The capabaility variable is statistically significant throughout the models. All coefficients of the underlying regression models for women and men used in the decomposition can be found in Additional File 2. Lorenzcurve: physical health of men underlying health inequalities. The analysis suggests that three main factors persistently contribute to variance in health over and above their effect on the variation in health channelled through one of the socioeconomic measures: the capability score, cultural-behavioural variables, and to a lower extent the materialist approach.

Discussion
Of the three, the capability score illustrates the importance of interaction and compound effects as it captures the individual's socioeconomic, social, and psychological resources in relation to his/her exposure to life challenges. On one hand, the high explanatory power of the variable suggests that independent challenges for health (e.g., low household income or education) can be tackled if the individual has access to a high level of resources (e.g., high social capital). On the other hand, it illustrates the difficulties of maintaining good health if the individual faces many challenges while resources are low. Moreover, our analysis shows that for men and women, the difference between resources and challenges is particularly important during the last stage of work life (46 to 65 years). This may suggest that individuals become increasingly vulnerable when they face a high disparity of challenges and resources over a long time or in older age. Individuals may for example feel more strained as the challenges posed by job and family life persist, but the efficiency in the production of their own health decreases, and therefore, the ability to maintain good health decreases.
The results for the materialist variables complement the findings for the capability score. Our analysis for women and men proposes that materialist variables are important for health inequalities during the last stage of work life. However, considering the relative importance of the capability score compared to materialist variables, this perspective suggests that for old age, inequalities are even better captured if one considers vulnerability in terms of the interacted/compound effects of the discrepancy between resources and challenges.
With regard to the cultural-behavioural variables, our analysis suggests that these increasingly contribute to Rsquared for individuals from older age groups (36 to 65 years). Therefore, being overweight is a key contributor to health inequalities for both sexes during later stages of work life. Moreover, our analysis shows that culturalbehavioural variables are important in the youngest generation, especially for women. Again we see that weight problems contribute to health inequalities in women while the consumption of hard liqueur explains a high share of variation in health in men. The control variables add explanatory power across all models and especially in the first and last age group for women and in the middle age groups for men. Therefore, they should be considered in fully specified analyses.
There are a number of limitations to our analysis. First, we do not claim that the selection of variables exhaustively reflects the notions of the presented theories. Building on previous studies, we try to operationalise the approaches as closely as possible, but some choices in operationalisation will remain ultimately normative. For example, we did not include the interaction between smoking and low levels of household income in the decomposition analysis because we considered the  cultural-behavioural effect to be better captured by the interaction between smoking and low levels of education. Although this view was informed by our literature review, the decision remains subjective. Second, the approaches are operationalised with different numbers of variables. This might marginally affect the contribution to variance in the analysis. However, we are confident that this shortcoming is moderate. For example the capability approach, which is operationalised with one variable, is a major contributor in the analysis. Overall, our aim is to give an impression of the relative importance of the variables. We do not claim that the relative percentages should be taken at face value because they might be slightly distorted by collinearity. Last, in an econometric sense we do not apply methods that allow conclusions about causal mechanisms. Therefore we have to be cautious not to overstate our results.

Conclusions
Models that take a reductionist perspective and do not allow for the possibility that health inequalities are generated by factors over and above their effect on the variation in health channelled through one of the socioeconomic measures are underspecified and may fail to capture the determinants of health inequalities. This was particularly evident when we modelled the distance between resources and challenges in life, which in our analysis is a very important factor contributing to inequality in health. 1 The GSOEP version of the SF-12v2 deviates from the original SF-12v2 to a limited degree with regard to formulation, order of questions, and general layout. See Anderson et al. for more specific information (2007: 172). 2 We chose an age of 16 as the lower cut-off point for inclusion in the sample because 16 is the minimum age for legal work in Germany. About 20% of youths between the age of 16 and 18 are in the labour force. 3 We believe that the working population over age 65 represents a selected sample of likely very healthy individuals. To avoid any distortions in the analysis, we decided to use the official retirement age as the upper cut-off for the last age group. 4  . This implies that a negative value arises whenever the two beta coefficients have opposite signs (i.e., whenever controlling for multiple factors within a regression framework would reverse the sign from the simple regression). 6 The significance level of the single coefficient of dummy variables (household income, education, insurance etc.) in the regression models is low. However their combined effect (e.g., five dummy variables for education) is most of the time significant (using a Ftest).

Additional material
Additional file 1: A1. Decomposition conditions. Lists six conditions (based on Shorrocks (1982Shorrocks ( , 1983 and Fields (2004)) in which the sweights and p-weights are the same for any measure of dispersion that is continuous, symmetric, and takes the value zero when all Y are identical (the Gini coefficient, Theil index, and Atkinson index).
Authors' contributions LS participated in the design and concept of the study, carried out the statistical analysis, and drafted the manuscript. DSK participated in the design of the study, reviewed the literature and drafted the manuscript. RB participated in the revision of the manuscript and approved the final manuscript. All authors read and approved the final manuscript.

Competing interests
The authors declare that they have no competing interests.