### Multi-level regression

Due to the stratified nature of data in NFHS [12], the children are naturally nested into mothers, mothers are nested into households, households are into Primary Sampling Units (PSUs) and PSUs into states. Hence keeping in view this hierarchically clustered nature, the paper uses multi-level regression model to estimate parameter for nutritional status among children to avoid the likely under-estimation of parameters from a single level model [51]. Since here siblings are expected to share certain common characteristics of the mother and the household (mother's education and household economic status for e.g.) and children from a particular community or village have in common community level factors such as availability of health facilities and outcomes, it can be reasonably asserted that unobserved heterogeneity in the outcome variable is also correlated at the cluster levels [52–54]. This amounts to an estimation problem employing conventional OLS estimators, which gives efficient estimates only when the community level covariates and the household level covariates are uncorrelated with the individual and maternal effects covariates.

Researchers have adopted fixed effects models to estimate nutrition models and control for unobservable variables at the cluster level, which leads to the difficulty that if the fixed effect is differenced away, then the effect of those variables that do not vary in a cluster will be lost in the estimation process [54]. Allowing the contextual effects in our analysis of the impact of household socio-economic status on child undernutrition, we adopt an alternative approach of using multilevel models.

Broadly, we test the two types of multilevel models following the practice in contemporary literature; the variance components (or random intercept) models and the random coefficients (or random slopes) models. As in above, STATA routines for hierarchical linear models using maximum likelihood estimators for linear mixed models were used for both model forms.

The variance-components model correct for the problem of correlated observations in a cluster, by introducing a random effect at each cluster. In other words, subjects within the same cluster are allowed to have a shared random intercept. We consider two clusters, i.e., community and household, since in most of the cases NFHS provides information on children of one mother chosen from a particular household. Thus, we have,

${z}_{ij}=\beta \prime {x}_{ij}+{\delta}_{i}+{\mu}_{ij}$

where **z**_{
ij
}is the HAZ score for the child(ren) from the j^{th} household in the i^{th} community. **β** is a vector of regression coefficients corresponding to the effects of fixed covariates **x**_{
ij
}, which are the observed characteristics of the child, the household and the community. Where, '**i'** is a random community effect denoting the deviation of community i's mean z-score from the grand mean, '**j'** is a random household effect that represents deviation of household ij's mean z-score from the i^{th} community mean. The error terms **δ**_{
i
}and **μ**_{
ij
}are assumed to be normally distributed with zero mean and variances σ^{2}_{c} and σ^{2,}_{h} respectively. As per our arguments above, these terms are non-zero and estimated by variance components models. To the extent that the greater homogeneity of within-cluster observations is not explained by the observed covariates, σ^{2}_{c}, and σ^{2,}_{h} will be larger [55].

To evaluate the appropriateness of the multilevel models, we test whether the variances of the random part are different from zero over households and communities. The resulting estimates from the models can be used to assess the Intra Class Correlation (ICC) i.e., the extent to which child undernutrition is correlated within households and communities, before and after we have accounted for the observed effects of covariates

**x**_{
ij
}. A significantly different ICC from zero suggests appropriateness of random effect models [

54]. The ICC coefficient describes the proportion of variation that is attributable to the higher level source of variation. The correlations between the anthropometric outcomes of children in the same community and in the same family are respectively:

$\rho c={\sigma}^{2}c/({\sigma}^{2}c+{\sigma}^{2}h)$

Following this, the total variability in the individual HAZ scores can be divided into its two components; variance in children's nutritional status among households within communities, and variance among communities. By including covariates at each level, the variance components models allow to examine the extent to which observed differences in the anthropometric scores are attributable to factors operating at each level. Thus, the variance components model described above introduces a random intercept at each level or cluster assuming a constant effect of each of the covariates (on the outcome) across the clusters.

If additionally, we consider the effect of certain covariates to vary across the clusters (for e.g, differential impact of household socio-economic status or mother's education across households and/or communities), we need to introduce a random effect for the slopes as well, leading to a random coefficients model. Under these assumptions, the covariance of the disturbances, and therefore the total variance at each level depend on the values of the predictors [55].

As mentioned earlier, a subset of 24,896 children have been considered for the analysis from the hierarchically clustered NFHS-3 dataset. Hence, our multilevel models are based on observations on 24,896 children from 18,078 households distributed in 2,440 communities/clusters (PSUs). Inclusion of separate levels for children and mothers were considered not necessary since these were almost unitary to the number of households.

The analysis is presented in the form of five models, apart from the conventional OLS model without considering the cluster random effects, primarily as a comparison: Model_Null is the null model, where the HAZ z scores is the dependant variable with no covariates included; while in the later models along with poorest and richest household asset quintile, other covariates are introduced in a phased manner. Such as, Model_Kids introduces child specific predictors (being purely individual attributes); Model_Moms introduces the mother-specific covariates. Model_Full is the full model with all the model covariates at respective levels. These models are three-level random intercept models with the two clusters: community, and households. In Model_Random_Slope, we introduce a random coefficient for socio-economic status at the household level. We settled for the random coefficient in the form of wealth quintile dummies. The covariates included as controls in our analytical models, with the primary aim of isolating the effect of income or socioeconomic status (SES) on chronic child undernutrition are described below. In the multilevel framework most of these variables can be classified as individual-specific, household-specific or community-specific covariates.

### Other explanatory variables used as controls

Apart from the above mentioned asset index, other determinants of childhood malnutrition are chosen based on approaches in literature and presented in the conceptual framework (Figure 1) of the study [47]. We consider certain individual characteristics of child as the proximate covariate of chronic malnutrition. These predisposing factors include child's characteristics similar to other studies, such as, child's age in months, quadratic form of age to eliminate the effect on z-score [38] since there exists non-linearity between age and HAZ, sex of the child, birth order, size of child at birth (as a proxy of birth weight) [57], incidence of recent illness, complete doses of immunization and recommended feeding practice; denoted by exclusive breast feeding for infants below six months of age, introduction of nutritional supplements along with or without breastmilk after six months. In view of information provided by NFHS on child feeding, we considered a child is introduced to supplementary food, wherever the child was reported having given any food-stuff irrespective of its breast feeding status, a day preceding the survey date.

The controls on mother's characteristics includes; years of in terms of education, body mass index (BMI), mothers status of anemia, autonomy for seeking medical help for self [58, 59] and place of birth for the child of interest. On the household level, except for asset quintile, controls was included for household ethnicity, since a large number of earlier studies found a significant linkage between scheduled tribe/scheduled caste households and childhood undernutrition [2, 14]. Community characteristic is regarded as the distant covariate of child malnutrition in the model and include rural-urban place of residence and state. Keeping in mind the large scale variation in childhood mortality and morbidity, the states are considered for each of the models as controls, or as fixed effects in multilevel models.