A reverse factual analysis of the association between smoking and memory decline in China

Background Whether smoking accelerates memory recession has been a topic of significant research. However, randomised controlled trials are not easy to carry out, and does not comply with the ethics of research. And observation method which based on the most readily observed data is easy to draw the wrong conclusions without adjustment. The memory difference between smokers and non-smokers may not really represent the real differences between their memories. Methods In response to these limitations, we adopt propensity score method to match the samples and solve the estimated selection bias and confounding bias on elderlies aged 60 years and over based on Chinese Longitudinal Healthy Longevity Survey (2011) data. The respondents are divided into non-smokers, people who used to smoke but not now, and people who used to smoke and still now. To balance the similarity between different groups on their propensity score weighted distributions of pretreatment covariates, we use generalized boosted models to estimate the multiply treatment propensity scores. Results The results show that compared with non-smokers, people who used to smoke and still now respectively have a decrease 0.0283, 0.0735, 0.0091 on self-evaluation memory, daily living activities, and cognitive function. People who used to smoke but not now have a decrease 0.0224 on daily living activities, while have an increase 0.0054 and 0.0104 on self-evaluation memory, and cognitive function. Conclusion The PSM has considerable utility to control pre-treatment imbalances on observed covariates in non-randomised or observational data.


Background
China is the world's largest producer and consumer of tobacco [1]. It has 350 million smokers and accounts for 37 % of global tobacco production [2]. The increasing number of smokers and consequent effects has gained public attention [3]. Many scholars identified that smoking can be related to many things, e.g. substance abuse or dependence, increased work time, social isolation, negative life events, family breakdown, child abuse, behavioural problems, family history of smoking and anxiety, etc. [4][5][6][7][8][9]. Even passive smoking also influences all life stages of Chinese elderly, including the risk of depression, daily life ability impairment, the odds of selfreported chronic diseases, and the impacts of cognitive function on social participation [10].
In recent years, researchers pay more attention to the negative impacts of smoking on working memory [11,12]. A longitudinal study for eight long-term smokers found the decline of their memory, cognitive function, and attention ability was closely related to smoking [13]. Compared with non-smokers, smokers have weaker performance in cognition and memory, and, in the long run, are more likely to suffer from depression and anxiety [14][15][16][17][18]. For adolescents, smoking does more serious damage to youths' working memories [19,20]. For smokers between ages 43 and 53 who smoke more than 20 cigarettes a day, memory recession is faster than in youths [21]. For elderly smokers who smoke or smoked, compared to those who have never smoked, had more severely declining memory and cognitive functions as well as larger risk for Alzheimer's disease [22][23][24]. The reason is that the harmful substances in tobacco or nicotine negatively impact people's sleep quality, and consequently damage memory and cognitive functions [25]. Smokers' working memory ability and cognitive efficiency are significantly lower than nonsmokers, so people should pay attention to smoking and memory impairment [26]. However, some researchers find that working memory and ability of the shortterm smokers were improved compared to that of the non-smokers [27].
Along with aging, memory is also affected. Whether smoking accelerates memory recession has been a topic of significant research. In some studies, scholars used randomised controlled trials, so the behaviours and results of smokers and non-smokers could be easily observed. Randomised controlled trials need to recruit a large number of participants, who are then randomly assigned to smoking and non-smoking groups. Nevertheless, this type of experiment is not easy to carry out, and does not comply with the ethics of research. In this case, observation is the most appropriate method. However, based on the most readily observed data, it is easy to draw the wrong conclusions without adjustment. For example, when comparing the best memory of the smoking group and the poorest memory of the nonsmoking group, we would come to the conclusion that smoking is harmless to memory. The reason is that observation study does not adopt randomised grouping, which weakens the influence of confounding variables in the treatment and control groups. Therefore, it is easy to cause systematic bias. The Propensity Score Matching (PSM) method can solve this problem and eliminate interference factors between the two groups. This research study employed the PSM method to match samples, using the propensity score to control covariates, and solve the estimation bias caused by self-selectivity.

Sample
Research data for this study came from the latest survey data of the Chinese Longitudinal Healthy Longevity Survey (CLHLS) from 2011. The CLHLS conducted face-to-face interviews in 1998,2002,2005,2008, and 2011, respectively, using internationally compatible questionnaires. The survey design investigated each centenarian in the sampled counties based on the voluntary principle, focusing on the oldest, i.e. ages 80 and older, from 631 counties of 22provinces in China. 1 In order to yield a comparable sample, the CLHLS included young elderlies aged 65-79 and their aged 35-64 adult children from 2002 onward. Questionnaires included basic conditions of respondents as well as information on their social, economic backgrounds and family structures, as well as respondent self-evaluation on health status and quality of life, life style, disease, health and other detailed information. The survey project was supported by Demographic Analysis of Health Longevity in China and Duke University in 1998, the United Nations Population Fund (UNFPA) and China Social Sciences Foundation in 2002, and the China Natural Sciences Foundation and Hong Kong Research Grants Council in 2004. The CLHLS is currently the most representative micro-panel data related to elderly health, and has the largest global sample of centenarians to a report in Science. Related survey data presents high quality in terms of sample loss, accuracy of respondent age, and reliability and validity of main variables [28,29].
In China, the age benchmark of sixty years is usually referred to as 'a cycle of sixty years'. China is located in the Asia Pacific region, which generally considers 60 years and older as elderly; the normal retirement age of this region is 60 years old. Therefore, in this paper, the study objects are elderly born in 1951 or before and reached 60 years and above in 2011.

Measures
In measuring the memory status of the elderly, this research study used multiple indicators to represent the different dimensions of memory in order to analyse the influence of smoking on people's memory.

Self-evaluation memory
Self-evaluation memory is a comprehensive measurement index, including respondents' subjective and objective memory status.

Activities of daily living
Memory aging is seemingly normal for adults. Although it tends to inconvenience the elderly, generally speaking, it does not have a great impact on their working, learning and living. This research asked questions to the elderly to measure their daily living. Questions asked included wearing clothes, bathing, talking, waking up, doing housework, cooking, grocery shopping, taking medicine, etc.
Survey question: 'Whether health or memory causes difficulty in the completion of daily activities?' Response coding: 'not difficult' = 1 (representing good memory); 'difficult but could still complete' , 'difficult and need help' , 'cannot complete' = 0 (representing general or bad memory).
As long as one elderly response included 'difficult but could still complete' , 'difficult and need help' , 'cannot complete' , that respondent was defined to have general or bad memory.

Cognitive function
The aging process is accompanied by the decline of cognitive function. Cognitive function degradation is often an early symptom of Alzheimer's disease, brain atrophy, and Parkinson's disease, which have been difficult problems for the elderly across many countries. The questionnaire drew on the internationally popular Mini-Mental State Examination (MMSE) for respondent orientation skills, immediate recall, delayed recall, structural imitation and calculation ability. Scores range from 0 to 31. Education background also influences MMSE scores [30]. Taking into account the low level of education among the elderly, this research utilised coding reflecting Cui et al. [30]. If respondents did not have a formal education, 18 points or less constituted disabled cognitive function. If respondents were educated 1-6 years, 21 points or less indicated disabled cognitive function. If respondents were educated more than 6 years, 25 points or less qualified as disabled cognitive function.

Variables
The core independent variable in this study was respondent smoking behaviour. Respondents were divided into non-smokers, people who used to smoke but not now, and people who used to smoke and still now. In order to investigate smoking effects on the memory of the elderly, some of the major demographic characteristics -i.e. social and economic status, family relationship and support, and lifestyle variables -were controlled. American researchers found that male smokers are weaker than female smokers in memorizing people's names, and that there is no gender difference in the effects of long-term smoking on memory ability; both males and females suffer memory deficits [31]. If the bad habits of smoking continue with aging, they cause memory disorders [32]. Therefore, demographic variables included in this research were age, gender (male coded as 1, female coded as 0), residence (city coded as 1, rural coded as 0), current marital status (married coded as 1, otherunmarried, divorced, widowed -coded as 0), and memory-related disease diagnosis (yes coded as 1, no coded as 0). Blau and Duncan [33] observed that the level of education was an important measurement index for social and economic status as occupation and income. Given the low education of the elderly and the primary school 6-year completion requirement in China, socioeconomic status variables included years of education (more than 6 years of education coded as 2; 1-6 years coded as 1; not received education coded as 0), employment before 60 years old (had a job coded as 1, other coded as 0), and self-evaluated economic level (middle, more than middle and very high coded as 1; lower and poor coded as 0). Cognitive neuroscientists at Michigan State University found that exercises, especially aerobic exercises, could improve long-term memory. In other words, people who do not exercise may not have very good memory. Researchers at Illinois State University found that older people who often exercise have better memories, and neural activities associated with cognitive activities make them more active and effective. In addition, alcohol paralyses the brain and inhibits nerve cells, which results in torpid reaction and affects the hippocampus in the brain. The hippocampus plays a key role in memory; consequently, alcohol causes declining memory [34]. Therefore, lifestyle variables included regular exercise (yes coded as 1, no coded as 0), and drinking habits (do not drink coded as 2, used to drink but not now coded as 1, regular drinker coded as 0).

Reverse factual analysis
PSM is a method based on reverse factual analysis applying to biology, and was put forward by Paul Rosenbaum and Donald Rubin in 1983. After the 1990s, the method was also applied in health economics and other social sciences. In observational studies, data bias and confounding variables exist due to various reasons. The basic idea of this method is that when studying the effect of a policy or behaviour, comparing similar treated and controlled groups effectively reduces sample selection biases. Compared with traditional matching methods, PSM simplifies the multi-dimension to one single dimension, which reduces computational difficulty and the large sample size requirement, as well as improves the probability of matching success [35][36][37][38]. Propensity score methods do not require modelling the mean for outcomes. Accordingly, this research only used pretreatment covariates and treatment assignments of study participants to implement propensity score adjustments, which avoided bias from model misspecification [39,40]. Numerous studies have shown the propensity score method can be extended to multiple treatment cases [41][42][43][44][45][46]).
In the multiple treatment setting, Generalized Boosted Model (GBM) can capture complex and nonlinear relationships on pre-treatment covariates through an iterative process, finding the propensity score leading to the best balance between treatment and control groups [47].
That is, GBM can deal with continuous and discrete variables.

Sample selection and description
Research samples were selected based on the following principles: (1) Samples were selected according to respondent birth year; samples born after 1951 were eliminated; (2) Samples missing variable values were deleted. Ultimately, there were 3311 samples, including 1000 non-smokers, 1881 smokers and 430 respondents who used to smoke but did not anymore. Table 1 shows descriptive statistics of all sample data, and compares the three groups. Table 1 shows the gender distribution was relatively balanced, with 1652 males and 1659 females. There were more males who smoked, 852, accounting for 51.57 % of the subsample. Only 433 males did not smoke, accounting for 26.21 %; 367 males used to smoke but did not anymore, accounting for 22.22 % of the male subsample. For females, only 148 smoked, equivalent to 8.92 %. 1448 females were non-smokers, i.e. 87.28 %. 63 females used to smoke but did not anymore, accounting for 3.80 %. Most respondents lived in rural areas, with 1-6 years of education, and mainly engaged in farming before 60 years of age. Furthermore, most were married, with relatively uniform self-evaluated economic levels,   For all samples, the mean values of self-evaluated memory, daily living abilities, and cognitive function were 0.1444, 0.5630 and 0.3328, respectively. Nonsmokers were the best in cognitive function. People who used to smoke but did not anymore were the best in self-evaluated memory. People who smoked were the best in daily living abilities. This research used the F test to find that differences of self-evaluated memory among the three groups were not significant (0.1 % significance), while differences of daily living abilities were 10 % significant.

Balancing the treated and controlled groups
These three respective elderly groups exhibited good performances in one certain aspect, we could not infer that smoking declines memory. This relationship may have been endogenous, so comparing the three groups directly would lead to estimation bias because residual error may include 'disease' factors which are related to smoking but cannot be controlled by observable variables. Therefore, the effects of smoking on memory may be exaggerated or reduced [48]. As a result, this researchused generalized boosted regression to estimate propensity scores and weighting of compared cases to estimate the average treatment effect on the treated group.
The study utilised two methods to assess the balance, or equivalence, established on pre-treatment covariates of the weighted treatment and control groups [47]. One method was to use the effect size or the absolute standardised bias and summarize across variables with the mean; the other method was to use Kolmogorov-Smirnov (KS) statistics to assess balances and summarise using the maximum across variables. 5000 iterations were taken to be the optimal number for minimising the largest of KS statistics. Table 2 shows how well the weights succeeded in manipulating the control group to match or balance characteristics of the two groups. In Table 2, E(Y1|t = 1) and E(Y0|t = 1) respectively represent the treatment means and the control means for each of the covariables, while KS and P are the Kolmogorov-Smirnov test statistic and its associated p-value. P-value is derived from Monte Carlo simulations for the maximum KS statistic. Thus, a small p-value indicates the groups are clearly imbalanced and inconsistent with what should be expected had the groups been formed by random assignment. From Table 2, it is evident balance was achieved after weighting.
In order to demonstrate the importance and rationale of the PSM method in empirical research, useful diagnostic plots from propensity score objects were generated (see Fig. 1). Figure 1 is the standardised effect size plot illustrating the effect of weights on the magnitude of differences among groups on each covariate. The standardised effect size is defined as the treatment group mean minus the control group mean divided by the treatment group standard deviation. In these plots, closed circles indicate a statistically significant difference. Figure 1 shows many differences occurred before  Fig. 1 The standardized effect size plots before and after weighting weighting, while none occurred after weighting. Furthermore, effect sizes of most variables were reduced after weighting (referring to the blue lines in Fig. 1).

Estimation results
In non-randomized trials, the PSM method can maximize the elimination of sample selection bias and confounding bias [49]. As shown in Table 3, compared with non-smokers, the analysis estimated a decrease in self-evaluated memory of 0.0283 for smokers and an increase of 0.0054 for former smokers. For daily living activities, both groups decreased by 0.0735 and 0.0224, respectively. For cognitive function, the analysis estimated a 0.0091 decrease for smokers and a 0.0104 increase for former smokers. However, except for the effect on continued smokers in daily living activities, other effects did not appear to be statistically significant.

Discussion
Smoking is a widespread and serious issue in China. It has a certain sociality and function in social communication.
However, smoking is harmful to people's physical and mental health. Previous studies have indicated positive or negative expectations of smoking can affect smokers' decisions, intentions and behaviours. Those with positive expectations of smoking believe smoking can promote social interaction, so most smoke and are not willing to give up the habit. Positive expectations can effectively predict the consequences of nicotine dependence [50]. Those who hold negative expectations of smoking worry about being rejected by peers if they smoke, so few smoke [51,52]. This paper aimed to research whether smoking affects memory. The PSM was employed to reduce or eliminate confounding characteristics in observational data for smokers and non-smokers. PSM has gained widespread application over the past decade because of its advantages compared with traditional regression methods. PSM uses propensity score weights to control pre-treatment imbalances on observed covariates in non-randomised or observational data. After balancing or weighting, confounding characteristics are reduced or eliminated, and the distributions of observed pre-treatment characteristics are similar among the treated and controlled groups. This research study used propensity scores to weight the samples for the treatment group (non-smokers) and control group (current and former smokers).
Results showed that compared with non-smokers, current smokers decreased scores by 0.0283, 0.0735, 0.0091, respectively, on self-evaluated memory, daily living activities, and cognitive function. In contrast, former smokers decreased daily living activities by 0.0224, while they increased self-evaluated memory and cognitive function by 0.0054 and 0.0104, respectively. However, most effects did not appear to be statistically significant, except for the effect on daily living activities in current smokers. When people quit smoking, their self-evaluation and cognitive functions improve. Research results were generally insignificant in elderly Chinese samples. Possible causes include: (1) memory is affected by many factors, e.g. genetics, work demands, psychosomatic diseases, etc.; (2) this research used a three-level ordinal variable other than smoking history to measure smoking; (3) cognitive function related to coding in Chinese context. Although the education variable was added to adjust coding, respondent answers may also have had measurement error; and (4) the research sample consisted of elderly aged 60 and over, consequently, smoking may not have only related to their present life but also to their early childhood, adolescence, and adult lives.

Conclusion
This study also had limitations: (1) Respondents were divided into three types: non-smokers, former smokers, and current smokers. In order to deeply mine the relationships between smoking and memory, future research could include a smoking frequency variable instead of a three-level variable. However, the current research lacked more detailed smoking indexes, e.g. the quality of cigarettes, combined index of the number of cigarettes and time.
(2) The current study extracted three factors to denote memory: self-evaluated memory, daily living abilities, and cognitive function. However, these indexes were obtained through self-assessment, which may have had measurement error. In fact, accurately measuring  (3) The PSM only remove confounding by observed variables, that is, if there are some unmeasured variables differ among the treatment group and the control group, then the estimate may be biased. And the gender is very different between smokers and non-smokers in Table 1, perhaps gender is a main influence factor for memory loss. However, the PSM cannot reflect this difference. Therefore, in order to obtain more information on smoking and memory, further studies are needed.