Skip to main content

COVID-19 death risk predictors in Brazil using survival tree analysis: a retrospective cohort from 2020 to 2022



This study analyses the survival of hospitalized patients with Severe Acute Respiratory Syndrome (SARS) due to COVID-19 and identifies the risk groups for death due to COVID-19 from the identification of potential interactions between its predictors.


This was a retrospective longitudinal study with data from 1,756,917 patients reported in the Influenza Epidemiological Surveillance Information System from 26 February 2020 to 31 December 2022. In this study, all adult and older (≥ 20 years) patients were hospitalized with SARS due to COVID-19, with death as the outcome. Survival tree analysis was used to identify potential interactions between the predictors. A model was built for each year of study.


Hospital lethalitywas 33.2%. The worst survival curve was observed among those who underwent invasive mechanical ventilation and were aged 80 years or older in the three years of the pandemic. Black and brown race/color were predictors of deaths in the years 2020 and 2021 when there was greater demand from the health system due to the greater number of cases.


By applying survival tree analysis we identified several numbers of homogeneous subgroups with different risks for mortality from COVID-19. These findings show the effects of wide inequalities of access by the population, requiring effective policies for the reduction and adequate management of the disease.


Brazil is among the countries most affected by the COVID-19 epidemic, with a high number of cases, hospitalizations, and deaths. Many cases have evolved into the most severe forms of the disease, dependent on hospitalization and intensive care, especially among the most vulnerable patients and those with risk factors associated with the severity of COVID-19. This condition makes Brazil one of the world’s highest proportions of hospital deaths and the lowest hospital survival rates [1].

Data from a systematic review indicate that among cases of COVID-19, the rate of admission to the Intensive Care Unit (ICU) can be 21%, and 69% require mechanical ventilation [2]. Mortality for patients who were admitted to the ICU can be 28.3% and 43% for mechanically ventilated patients [2]. Hospital mortality from COVID-19 in Brazil (38%) [3] was higher than in other countries, such as Germany (22%) [4] and the United States (35.4%) [5] in the first year of the pandemic.

Extensive literature describes several individual risk factors and those related to the network of health services that can lead to the worsening of COVID-19 and generate a higher risk of death, even in hospital care. Among the individual risk factors are black or brown color/race, advanced age, male sex, and presence of comorbidities (especially obesity, diabetes, hypertension, cancer, and chronic kidney disease) [5,6,7,8].

A survey in Brazil showed that an increase in the number of serious cases in cities in the interior of Brazil caused an overload of small hospitals with few qualified human resources and ICU beds to manage this increase in demand, and the consequent need to transport these patients to large capitals overloads the health system throughout the country [9].

This overload also occurred due to the denialist stance of the Federal Government at the time, which resulted in a non-uniform and integrated response to COVID-19 throughout Brazil, even using treatments without robust scientific evidence of their effectiveness [10]. In contrast to the Federal Government, many municipal and state governments, media sectors, political parties, and the judiciary based their actions on scientifically based measures, in line with the efforts of scientific organizations that mobilize for the strengthening of the Unified Health System (SUS) [10]. Unfortunately, this did not completely prevent the disastrous consequences we faced, with more than 700,000 people dead from the disease [11].

However, despite the large volume of knowledge generated since its discovery in December 2019, there is still diversity between countries, regions, and cities in coping responses to COVID-19, with serious repercussions on the survival of these patients. Much of the research already carried out has not yet revealed which characteristics of patients can interact with each other and explain the survival situation of patients with COVID-19 in Brazil since previous studies focused on risk factors in isolation.

Therefore, this study sought to analyze the survival of hospitalized patients with Severe Acute Respiratory Syndrome (SARS) due to COVID-19 and to identify the groups at risk of death due to COVID-19 based on the identification of potential interactions between their predictors.


This was a retrospective longitudinal study with data from 1,756,917 patients notified in the Influenza Epidemiological Surveillance Information System (SIVEP-Gripe) from 26 February 2020 to 31 December 2022. The SIVEP-Gripe is an information system created by the Ministry of Health to record cases and deaths from SARS and COVID-19 in Brazil. In it, the notification of COVID-19 is compulsory and receives information from patients in public and private hospitals, as well as about those who died without hospitalization. The data were extracted from the OpenDataSUS ( website on 1 May 2023.

In this study, all patients aged 20 years or older (adults and elderly) who had a final classification of SARS due to COVID-19 and at least one day of hospitalization were included. Those with missing information or typing errors on the date of hospitalization, discharge date, or information on the evolution of the case (death or discharge) were excluded.


In this study, the following variables were evaluated: sociodemographic - sex (men, women), age (20 to 39, 40 to 59, 60 to 79, and 80 years or older), race/color (White, Black, Yellow, Brown, Indigenous), country’s macro-region (Midwest, North, Northeast, South, Southeast), clinical data, Intensive Care Unit Admission (Yes, No), Mechanical Ventilation (Invasive, Non-Invasive, No), and Risk Factor (No, One factor, Two factors, Three or more).

The risk factor variable refers to the number of comorbidities reported by the patient at the time of hospitalization, which includes the following risk factors: postpartum women, chronic cardiovascular disease, chronic hematological disease, chronic liver disease, asthma, diabetes mellitus, chronic neurological disease, chronic lung disease, immunosuppression, chronic kidney disease, and obesity.

For the survival analysis, the outcome (failure) was the occurrence of hospital death within a maximum of 90 days of hospitalization and the survival time, defined from the date of admission to the date of evolution (discharge or death). All individuals who were discharged within 90 days or remained hospitalized after this period were excluded.

Data analysis

All analyses were performed using R software ( To deal with the missing values of some variables, we used a single imputation with the Fully Conditional Specification (FCS) method implemented with the MICE Package [12].

After imputation, descriptive analyses (proportion, mean, and standard deviation) of the variables considered in the study were performed.

The hospital fatality rate was calculated as follows:

$$\frac{{Number\,of\,hospital\,deaths\,due\,to\,COVID - 19\,SRAG\,in\,the\,period}}{{Total\,number\,of\,reported\,hospitalizations\,due\,to\,COVID - 19\,SRAG\,in\,the\,period}}\, \times 100$$

The survival curve of the included cases was constructed using the Kaplan-Meier method. To identify different groups at risk of death from COVID-19, based on the interactions between the socioeconomic, demographic, and clinical characteristics of hospitalized patients, the survival tree (ST) method was used. The survival tree is a nonparametric technique that incorporates tree-structured regression models. From the survival tree, individuals were grouped according to their survival time and based on the independent variables. Thus, this technique allows the automatic detection of complex interactions between variables without the need to specify them a priori [13].

For the construction of the survival tree, survival time was defined as the time (days) of hospitalization. The patient’s sociodemographic and clinical characteristics were included in the tree as independent variables, and the patient’s status as a dependent variable: 1 (one) if death occurred within 90 days and 0 (zero) if there was no death within that period (discharge or hospital stay longer than 90 days).

ST groups patients according to survival time and independent variables, and the sample is divided into subgroups (nodes) based on an independent variable. First, the initial node (root node) of the tree is obtained, child nodes are created, and this procedure is repeated until the terminal node is reached [13]. This method automatically detects complex interactions between variables without the need to specify them in advance. ST was implemented in the statistical program R, using the Survival, LTRCtrees and Party.kit packages.

The risk of death was determined at each terminal node of the tree, and Kaplan-Meier curves were constructed for each terminal node. The minimum criterion for node division was defined as P < 0.05.

After defining the tree and the number of terminal nodes contained in the tree, a categorical variable was created to specify which terminal node the patient was included. Thus, considering that the tree contained k terminal nodes, the categorical variable comprised k categories (groups at risk of death). For obtaining the Hazard Ratio (HR) of death events at each terminal node, a univariate Cox model was fitted, having as an independent variable only the categorical variable that specifies the node to which the patient belongs since this variable was obtained from the independent variables considered in the tree. The terminal node-containing patients with the lowest risk of death were considered as the reference category. A model was built for each year of study.

Appraisal by the Ethics Committee of research involving human beings was not necessary, as this research was prepared only with secondary public data available online at official electronic sites. These databases do not contain personal or household identification data, which guarantees respect for the secrecy and privacy of the research participants’ information [14].


Among the 1,756,917 hospitalized patients with COVID-19 assessed in this study, 585,914 (33.2%) died.

A statistically significant difference was observed between all covariates under study and deaths due to COVID-19 (p < 0.001). In 2020 lethality was slightly higher in men (34.7%), aged 80 years or older (60.6%), of black race/color (37.3%), from the northeast region (42.3%) (Table 1). In 2021 lethality was slightly higher in women (33.7%), aged 80 years or older (59.1%), of black race/color (37.2%), and from the north region (40.7%). Similar results were found in 2022, with the difference being that the Northeast region had the highest lethality rate (35.6%) (Table 1).

Table 1 Sociodemographic characteristics of hospitalized adults and older adults with SARS due to COVID-19 in Brazil in 2020–2022

In the three years studied, we observed a higher lethality among patients with three or more risk factors, those who were admitted to the ICU, and required invasive mechanical ventilation. The average length of stay was shorter in 2020 (12.7 ± 13.7 days) among those who died (Table 2).

Table 2 Clinical characteristics of adults and older adults hospitalized with SARS due to COVID-19 in Brazil in 2020–2022

The survival tree results are shown in Fig. 1. Mechanical ventilation, age, ICU stay, number of risk factors, and race were selected by tree to group the cases. The best cut-off points for these predictors were determined using the survival tree algorithm. Mechanical ventilation and age were considered the most important predictors of death from COVID-19.

Fig. 1
figure 1

Survival tree for death events in adults and older people hospitalized for COVID-19 in Brazil, 2020–2022. Notes: Squares represent terminal nodes; the numbers (n) in the squares indicate the sample size; and the curves inside the squares show the estimated Kaplan-Meier survival of the subgroups. Circles represent the most significant variables for dividing the population into smaller groups.

The Kaplan-Meier curves for these groups are shown in Fig. 1. Ten groups were identified using the survival-tree algorithm in 2020. The curves of nodes 12, 14, 17, 18, and 19 were those with the worst survival curves showing the higher risk groups for death due to COVID-19 this year. The curves of the other nodes (5, 6, 7, and 10) had a higher probability of survival, that is, a lower risk of death from COVID-19. Node 19 had the highest risk of death from COVID-19 (HR: 16.20; 95%CI: 15.38–17.07). This group was composed of patients who used invasive mechanical ventilation, and 80 years or older. In 2021 twelve groups were identified using the survival-tree algorithm. The curves of nodes 14, 17, 18, 21, 22, and 23 were those with the worst survival curves and the curves of the other nodes (6, 7, 8, 9, and 12) had a higher probability of survival. The worst survival curve utilized the same variables as the previous year (HR: 12.58; 95%CI: 12.12–13.05). In 2022, only the use of invasive mechanical ventilation and age were identified as risk predictors for death by the survival tree, and only nine groups were identified using the survival-tree algorithm. Nodes 12, 15, 16, and 17 exhibited the poorest survival curves, while nodes 5, 6, 8, 9, and 11 had a higher probability of survival. However, the poorest survival curve was observed among those who underwent invasive mechanical ventilation, were younger than 80 years old, and did not require ICU admission (HR: 13.78; 95%CI: 12.81–14.82) (Table 3).

Table 3 Cox analysis of nodes identified using survival tree


The results of this study showed that 33.2% of adult and older patients hospitalized in Brazilian hospitals in 2021 died and that there was a set of characteristics associated with the observed mortality. Survival time decreased with the length of hospital stay. Invasive mechanical ventilation, older age, ICU stay, and black or brown race/color were significant predictors of death from COVID-19.

Patients aged 60–79 years who used invasive mechanical ventilation and did not go to the ICU were the group with the highest risk among those presented in the survival tree. This indicates that the severity of lung impairment, in this case, is detected by the need for invasive support, which reduces the survival of these patients. Advanced age is a well-documented risk factor in the literature, and the older the age group, the higher the risk of patients with COVID-19 [5, 15].

It was also observed that older people over 80 years of age who did receive invasive ventilatory support had a higher risk than their younger peers. This result points to a possible non-institutionalization of the advanced support routine for older people with this need, increasing the mortality in this group. It is known that with the exponential increase in cases of COVID-19, there was saturation of the health system across the country, which meant that there was a shortage of ICU beds and, consequently, advanced ventilatory support in several regions [9]. This may have meant that older patients did not receive all the resources necessary for the proper management of their health conditions, which further exposes the vulnerabilities faced by the older population in the country.

Race/color is another determinant of mortality and survival in the Brazilian population, identified in the first two years of the pandemic. Thus, in situations of scarcer health resources, black and brown people become more vulnerable. A study with data from the 2013 National Health Survey that analyzed the factors associated with poor access to health services found that individuals with brown/black skin color, residing in the North or Northeast regions had a higher proportion of poor access to services [16]. Data on morbidity and mortality from COVID-19 according to race/color in Brazil and the United States indicate a greater impact of the disease on the black population. Even with the low completeness of data on race/color, which makes more robust research unfeasible, it is possible to verify that despite the greater hospitalization in the white population, the highest mortality occurred in the black and indigenous populations [17].

These data were confirmed in a previous study on mortality from COVID-19 in Brazil according to ethnic and regional variations, which observed an increase in mortality among brown and black people and those who live in the northern region [18]. Race/color is a determining factor in access to health services, especially in the ICU, which is an extremely necessary environment for the care of patients with more serious illnesses.

A literature review points out that racial inequalities in access to health services are long term and persist today. In brown and black populations, socioeconomic inequalities and inequalities in access to social and health services overlap, and the accumulation of these disadvantages affects living and health conditions [19].

Despite important advances in the Unified Health System (SUS), which defines better health outcomes in Brazil [20], there are still weaknesses in the quality of services offered. The complex public-private relationship in the provision of health services, associated with deep regional inequalities and the underfunding of the system, are still challenges that need to be overcome to improve the quality of the population’s health [21].

The presence of one or more risk factors proved to be variables that had less impact on the mortality of these patients than others, such as age. Thus, age seems to be the most significant factor contributing to the mortality of patients with COVID-19. In addition, the difficulty of properly diagnosing severity, which is determined by clinical and radiological manifestations, and the lack of beds in several regions of the country, as mentioned earlier, made this group much more vulnerable during the pandemic.

Immunization in Brazil began in January 2021, with older adults being priority groups during the vaccination campaign. The high risk of mortality in this group, even after vaccination, may be due to the drop in neutralizing antibodies among those immunized with CoronaVac, which was the vaccine most used in older adults at the beginning of the campaign [22] and/or the emergence of variants such as Delta, which have a higher lethality rate when compared to the original virus [23].

Although previous studies have shown the risk factors for death from COVID-19, the authors did not find any study that showed the effect of the interaction between risk factors.

This study had some limitations which must be mentioned. First, there may be biases in filling out patient information, which is common in all observational studies. Second, all the patients surveyed were hospitalized for SARS; therefore, it is necessary to consider that the mortality presented in this study is only in severe cases, thereby preventing the generalization of these results. Another point to be mentioned is related to missing data, which is also inherent to records in attendance forms and in large databases. However, an attempt was made to reduce this limitation by imputing data.


Analysis of the results indicated that the use of invasive mechanical ventilation, ICU stay, advanced age, and black and brown race/color were important risk factors for death due to COVID-19. These findings highlight the effects of broad social and racial inequalities that make a population group more vulnerable to infection by the virus. They also highlight the requirement of effective policies aimed at reducing the poor access of the population to tests necessary for the correct diagnosis and management of the disease.

Data availability

All data used in this research is publicly available on the OpenDataSUS ( website.



Intensive Care Unit


Severe Acute Respiratory Syndrome


Gripe-Influenza Epidemiological Surveillance Information System


Fully Conditional Specification


Survival Tree


Unified Health System


  1. Mathieu E, Ritchie H, Rodés-Guirao L et al. Coronavirus (COVID-19) Deaths, [2022, accessed 29 October 2022].

  2. Chang R, Elhusseiny KM, Yeh YC, Sun WZ. COVID-19 ICU and mechanical ventilation patient characteristics and outcomes—A systematic review and meta-analysis. PLoS ONE. 2021;16:1–16.

    Google Scholar 

  3. Ranzani OT, Bastos LSL, Gelli JGM, Marchesi JF, Baião F, Hamacher S, et al. Characterisation of the first 250 000 hospital admissions for COVID-19 in Brazil: a retrospective analysis of nationwide data. Lancet Respir Med. 2021;9:407–18.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Karagiannidis C, Mostert C, Hentschker C, Voshaar T, Malzahn J, Schillinger G, et al. Case characteristics, resource use, and outcomes of 10 021 patients with COVID-19 admitted to 920 German hospitals: an observational study. Lancet Respir Med. 2020;8:853–62.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Gupta S, Hayek SS, Wang W, Chan L, Mathews KS, Melamed ML, et al. Factors Associated with Death in critically ill patients with Coronavirus Disease 2019 in the US. JAMA Intern Med. 2020;180:1436.

    Article  CAS  PubMed  Google Scholar 

  6. Ferreira JC, Ho Y-L, Besen BAMP, Malbouisson LM, Taniguchi LU, Mendes PV, et al. Protective ventilation and outcomes of critically ill patients with COVID-19: a cohort study. Ann Intensive Care. 2021;11:92.

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Lana CN, A, Santana JdaM, Souza GB, Souza LMSde. Determinantes sociais da saúde e óbitos por Covid-19 nos estados da região nordeste do Brasil. Rev Bras Saúde Func. 2020;11:18–29.

  8. Wang Z, Deng H, Ou C, Liang J, Wang Y, Jiang M, et al. Clinical symptoms, comorbidities and complications in severe and non-severe patients with COVID-19. Med (Baltim). 2020;99:e23327.

    Article  CAS  Google Scholar 

  9. Nicolelis MAL, Raimundo RLG, Peixoto PS, Andreazzi CS. The impact of super-spreader cities, highways, and intensive care availability in the early stages of the COVID-19 epidemic in Brazil. Sci Rep. 2021;11:13001.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  10. Campos GWS. O pesadelo macabro Da Covid-19 no Brasil: entre negacionismos e desvarios. Trab educ saúde. 2020;18(3):e00279111.

    Article  Google Scholar 

  11. Rocha L. Com redução no ritmo, Brasil ultrapassa marca de 700 mil mortes por Covid-19. CNN Brasil, [2023, accessed 29 November 2023].

  12. van Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45. Epub ahead of print.

  13. Bou-Hamad I, Larocque D, Ben-Ameur H. A review of survival trees. Stat Surv; 5. Epub ahead of print 1 January 2011.

  14. Brasil. Resolução no 510, de 07 de abril de 2016. Dispõe sobre as normas aplicáveis a pesquisas em Ciências Humanas e Sociais. Diário Oficial [da] República Federativa do Brasil, Brasília, DF, 24 maio 2016, [2016, accessed 1 November 2021].

  15. Li J, Huang DQ, Zou B, Yang H, Hui WZ, Rui F, et al. Epidemiology of COVID-19: a systematic review and meta‐analysis of clinical characteristics, risk factors, and outcomes. J Med Virol. 2021;93:1449–58.

    Article  CAS  PubMed  Google Scholar 

  16. Dantas MNP, de Souza DLB, de Souza AMG, Aiquoc KM, Souza TA, Barbosa IR. Fatores associados ao acesso precário aos serviços de saúde no Brasil. Rev Bras Epidemiol. 2020;24:e210004. Epub ahead of print 2021.

    Article  PubMed  Google Scholar 

  17. de Araújo EM, Caldwell KL, Santos MPA, de dos, Souza IM, Santa Rosa PLF, Santos ABS et al. dos, Morbimortalidade pela Covid-19 segundo raça/cor/etnia: a experiência do Brasil e dos Estados Unidos. Saúde em Debate. 2020;44:191–205.

  18. Baqui P, Bica I, Marra V, Ercole A, van der Schaar M. Ethnic and regional variations in hospital mortality from COVID-19 in Brazil: a cross-sectional observational study. Lancet Glob Heal. 2020;8:e1018–26.

    Article  Google Scholar 

  19. da Silva NN, Favacho VBC, Boska G, de Andrade A, Merces EDC, Oliveira NPD. Access of the black population to health services: integrative review. Rev Bras Enferm. 2020;73:e20180834. Epub ahead of print 2020.

    Article  PubMed  Google Scholar 

  20. Campello T, Gentili P, Rodrigues M, Hoewell GR. Faces da desigualdade no Brasil: um olhar sobre os que ficam para trás. Saúde em Debate. 2018;42:54–66.

    Article  Google Scholar 

  21. Viacava F, de Oliveira RAD, Carvalho C, de Laguardia C, Bellido J. SUS: oferta, acesso e utilização de serviços de saúde nos últimos 30 anos. Cien Saude Colet. 2018;23:1751–62.

    Article  PubMed  Google Scholar 

  22. Mok CKP, Cohen CA, Cheng SMS, Chen C, Kwok KO, Yiu K, et al. Comparison of the immunogenicity of BNT162b2 and CoronaVacv COVID-19 vaccines in Hong Kong. Respirology. 2022;27:301–10.

    Article  PubMed  Google Scholar 

  23. Twohig KA, Nyberg T, Zaidi A, Thelwall S, Sinnathamby MA, Aliabadi S, et al. Hospital admission and emergency care attendance risk for SARS-CoV-2 delta (B.1.617.2) compared with alpha (B.1.1.7) variants of concern: a cohort study. Lancet Infect Dis. 2022;22:35–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


This work was supported by the Foundation for Research and Scientific and Technological Development of Maranhão (Fundação de Pesquisa e Desenvolvimento Científico e Tecnológico do Maranhão– FAPEMA) and Coordination for the Improvement of Higher Education Personnel (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) (Finance Code 001).


There was no funding for this study.

Author information

Authors and Affiliations



CMN, MRFCB, AMS, and BLCAO contributed to the study conception, design, and formal analysis. The first draft of the manuscript was written by Carlos Martins and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Carlos Martins Neto.

Ethics declarations

Ethics approval and consent to participate

Appraisal by the Ethics Committee of research involving human beings was not necessary, as this research was prepared only with secondary public data available online at official electronic sites. These databases do not contain personal or household identification data, which guarantees respect for the secrecy and privacy of the research participants’ information [12].

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martins Neto, C., Branco, M.d.R.F.C., dos Santos, A.M. et al. COVID-19 death risk predictors in Brazil using survival tree analysis: a retrospective cohort from 2020 to 2022. Int J Equity Health 23, 33 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: