Setting and population
The cross-sectional study was performed in Amantani Island in Puno, Peru. All the people who answered the surveys had to meet the following eligibility criteria: a) Residents of Amantani (including adults and children) who live in the selected household b) At least more than 50% of the people in the household spoke Spanish. Those who refused, did not have the capacity to give informed consent to participate or were unable answer the questionnaire in the study were excluded.
Tool and validation process
We developed and validated a questionnaire to measure the health needs of the population of Amantani. Our instrument was based on the ENDES national survey [4]. It collected information on demographic, socioeconomic and health indicators of women of reproductive age (15–49 years) and their children under five residing in Peru. Its results have regional, departmental and national representativeness. ENDES is a Peruvian adaptation of the Demographic Health Survey that has been used in over 80 developing countries for over 30 years [5]. For the sake of this study, the section of women’s reproductive health was excluded for male participants.
After these adjustments, we performed two types of qualitative validation: a) expert validation and b) field validation. The first one, was performed by seven Peruvian health professionals with experience working in rural settings, who specialized in internal medicine, ophthalmology, infectious diseases, dentistry, reproductive health, public health and mental health. They reviewed all questions with special emphasis on their specialties. After collecting all of the comments and suggestions, the questionnaire was modified accordingly and sent back to the experts for their approval.
For field validation, we decided to conduct 10 health surveys in three households using a convenient sample. We made sure to interview at least one female, one male, an older person and a child. After finishing the health need questionnaire, we asked questions related to feasibility, relevance, and acceptability of the survey. We tried to establish whether or not the respondents felt comfortable answering the questions, there was an adequate understanding of the questions and the questionnaire duration was acceptable. Moreover, we asked if they felt that answering these questions was going to help the authors to accomplish the study objective, and if they had any suggestions for improving the survey.
It is important to emphasize that the assessment of the prevalence of diabetes and hypertension was based on self-report. No clinical procedures or laboratory tests were requested or performed for confirming the diagnosis. In the case of children under 5 years of age, questions were asked to their mothers and included queries related to pregnancy, antenatal care, delivery, the postpartum period, and breastfeeding.
The validated instrument had two levels of administration: household and individual. The dimensions of the questionnaire are as follows: a) Household Level: people living in the household, disability, water: availability, sources, quality of drinking water, sanitation, house material characteristics; and b) Individual Level: non-communicable diseases and risk factors, eye health, dental care, mental health, prevention of oncologic disease, obstetric care: prenatal, delivery, postnatal, knowledge and the symptoms and the transmission of tuberculosis, knowledge about the transmission of HIV/AIDS (Human immunodeficiency virus infection/Acquired immune deficiency syndrome), contraception: knowledge and practice.
Sampling
We conducted a two-stage cluster sampling, in which we first sampled households (clusters), and then individuals within households. Given that we were interested in multiple outcomes with unknown and wide-ranging prevalence, we assumed a conservative prevalence of 50% for the sample size calculation, in order to attain sufficient precision for all outcomes (as this results in the maximum sample size). In addition, we assumed a margin of error for the prevalence of 5%, and a confidence level of 95%. Finally, we assumed a population size of 4255 [1], and the average household size of four inhabitants in accordance to the last national census.
Following the methodology of Bennet et al. [6] for sample size calculation for clustered samples, and using the parameters described above, we obtained a sample size of approximately 150 households (clusters). Thereby, the desired precision for estimating the health needs in Amantani required sampling a minimum of 150 houses.
Selection of households
We used simple random sampling to select households using Google Earth. Then, using a random number generator, we randomly selected 150 households to be included in the study. Subsequently, we identified the coordinates of each of these houses, and created a geographic layer with their location. This layer was uploaded through Google maps to all of the mobile devices that were going to be used in the fieldwork. This allowed each interviewer to have precise directions to locate the selected households. A printed map was also available for areas with limited GPS access, as we knew in advance that several areas of the island had poor internet and telephone connectivity.
In cases where the selected households were abandoned or had no residents present at the time of the visit, we selected a replacement household from neighbouring houses according to a predefined random rule. Moreover, 22 households (12.4%) refused to participate in the survey and 4 household were excluded because they speak only in quechua (2.26%).
Data collection
The data collection was held in July 2017. We started the recruitment of volunteer fieldworkers through Facebook approximately 1 month beforehand. During the application process, we prioritized individuals with experience in community work in the health field. After selection, we were able to recruit nine volunteers (doctors, public health specialist, biologist, dentist and one medical student). Each of them were given a mobile phone with the questionnaires and maps of selected households downloaded and underwent two training sessions and a final test.
An informed consent was obtained from each participant. Only houses that met the study selection criteria were included. The individual-level questions were personally asked to the each of the adults living in the household, while parents/tutors answered the questions about the children. In the case of adolescents between 12 and 17 years, they could answer the questions provided they had the consent of their parents. All the data recollected was stored on each mobile phone and sent to a matrix database when Internet access was available.
Data analysis
We used Stata 14 Data Analysis and Statistical Software for data analysis, which was conducted both at the household and individual level. We summarized categorical variables using a relative frequency analysis while also presenting absolute values. For continuous variables, we used the arithmetic mean as a measure of central tendency, and the standard deviation as a measure of dispersion. For the bivariate and regression analysis, all numerical variables were categorised. For the bivariate analysis, we used the chi-square test. We chose to model the prevalence ratio (PR) directly using a Poison Model, as opposed to using the odds ratio as an approximation, because the odds ratio overestimates the prevalence ratio when the outcome is not uncommon (> 10%) [7]. Analyses took into account the characteristics of the survey design (clustering, sampling weights) by using the survey estimation commands (svy) in Stata.