The Pakistan Earthquake Survey: Methodological Lessons Learned
By Leah Richardson, Moazzem Hossain and Kevin Sullivan
Acknowledgements go out to the survey teams as well as to those responsible for the close partnership between UNICEF, WFP, WHO and the Ministry of Health. Additionally, warm thanks to Rafah Aziz (UNICEF- Geneva), Mona Shaikh (WFP Pakistan), Shadoul Ahmed (WHO Pakistan), Rifat Anis (NIH, Pakistan), Zahid Larik (DDG, Nutrition Wing, MoH, Pakistan), Tehzeeb Ali (PHC Consultant, Pakistan), Fakhra Naheed (PO Nutrition Wing, MoH, Pakistan) and Shafat Sharif (Data Analyst, Eycon Solution, Pakistan) for all their hard work throughout the process.
Leah Richardson works as a Public Health Nutritionist in the Nutrition Service of the World Food Programme (WFP) headquarters. Her current interests are survey methods, nutrition in emergencies, and measuring mortality.
Moazzem Hossain is an Advisor at UNICEF NYHQ, Nutrition Section. He was coordinating Nutrition Assessment and responses in the earthquake affected areas of Pakistan during October 2005 - January 2006. He has vast experience in conducting Nutrition Assessment in different emergencies like drought, floods, conflicts and now the recent earthquake.
Kevin Sullivan is an Associate Professor in the Department of Epidemiology, Emory University, Atlanta. His areas of expertise include epidemiologic methods, micronutrient deficiencies, anthropometry, and survey methods.
On October 8, 2005 a strong earthquake - said to be the most powerful in the region in 500 years - hit the northeastern part of Pakistan. The result was massive destruction and catastrophic mortality, primarily in the upper Northwest Frontier Province (NWFP) and in Azad Jammu Kashmir (AJK). Emergency relief was initiated within days of the earthquake to deal with the most immediate needs and within weeks of the event, a Rapid Food Security and Nutrition Needs Assessment was conducted by WFP and UNICEF (with support from OXFAM). The results indicated that among the affected areas, most were rural. Nearly 2.5 million had lost their homes and the majority of the population was living in makeshift tents. More than half reported loss of all grain stock and 15% reported complete dependence upon charity/aid. The rice and maize harvest had been interrupted, livelihoods had been severely curtailed, and morbidity rates were high.
Prior to the earthquake, acute malnutrition had been a major public health problem (at national level 13% global acute malnutrition) and in light of the aggravating factors, the situation was expected to deteriorate. Various agencies involved in the response wanted a more specific and accurate figure of the malnutrition prevalence along with relevant health and vulnerability information that would assist in designing appropriate interventions in affected areas. Therefore, a nutrition and health survey was planned by UNICEF/WFP/WHO in coordination with the national Ministry of Health (MOH). A technical working group of the implementing agencies was formed with representation from all partners to oversee the survey implementation - from design to data analysis through to report writing. In this context, the partnership worked extremely well and was a valueadded step in the process. It could serve as a model for future assessments.
A man sets up his food stall
amongst the earthquake ruins
The principal objectives of the survey were to assess the nutritional status of children 6-59 months and their mothers, to estimate the crude mortality rates for the day of the earthquake as well as the pre/post earthquake rates, to determine the prevalence of morbidity, and to investigate food consumption patterns and household food security. Sample sizes were calculated for each of the survey populations using estimates of global acute malnutrition and crude mortality rates. Clusters were selected using the probability proportional to size methodology. Households were selected using systematic random sampling and household lists. Data were collected in the four surveys by six trained survey teams between 21 November and 25 December 2005.
Methodological Lessons Learned
A mother prepares food
in a makeshift tent.
Assessments conducted in times of crisis have limitations and problems brought about by (among other things) a lack of ready information, time constraints, and harsh/dangerous working conditions. This survey in Pakistan was no exception. Some of the problems encountered, mistakes made, and solutions found are just as valuable as the results. In sharing these experiences and lessons learned, the goal is to improve the quality of future assessments and to provide a platform from which to grow.
Lesson 1: At what level do you want your results to be representative?
The first challenge the technical working group faced was to create a study design that would capture separately the conditions of both the stable and moving populations affected by the earthquake. Creating a population sample frame was extremely difficult considering the ongoing migration and that those displaced (camps) were much more adversely affected than those who remained in their homes (communities). Additionally, the affected areas fell into two major political and geographical zones, Azad Jammu and Kashmir (AJK) and North West Frontier Province (NWFP) of Pakistan, which had different pre-disaster conditions and had not sustained damage proportionally. With these issues in mind, the struggle was to create a sampling frame that would translate into survey results representative of the different populations involved. Since the earthquake had affected the provinces unequally, and since the affected populations were living in both camps and in communities, four cross sectional surveys were conducted. In the NWFP, two separate surveys were conducted, one among those living in camps and the other in communities of Mansehra District which was one of the most affected districts. In the AJK a similar approach was used with one survey conducted in camps and the other in communities of Muzaffarabad District. Findings from these four surveys could then be used to provide specific information of the two population sub-groups in the two distinct areas. Furthermore, the results could be used in tandem to determine quantitatively which populations and/or area were more in need of specific services when compared to others, thereby illustrating the overall health, nutrition and food security situation. Although the exercise was relatively more expensive and time consuming than doing only one survey, it was found to provide essential information at a level of detail that would have been impossible if only one sampling frame had been used to provide one overall estimate.
Lesson 2: How much supervision is enough?
An overarching and integral factor in all surveys, including this one, is the need for consistent and meticulous supervision. Unfortunately, due to the overwhelming nature of the emergency, staff capacity was limited and the survey coordinating team was not able to designate one supervisory person for the full data collection, analysis and report writing. In the absence of oversight and supervision by one fully responsible person, especially during data collection period, the survey teams relied on their individual team supervisors and previous experience/knowledge. Hence, some of their initiatives deviated from the prescribed methodology and caused some complications during data analysis. Thorough training followed by careful supervision of the overall process by one responsible person or team is a pre-requisite for a smooth and high quality assessment.
Lesson 3: How do you place your clusters, and must you go to all of them?
Scenes of destruction
post earthquake in north
When external circumstances dictate that certain geographical areas are not accessible, and the accessibility will not change over the course of the assessment, these areas and populations should be excluded from your initial sampling universe. They have a zero probability of selection and have no purpose in the sampling universe. When accessibility is fluid (such as during times of conflict or, more context specific, under the threat of landslides or snow) it is recommended to keep those areas and populations in the sampling universe in case you might be able to reach them. Automatic exclusion of these areas may introduce bias into the results. Therefore, if there is substantial reason to believe that geographic areas may be unreachable, one potential solution is to estimate the number of clusters in these areas that may be unreachable. Then increase the overall number of clusters to be selected in order to ensure that the minimum required sample size is achieved (for example, selecting 33 clusters when you need 30 clusters but think that you may not be able to reach three.) Selecting more clusters based on the assumption that some may be unreachable is a reasonable approach.
The caveat for this sampling methodology is that if 33 clusters are selected with the hope of reaching at least 30, all accessible clusters must be included in the final sample. For example, if 33 clusters are selected, and only one of the 33 clusters is inaccessible, it is imperative that all 32 accessible clusters are included in the sample and that data collection does not stop with the first completed 30 clusters. Since the 33 clusters are selected using PPS, intentionally excluding clusters when 30 clusters have been sampled makes it a nonprobability sample, and therefore may lead to non-representative results.
A surveyor records details during the nutrition and health survey
In Muzaffarabad community and in the AJK camps it was decided that the risk of losing clusters was great enough to warrant selecting additional ones. In the context of this survey where 30 clusters were required for the desired sample size, an additional three clusters were selected to act as a protective buffer. This means, in effect, 33 clusters were selected from the sampling universe using PPS and the final survey design was 33 clusters of 20 households. The survey team began data collection and, in Muzaffarabad, one cluster out of 33 was inaccessible while in the AJK camps it turned out that all clusters were accessible. The methodological problem occurred when once 30 clusters had been included in the sample, data collection stopped and the remaining 3 clusters were excluded from the survey. This intentional exclusion had the potential of injecting bias into the results, especially if the 2 or 3 excluded clusters were disproportionately different from the included clusters (harder to access, more affected by the earthquake, no access to humanitarian relief, etc).
Once data analysis began the coordinating team realised that the data was potentially biased and had to apply some retrospective methods during the analysis to correct the problem. The most important lesson to take from this is if an 'alternative methodology' is used in designing a survey, it is important to adhere to the accompanying methodological requirements.
Lesson 4: Do you calculate required sample size? And is your sampling unit the household or individual?
Survey team training
Cluster sampling for nutrition surveys has historically often been conducted using a standard 30x30 approach (without calculating the survey/context specific required sample size) and using the WHO/Expanded Programme on Immunisation (EPI) method for household selection. Sampling methodology has been moving away from the standard approaches of always using the 30x30 design and use of the next-nearest household quota sampling of eligible individuals. In this survey a few more recent and highly regarded sampling techniques were applied.
Firstly, sample size was calculated based on assumed prevalences, desired precision, and assumed design effect. Hence, the standard 30x30 survey deviated into a smaller, faster and cheaper 30x20 sample. Secondly, systematic, random sampling using household lists was applied in each cluster in order to select the households (and to move away from the potentially biased EPI method of proximity sampling). In applying this method, it is necessary to pre-select the exact households included in the data collection, therefore the primary sampling unit becomes the household, instead of the child. This means that the children included in the sample are only to be located in these 20 pre-selected households. It is unlikely that there will be exactly one child per household and it is possible that there may be less than exactly 20 children to be found inclusive among all the selected households. This means that in some clusters, there may be less than 20 children (and in some clusters there may be more than 20 children). The other reason, in this survey, to have a quota of households was for mortality estimates which should be based on households, regardless of whether or not there are children in the household.
The survey teams, with experience from other surveys, were accustomed to using the standard proximity sampling approach where children were the primary sampling unit and exactly 30 children were sampled per cluster. When using this 'new' sampling techniques they became nervous about clusters where 20 households did not yield a minimum of 20 children. Thinking that this would jeopardise the survey results, and not understanding the rationale behind this alternate methodology, the teams decided that a quota system for children must be applied to each cluster that yielded less than 20 children. Consequently, survey teams selected additional households until the 20 children quota was found, resulting in some cluster data containing more than 20 households.
Weighing a child in a household
When the data were being cleaned and primed for analysis, the coordinating team recognised that a modification of the methodology had been made in the field by the survey teams. With the prescribed survey methodology there was no way that there could be more than 20 households per cluster, therefore the change in sampling was immediately apparent. As a solution to the problem this imposed, the households exceeding the initial 20 household range were excluded from the analysis leaving the original methodology intact. Here, the lessons learnt worked two ways in that rigorous supervision could avoid extra time during data cleaning, while rigorous data cleaning helps in controlling manageable mistakes made during data collection.
A survey team arrive in
The timely results of the survey played an important role in detailing the effects of the earthquake to the outside world, to agencies involved in the relief effort, and to donors interested in supporting the relief effort. Although there were problems encountered during this survey (as there is in every survey) coordinated interagency efforts ensured that the quality of the results were maintained. While high quality results are essential there is a high value in lessons learned - and shared.
For further information, contact: Leah Richardson, email: Leah.Richardson@wfp.org
Taken from Field Exchange Issue 28, July 2006