Intensive and Critical Care Nursing
Volume 26, Issue 2 , Pages 64-68, April 2010

Preparing research instruments for use with different cultures

  • Ruth Endacott

      Affiliations

    • Faculty of Health, University of Plymouth, UK
    • Division of Nursing and Midwifery, La Trobe University, Melbourne, Australia
    • Corresponding Author InformationCorresponding author at: Faculty of Health, Centre Court, Drake Circus, Plymouth PL4 8AA, UK. Tel.: +44 1752 587488; fax: +44 1752 586748.
  • ,
  • Julie Benbenishty

      Affiliations

    • Hadassah Hebrew University Medical Center, General Intensive Care Unit, PO Box 12000, 91120 Jerusalem, Israel
    • Tel.: +972 2 6778060; fax: +972 2 6430349.
  • ,
  • Myriam Seha

      Affiliations

    • Spital Maennedorf, Asylstr, 8708 Maennedorf, Switzerland
    • Tel.: +41 44 922 20 60; fax: +41 44 922 20 67.

Accepted 14 December 2009.

Article Outline

Summary 

There is a growing requirement to use standardised instruments for collecting research data and monitoring patient progress. Two sets of properties should be addressed when selecting and adapting research instruments: psychometric properties (validity, appropriateness, reliability, and responsiveness) and clinical properties (feasibility and acceptability of the instrument). This paper outlines steps necessary to fulfil these requirements when using a research instrument in different cultures.

Keywords: Research instruments, Validity, Reliability, Translation, Multicultural, International

 

Back to Article Outline

Introduction 

The requirement to adapt research instruments for use in different cultures is evident in two trends:

(a)the increase in people from different cultural backgrounds in the same population (for example, relatives of ICU patients);

(b)the increase in studies designed to examine practices across different cultural contexts.

In order to compare populations that may be culturally diverse, it is important to use standardised instruments (Sumathipala and Murray, 2006) hence the preference for translating instruments rather than developing new instruments (see for example Stricker et al. (2007) adaptation of a measure for family satisfaction in ICU and the adaptation of post-traumatic stress disorder measures in ICU as described by Jones et al. (2007)). The level of detail provided in reports of studies using translated research instruments varies considerably, reflecting the primary purpose of the paper. For example, a paper by Larsson et al. (2007) focused on the processes used to translate Confusion Assessment Method for ICU (CAM-ICU) for use in Swedish ICUs whereas a paper by Jones et al. (2007) focused on the results of a study that used a tool translated into three languages. Emphasis is often placed on ensuring adequate and appropriate translation of instruments; this is not surprising given the heavy influence of culture on language. The ETHICATT study investigators (Sprung et al., 2007) found that the questionnaire used to examine attitudes to end of life in nurses, physicians, patients and families included expressions for which there was no equivalent in some languages.

There are a number of factors to be considered in addition to instrument translation. In general, two sets of properties should be addressed when selecting and adapting research instruments: firstly, psychometric properties (validity, appropriateness, reliability, and responsiveness) and, secondly, clinical properties (for example, feasibility/acceptability of the measure). These will be addressed in this paper, specifically as they relate to measures to be used in different cultures.

Back to Article Outline

Validity 

Validity describes the extent to which the instrument measures what it is intended to measure. Research instruments in themselves are not inherently valid (Curtis, 2003); validity has to be established for specific contexts and populations. For example, the ICU Palliative Care Quality Measures were developed in the United States over a 2-year period in consultation with over 200 ICU clinicians from 43 ICU teams. They were then subjected to extensive and rigorous pilot testing (for a full description of these processes, see Nelson et al., 2006). However, validity and reliability established through this process does not mean that the tool can be automatically applied in other settings. In consultation with Nelson, work is underway by the authors (RE, JB, and MS) to assess validity and reliability for the measure in the UK, Israel and Switzerland.

Traditionally validity has been conceptualised as:

content validity, which examines the extent to which a measure contains a comprehensive sample of items that are relevant to the area of interest;

criterion validity, which examines the extent to which a measure provides results that are consistent with a gold standard;

construct validity, which involves forming theories about the attribute and then assessing the extent to which the measure provides results that are consistent with the theories. The construct validity of the ICU Palliative Care Quality Measures tool (Nelson et al., 2006) will also be assessed by the authors using focus groups to examine what clinicians in the three countries consider to represent a ‘good’ dying process.

Each of these types of validity should be considered when research instruments require translation.

Translating research instruments 

The overall goal of instrument translation is to provide evidence that the meaning of items in the translated version is equivalent to items in the original language (Varrichio, 2004). Specific guidelines exist for the translation of research instruments designed to measure patient-related outcomes (Wild et al., 2005); these guidelines recommend a seven-stage translation process (see Fig. 1).

  • View full-size image.
  • Figure 1. 

    Seven-stage translation process (adapted from Wild et al., 2005). Preparation; forward translation/reconciliation; back translation; back translation review; harmonisation; cognitive debriefing; review of cognitive debriefing results and finalisation.

A detailed presentation of these processes to translate Confusion Assessment Method for ICU (CAM-ICU) for use in Swedish ICUs is provided by Larsson et al. (2007). An alternative approach, using translation/back translation followed by a nominal group technique, is described in detail by Sumathipala and Murray (2006). Achieving consensus is a key goal of both of these approaches; in order to achieve this, Duffy (2006) suggests that those reviewing the translated instrument should answer three questions for each item:

1.what does this item mean to you? (to confirm item meaning);

2.how clear is this item to you? (to confirm item clarity);

3.how relevant is this item to you? (to confirm item relevance).

This last question is particularly culturally dependent and may be overlooked when selecting and adapting research instruments. Two authors of this paper (JB/MS) report that it is unusual to use ALL questions in a questionnaire developed and validated in English because some items are not relevant to other cultures.

Construct validity (sometimes referred to as ‘conceptual equivalence’) is particularly important if there is no language/lexical equivalence (Temple, 1997). Some examples of challenges with language equivalence are presented in Table 1. Language is a crucial consideration even when the same word is used; for example, the cultural sensitivities attached to the term ‘euthanasia’ are eloquently portrayed by Michalsen and Reinhart (2006).

Table 1. Linguistic challenges encountered when translating research instruments.
Study (authors)Challenges
Translation of CAM-ICU for use in Swedish ICUs (Larsson et al., 2007)The word ‘delirium’ in Swedish culture can be linked to alcohol abuse
‘Confusion’ might be interpreted as an insulting term
Translation of the Bradford Somatic Inventory (Sumathipala and Murray, 2006)Translating the phrase ‘heart pounding’ led to two phrases in Sinhala but back translation meant that the two phrases meant ‘speeding’ and ‘a feeling of forceful tapping’
Exploration of spinal cord injury (Chen and Boore, 2010)Translation of ‘suffering’ from Chinese results in the word ‘pain’.
ETHICUS study (Benbenishty et al., 2006)The definition of brain death was problematic in some European countries
Physical Restraint in Intensive Care in Europe [PRICE] study (Benbenishty and Adam, 2006)Whilst there were no problems translating the research instrument, the data collection instructions required considerable revision

Birbili (2000) suggests that different factors may affect the quality of the translation, depending on who undertakes the translation:

1.if the researcher is also the translator, the quality of translation may be influenced by factors such as: the background of the researcher–translator; the researcher's knowledge of the language and culture of the people under study; and the researcher's fluency in the language of the write-up.

2.if the researcher and the translator are not the same person, the quality of translation can be influenced by the position the translator holds in relation to the researcher and the competence/background of the translator. If the translator is not a health care worker, his/her understanding of nursing and medical concepts may be inaccurate or incomplete.

The extent to which translators also make decisions about punctuation and inflection will depend on the type of material being translated (Chen and Boore, 2010). It is generally accepted that any translation is subject to flaws: “even an apparently familiar term or expression for which there is direct lexical equivalence might carry ‘emotional connotations’ in one language that will not necessarily occur in another” (Birbili, 2000).

Research instruments may be subject to copyright hence their use or amendment requires permission of the original authors. Regardless, it is good practice to seek involvement of the original authors in order to assess harmonisation (or consistency) with the original version of the tool. This is particularly important if the researchers will seek in future to compare findings across a number of studies.

Avoiding bias 

Selection bias can be introduced when decisions are made about which languages to translate research instruments into; such decisions must be guided by the primary aim of the study. For example, if the aim is to explore nurses’ opinions on a good death in ICU, involving families during CPR, moral distress amongst ICU nurses or conflicts in the ICU across Europe then the instruments must be translated into as many European languages as possible. However, if the aim of the study is to compare practice in two specific countries then limiting the languages used is acceptable. This decision-making about choice of language is explicit in some papers (e.g. Latour et al., 2009), whilst other authors leave these questions unanswered (Sprung et al., 2007). Latour et al. (2009) describe a pragmatic approach to language decisions: as their respondents were all conference delegates, the languages used for their questionnaire (in addition to English) were based on location of the conference and country of origin for the majority of delegates.

Ethical acceptability 

The translation of technical terms for clinicians to complete research instruments is relatively straightforward (it is easier to achieve consistency where there is a central lexicon of practice); translation of the participant information sheet and consent form requires researchers to satisfy ethical principles of: informed consent, beneficence/non-maleficence and autonomy/right to withdraw.

Translating research findings 

Issues around translation in research may not arise until after data collection, when the researcher seeks to present the data in another language. In the case of qualitative research, Chen and Boore (2010) suggest that translation of concepts and categories is undertaken, rather than verbatim translation of all data transcripts, with involvement of an expert panel to verify translation decisions. However, this remains particularly problematic when qualitative data excerpts, often in the patient or relative's own words, form a key part of a conference presentation.

Back to Article Outline

Appropriateness 

Appropriateness is a term used to describe whether the range of the construct measured within the sample is similar to the range which is covered by the instrument. In essence this reflects how relevant the instrument is to the population being examined. For example, a useful outcome measure must provide room on the scale to demonstrate improvement and deterioration. Appropriateness is assessed by looking at the scale score distributions of the instrument with regard to the range, mean, standard deviation of scores, as well as the floor and ceiling effects. If the floor and ceiling effects are high, it suggests that the spectrum of the scale is too limited to detect some of the changes which may occur. If a high percentage of people score the lowest possible score, thus producing a floor effect, then there is no room for further deterioration to be detected. Conversely if a high percentage of people score the highest possible score, producing a ceiling effect, then there is no room to detect improvements that might be occurring. It therefore follows that high floor and ceiling effects limit the potential responsiveness of an instrument, particularly when the sample is restricted to specific sub-samples or settings. Further, if there are differences between responses by country then data analysis may be restricted to subsets of the original dataset, requiring large numbers from each country or elaborate re-coding processes.

Back to Article Outline

Reliability 

Regardless of the strength of previous reliability testing, it is important to establish reliability when a tool is used in a different setting. Reliability refers to the extent to which the measure is consistent and minimises random error (its repeatability). Aspects of reliability to be addressed include: instrumental reliability (reliability of measurement device), rater reliability (reliability of the person administering the measurement device) and response reliability (reliability/stability of the variable being measured). These aspects are addressed through establishing equivalence, stability and internal consistency for the instrument.

Equivalence 

Establishes whether an instrument produces consistent measurements, for a given entity, in the hands of two or more investigators (raters) or when utilised in two different forms. If data are to be collected on any variable in a study by more than one investigator, inter-rater reliability needs to be considered. Tools used in clinical practice are almost always required to have inter-rater reliability, since different practitioners are likely to use them to assess the same individual (for example CAM-ICU).

Stability 

Denotes the extent to which an instrument performs consistently when used to measure the same entity on repeated occasions, i.e. the extent to which measurements are repeatable. It is important to distinguish between stability of the instrument (e.g. CAM-ICU) and stability of the entity that it seeks to measure (e.g. confusion). Stability is usually determined for a single rater/investigator and is referred to as intra-rater reliability or test–retest reliability. It involves serial measurements of an entity by a single rater. Attempts to obtain intra-rater reliability are still prone to errors arising from variation in the investigator's performance.

Internal consistency 

It is a measure of the homogeneity of a multi-item instrument and is usually achieved by including at least two items that measure the same aspect of the construct.

Back to Article Outline

Responsiveness 

A measure is said to be ‘responsive’ if it is sensitive to interventions. For example quality of life measures should be responsive to interventions that change quality of life. Evaluating responsiveness requires assessing quality of life relative to an external indicator of change

Generic instruments are most useful in discriminating and making comparisons of different disease states for determining severity of disease impact and cross-condition comparisons. Disease-specific instruments can assess limitations or restrictions associated with particular disease states and may be more responsive to minimally significant changes.

Back to Article Outline

Feasibility and acceptability 

Although a research instrument may be widely accepted internationally (for example, the generic SF-36 for measuring health and well-being) it may not be specific enough for the culture in which the research is being conducted or the illness under study. In this case, the investigator would use the generic tool along with a specific tool relevant to the disease/context. The inclusion of a generic measure allows comparison of the study sample (for example family members of ICU patients) with other populations.

One aspect of the afore-mentioned review of Palliative Quality Measures for ICU (Nelson et al., 2006) by the authors of this paper is the feasibility and acceptability of all measures in the three countries (UK, Israel and Switzerland). Removal of some dimensions of the tool is necessary because they are not practised in a particular country.

Back to Article Outline

Conclusions 

Clinicians are increasingly encouraged to use outcome measures to assess the effectiveness of interventions, for example Short-Form-36 (SF-36), Hospital Anxiety and Depression Scale (HADS) or the Impact of Events Scale (IES). It is important to appreciate the steps necessary to adapt research instruments for use in different cultures, both for collecting research data and for monitoring patient progress. The increasingly multicultural population of ICU patients and families also brings a responsibility to ensure that adequate understanding is achieved when explaining the results of assessments to patients and families with a different linguistic background. This is particularly important in countries where the education background of the patient/family may create inequity in the level of understanding achieved.

Back to Article Outline

References 

  1. Benbenishty J, Adam S. Physical restraint in European ICUs. Intensive Care Med. 2006;32(S1):S107
  2. Benbenishty J, Ganz FD, Lippert A, Bulow H-H, Wennberg E, Henderson B, et al. Nurse involvement in end-of-life decision making: the ETHICUS study. Intensive Care Med. 2006;32:129–132
  3. Birbili M. Translating from one language to another. Social Research Update 2000; 31. Available from: http://www.soc.surrey.ac.uk/sru/SRU31.html [accessed 11.11.2009].
  4. Chen HY, Boore JRP. Translation and back-translation in qualitative nursing research: methodological review. J Clin Nurs. 2010;19:234–239
  5. Curtis JR. Measuring health status after critical illness: where are we and where do we go from here?. In:  Angus DC,  Carlet J editor. Surviving intensive care. Berlin: Springer; 2003;
  6. Duffy ME. Translating instruments into other languages: basic considerations. Clin Nurse Spec. 2006;20(5):225–226
  7. Jones C, Backman C, Capuzzo M, Flaaten H, Rylander C, Griffiths RD. Precipitants of post-traumatic stress disorder following intensive care. Intensive Care Med. 2007;33:978–985
  8. Larsson C, Granberg , Axel A, Ersson . A confusion assessment method for the intensive care unit (CAM-ICU): translation, retranslation and validation into Swedish intensive care settings. Acta Anaesthesiol Scand. 2007;51:888–892
  9. Latour JM, Fulbrook P, Albarran JW. EfCCNa survey: European intensive care nurses’ attitudes and beliefs towards end-of-life care. Nurs Crit Care. 2009;14:110–121
  10. Michalsen A, Reinhart K. “Euthanasia”: a confusing term, abused under the Nazi regime and misused in present end of life debate. Intensive Care Med. 2006;32:1304–1310
  11. Nelson JE, Mulkerin CM, Adams LL, Pronovost PJ. Improving comfort and communication in the ICU: a practical new tool for palliative care performance, measurement and feedback. Qual Saf Health Care. 2006;15:264–271
  12. Sprung C, Carmel S, Sjokvist P, Baras M, Cohen SL, Maia P, et al. Attitudes of European physicians, nurses, patients, and families regarding decisions: the ETHICATT study. Intensive Care Med. 2007;33:104–110
  13. Stricker K, Niemann S, Bugnon S, Wurz J, Rohrer O, Rothen H. Family satisfaction in the intensive care unit: cross-cultural adaptation of a questionnaire. J Crit Care. 2007;22:204–211
  14. Sumathipala A, Murray J. New approach to translating instruments for cross-cultural research: a combined qualitative and quantitative approach for translation and consensus generation. Int J Methods Psychiatr Res. 2006;9(2):87–95
  15. Temple B. Watch your tongue: issues in translation and cross-cultural research. Sociology. 1997;31:607–618
  16. Varrichio C. Measurement issues concerning linguistic translations. In:  Frank-Stromberg M,  Olsen S editor. Instruments for clinical health care research. 3rd ed.. Boston: Jones and Bartlett; 2004;
  17. Wild D, Grove A, Martin M, Eremenco S, McElroy S, Verjee-Lorenz A, et al. Principles of good practice for the translation and cultural adaptation process for patient reported outcome (PRO) measures: report of the ISPOR task force for translation and cultural adaptation. Value Health. 2005;8:94–104

PII: S0964-3397(09)00115-3

doi:10.1016/j.iccn.2009.12.005

Intensive and Critical Care Nursing
Volume 26, Issue 2 , Pages 64-68, April 2010