Skip to main content

The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings



Racism and associated discrimination are pervasive and persistent challenges with multiple cumulative deleterious effects contributing to inequities in various health outcomes. Globally, research over the past decade has shown consistent associations between racism and negative health concerns. Such research confirms that race endures as one of the strongest predictors of poor health. Due to the lack of validated Australian measures of racist attitudes, RACES (Racism, Acceptance, and Cultural-Ethnocentrism Scale) was developed.


Here, we examine RACES’ psychometric properties, including the latent structure, utilising Item Response Theory (IRT). Unidimensional and Multidimensional Rating Scale Model (RSM) Rasch analyses were utilised with 296 Victorian primary school students and 182 adolescents and 220 adults from the Australian community.


RACES was demonstrated to be a robust 24-item three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RSM Rasch analyses provide strong support for the instrument as a robust measure of racist attitudes in the Australian context, and for the overall factorial and construct validity of RACES across primary school children, adolescents, and adults.


RACES provides a reliable and valid measure that can be utilised across the lifespan to evaluate attitudes towards all racial, ethnic, cultural, and religious groups. A core function of RACES is to assess the effectiveness of interventions to reduce community levels of racism and in turn inequities in health outcomes within Australia.


Racism and associated discrimination are pervasive and persistent challenges that permeate contemporary society, with multiple cumulative deleterious effects on the health of all people. Research consistently confirms that race is one of the strongest predictors of health outcomes, with racism a fundamental cause of such inequalities [1, 2]. Positive social contact is essential for social, psychological, and physiological health and development throughout the lifespan; individuals who experience social isolation or rejection, including as a result of inter- and intra-racial racism, are susceptible to various behavioural, emotional, and physical problems, and negative educational, economic, and social outcomes [3, 4]. Racist attitudes result in poor physiological outcomes, negative mental health outcomes, and general psychopathology in various minority racial, ethnic, cultural, and religious groups in numerous societies with immigrant and Indigenous populations [57]. Racism is also a key influence on common psychiatric conditions such as mood, anxiety, and eating and substance use disorders. Moreover, when groups are relentlessly depicted as problematic and undesirable, these stereotypes are internalised, with negative consequences for both dominant and non-dominant groups (cf. [811]).

Although a change in one’s beliefs or attitudes toward a stereotyped group may or may not lead to changes in behaviour toward members of that group [12], attitude change is an essential component of reducing community levels of racism. Measurement is therefore fundamental in discussions of improving racial attitudes [13]. Quantifying racism is challenging, however, requiring differentiation of its multiple dimensions and the range of potential reactions and responses to exposure to racism.

Measuring racism

Racism research has historically concentrated on two alternate and distinct methods of measurement. The majority of investigations have examined the effects of racism by concentrating on victims of perceived racism, and evaluating the frequency and intensity of racist events on individuals (for reviews see [5, 6, 1417]). Less attention has been paid to racist attitudes held by individuals. Even so, over 100 instruments exist which assess explicit racist attitudes and 24 are available to evaluate perceived racism [14, 18]. Most of these have not been appropriately validated, the tools often fail to meet minimum standards required for scientific attitude scales (fewer than 5 % of studies address a sufficient range of reliability and validity indices for the instrument to be considered valid), and they are often used indiscriminately. In addition, most measures of racist attitudes relate to anti-African American attitudes and are validated only for US populations. These scales may not necessarily be relevant, generalisable, valid, or useful in alternate settings. Further, direct extrapolation of US experiences and research is inappropriate for the Australian context [19], given the distinctive histories and experiences of Aboriginal Australians and African Americans; nature of colonial relations; extensiveness of genocidal pasts; relative size of populations; level of visibility; and extent of reduced social, economic, and health status [20]. Dissimilar patterns of cultural diversity across the two countries also render problematic the direct transfer of US measures to Australia.

Despite these problems, Australian researchers have often uncritically imported and utilised US concepts and tools [19, 20]. Several Australian scales have been developed, but these either concentrate on a specific group (e.g., Indigenous Australians; [19]) or lack a robust research base and peer evaluation of their empirical development and validation (e.g., [21]). This gap is especially apparent for youth: here the available instruments are limited to measures of social distance and stereotyping (e.g., [22, 23]); those adapted from non-Australian measures used without further validation (e.g., [24]); and instruments requiring extrapolation from participant responses, raising questions of reliability and validity (e.g., [25]).

Moreover, Australian studies of racism have predominantly been conducted as if racism existed only between White non-Indigenous and Indigenous Australians [26], with the first systematic investigation of racist attitudes in a minority group conducted only recently [27]. This is problematic because of community diversity in Australia, the varying characterisations of non-Australians versus Australians [9], and evidence that distinct racial, ethnic, cultural, and religious groups experience and conceptualise racism in different ways [2830, 31].

While early characterisations of Indigenous people provided the foundations for contemporary racist practices [26, 32], the contemporary context is important, given the changing nature of racism [33]. Pedersen, Clarke, Dudgeon, and Griffiths [34] describe the historical progression of racism in Australia as moving from targeting Yugoslavs, Italians, Asians, Arabs, to Afghans. The past decade would most appropriately also include people from the sub-continent of India and from Africa, both populations widely reported in the media as key out-groups in contemporary Australian society. The historical, contemporary, and regional factors that shape the different attitudes to these groups need to be understood and reflected in assessment instruments to ensure appropriate evaluation of interventions aiming to improve intergroup relations. Current racism research is therefore limited in terms of generalisability, validity, and utility for the Australian context [35].

Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES)

Despite the extensive work of Australian researchers and community and government organisations working against racism, there are no empirically validated tools available to measure racism in the Australian context. As a result, anti-racism programs are rarely well evaluated. To redress this, an explicit measure of racial, ethnic, cultural, and religious acceptance – the Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES; [36]) – was developed with children, adolescents, and adults from various racial, ethnic, cultural, and religious backgrounds.

From December 2011 to March 2012, a qualitative study was conducted among young Australians on their conceptualisations of and experiences with racism, to generate sufficient data to form the basis of a scale (detailed elsewhere; [31]). This study demonstrated a consistent explanatory model for understanding racism across groups [36]. The qualitative data, which provided insight into Australian lay understandings of racism [31], were supplemented and complemented by an extensive and comprehensive literature review on the conceptual racism literature and existing instruments, to create the preliminary measure. Since RACES was designed to evaluate and inform anti-racism and pro-diversity initiatives, items were designed to measure acceptance of difference and racism viewed along a continuum. Efforts were made to ensure that the development of the items was atheoretical, primarily driven by the qualitative data, rather than conforming to a chosen theory of racism. Consequently, the items developed can be thought of as representing the multidimensional nature of contemporary racism in Australia, spanning a number of theoretical positions.

The items underwent expert review for appropriateness, comprehensiveness, redundancy, and clarity, and were consequently pilot tested utilising cognitive interviewing techniques with children to ensure comprehensibility regardless of age. The instrument was evaluated longitudinally and cross-sectionally with school children, adolescents, and adults drawing upon Classical Testing Theory (CTT; [36]). As we illustrate below, estimates of internal consistency reliability,Footnote 1 in addition to factorial,Footnote 2 construct,Footnote 3 convergent,Footnote 4 and discriminant validityFootnote 5 support the measure.

Aim and hypotheses

In this article we examine the underlying factor structure of RACES using Item Response Theory (IRT) to further refine and finalise the measure developed using CTT. This provides additional support for its use as a robust tool to assess and evaluate racism reduction interventions. We hypothesised that the underlying factor structure of the measure would be consistent for CTT and IRT, and that the final measure would function comparably across children, adolescents, and adults.


Research setting

The childhood component of the research was based in a small town, Greenfields (pseudonym), located in Cardinia Shire, approximately 55 km southeast from central Melbourne. The Shire, and the adjacent City of Casey, are among the most rapidly growing residential areas of Melbourne, with population estimates well exceeding projected growth forecasts of both the state of Victoria and the Australian nation [3739]. The vast majority of inhabitants of Cardinia Shire, and their parents, are Australian-born, at rates much higher than the general state and national populations. However, this cultural uniformity will be substantially impacted by the projected increase in population, with increasing numbers of culturally and linguistically diverse migrants predicted [38]. The adolescent and adult components of the research were conducted throughout the Australian nation.


The research reported here involved 296 students from the core Victorian study area. These students were enrolled in six different primary schools in years five or six. Two of the schools were government funded and secular, two were non-denominational Christian, one was Islamic, and one was Catholic. In addition, 402 community individuals aged 15 years or older also participated. Adolescents and adults from six of the seven Australian states and territories participated in the research (for details see: [36]). It was considered important to examine the children, adolescents, and adults separately due to differences in their general developmental stage [4042] and the level of crystallisation of their racial attitudes [43]. Descriptive statistics for each sample are displayed in Tables 1 and 2 below.

Table 1 Descriptive statistics split by data set
Table 2 Participant self-labelled racial/ethnic background descriptives split by data set

Item response theory

The Rasch models originally proposed in the 1960s can be used to analyse categorical data from assessments designed to measure latent underlying variables such as abilities, attitudes, or personality traits [44]. Rasch models and the related Item Response Theory emphasise that the qualities of both the individual and the item influence item responses [45]. The core underlying theory is that there is a differential effect of item ‘difficulty’ on individuals at different trait levels [45]. For example, on a hypothetical measure of racist attitudes, of the two items “I hate people from other backgrounds” and “I have some minor racist tendencies,” the former is considerably more ‘difficult’ to endorse and would be expected to be sanctioned only by individuals high on the trait of racism. Conversely, the latter item may be endorsed by individuals who are much lower, as well as those moderate or high, on the trait of racism. Ratifying each item provides distinct information about individuals with differing levels of the underlying trait of racism. In contrast, CTT tends to treat each item as having the same ‘difficulty’ and ignores differing response patterns. This limits CTT in its ability to deal with an ordered continuum of items representing an underlying unidimensional construct and with summation of rating scale data [46]. Consequently, Rasch models and IRT can be utilised to perform advanced analytical techniques, which evaluate the differential effects of item ‘difficulty’ and individual trait level not otherwise available within a CTT framework.

In some instances, Rasch models and IRT have been considered psychometrically superior to CTT methods such as Principal Components Analysis, Exploratory Factor Analysis, Confirmatory Factor Analysis (CFA), and related statistical analyses, and appear to improve the precision and validity of psychological measurement [45, 47]. Both IRT and CTT methods have advantages and limitations, however, with certain statistical approaches more advantageous than others depending on the research purpose [48]. Moreover, there are underlying mathematical similarities between both methods [49]. Since neither has an overarching distinct advantage, the IRT and CTT were used interdependently to evaluate the psychometric properties of RACES [50].


Ethic, consent, and permissions

Ethics approval was received by Monash University Human Research Ethics Committee. Prior to participation, all participants were provided with the explanatory statement and given the option to decline involvement in the research.

Testing procedure

Initial instructions to participants outlined the purpose of the survey as inquiring about their thoughts and feelings towards people from the many different racial, ethnic, cultural, and religious backgrounds in Australia, with a number of examples of backgrounds provided (e.g., “Australian”, “Jewish”, “African”, etc.). Once the survey was completed, participants were thanked for their involvement in the research, but no post-testing feedback was provided.

Primary school data set procedure

The authors became involved with five participating schools when we were invited to evaluate the activities of a Victorian anti-racism program, known as “Building Harmony in the Cardinia Growth Corridor”. The principal of an additional school was approached directly by the authors for student participation to enable the inclusion and evaluation of attitudes of children not currently participating in an anti-racism and pro-diversity initiative. All schools obtained permission for students to participate from parents, with no parent declining their child’s participation.

All surveys were completed in September 2012 under the supervision of teachers during class. In five schools the survey was completed online (completion time 15–30 min); in the remaining school surveys were completed in hard copy (completion time 45–60 min). All responses were completed within 10 days of initiation of the survey, which included a demographic questionnaire, RACES [36], and the Strengths and Difficulties Questionnaire [51] (not analysed here). Data are referred to below as the ‘Primary School data set’.

Community data set procedure

Adolescent and adult community participants were recruited nationally via newspaper, radio, and online advertising. Participants aged 15 years or older were considered capable of providing informed consent for the purposes of the current research. Participants were able to access a link to the online survey or contact the authors directly to be provided with a web link or a hard copy survey via mail; all but four responses were completed online, between March 2012 and April 2013. The surveys took approximately 15 min to complete and included a demographic questionnaire, RACES [36], the Dunn and Geeraert [21] Racism Survey, and the Minnesota Temperament Inventory [52]; the latter two measures are not analysed here. Data from this group are labelled below the ‘Community data set’. Data were intended to be examined in entirety (‘Community data set’) and split by adolescents aged 15–20 years (‘15–20 years data set’) and adults aged 21 years and over (‘21+ years data set’) to explore the consistency of the measure across age groups. However, the 21+ years data set failed to meet minimum IRT assumptions and was omitted from independent analysis.

Data treatment

Data for each data set – Primary School, Community, and 15–20 years – were initially collated in SPSS 20.0 and a missing data analysis was performed with all cases with 5 % or more data missing removed. Data were subsequently collated in ACER ConQuest 3.0. Analysis using a Rasch Rating Scale Model (RSM) was undertaken for each data set and each subscale separately. According to the model the probability of a person n responding in category x to item i, is given by Fig. 1:

Fig. 1
figure 1

Rasch Rating Scale Model Part 1

where το = 0 so that Fig. 2:

Fig. 2
figure 2

Rasch Rating Scale Model Part 2

βn is the person’s position on the variable, δi is the scale value (‘difficulty’ to endorse) estimated for each item i and τ1, τ2, . . ., τm are the m response thresholds estimated for the m + 1 rating categories.

Model and item fit was assessed, and items were removed, according to criteria recommended by Linacre [53]. Infit (inlier-sensitive or information-weighted fit) and Outfit (outlier sensitive or non-weighted fit) were evaluated using 0.5–1.5 as a guideline for productive measurement, with values above 2.0 considered degrading of the measurement system. Standardised values, which assess if the model fits the data perfectly, were consequently inspected, allowing for−2.0–3.0 as an acceptable fit. Ill-fitting items on this index are not considered to be degrading of the overall model, but rather to be either overly predictable (i.e., > 3.0) or unpredictable (i.e., <−2.0). Moreover, if Infit and Outfit values are acceptable, Standardised values can be ignored [54]. Once misfitting items are identified, the researcher must make a decision to keep or disregard these data. The confirmation of item fit provides evidence of item quality and content validity.


Both Primary School and Community participants completed the 25-item RACES, which consists of three subscales capturing a distinct component of racism: Racist Attitudes Scale (RAS), an 8-item scale of attitudes reflecting out-group denigration and derogation; Accepting Attitudes Scale (AAS), a 13-item scale of attitudes reflecting out-group endorsement and acceptance; and Ethnocentric Attitudes Scale (EAS), a 4-item scale of attitudes reflecting in-group favouritism and loyalty [36]. Items are responded to on a four-point Likert-type scale ranging from “Strongly Disagree” to “Strongly Agree”; half are reverse scored so higher scores indicate higher levels of acceptance or lower levels of racist attitudes. A neutral option was omitted to ensure ambivalent participants offered a meaningful response and to encourage them to consider their opinions when responding to the survey [55]. The subscales are appropriately interrelated with moderate to near perfect effect [36] and the relationships between RACES and an existing Australian measure of racism (very large to near perfect effect; [36]) and social, emotional, and behavioural strengths and difficulties (small to large effect; [56]) has been established. RACES has also been shown to be internally consistent (total scale and subscale Alpha Coefficient’s range from .79-.91); possesses factorial, construct, discriminant, and convergent validity in children, adolescents, and adults; and be test-retest reliable in children [3657].

Model selection

A core assumption of Rasch and IRT analyses is the selection of an appropriate model for the data [58]. A range of Rasch models can be utilised for rating scale type data; two competing models include the RSM and the Partial Credit Model (PCM). RSM specifies that a set of items share the same rating scale structure or response format (e.g., all items have the possible responses “Strongly Disagree”, “Disagree”, “Agree”, and “Strongly Agree”) [59, 60]. In contrast, PCM specifies that each item has its own unique rating scale structure, derived from assessments where responses that are incorrect can be indicative of some knowledge and are consequently given partial credit [59, 60]. For our purposes, a Rasch model known as a polytomous one parameter RSM for unidimensional traits was considered most appropriate [61]. The RSM was developed to analyse ratings from a unidimensional item set with two or more ordered and fixed response categories [62], and was expanded for use in multidimensional models in IRT software, such as ACER ConQuest 3.0. Both unidimensional and multidimensional RSM were utilised to examine the underlying latent structure as unidimensional (i.e., three unidimensional subscales examined independently) and multidimensional (i.e., three subscales examined interdependently as a single multidimensional scale), providing information that may have been overlooked had only one method been utilised. The purpose of evaluating the fit of a unidimensional model to each of the three subscales also enabled the assessment of whether each, or any, of the subscales could potentially be utilised as an independent scale. The use of a multidimensional model additionally enables the calibration of each subscale simultaneously, increasing measurement precision by including an assessment of the correlations between subscales. This advantage of multidimensional models is especially prominent when subscale length is limited or correlations between subscales are high [63], as is the case with RACES.

Response category variability

A further assumption of polytomous Rasch models is that the data set to be analysed has acceptable response category variability to avoid unstable measures, inaccurate model fit indices, and incorrect inferences [64]. For measure stability it is helpful for the accuracy of model fit and for drawing inferences from the data [64]. This ensures the robustness of the estimates, or that similar estimates could be obtained with another sample from an equivalent population. A guideline for RSM is a minimum of 10 observations in each category accumulated across all relevant items (M. Linacre and R. Adams, personal communication, September 16, 2014). A smaller number of observations only at the item level can impact upon the capacity to accurately assess fit.

To assess the assumption of response category variability, we examined the number of responses in each category for each item. All data sets met the minimum criterion; however, the 21+ years data set had a total of seven items (i.e., 29 % of scale) without a response in each category and was therefore not considered to have sufficient response variability to enable accurate analysis, precluding Rasch analysis of this data set. The Primary School data set, the overall Community data set, and the 15–20 years data set were examined, thus strengthening our results by allowing exploration of the latent trait structure of the three subscales of RACES using Rasch analysis across age groups.


A final underlying assumption of unidimensional Rasch models is that the data have a unidimensional structure [65]. The underlying multidimensionality of RACES [36] precluded examining the scale as a single unidimensional measure. Although multidimensional Rasch models exist, they are complex and limited software is available to facilitate flexible analysis [66, 67]. Hence, examination utilising a multidimensional model provided supplementary information, rather than acting as a central analysis. Each subscale was examined separately utilising the unidimensional RSM, as is appropriate when multiple subscales are assumed to tap a unidimensional construct [66].

Although CFA has disadvantages for evaluating underlying unidimensionality prior to undertaking Rasch analysis, it is common in psychological research [68]. Moreover, even when more advanced methods such as the TETRAD method, the Rasch model, or Parallel analysis are utilised to confirm unidimensionality, subjective judgment is required to determine underlying dimensionality [68]. CFA utilising a congeneric (one factor) measurement model was therefore considered sufficient to examine the underlying unidimensionality of each of the subscales prior to undertaking further Rasch analyses. Each subscale was assessed separately, with an evaluation of the fit of all items within each subscale performed.



The unidimensionality of each subscale (AAS, RAS, and EAS) was examined utilising a separate congeneric (one factor) measurement model CFA for all data sets (Primary School, Community, and 15–20 years). The χ 2 statistic indicated poor fit for a number of analyses. However, this statistic is sensitive to sample size and a number of alternative, less conservative, fit indices are available [69]. To avoid model misspecification multiple indices of fit were examined using widely accepted cut-off criteria [70]. CMIN/df is considered poor fit above 3.00 [71]; RMSEA poor fit above .10 [69] and good fit below .08 [72]; IFI good fit above .90 [73]; and SRMR good fit below .10 [74]. Each hypothesised factor for all data sets was considered to be of sufficient unidimensionality to undertake Rasch analysis (see Tables 3 and 4).

Table 3 RACES subscales CFA unidimensionality results
Table 4 CFA congeneric (one factor) measurement model factor loadings for races subscales

Unidimensional model fit

Primary school data set

All items on each subscale had acceptable Infit and Outfit. When Standardised values were examined EAS had acceptable fit, but AAS and RAS had several items of less than ideal fit. However, no items were removed due to the sensitivity of this index to sample size and the acceptable Infit and Outfit values across each item. Each of the reliability indices (separation reliability and EAP/PV reliability) indicated that all RACES subscales had acceptable reliability (i.e., > .70; [75]). EAP/PV reliability is the explained variance according to the estimated model divided by the total individuals variance [76]. As explained previously, Rasch models permit separation of the individual and item parameters. Separation reliability is a summary of ‘true’ separation as a ratio to separation including measurement error (the ratio of sample deviation, corrected for error, to the average estimation error), indicating how well a test can separate individuals by performance; it is comparable to the Kuder-Richardson Formula 20 measure of internal consistency [77].

15–20 years data set

Several items across the subscales were of less than ideal fit when Standardised values were examined. However, one item (“I don’t tease people because of their background”) on AAS had unacceptable Infit and Outfit. Each of the reliability indices indicated that RAS and AAS had acceptable reliability. EAS had poor separation reliability, but acceptable EAP/PV reliability. The misfitting item from AAS was removed from further analysis with this data set following recommendations of initially removing underfitting items (i.e., > 1.5; [78]), and the RSM analysis was re-conducted.

All items on the subscale were of acceptable Infit and Outfit, although several fell outside the recommended Standardised values range. All items were retained, however, due to the sensitivity of the index, and the balance achieved with the current total RACES of 12 positive items and 12 negative items. This balance avoids response bias due to (1) the sensitivity of the attitudes under evaluation [79, 80] and (2) the tendency for participants to acquiesce, especially those with lower levels of general knowledge and cognitive sophistication (e.g., younger individuals and those with less formal education) [81]. It allows exploration of both positive (acceptance) and negative (racism) attitudes which are functionally independent (i.e., positive attitudes are stronger predictors of positive behaviours and negative attitudes are stronger predictors of negative behaviours) (cf. [19, 82]) and conceptually distinct [83].

Community data set

All items on EAS were of acceptable Infit and Outfit. AAS had one item (“I don’t tease people because of their background”) with undesirable Infit and Outfit and one item (“I get upset if I hear racist comments about any background”) with less than ideal Outfit. RAS had one item (“People from some backgrounds get more than they deserve”) with undesirable Infit. Several items across the subscales were of less than ideal fit when Standardised values were examined. However, due to the sensitivity of this index and acceptable Infit and Outfit values across most items, only one item (“I don’t tease people because of their background”) of poor fit across all indices was removed from further analysis with this data set, and the RSM analysis was re-conducted.

Two items had Outfit outside of the recommended range (“I get upset if I hear racist comments about any background” and “We should be taught about all backgrounds in school”). All other items were of acceptable Infit and Outfit. Several items were outside the recommended Standardised values range, but were retained due to (1) the Infit-Outfit discrepancies, with no items considered degrading of the measurement system (2) the sensitivity of the Standardised values index, and (3) the balance achieved with the current total RACES scale of 12 positive items and 12 negative items if no further items are removed.

Primary school data set re-analysis

Due to the potential value of a single scale containing precisely the same items to assess racism across age groups, the Primary School data set was re-assessed. One item problematic in both the 15–20 years and overall Community data sets (“I don’t tease people because of their background”), was removed from AAS and the RSM analysis was re-conducted. All items on the subscale were of acceptable Infit and Outfit. Although several items were outside the recommended Standardised values range, all items were retained due to reasons reported above.

The final model fit statistics for each data set and subscale are shown in Table 5 below.

Table 5 Unidimensional model fit indices for RACES subscales

Unidimensional scale information

Rasch analysis enables graphical representations of item and total scale characteristics of the data. The Item Characteristic Curve (ICC) or Item Response Function (IRF) and the Expected Score Curve (ESC) are key graphical representations of the performance of items within a Rasch analysis. The Test Information Function (TIC) or Test Information Function (TIF) is a core graphical representation of the performance of the overall test or scale within a Rasch analysis. Due to space constraints, only the TIF for the Community data set subscales are depicted graphically in the main text of this article (additional figures displaying the alternate data set TIFs are presented in Additional file 1, available online). However, the performance of RACES overall scale and subscales are described in the context of each graphical representation below.

The ICC/IRF shows the probability of a correct response as a function of the trait level of an individual and provides a nuanced analysis of item categories. These graphs represent probability as a function of ability plotted along an S‐shaped curve, with low trait levels having a probability of close to zero and high trait levels having a probability of close to one. The leftmost ICCs are the items ‘easiest’ to endorse (i.e., individuals low to high on the latent trait would endorse) and the rightmost items are the most ‘difficult’ to endorse (i.e., only individuals high on the latent trait would endorse). For our purposes an ‘easy’ item would capture individuals with low to high levels of accepting attitudes, while a ‘difficult’ item would be endorsed only by individuals with high levels of attitudes of acceptance (or low levels of racist and ethnocentric attitudes).

Depending on the purpose of the test, it may be important to have most items with high (e.g., measures of psychopathology) or low (e.g., measures of intellectual impairment) ‘difficulty’ levels. Within any test or scale intended for an average population, items need to be of varying ‘difficulty’. These figures illustrate that each RACES subscale contains items ranging from ‘easy’ to endorse to ‘difficult’ to endorse. If utilised as an entire multidimensional scale, RACES contains items that provide information about and can discriminate between individuals from low to high on the latent trait. As RACES was designed for use with a normal (i.e., average) population (versus highly racist or highly accepting), the ICCs of each of the subscales would be appropriate if utilised in combination. Items from each of the subscales performed similarly across each of primary school children, adolescents, and adults.

The ESC shows the expected score given the trait level of an individual and enables an analysis of general fit. The leftmost ESCs are the ‘easiest’ items and the rightmost the most ‘difficult’ items. These figures illustrate that many of RACES items across each subscale performed as predicted by the underlying model. Importantly, items from the subscales performed similarly across each of primary school children, adolescents and adults.

The Item Information Curve (IIC) or Item Information Function (IIF) shows the range where an item is best at discriminating among individuals of a certain trait level. However, the TIC/TIF better represents the data as it provides an illustrative summary of the combined information for all items on each subscale. Like the IIC/IIF, the TIC/TIF shows the range where an overall test is best at discriminating among individuals of a certain trait level. Higher information denotes more precision (or reliability) for measuring a person’s trait level. The TIC/TIF for each Community data set subscale is shown in Fig. 1 below.

The upper most line represents AAS, the middle line RAS, and the lowest line EAS. As illustrated, each RACES subscale generally only contains items that provide information about, and is able to discriminate between, individuals either from low, moderate, or high on the latent trait. Nonetheless, if utilised as an entire multidimensional scale, RACES contains items that provide information enabling discrimination between individuals from low to high on the latent trait. As RACES was designed for use with a normal population, the TIC/TIFs of each of the subscales are appropriate when utilised in combination. Importantly, the subscales performed similarly across each of primary school children, adolescents and adults Fig. 3.

Fig. 3
figure 3

Community data set subscale TIFs. The upper most line represents the AAS, the middle line represents the RAS, and the lower most line represents the EAS. The TIF shows the range where each subscale provides the most information or at which trait level the subscale is best at discriminating among individuals. The left most latent trait represents individuals low on the latent trait and the right most latent trait represents individuals high on the latent trait

Multidimensional model fit

The underlying structure of RACES as multi-scale was examined using multidimensional RSM analysis, to assess the between item multidimensionality of RACES with the aforementioned three subscale structure (i.e., 12-item AAS, 8-item RAS, and 4-item EAS). Data for the Primary School, Community, and 15–20 years data sets was collated in ACER ConQuest 3.0. Analysis using the RSM was undertaken for each data set for the overall scale with 24 items. Model fit was assessed utilising recommended criteria, as previously described. For each data set the χ 2 statistic indicated a poor fit for the total RACES (Primary School: χ 2 (21) = 314.79, p < .001; 15–20 years: χ 2 (21) = 155.43, p < .001; Community: χ 2 (21) = 323.94, p < .001). Moreover, several items across data sets were of less than ideal fit when Standardised values were examined. However, due to the sensitivity of these indices to sample size, other model fit indices were examined.

One item (“I don’t ignore people because of their background”) was of less than ideal Infit and Outfit for the Primary School data set. One item (“People from some backgrounds get more than they deserve”) had undesirable Infit and Outfit for the 15–20 years data set; two further items (“I get upset if I hear racist comments about any background” and “I accept people from all backgrounds”) had less than ideal Outfit. For the Community date set, one item (“People from some backgrounds get more than they deserve”) had undesirable Infit and Outfit, and one further item (“I get upset if I hear racist comments about any background”) had less than ideal Outfit. All other items across the data sets were of acceptable Infit and Outfit and no items were considered to degrade the measurement system (as per [53]). The multidimensional model fit statistics are displayed in Table 6 below.

Table 6 Multidimensional model fit indices for RACES subscales

Multidimensional scale information

Graphical representations of the data illustrate item and combined total scale characteristics (additional figures displaying the Community data set data are presented in Additional file 1, which is available online; all other figures are available upon request from the lead author). The ICCs illustrate that the multidimensional RACES contains items that range from ‘easy’ to endorse to ‘difficult’ to endorse. These items performed similarly across each of primary school children, adolescents, and adults. The ESCs illustrate that many of RACES items performed as predicted by the underlying multidimensional model. These items performed similarly for each of primary school children, adolescents, and adults.


The aim of the project reported in this article was to refine and validate an attitudinal measure of racial, ethnic, cultural, and religious acceptance, for use as a proxy to quantify racist attitudes (see [36]). The end goal was to produce an instrument for use in community-wide anti-racism and pro-diversity initiatives, to assist in evaluating, refining, and improving their effectiveness, so to contribute to programs to reduce racism and increase acceptance of difference throughout Australia. It was hoped that in turn inequities in health outcomes across Australia’s diverse racial, ethnic, cultural, and religious groups could be redressed.

Insufficient attempts to reduce racism can lead to an intensification of racist attitudes [84, 85]. Because of this, it is crucial for racism reduction interventions to be based on a sound theoretical framework, as demonstrated over decades of research [9, 84, 8689]. Yet a recent review of 50 years of diversity training demonstrated that in most cases programs are considered effective contingent upon the number of people trained, not by accurately evaluating their efficacy [90]. Without appropriate evaluation and demonstration of the efficacy of such interventions, anti-racism and pro-diversity programs cannot be widely disseminated and are therefore neither meaningful nor useful to the community at large.

A principal concern in developing and validating RACES was the lack of confidence in the capability of existing instruments to capture the varied forms of racism experienced by individuals of diverse groups in Australia. This is essential, as distinct groups often report diverse aspects and dissimilar experiences of racism and discrimination [91]. By adopting a comprehensive process to develop and validate RACES, the measure can be used with multiple groups across the lifespan.

The present research demonstrated the robust reliability and validity of RACES, confirming the utility of the measure. Overall, RACES has a number of key advantages as a measure of racist attitudes in Australia. RACES was developed for, and validated in, the contemporary Australian social context, with previous development phases ensuring that the items were based on real experiences, understandings, and conceptualisations, utilising a mixed-methods approach. This contrasts with many measures that draw on secondary data or uncritically re-word or adapt existing scales and rely solely upon quantitative methods. Unlike any existing measure of racist attitudes, RACES was assessed and refined utilising both CTT and IRT, giving greater confidence in its factorial validity. The Rasch analyses support the overall factorial and construct validity of the 24-item RACES across primary school children, adolescents, and adults, and indicate that RACES is a reliable three-dimensional scale of Accepting Attitudes (12 items), Racist Attitudes (8 items), and Ethnocentric Attitudes (4 items). RACES also provides information about, and discriminates between, individuals across the range of the latent traits of racism, acceptance, and cultural-ethnocentrism. Finally, in contrast to previous measures of racism in Australia, RACES was designed for assessing attitudes towards all racial, ethnic, cultural, and religious groups and has been shown to be reliable and valid across children, adolescents, and adults.


Although participants were sought from around Australia and across the range of adolescent and adult ages for the Community data set, the sample was predominantly from Victoria and the average age was quite young, limiting the generalisability of the results. Minimum sample sizes for factor analysis and other analyses were met, but replication and additional data from larger samples would enhance confidence in the results. Invalid responses may also have biased the results, although inspection of removed cases revealed that most missing data was from latter parts of the survey, suggesting that technical difficulties led to participant non-completion, rather than being characteristic of the participants. Some scale characteristics were less than ideal (e.g., fit indices) and therefore require confirmation with alternate populations. We did not remove items based on stringent cut-points due to the limited sample available, but there is the potential that findings are an artefact of the participants, reinforcing the need for replication. Finally, strong consistency was found across age groups, but results were based on an unbalanced overall scale (i.e., 12, 8, and 4 items), which may bias findings utilising the total scale score. Moreover, the failure of the 21+ years data set to meet minimum requirements for independent analysis casts some doubt on the uniformity found across age groups and hence requires further exploration. The brief length of the EAS also raises some concern due to the potential for short tests to lead to less accurate estimation in Rasch models [92, 93], although alternative research has demonstrated the accuracy of Rasch estimation for tests as short as five items [92, 94].

Implications for practice

Prior to its wide dissemination to evaluate anti-racism and pro-diversity initiatives, future research is needed to confirm the psychometric properties of the new measure in alternate contexts and populations. Regardless, there are significant advantages of RACES over existing tools. RACES can be used to: a) evaluate the relationship between racism and other variables, b) track changes in racist attitudes over time, c) compare racist attitudes across groups, and d) evaluate the effect of anti-racism or pro-diversity initiatives. If the robust validity of the measure is confirmed in prospective research, potential gender, SES, and other demographic differences might be explored, so enhancing our understanding of racism in Australia. The most important use of RACES is its potential to assess the effectiveness of racism-reduction programs, by evaluating the attitudes of participants prior to and after intervention. Such evaluation would provide a strong evidence base for initiatives to be developed, refined, and extended to reduce community levels of racism. Due to its development stages predominantly involving youth, RACES has particular potential for effective use with school- or other youth-based initiatives.


Racism is a significant challenge in contemporary Australian society due to the potential and significant negative impact on a range of health, social, psychological, and economic outcomes of the diverse racial, ethnic, cultural, and religious groups within Australia. Various interventions have attempted to reduce racism, increase acceptance of diversity, and address health inequities. However, confident conclusions about the effectiveness of such initiatives have not been able to be drawn, because of the absence of validated and standardised measures of racism appropriate for the diverse Australian population. The present project aspired to redress this issue and answer the appeals of previous researchers by working to inform developmentally targeted racism-reduction programs. RACES was designed to evaluate such initiatives and early validity findings offer solid foundations for, and confidence in, the instrument. Although follow up work is needed, RACES can be employed in a meaningful and useful manner to assist with the evaluation, and consequent targeted improvement, of innovative intervention programs for populations across the lifespan. Such appraisals would provide a strong evidence base for initiatives to reduce community levels of racism and in turn inequities in health outcomes across all racial, ethnic, cultural, and religious groups within Australia.


  1. Internal consistency reliability demonstrates that each item relates to each other item in the scale.

  2. Factorial validity demonstrates that the identified factor structure is valid in respect to the underlying theoretical model.

  3. Construct validity is an overall measure of validity that encompasses all other forms of reliability and validity. Construct validity demonstrates that the instrument measures what it purports to measure. In other words, the measure performs as it is expected to perform based on the overarching theory upon which it is based.

  4. Convergent validity demonstrates that the measure is related to concepts it would be expected to be related to, or alternatively that results from two groups which would be expected to have similar results are related.

  5. Discriminant validity demonstrates that the measure is unrelated to concepts it would be expected to be unrelated to, or alternatively that results from two groups which would be expected to have different results are different.


  1. Williams DR, Collins C. US socioeconomic and racial differences in health: patterns and explanations. Annu Rev Sociol. 1995;21(1):349–86. doi:10.1146/

    Article  Google Scholar 

  2. Phelan JC, Link BG. Is racism a fundamental cause of inequalities in health? Annual Review of Sociology. 2015;41(1). doi:10.1146/annurev-soc-073014-112305.

  3. Kurzban R, Leary MR. Evolutionary origins of stigmatization: the functions of social exclusion. Psychol Bull. 2001;127:187–208. Retrieved from

    CAS  Article  PubMed  Google Scholar 

  4. Clark R, Anderson NB, Clark VR, Williams DR. Racism as a stressor for African Americans: a biopsychosocial model. Am Psychol. 1999;54:805–16. Retrieved from

    CAS  Article  PubMed  Google Scholar 

  5. Paradies YC. A systematic review of empirical research on self-reported racism and health. Int J Epidemiol. 2006;35:888–901. doi:10.1093/ije/dyl056.

    Article  PubMed  Google Scholar 

  6. Williams DR, Neighbors HW, Jackson JS. Racial/ethnic discrimination and health: findings from community studies. Am J Public Health. 2008;98(2):29–37. Retrieved from

    Article  Google Scholar 

  7. Cunningham J, Paradies YC. Patterns and correlates of self-reported racial discrimination among Australian Aboriginal and Torres Strait Islander adults, 2008-09: Analysis of national survey data. Int J Equity Health. 2013;12(1):47. doi:10.1186/1475-9276-12-47.

    Article  PubMed  PubMed Central  Google Scholar 

  8. American Psychological Association. Final report of the APA delegation to the “UN world conference against racism, racial discrimination, xenophobia, and related intolerance (WCAR)”. Washington DC: Author; 2004

  9. Sanson A, Augoustinos M, Gridley H, Kyrios M, Reser J, Turner C. Racism and prejudice: an Australian psychological society position paper. Aust Psychol. 1998;33(3):161–82. doi:10.1080/00050069808257401.

    Article  Google Scholar 

  10. Bastian B, Jetten J, Chen H, Radke HRM, Harding JF, Fasoli F. Losing our humanity: the self-dehumanizing consequences of social ostracism. Pers Soc Psychol Bull. 2013;39(2):156–69. doi:10.1177/0146167212471205.

    Article  PubMed  Google Scholar 

  11. Branscombe N, Schmitt M, Harvey R. Perceiving pervasive discrimination among African Americans: implications for group identification and well-being. J Pers Soc Psychol. 1999;77(1):135–49. doi:10.1037//0022-3514.77.1.135.

    Article  Google Scholar 

  12. Devine PG. Stereotypes and prejudice: their automatic and controlled components. Attitudes and Social Cognition. 1989;56(1):5–18. doi:10.1037/0022-3514.56.1.5.

    Google Scholar 

  13. Quillian L. New approaches to understanding racial prejudice and discrimination. Annu Rev Sociol. 2006;32(1):299–328. doi:10.1146/annurev.soc.32.061604.123132.

    Article  Google Scholar 

  14. Bastos J. Racial discrimination and health: a systematic review of scales with a focus on their psychometric properties. Soc Sci Med. 2011;1(2):4–13. doi:10.1016/j.socscimed.2009.12.020.

    Google Scholar 

  15. Harrell J, Hall S, Taliaferro J. Physiological responses to racism and discrimination: an assessment of the evidence. Am J Public Health. 2003;93(2):243–8. Retrieved from

    Article  PubMed  PubMed Central  Google Scholar 

  16. Pascoe EA, Richman LS. Perceived discrimination and health: a meta-analytic review. Psychol Bull. 2009;135:531–54. doi:10.1037/a0016059.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Quillian L, Cook KS, Massey DS. New approaches to understanding racial prejudice and discrimination. Annu Rev Sociol. 2006;32:299–328. doi:10.1146/annurev.soc.32.061604.123132.

    Article  Google Scholar 

  18. Grigg K. Development and validation of the Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): Measuring racism in Australia (Doctoral dissertation). 2014. Retrieved from

    Google Scholar 

  19. Pedersen A, Beven J, Walker I, Griffiths B. Attitudes toward Indigenous Australians: the role of empathy and guilt. J Community Appl Soc Psychol. 2004;14(4):233–49. doi:10.1002/casp.771.

    Article  Google Scholar 

  20. Walker I. The changing nature of racism: From old to new? In: Augoustinos M, RK J, editors. Understanding the psychology of prejudice and racism. London: Sage; 2001. p. 24–41.

    Chapter  Google Scholar 

  21. Dunn KM, Geeraert P. The geography of “race” and racisms. GeoDate. 2003;16(3):1–6. Retrieved from

  22. Walker I, Crogan M. Academic performance, prejudice, and the jigsaw classroom: new pieces to the puzzle. J Community Appl Soc Psychol. 1998;8(6):381–93. doi:10.1002/(sici)1099-1298(199811/12)8:6<381::aid-casp457>3.0.co2-6.

    Article  Google Scholar 

  23. Doyle AB, Aboud FE. A longitudinal study of White children’s racial prejudice as a social-cognitive development. Merrill-Palmer Q. 1995;41(2):209–28. Retrieved from

    Google Scholar 

  24. White FA, Gleitzman M. An examination of family socialisation processes as moderators of racial prejudice transmission between adolescents and their parents. J Fam Stud. 2006;12(2):247–60. doi:10.5172/jfs.327.12.2.247.

    Article  Google Scholar 

  25. White FA, Abu-Raya HM. A Dual Identity-Electronic Contact (DIEC) experiment promoting short- and long-term intergroup harmony. J Exp Soc Psychol. 2012;48:597–608. doi:10.1016/j.jesp.2012.01.007.

    Article  Google Scholar 

  26. Tuffin K. Racist discourse in New Zealand and Australia: reviewing the last 20 years. Soc Personal Psychol Compass. 2008;2:591–607. doi:10.1111/j.1751-9004.2007.00071.x.

    Article  Google Scholar 

  27. McGrane JA, White FA. Differences in Anglo and Asian Australians’ explicit and implicit prejudice and the attenuation of their implicit in-group bias. Asian J Soc Psychol. 2007;10(3):204–10. doi:10.1111/j.1467-839X.2007.00228.x.

    Article  Google Scholar 

  28. McKown C. Age and ethnic variation in children’s thinking about the nature of racism. J Appl Dev Psychol. 2004;25:597–617. doi:10.1016/j.appdev.2004.08.001.

    Article  Google Scholar 

  29. Verkuyten M, Kinket B, van der Wielen C. Preadolescents’ understanding of ethnic discrimination. J Genet Psychol. 1997;158(1):97–112. Retrieved from

    CAS  Article  PubMed  Google Scholar 

  30. Pedersen A, Dunn K, Forrest J, McGarty C. Prejudice and discrimination from two sides: How do Middle-Eastern Australians experience it and how do other Australians explain it? Journal of Pacific Rim Psychology. 2012;6(1):18–26. doi:10.1017/prp.2012.3.

    Article  Google Scholar 

  31. Grigg K, Manderson L. "Just a joke": Young Australian understandings of racism. International Journal of Intercultural Relations, 2015;17:195–208. doi:10.1016/j.ijintrel.2015.06.006.

    Article  Google Scholar 

  32. McCreanor T. Pakeha ideology of Maori performance: A discourse analytic approach to the construction of educational failure in Aotearoa/New Zealand. Folia Linguistica. 1993;27(3-4):293–314. doi:10.1515/flin.1993.27.3-4.293.

    Article  Google Scholar 

  33. Duckitt J. Psychology and prejudice: an historical analysis and integrative framework. Am Psychol. 1992;47:1182–93.

    Article  Google Scholar 

  34. Pedersen A, Clarke S, Dudgeon P, Griffiths B. Attitudes toward Indigenous Australians and asylum seekers: the role of false beliefs and other social-psychological variables. Aust Psychol. 2005;40(3):170–8. doi:10.1080/00050060500243483.

    Article  Google Scholar 

  35. Dunn KM, Burnley I, McDonald A. Constructing racism in Australia. Australian Journal of Social Issues. 2004;39:409–30.

    Google Scholar 

  36. Grigg K, Manderson L. Developing the Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES). The Australian Developmental and Educational Psychologist. 2015;32(1):71-87. doi: 10.1017/edp.2015.7.

    Article  Google Scholar 

  37. City of Casey Council. City of Casey population forecasts. 2012.

    Google Scholar 

  38. Cardinia Shire Council. Cardinia Shire population forecasts. 2012.

    Google Scholar 

  39. Australian Bureau of Statistics. 3222.0 - Population projects, Australia, 2006 to 2101. 2008.

    Google Scholar 

  40. Eccles JS, Midgley C, Wigfield A, Buchanan CM, Reuman D, Flanagan C, et al. Development during adolescence. The impact of stage-environment fit on young adolescents’ experiences in schools and in families. Am Psychol. 1993;48(2):90–101. doi:10.1037/0003-066X.48.2.90.

    CAS  Article  PubMed  Google Scholar 

  41. Erikson E. Childhood and society. New York: W. W. Norton & Company; 1963.

    Google Scholar 

  42. Arnett JJ. Emerging adulthood: A theory of development from the late teens through the twenties. Am Psychol. 2000;55:469–80. doi:10.1037/0003-066X.55.5.469.

    CAS  Article  PubMed  Google Scholar 

  43. Nesdale D. The development of ethnic prejudice in early childhood: Theories and research. In: Saracho O, Spodek B, editors. Contemporary perspectives on socialization and social development in early childhood education. Charlotte: Information Age Publishing; 2007. p. 213–40.

    Google Scholar 

  44. Rasch G. Probabilistic models for some intelligence and attainment tests. Expandedth ed. Chicago: The University of Chicago Press; 1980.

    Google Scholar 

  45. Furr RM, Bacharach VR. Psychometrics: an introduction. Thousand Oaks: Sage; 2008.

    Google Scholar 

  46. Prieto L, Alonso J, Lamarca R. Classical test theory versus Rasch analysis for quality of life questionnaire reduction. Health Qual Life Outcomes. 2003;1(1):27–40. doi:10.1186/1477-7525-1-27.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Reise SP, Ainsworth AT, Haviland MG. Item response theory: fundamentals, applications, and promise in psychological research. Curr Dir Psychol Sci. 2005;14(2):95–101. doi:10.1111/j.0963-7214.2005.00342.x.

    Article  Google Scholar 

  48. DeVellis R. Scale development: theory and applications. 3rd ed. Thousand Oaks: Sage; 2012.

    Google Scholar 

  49. Takane Y, de Leeuw J. On the relationship between item response theory and factor analysis of discretized variables. Psychometrika. 1987;52(3):393–408. doi:10.1007/bf02294363.

    Article  Google Scholar 

  50. Embretson SE, Hershberger SL. The new rules of measurement. Hillsdale: Lawrence Erlbaum Associates; 1999.

    Google Scholar 

  51. Goodman R. The strengths and difficulties questionnaire: a research note. J Child Psychol Psychiatry. 1997;38:581–6. doi:10.1111/j.1469-7610.1997.tb01545.x.

    CAS  Article  PubMed  Google Scholar 

  52. Loney BR, Taylor J, Butler MA, Iacono WG. Adolescent psychopathy features: 6-Year temporal stability and the prediction of externalizing symptoms during the transition to adulthood. Aggress Behav. 2007;33(3):242–52. doi:10.1002/ab.20184.

    Article  PubMed  Google Scholar 

  53. Linacre JM. What do Infit and Outfit, mean-sqaure and standardized mean? Rasch Measurement Transactions. 2002;16:878-. Retrieved from

  54. Linacre JM. Winsteps® Rasch measurement computer program User’s Guide. Beaverton:; 2006.

    Google Scholar 

  55. Nowlis SM, Kahn BE, Dhar R. Coping with ambivalence: The effect of removing a neutral option on consumer attitude and preference judgments. J Consum Res. 2002;29:319–34. doi:10.1086/344431.

    Article  Google Scholar 

  56. Grigg K, Manderson L. Building Harmony: Measuring and reducing racism in Australian schools. The Australian Community Psychologist. 2014;26(2):68–89.

    Google Scholar 

  57. Grigg K, Manderson L. Is there a relationship between psychopathic traits and racism? Current Psychology. Advance online publication. 2014. doi: 10.1007/s12144-014-9283-9.

    Google Scholar 

  58. Edelen MO, Reeve BB. Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Qual Life Res. 2007;16:5–18. doi:10.1007/s11136-007-9198-0.

    Article  PubMed  Google Scholar 

  59. Wright BD. Model selection: Rating Scale Model (RSM) or Partial Credit Model (PCM)? Rasch Measurement Transactions. 1998;12:641–2. Retrieved from

    Google Scholar 

  60. Linacre JM. Comparing “Partial Credit Models” (PCM) and “Rating Scale Models” (RSM). Rasch Measurement Transactions. 2000;14:768-. Retrieved from

  61. Andrich D. A rating formulation for ordered response categories. Psychometrika. 1978;43:561–73. doi:10.1007/BF02293814.

    Article  Google Scholar 

  62. Engelhard G. Item Response Theory (IRT) models for rating scale data. In: Everitt BS, Howell D, editors. Encyclopedia of statistics in behavioral science. New York: Wiley; 2005.

    Google Scholar 

  63. Wang WC, Yao G, Tsai YJ, Wang JD, Hsieh CL. Validating, improving reliability, and estimating correlation of the four subscales in the WHOQOL-BREF using multidimensional Rasch analysis. Qual Life Res. 2006;15:607–20. doi:10.1007/s11136-005-4365-7.

    Article  PubMed  Google Scholar 

  64. Linacre JM. Understanding Rasch measurement: optimizing rating scale category effectiveness. J Appl Meas. 2002;3(1):85–106. Retrieved from

    PubMed  Google Scholar 

  65. Childs RA, Oppler SH. Implications of test dimensionality for unidimensional IRT scoring: An investigation of a high-stakes testing program. Educ Psychol Meas. 2000;60:939–55. doi:10.1177/00131640021971005.

    Article  Google Scholar 

  66. Maydeu-Olivares A. Further empirical results on parametric versus non-parametric IRT modeling of Likert-type personality data. Multivar Behav Res. 2005;40(2):261–79. Retrieved from

    Article  Google Scholar 

  67. Reeve BB, Fayers P. Applying item response Theory modelling for evaluating questionnaire item and scale properties. Assessing quality of life in clinical trials: Methods and practice. London: Oxford University Press; 2005.

    Google Scholar 

  68. Yu CH, Popp SO, DiGangi S, Jannasch-Pennell A. Assessing unidimensionality: A comparison of Rasch Modeling, Parallel Analysis, and TETRAD. Practical Assessment, Research & Evaluation. 2007;12(14). Retrieved from

  69. Tabachnick BG, Fidell LS. Using multivariate statistics. 5th ed. Boston: Allyn & Bacon; 2007.

    Google Scholar 

  70. Hu L-T, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal. 1999;6(1):1–55. doi:10.1080/10705519909540118.

    Article  Google Scholar 

  71. Hu L-T, Bentler PM. Evaluating model fit. In: Hoyle RH, editor. Structural equation modeling: Concepts, issues, and applications. Thousand Oaks: Sage; 1995.

    Google Scholar 

  72. Browne MW, Cudeck R. Alternative ways of assessing model fit. In: Bollen KA, Long JS, editors. Testing structural equation models. Newbury Park: Sage; 1993. p. 136–62.

    Google Scholar 

  73. Marsh HW, Hau KT. Assessing goodness of fit: Is parsimony always desirable? J Exp Educ. 1996;64:364–90. Retrieved from

    Article  Google Scholar 

  74. Kline RB. Principles and practices of structural equation modeling. 2nd ed. New York: Guilford Press; 2004.

    Google Scholar 

  75. Nunnally JC. Psychometric theory. 2nd ed. New York: McGraw Hill; 1978.

    Google Scholar 

  76. Rauch DP, Hartig J. Multiple-choice versus open-ended response formats of reading test items: A two-dimensional IRT analysis. Psychological Test and Assessment Modeling. 2010;52(4):354–79. Retrieved from

    Google Scholar 

  77. Wright BD, Stone M. Best test design: Rasch measurement. Chicago: MESA Press; 1979.

    Google Scholar 

  78. Wright BD, Linacre JM. Reasonable mean-square fit values. Rasch Measurement Transactions. 1994;8:370. Retrieved from

  79. Schriesheim CA, Hill KD. Controlling acquiescence response bias by item reversals: The effect on questionnaire validity. Educ Psychol Meas. 1981;41:1101–14. doi:10.1177/001316448104100420.

    Article  Google Scholar 

  80. Schweizer K, Schreiner M. Avoiding the effect of item wording by means of bipolar instead of unipolar items: An application to social optimism. Eur J Personal. 2010;24(2):137–50. doi:10.1002/per.748.

    Google Scholar 

  81. Jackman MR. Education and prejudice or education and response-set? Am Sociol Rev. 1973;38:327–39. Retrieved from

    CAS  Article  PubMed  Google Scholar 

  82. Pittinsky TL, Rosenthal SA, Montoya RM. Liking is not the opposite of disliking: The functional separability of positive and negative attitudes toward minority groups. Cult Divers Ethn Minor Psychol. 2011;17(2):134–43. doi:10.1037/a0023806.

    Article  Google Scholar 

  83. Phillips ST, Ziller RC. Toward a theory and measure of the nature of nonprejudice. J Pers Soc Psychol. 1997;72:420–34. doi:10.1037/0022-3514.72.2.420.

    Article  Google Scholar 

  84. Hewstone M, Swart H. Fifty-odd years of inter-group contact: From hypothesis to integrated theory. Br J Soc Psychol. 2011;50:374–86. doi:10.1111/j.2044-8309.2011.02047.x.

    Article  PubMed  Google Scholar 

  85. Ray JJ. Racial attitudes and the contact hypothesis. J Soc Psychol. 1983;119(1):3–10. doi:10.1080/00224545.1983.9924435.

    Article  Google Scholar 

  86. Pedersen A, Walker I, Paradies Y, Guerin B. How to cook rice: a review of ingredients for teaching anti-prejudice. Aust Psychol. 2011;46(1):55–63. doi:10.1111/j.1742-9544.2010.00015.x.

    Article  Google Scholar 

  87. Greco T, Priest N, Paradies Y. Review of strategies and resources to address race-based discrimination and support diversity in schools. Carlton: Victorian Heath Promotion Foundation; 2010.

    Google Scholar 

  88. Allport G. The nature of prejudice. New York: Basic Books; 1954.

    Google Scholar 

  89. Cotton K. Fostering intercultural harmony in schools: research findings. School improvement research series. Portland: Office of Educational Research and Improvement; 1993.

    Google Scholar 

  90. Anand R, Winters M. A retrospective view of corporate diversity training from 1964 to the present. Academy of Management Learning & Education. 2008;7(3):356–72. Retrieved from

    Article  Google Scholar 

  91. Shariff-Marco S, Breen N, Landrine H, Reeve BB, Krieger N, Gee GC, et al. Measuring everday racial/ethnic discrimination in health surveys. Du Bois Review. 2011;8(1):159–77. doi:10.1017/s1742058x11000129.

    Article  Google Scholar 

  92. Toland M. Determining the accuracy of item parameter standard error of estimates in BILOG-MG 3 (Doctoral dissertation). 2008.

    Google Scholar 

  93. Kirisci L, Hsu T, Yu L. Robustness of item parameter estimation programs to assumptions of unidimensionality and normality. Appl Psychol Meas. 2001;25(2):146–62. doi:10.1177/01466210122031975.

    Article  Google Scholar 

  94. Drasgow F. An evaluation of marginal maximum likelihood estimation for the two-parameter logistic model. Appl Psychol Meas. 1989;13(1):77–90. doi:10.1177/014662168901300108.

    Article  Google Scholar 

Download references


The authors would like to acknowledge the support of Windermere Child and Family Services and Professor James Ogloff throughout the research, in addition to the statistical advice of Professor Grahame Coleman.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Kaine Grigg.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

KG conceived the research, the design of the study, coordinated and carried out the studies, performed the statistical analyses and interpreted same, and drafted and revised the manuscript. LM participated in the conception of the research, the design and coordination of the study, the interpretation of the results, and the drafting and revision of the manuscript. Both authors read and approved the final manuscript.

Authors’ information

KG has completed a Doctor of Psychology in Clinical Psychology, Specialising in Forensic Psychology at Monash University. He also has a Bachelor of Social Science (Psychology) with Distinction and a Bachelor of Applied Science (Psychology) (Honours 1) from RMIT University. As a clinician KG has a worked with a broad range of age groups and populations, across diverse settings, focussing on delivering clinical and forensic psychological services. KG has worked in residential alcohol and other drug withdrawal units, community health services, forensic mental health hospitals, prisons and juvenile detentions facilities, and in court assessment clinics. As an academic KG has worked within tutoring, teaching associate, and research officer roles, attaining a grounded understanding of research-based practice. His research interests involve cross-cultural interactions: he has previously studied the negative mental health effects of perceived racism on Australians of diverse racial backgrounds and his Doctoral research developed an Australian attitudinal measure of racial, ethnic, cultural, and religious acceptance (Racism, Acceptance, and Cultural-Ethnocentrism Scale; RACES), which can be utilised as a proxy measure of racist attitudes.

LM is Professor of Public Health and Medical Anthropology in the School of Public Health, The University of the Witwatersrand, Johannesburg, South Africa. She joined Wits as a member of faculty in 2014; previously, from 2004-2013, she was an honorary professor and, in 2008, Hillel Friedland Senior Fellow at the university. She also holds appointment as Visiting Distinguished Professor, Institute at Brown for Environment and Society and Visiting Professor of Anthropology, Brown University, Providence, RI, USA.  In addition, she is an honorary professor at Khon Kaen University, Thailand, and adjunct professor in both the School of Social Sciences and the School of Psychological Sciences at Monash University.

Additional file

Additional file 1:

The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale Appendix. Appendix of additional figures not included in main text. (DOCX 906 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Grigg, K., Manderson, L. The Australian Racism, Acceptance, and Cultural-Ethnocentrism Scale (RACES): item response theory findings. Int J Equity Health 15, 49 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Australia
  • Racism
  • Scale
  • Item Response Theory
  • Rasch analysis