Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: null; date: 15 November 2018

# Measuring Group Consciousness: Actions Speak Louder Than Words

## Abstract and Keywords

Although group consciousness is an important concept in explaining political behavior, both theoretical guidance on how to measure group consciousness and empirical consensus regarding its operationalization are lacking. This has the potential to lead to both diverging results and inaccurate empirical conclusions, which greatly limits the ability to understand the role that group consciousness plays in politics. Using data from Pew’s 2013 “Survey of LGBT Americans,” this analysis provides a foundation for measuring group consciousness using item response theory (IRT). Through an examination of dimensionality, monotonicity, model fit, and differential item functioning, the results demonstrate that many assumptions about measuring group consciousness have been incorrect. Further, the findings suggest that previous conclusions about subgroup differences may be the result of survey bias, rather than actual between-group differences. Moving forward, scholars of political behavior should use IRT to measure latent constructs.

# Introduction

Group consciousness is an important concept in explaining a variety of political factors, ranging from conceptions of group identity (Smith 2004), to adherence to group norms (Huddy, 2001), to political participation (Gurin, Miller, and Gurin 1980; Miller, Gurin, Gurin, and Malanchuk 1981; Shingles 1981, Stokes 2003; Sanchez 2006a, 2006b), to partisanship (Highton and Kam 2011; Wallace et al. 2009; Kidd et al. 2007; Welch and Foster 1992; Abramowitz and Saunders 2006), to public opinion (Gurin 1985; Sanchez 2006a; Conover 1984, 1988; Conover and Feldman 1984; Conover and Sapiro 1993). Given the large body of evidence demonstrating the power of group consciousness in explaining political outcomes, one would expect a multitude of well-tested and statistically valid measures of group consciousness to be available to researchers. This is not the case, however, as we lack both theoretical guidance on how to measure group consciousness and empirical consensus surrounding its operationalization. In short, political scientists spend a great deal of time discussing group consciousness and how it should be defined, but almost no time examining how it should be measured. This chapter attempts to bridge this gap between conceptualization and measurement by using item response theory (IRT) to demonstrate how group consciousness should be quantified for analytical purposes. Using IRT to measure group consciousness is a major advancement for political science, as it has stronger theoretical measurement principles and a greater capacity to solve measurement problems than conventional measurement methods do (Lord 1980; Hambleton, Swaminathan, and Rogers 1991; Embretson and Reise 2000, 2013; Baker and Kim 2004; van der Linden and Hambleton 1997).

Through IRT, this analysis also speaks to a larger issue in political science, which involves the proliferation of measurement strategies that are not empirically based. Although I focus specifically on group consciousness, this methodology could, and should, extend to most (p. 364) concepts relating to political behavior, such as political knowledge (Carpini and Keeter 1993; Mondak 2001; Jerit, Barabas, and Bolsen 2006; Abrajano 2015), political participation (Gillion 2009; Harris and Gillion 2012), legislative significance and accomplishment (Clinton and Lapinski 2006), and tolerance of ethnic minorities (Weldon 2006), which all have the potential to capture dozens of different, yet related, ideas. Similar to group consciousness, although these constructs may appear relatively conceptually straightforward, empirical evidence suggests that they are potentially quite difficult to accurately measure. This is especially problematic because our current measurement strategies for quantifying these concepts are murky at best and nonexistent at worst. This not only leads to diverging results and conclusions, but also inhibits scholars of political behavior from forming consensus measures that could validate theoretical results. Consequently, without methodologically validated measures of our constructs, it is impossible to determine if our empirical results are accurate or are simply the result of inappropriate measurement strategies; differential item functioning (DIF), which occurs when a survey contains items that are biased for various subpopulations; or a combination of both factors.

To examine the measurement of group consciousness, I rely on the Pew Research Center’s “Survey of LGBT Americans” (2013). This survey provides data on the increasingly important, yet consistently understudied, lesbian, gay, bisexual, and transgender (LGBT) community. The diversity of this sample is particularly important, as it contains a wide variety of sexual orientations, racial and ethnic minorities, age groups, income groups, and education categories, which allows this analysis to test for the impact of subgroup membership on measuring group consciousness. Further, it provides the first examination of group consciousness outside the racial and ethnic context by including the politically important and undertheorized LGBT community.

# What Is Group Consciousness?

The concept of group consciousness combines in-group politicized identity with a set of ideas about a group’s relative status and strategies for improving it (Jackman and Jackman 1973; Gurin, Miller, and Gurin 1980; Miller, Gurin, and Gurin 1981; Chong and Rogers 2005; McClain et al. 2009). It is thought to structure the value and meaning of group identity for minority communities (Smith 2004) and is often conceived of as multidimensional, including components such as self-identification, a sense of dissatisfaction with the status of the group, identity importance, and identity attachment (Gurin, Miller, and Gurin 1980; Miller, Gurin, and Gurin 1981; Ashmore, Deaux, and McLaughlin-Volpe 2004; Chong and Rogers 2005). Scholars argue that political consciousness is a driving force in the political behavior of minorities by providing group members with both a “need to act” and a “will to act” (Gamson 1968, 48). To summarize, group consciousness is generally defined as a multidimensional and complex concept relating to a person’s political awareness of his or her group label (Stryker 1980; Tajfel 1981, 1982; Turner et al. 1987; Ashmore, Deaux, and McLaughlin-Volpe 2004). (p. 365) Because operationalizations shift across fields and range from interpersonal processes to aggregate-level products of political action (Brubaker and Cooper 2000), this analysis focuses on the four distinct conceptual factors that are most relevant: (1) self-categorization, (2) evaluation, (3) importance, and (4) attachment (Ashmore, Deaux, and McLaughlin-Volpe 2004).

## Self-Categorization

Self-categorization refers to the first step in developing group consciousness, as it represents identification as a member of a particular social group (Deaux 1996; Ashmore, Deaux, and McLaughlin-Volpe 2004). It is the precondition for all other dimensions of group consciousness, because one cannot express pride or importance in an identity that one does not self-identify with (Phinney 1991). Research consistently demonstrates the power of self-categorization, with even arbitrary group labels eliciting powerful in-group favoritism among group members (Brewer 1979; Diehl 1989; Tajfel 1982). In this analysis, self-categorization captures the degree to which LGBT persons think of themselves as gay and the extent to which they locate their identities within the gay community. Outwardly labeling oneself as gay is a fundamental part of this process, often referred to as “coming out.” When an LGBT person comes out, he or she explicitly signals to the outside world that he or she categorizes his or her identity in terms of his or her gayness and that public recognition of this identity is important. Consequently, as persons increasingly outwardly label themselves as LGBT, they indicate a heightened level of self-categorization, signaling higher levels of group consciousness.

All participants in Pew’s 2013 “ Survey of LGBT Americans” self-identify as LGBT, because this was a prerequisite for participation in the survey.1 However, the survey also contains a question related to “being out,” or the extent to which a respondent publicly self-identifies with the LGBT label. Table 17.1 summarizes the self-categorization (p. 366) item, including a description of the question and response rates for each category. It demonstrates that the LGBT community reports varying levels of self-categorization, with a majority (57%) of respondents reporting that they are out to all or most of the important people in their lives, and about one in five reporting that they remain “out” to only some of them (21%) or only a few of them (16%). A minority of respondents (6%) reported that none of the most important people in their lives are aware of their LGBT identity.

Table 17.1 Self-Categorization in “A Survey of LGBT Americans”

All in all, thinking about the important people in your life, how many are aware that you are [lesbian, gay, or bisexual]?

N

%

Mean

SD

None of them

64

5.6

3.3

0.9

Only a few of them

185

16.1

Some of them

246

21.4

All or most of them

654

56.9

Total

1,149

## Evaluation

Following self-categorization as a group member, one of the first processes an LGBT person undergoes is evaluation of the group. Evaluation refers to the positive or negative attachments that a person has toward his or her group identity (Eagly and Chaiken 1993; Ashmore, Deaux, and McLaughlin-Volpe 2004). It has two distinct subcomponents, public evaluation and private evaluation. Public evaluation captures how favorably the broader population regards the individual’s social group, while private evaluation captures how favorably the individual regards his or her social group (Crocker et al. 1994; Luhtanen and Crocker 1992; Sellers et al. 1997; Heere and James 2007). In many cases, there may be a difference between public and private evaluation. For example, an individual may report pride in having an LGBT identity, yet recognize the discrimination and societal disapproval that accompany that label.

Public evaluation and private evaluation are theorized to operate along two distinct dimensions in relation to group consciousness (Crocker et al. 1994). Negative public evaluation, which signals that respondents perceive a large amount of discrimination and societal disapproval, is consistently found to indicate heightened levels of group consciousness (Miller, Gurin, and Gurin 1981; Stokes 2003; Masuoka 2006). This implies that as perceptions of society’s attitudes toward the group grow more negative, the group is indicating higher levels of political consciousness. Private evaluation displays the inverse of this relationship, with positive personal evaluations signaling higher levels of group consciousness (Abrams and Brown 1989; Trapnell and Campbell 1999). Group members should evaluate their group more positively as their levels of consciousness rise.

Table 17.2 displays the items that measure public and private evaluation. Regarding public evaluation, table 17.2 indicates that the majority of respondents (55%) reported that gays and lesbians face a lot of discrimination in American society, although many respondents reported that there was only some discrimination (38%). The data for private evaluations demonstrates an even higher degree of variance, with respondents largely divided between reporting neutral attitudes (57%) or positive attitudes (38%). Therefore, similar to the self-categorization item, the evaluation items display a great deal of variance regarding self-reported group consciousness. (p. 367)

Table 17.2 Public and Private Evaluation in “A Survey of LGBT Americans”

How much discrimination is there against gays and lesbians in our society today?

N

%

Mean

SD

None at all

18

1.6

3.5

0.7

Only a little

66

5.7

Some

434

37.7

A lot

632

55.0

Total

1,150

Thinking about your sexual orientation, do you think of it as mainly something positive in your life today, mainly something negative in your life today, or it doesn’t make much of a difference either way?

N

%

Mean

SD

Mainly something negative

67

5.8

2.3

0.6

Doesn’t make much of a difference either way

659

57.4

Mainly something positive

422

36.8

Total

1,148

## Importance

In addition to self-identifying with a group label and making value judgments regarding the favorability of that label, the importance of the identity to an individual also captures his or her level of group consciousness. Importance represents the degree of significance an individual attaches to his or her group label and overall self-concept of his or her group membership as meaningful (Ashmore, Deaux, and McLaughlin-Volpe 2004). A fundamental component of identity importance is the concept of psychological centrality (Stryker and Serpe 1994), which captures the extent to which a social category is essential to an individual’s sense of self (Stryker and Serpe 1994; McCall and Simmons 1978; Rosenberg 1979). When persons report that their group label is important to their overall sense of identity, they acknowledge the importance and centrality of that label, indicating that it is a fundamental component of their identity. As the identity becomes more central to respondents, it indicates higher levels of group consciousness. Table 17.3 demonstrates the centrality of gay identity in the lives of LGBT Americans, with the community displaying a large degree of variability. Many respondents report that the identity is very or extremely important (37%), signaling high levels of group (p. 368) consciousness, while many others report that it is not too or not at all important (35%), signaling low levels of group consciousness.

Table 17.3 Importance in ‘A Survey of LGBT Americans”

How important, if at all, is being [lesbian, gay, or bisexual] to your overall identity? Would you say it is . . .

N

%

Mean

SD

Not at all important

142

12.4

3.0

1.2

Not too important

263

22.9

Somewhat important

323

28.1

Very important

284

24.7

Extremely important

138

12.0

Total

1,150

## Attachment

In addition to the centrality of a group identity, attachment, or the sense of closeness a person feels toward the larger group based on that identity, is also a distinct and important component of group consciousness (Ashmore, Deaux, and McLaughlin-Volpe 2004). Attachment reflects an individual’s affective involvement while also capturing the close relationships group members form with other members of the group (Heere and James 2007). An important component of attachment is interdependence, or the interconnection of the individual to the broader social group, indicating a merging of the self and the larger community (Mael and Tetrick 1992; Tyler and Blader 2001). Therefore, when persons report higher levels of interdependence, or a heightened sense of shared identity with other group members, they are indicating higher levels of group consciousness. Table 17.4 displays the items related to interdependence, which capture the attitudes of LGBT subgroups toward other community members. Participants reported their sense of shared identity for all outgroups, entailing that a lesbian respondent would only describe her feelings of shared identity regarding gay men and bisexuals. The average score across all outgroups was rounded to create a single measure of attachment for each respondent. The results demonstrate that one-quarter of respondents (25%) feel that they share a lot of common concerns with other LGBT persons, and a majority (52%) report that they share some concerns. A considerably smaller portion of respondents reported sharing only a little (18%) or nothing at all (4%). (p. 369)

Table 17.4 Attachment in “A Survey of LGBT Americans”

As a [lesbian, gay man, bisexual], how much do you feel you share common concerns and identity with [lesbians, gay men, bisexuals]?

N

%

Mean

SD

Not at all

50

4.4

3.0

0.8

Only a little

206

17.9

Some

601

52.4

A lot

291

25.4

Total

1,148

# How Should We Measure Group Consciousness?

Although it is not empirically established, scholars often assume that group consciousness is multidimensional, with each subcomponent representing a distinct dimension. Therefore, the number of variables used ranges widely across studies. Some reports “use multiple measures to capture the full range of the multidimensional concept of group consciousness” (Sanchez 2006b, 428; 2008) and treat these concepts as distinct and independent variables. Other studies use the subcomponents of group consciousness to create indices, which are predominantly constructed by adding values across group consciousness variables (Masuoka 2006; Henderson-King and Stewart 1994; Jamal 2005; Duncan 1999). Both approaches are particularly problematic, because constructs should not be mapped to a specific number of dimensions without examining the underlying structure of the data (Gerbing and Anderson 1988). Essentially, scholars should not assume multidimensionality (i.e., multiple independent measures) or unidimensionality (i.e., one additive index); dimensionality must be assessed and empirically validated before measuring group consciousness.

To date, none of the published articles examining group consciousness measure the concept based on strong measurement models. For example, only classical test theory has been used to examine the measurement of group consciousness (Sanchez and Vargas 2016), and this technique has only been used sparingly. This is problematic, as classical test theory models assume that measurement precision is constant across the entire trait range (Fraley, Waller, and Brennan 2000), implying that each measure will equally capture high, moderate, and low levels of group consciousness. This is incorrect, however, as most scales tend to accurately capture only one end of a scale. To (p. 370) demonstrate, many scales of group consciousness may adequately capture persons with high levels of group consciousness, but may mischaracterize levels of group consciousness across the rest of the distribution. When these scales are utilized, they will only accurately explain outcomes for the group they capture and will have poor explanatory value for other groups. Without examining measurement precision, it is impossible to determine if researchers are forming correct or incorrect conclusions, because there is a high probability that the results will only apply to certain levels of the latent trait. Classical test theory is also strongly dependent on the number of scale items and the sample in use (Embretson 1996; Yen 1986; Fraley, Waller, and Brennan 2000; Hambleton, Swaminathan, and Rogers 1991).

Classical test theory also fails to account for DIF, which allows us to determine if subgroup differences are reliable and valid, meaning that they reflect actual differences between groups, or if they are a function of the survey items (Zumbo 1999). Because classical test theory assumes that all group differences are the result of “real” variation, this method fails to account for the fact that many items often “work differently” or are biased for or against particular subgroups (Embretson and Reise 2000, 249; Swaminathan and Rogers 1990; Zumbo 1999; Osterlind and Eveson 2009; Holland and Wainer 2012). Therefore, the differences we observe may not be actual differences at all, but rather a function of the survey’s measurement bias (Abrajano 2015). This is particularly problematic for group consciousness, because subgroup differences have been an important component of the literature for decades. For example, important subgroup differences have been identified relating to socioeconomic status (Masuoka 2006; Jamal 2005; Duncan 1999; Sanchez 2006b), panethnic identity (Jamal 2005; Masuoka 2006; Sanchez 2006a, 2006b, 2008), sex (Jamal 2005), and age (Jamal 2005; Sanchez 2006b, 2008), among other factors.

Item response theory offers several methodological advantages that allow us to address these limitations. It refers to models intended to characterize the relationship between an individual’s responses and the underlying latent trait of interest (van der Linden and Hambleton 1997; Fraley, Waller, and Brennan 2000; Baker 2001; Embretson 1996; Embretson and Reise 2000). In IRT, theta (θ) represents a latent trait, such as group consciousness. A significant difference between IRT and classical test theory is that, unlike classical test theory, IRT uses a search process to determine the latent trait, rather than a simple computation, such as an additive index (Embretson and Reise 2000). Accordingly, IRT scores group consciousness by finding the level of θ that gives the maximum likelihood. This trait is quantitative in nature, typically has a mean of zero and a standard deviation of one, and characterizes θ in terms of the probability of item endorsement (Fraley, Waller, and Brennan 2000).

The IRT models have two primary assumptions: (1) the item characteristic curve (ICC) must be monotonically increasing, and (2) the data are locally independent (Lord 1980; Reise, Widaman, and Pugh 1993; Embretson and Reise 2000). The ICC is a nonlinear regression line that shows the probability of reporting a response category relative to θ (Fraley, Waller, and Brennan 2000). The ICCs must be monotonically increasing, meaning that the probability of endorsing an item must increase as levels of θ increase (p. 371) (Fraley, Waller, and Brennan 2000). Although many different monotonically increasing functions can be utilized, logistic functions and normal ogive functions are the most prevalent (Embretson and Reise 2000). The shape of the ICC will vary across items based on difficulty and discrimination. Difficulty refers to the probability of successfully endorsing an item; items that many people endorse are less difficult, while items that fewer people endorse are more difficult. An ideal instrument contains items that span a wide range of item difficulties. Discrimination relates to the slope of the ICC and demonstrates how well an item discriminates between categories of θ. Items with high levels of discrimination will more accurately distinguish between persons with similar levels of θ around the difficulty value. Local independence relates to the relationship between the IRT model and the data (Embretson and Reise 2000). This assumption requires that, after we condition on θ, a respondent’s probability of endorsing an item is independent of the probability of endorsing other items. This assumption is also related to unidimensionality, which requires that all of the concepts map onto a single underlying trait.

Given the empirical properties and advantages of IRT, I argue that analyses focusing on latent constructs, such as group consciousness, should rely on IRT models to measure θ. Using IRT, I establish each respondent’s level of group consciousness along a quantitative, methodologically based scale.

# Data

Pew Research Center’s “Survey of LGBT Americans” (2013) is based on a survey of the LGBT population conducted April 11–29, 2013. It includes a nationally representative sample of 1,197 self-identified lesbian, gay, bisexual, and transgender adults eighteen years of age or older. Given the limited sample size of the transgender population, with only 43 respondents, this subgroup is not included in this methodological analysis, because this sample is inadequate for hypothesis testing due to its limited power (Green 1991; Wilson Van Voorhis, and Morgan 2007). The final sample contained 1,154 LGB persons.

The GfK Group administered the survey using KnowledgePanel, a nationally representative online research panel, as considerable research on sensitive issues, such as sexual orientation and gender identity, demonstrates that online survey administration is the most likely mode for eliciting honest answers from respondents (Pew Research Center 2013; Kreuter, Presser, and Tourangeau 2008). KnowledgePanel recruits participants using probability-sampling methods and includes persons both with and without Internet access, those with landlines and cell phones, those with only cell phones, and persons without a phone. From a sample of 3,645 self-identified LGBT panelists, one person per household was recruited into the study, constituting a sample of 1,924 panelists. From this eligible sample, 62% completed the survey. They were offered a $10 incentive to complete the process, which increased to$20 toward the end of the field period to reduce the nonresponse rate. Table 17.5 demonstrates the (p. 372) distribution of lesbians, gay males, and bisexuals in the sample. Gay males represent the largest group (35%), followed by bisexual females (30%), lesbians (24%), and bisexual males (11%).

Table 17.5 Sexual Orientation in “A Survey of LGBT Americans”

N

%

Lesbian

277

24.0

Gay

398

34.5

Bisexual Female

349

30.2

Bisexual Males

129

11.2

Total

1,153

# Methods

There are four steps in executing an IRT model: (1) testing model assumptions, (2) estimating the parameters, (3) assessing model fit, and (4) examining differential item functioning. The principal aspects of testing model assumptions are to establish both unidimensionality and monotonicity (Galecki, Sherman, and Prenoveau 2016). Exploratory factor analysis with principal components analysis was used to examine the dimensionality of the data. Table 17.6 shows the results, which indicate that, rather than the multidimensional construct group consciousness is hypothesized to be and regularly operationalized as, the construct is unidimensional within this data set.

Unidimensionality is established using eigenvalues and the proportion of variance explained. The Kaiser criterion (Kaiser 1970) recommends retaining only those factors with eigenvalues greater than 1. In this analysis, only one factor demonstrated an eigenvalue greater than 1, indicating a unidimensional model. Further, if a group of items is unidimensional, one factor should explain 20% or more of the total variance for all items (Reckase 1979; Reeve et al. 2007; Slocum-Gori and Zumbo 2011). For this model, the first factor exceeded this criterion by explaining 40.44% of the total variance, with no other factors exceeding the 20% threshold. Based on these results, the data satisfy the unidimensionality requirement.

Mokken scale analysis (MSA; Mokken, 1971, 1997) was used to test the monotonicity assumption. It examines patterns of responses and validates if these patterns are monotonically increasing, which is required for developing an IRT model. For items to meet the monotonicity assumption, the Loevinger’s H coefficient, which measures scalability, should exceed 0.30 (Loevinger et al. 1953; van Schuur 2003; Hardouin 2013; Hemker, (p. 373) Sijtsma, and Molenaar 1995). This MSA indicated that two items, public evaluation and attachment, violated the monotonicity assumption, demonstrating that neither variable should be retained in the IRT model.2 Table 17.7 shows that self-categorization, public evaluation, and importance all exceeded the required threshold of 0.30, therefore satisfying the monotonicity assumption and signifying that these three items are appropriate for measuring group consciousness using IRT.

Table 17.6 Unidimensionality and Group Consciousness

Eigenvalue

Difference

Variance Explained (%)

Factor 1

2.02

1.07

40.44

Factor 2

0.95

0.18

19.00

Factor 3

0.77

0.08

15.31

Factor 4

0.69

0.12

13.79

Factor 5

0.57

.

11.45

N

1,134

χ2(10) = 615.00, Prob> χ2 = 0.000

Although the variables demonstrated unidimensionality and monotonicity, visual inspection of the ICCs indicated potential problems with IRT estimation (Koster et al. 2009; Murray et al. 2014; Stochl, Jones, and Croudace 2012). Following an iterative process of examining unidimensionality, monotonicity, and model data fit, the variables were recoded to develop the most optimal model. This model was one with the strongest support for unidimensionality and monotonicity and the best model fit as measured by the test information function (TIF), residual analysis, global model fit, and the Akaike information criterion (AIC) and Bayesian information criterion (BIC) statistics (Zampetakis et al. 2015). To recode the data, I combined categories within items with the poorest model fit, while leaving categories with adequate model fit intact until the optimal fit was achieved. After numerous iterations and subsequent analysis of model fit, each item was recoded into a dichotomous measure that captured whether or not a respondent endorsed an item by reporting that he or she had LGBT group consciousness in that area.3 Table 17.8 summarizes the recoded measures:

With these three items, I used a two-parameter logistic model (2PL; Thissen and Steinberg 1986; van der Linden and Hambleton 1997; Embretson and Reise 2000) to estimate the IRT parameters; 2PL models are IRT models for binary dependent variables, which is appropriate because each of the three recoded group consciousness items is binary. The 2PL model allows discrimination to vary across items, indicating that the model does not assume that each item is equally indicative of a respondent’s standing on θ. Equation 1 (the 2PL model) shows the probability that a respondent with a given level of group consciousness (θ) will endorse item i (Embretson and Reise 2000, 70): (p. 374)

Table 17.7 Monotonicity and Group Consciousness

N

Loevinger’s H Coefficient

Self-Categorization

1,134

0.46

Private Evaluation

1,134

0.41

Importance

1,134

0.46

Scale

1,134

0.44

Table 17.8 Recoded Group Consciousness Variables

Self-Categorization

Private Evaluation

Importance

Not Endorsed

Endorsed

Not Endorsed

Endorsed

Not Endorsed

Endorsed

N

435

684

726

422

728

422

%

43.1

56.9

63.2

36.8

63.3

36.7

Total

1,149

1,148

1,150

$Display mathematics$
(1)

The logit of equation 1, θs − βi, is the difference of trait level and item difficulty. The αi represents the item discrimination parameter. The discrimination parameter, which is also referred to as the slope, indicates how well an item differentinates between response categories. Items with higher discrimination are generally superior measures, because they discriminate between response categories more accurately. The slope parameter is calculated at the location of item difficulty. Item difficulty represents the parameters and demonstrates the trait level at which there is a 50% probability of endorsing an item. Higher difficulty values represent items that are more difficult, indicating that fewer people are likely to endorse that item (Embretson and Reise 2000; Koch 1983; Reise, Widaman, and Pugh 1993). Using this information about the 2PL, table 17.9 displays the model results.

The IRT model demonstrates that all three items have similar levels of discrimination, indicating that they fairly evenly differentiate between response categories. The importance item is the most discriminating, with an α of 1.95, while the self-categorization item is the least discriminating, with an α of 1.29. Overall, all three items performed relatively well at discriminating between response categories. The difficulty of the items has a somewhat greater range, which is preferred, as well-developed survey instruments contain a number of items that range in difficulty. For this set of items, identity importance and private evaluation were the most difficult items to endorse, with higher βs. (p. 375) Conversely, self-categorization was an easier item for respondents to endorse, with a substantially lower β (−0.28). In general, these items tended to skew toward being moderate to easy for respondents to endorse.

Table 17.9 IRT Model of Group Consciousness

β

SE

Self-Categorization

Discrimination

1.29**

0.15

Difficulty

−0.28*

0.06

Private Evaluation

Discrimination

1.84**

0.27

Difficulty

0.46**

0.06

Importance

Discrimination

1.95**

0.30

Difficulty

0.46**

0.06

N

1,153

(**) p<0.05,

(***) p<0.001.

Another advantage of IRT over classical test theory is that the method is able to demonstrate measurement precision across levels of group consciousness. Figure 17.1 displays this information, referred to as the TIF. Precision is highest where the chart covers the most area (Zampetakis et al. 2015), which is particularly valuable because it shows where the scale is most accurate. For this group consciousness scale, the results are most precise at moderate levels of group consciousness and least precise for the lowest and highest levels of group consciousness. This means that when modeling group consciousness using these data, one can expect the greatest explanatory power for those with a moderate amount of group consciousness. This offers a significant advantage over classical test theory which, as stated above, cannot quantify precision across scales.

Two methods are used to assess the model fit for an IRT model. The first method examines the relationship between the observed and expected data by examining the model residuals (Hambleton and Murray 1983; Ludlow 1986; Stark 2001). To demonstrate adequate model fit, the expected data should fall within the 95% confidence interval of the observed data. Large residuals, or discrepancies between the observed and expected, indicate potential problems with the model (Embretson and Reise 2000). Figure 17.2 displays the relationship between the observed and expected data and indicates that the model fits the data well. In these figures, the black line with the error bars represents the observed data, while the gray line represents the expected data. For all categories of each of the three items, the majority of the observed data’s 95% confidence interval overlapped the expected results. (p. 376)

Click to view larger

Figure 17.1 Test information function for group consciousness.

Click to view larger

Figure 17.2 Model fit for group consciousness.

The second method for evaluating model fit involves examining the χ2/df statistic, which formalizes the analysis of residuals (Embretson and Reise 2000). This statistic examines the global fit of model and assumes an asymptotic χ2 distribution (Orlando and Thissen 2000; Zampetakis et al. 2015). Table 17.10 displays the chi-square results for the two-parameter logistic model. This table shows information on singlets, which are residuals for single items; doublets, which are residuals for pairs of items; and triplets, which are residuals for three items in a cross-validation sample (Liu et al. 2011). (p. 377)

Table 17.10 Frequencies of the Adjusted Chi-Square to df Ratios for GRM Model Data Fit

< 1

1 < 2

2 < 3

3 < 4

4 < 5

5 < 7

> 7

Mean

SD

Singlets

0

0

0

1

0

2

0

5.19

1.88

Doublets

0

0

0

0

0

0

3

61.04

6.08

Triplets

0

0

0

0

0

0

1

The results in table 17.10 suggest that the model has moderate to poor fit, as the majority of chi-square statistics are significant for singlets, doublets, and triplets. These results should be interpreted with caution, however, as the chi-square statistic is particularly sensitive to sample size and tends to imply model misfit even in moderately sized samples (Zampetakis et al. 2015). Evidence indicates that nearly any departure from the model will result in a significant detection of misfit (Bentler and Bonnet 1980), especially if the data are not normally distributed (McIntosh 2007). Consequently, this model likely fits the data better than the chi-square statistic implies. For example, Sinharay and Haberman (2014) analyzed a series of chi-square fit statistics in relation to IRT models and failed to find any models that fit the data, with severe misfit in nearly all large samples. Therefore, given the visual fit displayed in figure 17.2, I argue that the model adequately captures the data and that the resulting group consciousness scale is robust even in the event of violations of the IRT model.

The final step in capturing group consciousness is examining DIF. As detailed above, DIF occurs when there is an interaction between levels of group consciousness and group membership. When DIF is not present, respondents with the same level of group consciousness will have the same score on the latent trait; when DIF is present, a respondent’s level of group consciousness will be conditioned by his or her group membership and inaccurately distort the results. Therefore, two respondents may have the same level of group consciousness, but score differently on the scale based on their subgroup, rather than their level of θ. Two forms of DIF may be present in the sample, uniform DIF and nonuniform DIF (Zumbo 1999; Holland and Wainer 2012; Swaminathan and Rogers 1990). Uniform DIF occurs when group membership and group consciousness interact, but that interaction is consistent across all levels of the latent trait. Nonuniform DIF occurs when that interaction varies across levels of the latent trait, with different effects at low, moderate, or high levels of group consciousness.

I used DIFdetect to identify and adjust for DIF-affected items (Crane et al. 2006). This method utilizes an ordinal logistic regression model for DIF detection and extends previous DIF detection analyses (Mantel and Haenszel 1959; Swaminathan and Rogers 1990; Zumbo 1999). DIFdetect is an iterative process for estimating group consciousness that begins with detecting which items demonstrate DIF. When items do not demonstrate DIF, IRT parameters are estimated for the entire sample. When items demonstrate (p. 378) DIF, IRT parameters are estimated separately for the separate groups. This produces a DIF-adjusted estimate that can be used in subsequent analyses without bias. For the iterative process, the DIF-adjusted estimate of the latent trait is used to test additional grouping categories for DIF. This process of adjusting for DIF is repeated until all relevant items have been analyzed and adjusted for, as necessary (Zampetakis et al. 2015).

Table 17.11 Differential Item Functioning in “A Survey of LGBT Americans”

Self-Categorization

Private Evaluation

Importance

Type of Significant DIF at p < 0.05

Lesbians

Uniform

Uniform

None

Female Bisexuals

Uniform

Uniform

None

Male Bisexuals

Uniform

Nonuniform

Uniform

Racial and Ethnic Minorities

Nonuniform

Uniform

Uniform

Bachelor’s Degree

Uniform

None

None

Over 45 Years of Age

None

None

None

Table 17.11 shows that for nearly every demographic category, both uniform and nonuniform DIF was present, as the probability of DIF was consistently significant. Sexual orientation, race and ethnicity, and education all contributed to differential item functioning within this sample, while age did not. Each subgroup was compared to a reference population. For example, lesbians and bisexuals were compared to gay men, racial and ethnic minorities were compared to whites, those with bachelor’s degrees were compared to those without degrees, and the over age forty-five population was compared to the under age forty-five population. Across each DIF analysis except age, group membership was significant for at least one item within the scale, indicating that a DIF-adjusted measure of group consciousness must be used.

This is a particularly important finding, because it casts doubt on previous analyses of subgroup differences in levels of group consciousness. To date, we have attributed group differences to actual differences that exist between demographic groups. If these items are the result of survey bias, however, we may be drawing the wrong conclusions about levels of group consciousness. Using DIF-adjusted results, it is possible that differences among demographic groups may disappear in subsequent tests. Therefore, to verify that we form accurate conclusions about group consciousness, it is essential to use DIF analysis in constructing our measures of latent traits.

# Results

Using DIF estimates that were adjusted for lesbian sexual orientation, bisexual female sexual orientation, and education, I produced an unbiased and empirically grounded (p. 379) measure of group consciousness. Adjustments for racial and ethnic minority status and bisexual male orientation did not contribute to an improvement in the estimation of θ. Therefore, although DIF was present, I did not adjust group consciousness for these measures, because they did not improve the model. This likely indicates that, while significant, the DIF results for these groups were not substantively important and are unlikely to impact subsequent modeling. For all other groups, however, DIF fundamentally structured the results, demonstrating that these differences are likely to impact future tests. In addition, it is possible that the inability to improve the estimation of θ for bisexual males and racial and ethnic minorities is a function of their relatively small sample size, and that meaningful DIF could be found in future analyses that rely on larger and more diverse samples.

Table 17.12 Summary Statistics of Group Consciousness

Mean

SD

Min

Max

N

0.000

0.75

−0.94

1.20

1,153

Table 17.12 displays the summary statistics of the group consciousness measure, showing that IRT generated an interval measure of group consciousness with a mean of 0 and a standard deviation of 0.8. The latent trait was predicted using an empirical Bayes estimator that combines prior information about θ with the probability to obtain the conditional posterior distribution of θ (Skrondal and Rabe-Hesketh 2004, 2009). The resulting measure of group consciousness ranges from −0.9 to 1.2, with lower values representing lower levels of group consciousness and higher values representing higher levels of group consciousness. Overall, the summary statistics demonstrate that this measure of group consciousness has favorable statistical properties for subsequent testing.

# Discussion

The results presented in this analysis cast doubt on group consciousness research that fails to use strong measurement models. To date, dozens of research articles examine group consciousness, yet contain little to no discussion of the most appropriate measurement strategies for capturing the concept. This is a serious limitation in the current body of group consciousness research, as it leads to three primary limitations that the methodology proposed in this analysis addresses: (1) our measures of group consciousness may have face validity, but lack construct validity; (2) many measures of group consciousness probably contain survey bias that distorts our interpretation of subgroup differences; and (3) we are measuring group consciousness incorrectly when we use a series of distinct, independent variables or additive measures.

(p. 380) Beginning with an examination of validity, the most commonly used group consciousness measures have not been examined from a measurement standpoint. This means that although they theoretically align with our understanding of group consciousness, this relationship has not been empirically established. In this analysis, at least two of the measures that were expected to map to group consciousness, public evaluation and attachment, failed to demonstrate a relationship with the latent trait. If detailed examination of these items had not been performed, they could have erroneously been included in the final group consciousness measure. This would have likely led to model distortions and the incorrect presentation of results. Essentially, any conclusions we drew from a measure of group consciousness that included these items would have been wrong, as they fundamentally mismeasured the construct. Therefore, because most preceding articles have not used methodologically valid measures of group consciousness, we cannot be certain that our conclusions about the nature of group consciousness are reliable or valid.

Item bias further distorts these results and has a high probability of misdirecting our conclusions. Currently, many research articles point to significant and meaningful subgroup differences regarding levels of group consciousness (Masuoka 2006; Jamal 2005; Duncan 1999; Sanchez 2006a, 2006b, 2008). However, none of these articles examine whether the survey itself is driving these differences through differential item functioning. Given that five subgroups within this examination demonstrated DIF—lesbians, bisexual females, bisexual males, racial and ethnic minorities, and the college-educated population—it is very likely that our current understanding of subgroup differences may be the result of survey bias. Moving forward, analyses that seek to explain the formation of group consciousness and control for subgroups must include an analysis of DIF. Without doing so, the field may be making false deductions about the relationship between demographic categories and group consciousness.

Finally, this research calls into question the many measures of group consciousness that are currently employed. Most scholars analyzing group consciousness utilize either additive measures that simply add together a series of dependent variables, or treat all the subcomponents of group consciousness as distinct and operationalize each variable as a separate independent variable. Both approaches are incorrect. The first creates measures that are directly contingent on the number of items on the scale, which may or may not be related. The second treats variables as multidimensional when they are probably unidimensional. As this analysis demonstrates, the method that most accurately estimates group consciousness must rely on IRT. This is particularly important given that IRT produces results with favorable properties for statistical testing. Given that examining group differences can be misleading if the incorrect level of measurement is used (Maxwell and Delaney 1985), many of our current results regarding group consciousness may be misspecified.

Together, these results have broad implications for scholars of political behavior, because they provide strong support for the argument that IRT must be more thoroughly incorporated into our empirical analyses. Although we dedicate a great deal of time to discussing theoretical factors and implications, we rarely devote the same amount of (p. 381) attention to measurement strategies. Consequently, we use measures that are theoretically grounded, yet rarely empirically grounded. As this analysis demonstrates, that limitation is highly likely to lead us to false conclusions based on inappropriate measurement. This is particularly probable because our concepts tend to be relatively abstract, amorphous, and difficult to define.

Moving forward, scholars should incorporate IRT as a solution to these measurement problems. It allows us to develop empirically based measures for capturing latent constructs with favorable statistical properties for subsequent analysis. It builds on our theoretical knowledge by relying on theoretical justifications for initial item selection, while subsequently empirically testing the validity of those assumptions. Through a process of examining dimensionality, monotonicity, DIF, and model data fit, IRT allows us to produce empirically valid and reliable operationalizations. A general guideline would encourage scholars of political behavior to always begin with IRT, even when analyzing concepts that seem relatively straightforward, such as political knowledge or political participation, as evidence demonstrates that these latent variables are rarely as uncomplicated as they seem. Consequently, all analyses that utilize latent constructs should consider incorporating IRT as their measurement strategy.

# Conclusion

Using IRT, this analysis makes a series of important contributions that challenge the conventional measurement strategies of scholars analyzing group consciousness.

It begins by demonstrating that group consciousness is not multidimensional from a measurement standpoint, as all theoretical subcomponents mapped onto a single construct in this sample. Although we may discuss the construct as multidimensional, it is best operationalized using a single construct. In addition, many concepts that are traditionally grouped into group consciousness measures, such as public evaluation and attachment, failed to meet model assumptions and did not properly align with group consciousness. Therefore, some of the subcomponents we use to clarify the definition of group consciousness may not be particularly meaningful and should potentially be excluded from usage in future analyses. Further, even when the correct number of dimensions is used and the items are correctly specified, group consciousness measures are highly likely to suffer from differential item functioning. As this analysis shows, nearly all major subgroups demonstrated a degree of survey bias, implying that the conclusions formed about the relationship between these subgroups and group consciousness will be biased unless we use DIF-adjusted results. In total, these results call into question our current understanding of group consciousness, as almost all articles examining group consciousness lack appropriate measurement methodologies. Using IRT, we can overcome these limitations by establishing statistically valid measures of group consciousness that allow us to reexamine our prior conclusions.

## References

Abrajano, M. 2015. “Reexamining the ‘Racial gap’ in Political Knowledge.” Journal of Politics 77 (1): 44–54.Find this resource:

Abramowitz, A. I., and K. L. Saunders. 2006. “Exploring the Bases of Partisanship in the American Electorate: Social Identity vs. Ideology.” Political Research Quarterly 59 (2): 175–187.Find this resource:

Abrams, D., and R. Brown. 1989. “Self-Consciousness and Social Identity: Self-Regulation as a Group Member.” Social Psychology Quarterly 52 (4): 311–318.Find this resource:

Ashmore, R. D., K. Deaux, and T. McLaughlin-Volpe. 2004. “An Organizing Framework for Collective Identity: Articulation and Significance of Multidimensionality.” Psychological Bulletin 130 (1): 80–113.Find this resource:

Baker, F. B. 2001. The Basics of Item Response Theory. New York: ERIC Clearinghouse on Assessment and Evaluation.Find this resource:

Baker, F. B., and S. Kim. 2004. Item Response Theory: Parameter Estimation Techniques. 2nd ed. New York: CRC Press.Find this resource:

Bentler, P. M., and D. G. Bonnet. 1980. “Significance Tests and Goodness of Fit in the Analysis of Covariance Structures.” Psychological Bulletin 88 (3): 588–606.Find this resource:

Brewer, M. B. 1979. “In-Group Bias in the Minimal Intergroup Situation: A Cognitive-Motivational Analysis.” Psychological Bulletin 86 (2): 307–324.Find this resource:

Brubaker, R., and F. Cooper. 2000. “Beyond ‘Identity’.” Theory and Society 29 (1): 1–47.Find this resource:

Carpini, M. X. D., and S. Keeter. 1993. “Measuring Political Knowledge: Putting First Things First.” American Journal of Political Science 37 (4): 1179–1206.Find this resource:

Chong, D., and R. Rogers. 2005. “Racial Solidarity and Political Participation.” Political Behavior 27 (4): 347–374.Find this resource:

Clinton, J. D., and J. S. Lapinski. 2006. “Measuring Legislative Accomplishment, 1877–1994.” American Journal of Political Science 50 (1): 232–249.Find this resource:

Conover, P. J. 1984. “The Influence of Group Identifications on Political Perception and Evaluation.” Journal of Politics 46 (3): 760–785.Find this resource:

(p. 383) Conover, P. J. 1988. “The Role of Social Groups in Political Thinking.” British Journal of Political Science 18 (1): 51–76.Find this resource:

Conover, P. J., and S. Feldman. 1984. “How People Organize the Political World: A Schematic Model.” American Journal of Political Science 28 (1): 95–126.Find this resource:

Conover, P. J., and V. Sapiro. 1993. “Gender, Feminist Consciousness, and War.” American Journal of Political Science 37 (4): 1079–1099.Find this resource:

Crane, P. K., L. E. Gibbons, L. Jolley, and G. van Belle. 2006. “Differential Item Functioning Analysis with Ordinal Logistic Regression Techniques: DIFdetect and difwithpar.” Medical Care 44 (11, supp. 3): S115–S123.Find this resource:

Crocker, J., R. Luhtanen, B. Blaine, and S. Broadnax. 1994. “Collective Self-Esteem and Psychological Well-Being among White, Black, and Asian College Students.” Personality and Social Psychology Bulletin 20 (5): 503–513.Find this resource:

Deaux, K. 1996. “Social Identification.” In Psychology: Handbook of Basic Principles, edited by E. T. Higgins, and A. W. Kruglanski, 227–238. New York: Guilford Press.Find this resource:

Diehl, M. 1989. “Justice and Discrimination between Minimal Groups: The Limits of Equity.” British Journal of Social Psychology 28 (3): 227–238.Find this resource:

Duncan, L. E. 1999. “Motivation for Collective Action: Group Consciousness as Mediator of Personality, Life Experiences, and Women’s Rights Activism.” Political Psychology 20 (3): 611–635.Find this resource:

Eagly, A. H., and S. Chaiken. 1993. The Psychology of Attitudes. Fort Worth, TX: Harcourt Brace Jovanovich College Publishers.Find this resource:

Embretson, S. E. 1996. “The New Rules of Measurement.” Psychological Assessment 8 (4): 341–349.Find this resource:

Embretson, S. E., and S. P. Reise. 2000. Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.Find this resource:

Embretson, S. E., and S. P. Reise. 2013. Item Response Theory. Psychology Press.Find this resource:

Fraley, R. C., N. G. Waller, and K. A. Brennan. 2000. “An Item Response Theory Analysis of Self-Report Measures of Adult Attachment.” Journal of Personality and Social Psychology 78 (2): 350–365.Find this resource:

Galecki, J. M., M. F. Sherman, and J. M. Prenoveau. 2016. “Item Analysis of the Leeds Dependence Questionnaire in Community Treatment Centers.” Psychological Assessment 28 (9): 1061–1073.Find this resource:

Gamson, W. A. 1968. Power and Discontent. Homewood, IL: Dorsey Press.Find this resource:

Gerbing, D. W., and J. C. Anderson. 1988. “An Updated Paradigm for Scale Development Incorporating Unidimensionality and Its Assessment.” Journal of Marketing Research 25 (2): 186–192.Find this resource:

Gillion, D. Q. 2009. “Re-defining Political Participation through Item Response Theory.” Paper presented at APSA 2009 Meeting, Toronto.Find this resource:

Green, S. B. 1991. “How Many Subjects Does It Take to Do a Regression Analysis?” Multivariate Behavioral Research 26 (3): 499–510.Find this resource:

Gurin, P., A. H. Miller, and G. Gurin. 1980. “Stratum Identification and Consciousness.” Social Psychology Quarterly 43 (1): 30–47.Find this resource:

Gurin, P. 1985 “Women’s Gender Consciousness.” Public Opinion Quarterly 49 (2): 143–163.Find this resource:

Hambleton, R., H. Swaminathan, and H. J. Rogers. 1991. Fundamentals of Item Response Theory. Newbury Park, CA: Sage.Find this resource:

Hambleton, R. K., and L. Murray. 1983. “Some Goodness of Fit Investigations for Item Response Models.” In R. K. Hambleton (Ed.), Applications of Item Response Theory. Vancouver, BC: Educational Research Institute of British Columbia.Find this resource:

(p. 384) Hardouin, J. 2013. MSP: Stata Module to Perform the Mokken Scale Procedure. https://ideas.repec.org/c/boc/bocode/s439402.html

Harris, F., and D. Q. Gillion. 2012. “Expanding the Possibilities: Reconceptualizing Political Participation as a Toolbox.” In The Oxford Handbook of American Elections and Political Behavior, edited by J. E. Leighley, 144–161. New York: Oxford University Press.Find this resource:

Heere, B., and J. D. James 2007. “Stepping Outside the Lines: Developing a Multi-dimensional Team Identity Scale Based on Social Identity Theory.” Sport Management Review 10 (1): 65–91.Find this resource:

Hemker, B. T., K. Sijtsma, and I. W. Molenaar. 1995. “Selection of Unidimensional Scales from a Multidimensional Item Bank in the Polytomous Mokken I RT Model.” Applied Psychological Measurement 19 (4): 337–352.Find this resource:

Henderson-King, D. H., and A. J. Stewart. 1994. “Women or Feminists? Assessing Women’s Group Consciousness.” Sex Roles 31 (9): 505–516.Find this resource:

Highton, B., and C. D. Kam. 2011. “The Long-Term Dynamics of Partisanship and Issue Orientations.” Journal of Politics 73 (1): 202–215.Find this resource:

Holland, P. W., and H. Wainer, eds. 2012. Differential Item Functioning. New York: Routledge.Find this resource:

Huddy, L. 2001. “From Social to Political Identity: A Critical Examination of Social Identity Theory.” Political Psychology 22 (1): 127–156.Find this resource:

Jackman, M. R., and R. W. Jackman. 1973. “An Interpretation of the Relation between Objective and Subjective Social Status.” American Sociological Review 38 (5): 569–582.Find this resource:

Jamal, A. 2005. “The Political Participation and Engagement of Muslim Americans: Mosque Involvement and Group Consciousness.” American Politics Research 33 (4): 521–544.Find this resource:

Jerit, J., J. Barabas, and T. Bolsen. 2006. “Citizens, Knowledge, and the Information Environment.” American Journal of Political Science 50 (2): 266–282.Find this resource:

Kaiser, H. F. 1970. “A Second Generation Little Jiffy.” Psychometrika 35 (4): 401–415.Find this resource:

Kidd, Q., H. Diggs, M. Farooq, and M. Murray. 2007. “Black Voters, Black Candidates, and Social Issues: Does Party Identification Matter?” Social Science Quarterly 88 (1): 165–176.Find this resource:

Koch, W. R. 1983. “Likert Scaling Using the Graded Response Latent Trait Model.” Applied Psychological Measurement 7 (1): 15–32.Find this resource:

Koster, M., M. E. Timmerman, H. Nakken, S. J. Pijl, and E. J. van Houten. 2009. “Evaluating Social Participation of Pupils with Special Needs in Regular Primary Schools: Examination of a Teacher Questionnaire.” European Journal of Psychological Assessment 25 (4): 213–222.Find this resource:

Kreuter, F., S. Presser, and R. Tourangeau. 2008. “Social Desirability Bias in CATI, IVR, and Web Surveys: The Effects of Mode and Question Sensitivity.” Public Opinion Quarterly 72 (5): 847–865.Find this resource:

Liu, L., F. Drasgow, R. Reshetar, and Y. R. Kim. 2011. “Item Response Theory (IRT) Analysis of Item Sets.” Paper presented at the Northeastern Educational Research Association (NERA) Annual Conference, Rocky Hill, CT.Find this resource:

Loevinger, J., G. C. Gleser, and P. H. DuBois. 1953. “Maximizing the Discriminating Power of a Multiple-Score Test.” Psychometrika 18 (4): 309–317.Find this resource:

Lord, F. M. 1980. Applications of Item Response Theory to Practical Testing Problems. Hillside, NJ: Erlbaum.Find this resource:

Ludlow, L. H. 1986. “Graphical Analysis of Item Response Theory Residuals.” Applied Psychological Measurement 10 (3): 217–229.Find this resource:

Luhtanen, R., and J. Crocker. 1992. “A Collective Self-Esteem Scale: Self-Evaluation of One’s Social Identity.” Personality and Social Psychology Bulletin 18 (3): 735–754.Find this resource:

(p. 385) Mael, F. A., and L. E. Tetrick. 1992. “Identifying Organizational Identification.” Educational and Psychological Measurement 52 (4): 813–824.Find this resource:

Mantel, N., and W. Haenszel. 1959. “Statistical Aspects of the Analysis of Data from Retrospective Studies.” Journal of the National Cancer Institute 22 (4): 719–748.Find this resource:

Masuoka, N. 2006. “Together They Become One: Examining the Predictors of Panethnic Group Consciousness among Asian Americans and Latinos.” Social Science Quarterly 87 (5): 993–1011.Find this resource:

Maxwell, S. E., and H. D. Delaney. 1985. “Measurement and Statistics: An Examination of Construct Validity.” Psychological Bulletin 97 (1): 85–93.Find this resource:

McCall, G. J., and J. L. Simmons. 1978. Identities and Interactions: An Examination of Human Associations in Everyday Life. New York: Free Press.Find this resource:

McClain, P. D., J. D. Johnson Carew, E. Walton Jr., and C. S. Watts. 2009. “Group Membership, Group Identity, and Group Consciousness: Measures of Racial Identity in American Politics?” Annual Review of Political Science 12: 471–485.Find this resource:

McIntosh, C. N. 2007. “Rethinking Fit Assessment in Structural Equation Modelling: A Commentary and Elaboration on Barrett.” Personality and Individual Differences 42 (5): 859–867.Find this resource:

Miller, A. H., P. Gurin, G. Gurin, and O. Malanchuk. 1981. “Group Consciousness and Political Participation.” American Journal of Political Science 25 (3): 494–511.Find this resource:

Mokken, R. J. 1971. A Theory and Procedure of Scale Analysis. Berlin: De Gruyter.Find this resource:

Mokken, R. J. 1997. “Nonparametric Models for Dichotomous Responses.” In Handbook of Modern Item Response Theory, edited by W. J. van der Linden and R. K. Hambleton, 351–367. New York: Springer.Find this resource:

Mondak, J. J. 2001. “Developing Valid Knowledge Scales.” American Journal of Political Science 45 (1): 224–238.Find this resource:

Murray, A. L., K. McKenzie, K. R. Murray, and M. Richelieu. 2014. “Mokken Scales for Testing Both Pre-and Postintervention: An Analysis of the Clinical Outcomes in Routine Evaluation—Outcome Measure (CORE–OM) Before and After Counseling.” Psychological Assessment 26 (4): 1196.Find this resource:

Orlando, M., and D. Thissen. 2000. “Likelihood-Based Item-Fit Indices for Dichotomous Item Response Theory Models.” Applied Psychological Measurement 24 (1): 50–64.Find this resource:

Osterlind, S. J., and H. T. Eveson. 2009. Differential Item Functioning. 2nd ed. New York: Sage.Find this resource:

Pew Research Center. 2013. “A Survey of LGBT Americans: Attitudes, Experiences, and Values in Changing Times.” Pew Research Center. http://www.pewsocialtrends.org/2013/06/13/a-survey-of-lgbt-americans/Find this resource:

Phinney, J. S. 1991. “Ethnic Identity and Self-Esteem: A Review and Integration.” Hispanic Journal of Behavioral Sciences 13: 193–208.Find this resource:

Reckase, M. D. 1979. “Unifactor Latent Trait Models Applied to Multifactor Tests: Results and Implications.” Journal of Educational and Behavioral Statistics 4 (3): 207–230.Find this resource:

Reeve, B. B., R. D. Hays, J. B. Bjorner, K. F. Cook, P. K. Crane, J. A. Teresi, et al. 2007. “Psychometric Evaluation and Calibration of Health-Related Quality of Life Item Banks: Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS).” Medical Care 45 (5): S22–S31.Find this resource:

Reise, S. P., K. F. Widaman, and R. H. Pugh. 1993. “Confirmatory Factor Analysis and Item Response Theory: Two Approaches for Exploring Measurement Invariance.” Psychological Bulletin 114 (3): 552–566.Find this resource:

Rosenberg, M. 1979. Conceiving the Self. New York: Basic Books.Find this resource:

(p. 386) Sanchez, G. R. 2006a. “The Role of Group Consciousness in Latino Public Opinion.” Political Research Quarterly 59 (3): 435–446.Find this resource:

Sanchez, G. R. 2006b. “The Role of Group Consciousness in Political Participation among Latinos in the United States.” American Politics Research 34 (4): 427–450.Find this resource:

Sanchez, G. R. 2008. “Latino Group Consciousness and Perceptions of Commonality with African Americans.” Social Science Quarterly 89 (2): 428–444.Find this resource:

Sanchez, G. R., and E. D. Vargas. 2016. “Taking a Closer Look at Group Identity: The Link between Theory and Measurement of Group Consciousness and Linked Fate.” Political Research Quarterly 69 (1): 160–174.Find this resource:

Sellers, R. M., S. A. J. Rowley, T. M. Chavous, J. N. Shelton, and M. A. Smith. 1997. “Multidimensional Inventory of Black Identity: A Preliminary Investigation of Reliability and Construct Validity.” Journal of Personality and Social Psychology 73 (4): 805–815.Find this resource:

Shingles, R. 1981. “Black Consciousness and Political Participation: The Missing Link.” American Political Science Review 75 (1): 76–91.Find this resource:

Sinharay, S., and S. J. Haberman. 2014. “How Often Is the Misfit of Item Response Theory Models Practically Significant?” Educational Measurement: Issues and Practice 33 (1): 23–35.Find this resource:

Skrondal, A., and S. Rabe-Hesketh. 2004. Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. Boca Raton, FL: CRC Press.Find this resource:

Skrondal, A., and S. Rabe-Hesketh. 2009. “Prediction in Multilevel Generalized Linear Models.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 172 (3): 659–687.Find this resource:

Slocum-Gori, S. L., and B. D. Zumbo. 2011. “Assessing the Unidimensionality of Psychological Scales: Using Multiple Criteria from Factor Analysis.” Social Indicators Research 102 (3): 443–461.Find this resource:

Smith, R. M. 2004. “Identities, Interests, and the Future of Political Science.” Perspectives on Politics 2 (2): 301–312.Find this resource:

Stark, S. 2001. MODFIT: A Computer Program for Model-Data Fit. Urbana-Champaign: University of Illinois.Find this resource:

Stochl, J., P. B. Jones, and T. J. Croudace. 2012. “Mokken Scale Analysis of Mental Health and Well-Being Questionnaire Item Responses: A Non-parametric IRT Method in Empirical Research for Applied Health Researchers.” BMC Medical Research Methodology 12 (1): 1.Find this resource:

Stokes, A. K. 2003. “Latino Group Consciousness and Political Participation.” American Politics Research 31 (4): 361–378.Find this resource:

Stryker, S. 1980. Symbolic Interactionism a Social Structural Version. Menlo Park, CA: Benjamin Cummings.Find this resource:

Stryker, S., and R. T. Serpe. 1994. “Identity Salience and Psychological Centrality: Equivalent, Overlapping, or Complementary Concepts?” Social Psychology Quarterly 57 (1): 16–35.Find this resource:

Swaminathan, H., and H. J. Rogers. 1990. “Detecting Differential Item Functioning Using Logistic Regression Procedures.” Journal of Educational Measurement 27 (4): 361–370.Find this resource:

Tajfel, H. 1981. Human Groups and Social Categories: Studies in Social Psychology. Cambridge, MA: Cambridge University Press.Find this resource:

Tajfel, H. 1982. “Social Psychology of Intergroup Relations.” Annual Review of Psychology 33: 1–39.Find this resource:

Thissen, D., and L. Steinberg. 1986. “A Taxonomy of Item Response Models.” Psychometrika 51 (4): 567–577.Find this resource:

Trapnell, P. D., and J. D. Campbell. 1999. “Private Self-Consciousness and the Five-Factor Model of Personality: Distinguishing Rumination from Reflection.” Journal of Personality and Social Psychology 76 (2): 284–304.Find this resource:

(p. 387) Turner, J. C., M. A. Hogg, P. J. Oakes, S. D. Reicher, and M. S. Wetherell. 1987. Rediscovering the Social Group: A Theory of Self-Categorization. New York: Basil Blackwell.Find this resource:

Tyler, T. R., and S. L. Blader. 2001. “Identity and Cooperative Behavior in Groups.” Group Processes and Intergroup Relations 4 (3): 207–226.Find this resource:

van der Linden, W. J., and R. K. Hambleton, eds. 1997. Handbook of Modern Item Response Theory. New York: Springer.Find this resource:

van Schuur, W. H. 2003. “Mokken Scale Analysis: Between the Guttmann Scale and Parametric Item Response Theory.” Political Analysis 11 (2): 139–163.Find this resource:

Wallace, D. S., A. Abduk-Khaliq, M. Czuchry, and T. L. Sia. 2009. “African Americans’ Political Attitudes, Party Affiliation, and Voting Behavior.” Journal of African American Studies 13 (2): 139–146.Find this resource:

Welch, S. and L. S. Foster. 1992. “The Impact of Economic Conditions on the Voting Behavior of Blacks.” The Western Political Quarterly 45 (1): 221–236.Find this resource:

Weldon, S. A. 2006. “The Institutional Context of Tolerance for Ethnic Minorities: A Comparative, Multilevel Analysis of Western Europe.” American Journal of Political Science 50 (2): 331–349.Find this resource:

Wilson Van Voorhis, C. R., and B. L. Morgan. 2007. “Understanding Power and Rules of Thumb for Determining Sample Sizes.” Tutorials in Quantitative Methods for Psychology 3 (2): 43–50.Find this resource:

Yen, W. M. 1986. “The Choice of Scale for Educational Measurement: An IRT Perspective.” Journal of Educational Measurement 23 (4): 299–325.Find this resource:

Zampetakis, L. A., M. Lerakis, K. Kafetsios, and V. Moustakis. 2015. “Using Item Response Theory to Investigate the Structure of Anticipated Affect: Do Self-Reports about Future Affective Reactions Conform to Typical or Maximal Models?” Frontiers in Psychology September (6): 1–8.Find this resource:

Zumbo, B. D. 1999. A Handbook on the Theory and Methods of Differential Item Functioning (DIF): Logistic Regression Modeling as a Unitary Framework for Binary and Likert-type (Ordinal) Item Scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.Find this resource:

## Notes:

(1.) Survey weights were not used in this analysis.

(2.) Public evaluation and attachment were recoded using a variety of methods and retested to analyze if using a different measurement strategy would satisfy the monotonicity requirements. No method of recoding the items was able to achieve a sufficient Loevinger’s H coefficient to establish monotonicity. Further, visual inspection of the item characteristic curves validated the MSA, with both ICCs demonstrating significant violations of the monotonicity assumption (Koster et al. 2009; Murray et al. 2014; Stochl et al. 2012).

(3.) Unidimensionality was re-established for the three-item scale after analyzing monotonicity. The remaining items satisfied the unidimensionality requirement, with only one factor having an eigenvalue greater than 1, and the first factor explaining 56.65% of the variance. Therefore, this subset of items also met the unidimensionality condition. Monotonicity was also re-established for the three-item scale after recoding the variables following the logic described below. The remaining items satisfied the monotonicity requirement, indicating that item recoding did not violate model assumptions.