To cite this page using APA-style:

Hammer, J. H. (n.d.). Integrated Behavioral Model of Mental Health Help Seeking: Mixed-Method Protocol. Retrieved [Month] [Date], [Year], from https://helpseekingresearch.com/theory/ibm-hs/applications/mixed-method-protocol/

This page will provide an overview of the sequential mixed-method protocol to systematically identify the primary help-seeking beliefs (i.e., the subset of salient beliefs that most distinguish those who intend to seek help from those who do not) of a population and relevant sociodemographic groups therein. Here is a summary of the steps:

  • Step 1: Identify population of interest and sociodemographic groups therein
  • Step 2: Define the help-seeking behavior
  • Step 3: Conduct mixed-method pilot study
  • Step 4: Develop help-seeking belief measures
  • Step 5 (optional): Refine help-seeking belief measures via cognitive interviews
  • Step 6: Administer baseline survey containing help-seeking belief measures and direct measures
  • Step 7 (optional): Administer follow-up survey to determine the help-seeking moderators of the relationship between intention and prospective help-seeking behavior

The data from the baseline and follow-up surveys can be used to (a) identify primary help-seeking beliefs and (b) help-seeking determinants that stop people who intend to seek help from successfully accessing care. This information can be used to identify suitable targets for future intervention. Detailed advice on each step is provided below. To read an example of portions of these steps in action, see Hammer and colleagues (2024), where we documented our mixed-method approach to developing and providing validity evidence for the Undergraduate Engineering Mental Health Help-Seeking Instrument (UE-MH-HSI).

Step 1: Identify population of interest and sociodemographic groups therein

The core goal of help-seeking research is to identify the most important constructs that help or stop people from seeking and obtaining mental health care, so that those constructs can be targeted by future interventions designed to close the mental health treatment gap. Which constructs matter most will depend on the population in question. Thus, results found for one population cannot be assumed to generalize to other populations. Similarly, results for an overall sample cannot be assumed to generalize to a given sociodemographic segment within that sample, especially when that segment is in the numerical minority.

We are all human, but our experience as individual humans is powerfully shaped by our environment. Each of us lives within nested systems (e.g., family, neighborhood, region, nation, culture; Bronfenbrenner, 1979) that influence our beliefs about mental health and the structural resources available for restoring mental healing (e.g., talk therapy). Thus, the mental health help seeking journey of the average college student studying the United States will look different than the journey of a person incarcerated by the criminal legal system, a migrant laborer working abroad, or a stay-at-home parent living in a wealthy metropolitan suburb. In each case, there are a host of structural forces (e.g., racism, ableism), cultural influences (e.g., emotional stoicism, loss of face), environmental constraints (e.g., freedom of movement, local availability of mental health professionals), and other help-seeking determinants that have differentially shaped the help-seeking beliefs and opportunities for these individuals.

Thus, what helps or stops people from seeking mental health care can vary drastically depending on which group of people we are talking about. It follows that, the broader one’s population of focus (e.g., Black heterosexual women attending four-year colleges in the United States vs. Black heterosexual women living in the United States vs. Black heterosexual women vs. Black women vs. Black people vs. people), the more heterogenous their help-seeking beliefs and opportunities will be.

This has two important implications. First, the broader the population, the harder it becomes to draw robust conclusions from an overall sample of the population that accurately generalize to most members of that population. Second, the broader the population, the more essential it becomes to conduct subgroup analysis on important sociodemographic groups (on the basis of gender, race, etc.) within that population. Subgroup analysis makes it possible to determine, for which specific sociodemographic groups, which (a) primary help-seeking beliefs distinguish those who do intend to seek help from those who do not and (b) help-seeking determinants moderate the degree to which intention translates into successful help-seeking behavior. These findings may be the same, or may differ, across subgroup samples, and in comparison to the overall sample. “Some constructs may be important to address across the entire population, whereas others may be important for only certain sociodemographic groups, and thus optimal targets for group-specific intervention” (Hammer et al., 2024, p. 9).

Therefore, investigators must be intentional about identifying their population, and its sociodemographic segments, of interest. This will help them avoid the ethical misstep of making unwarranted generalizability claims about certain populations or segments. This is particularly important in situations where the numerical majority of a sample are people from a given privileged group (e.g., white people, people with a college education) and the investigators wish to offer conclusions about the population (e.g., adults living in the U.S.A) as a whole. In such cases, the importance of segment analysis (repeating analyses with a specific segment of the sample) is heightened, to help determine which overall sample effects do or do not generalize to key segments of the sample. When there is insufficient representation of a given segment in a sample, which can lead to issues such as low statistical power, that must be accounted for when discussing generalizability and limitations. Common examples of insufficient attention to this step in extant help-seeking scholarship is the overreliance on (a) college student samples, (b) predominantly white samples, and (c) samples collected from WEIRD–Western, Educated, Industrialized, Rich and Democratic cultures (Arnett, 2008), from which overbroad conclusions about “humans in general” are sometimes made.

In summary, care must be taken when determining the boundaries for one’s population of interest for a given project. The population should not be too broad, nor too narrow. The broader the population, the more important it becomes to collect sufficient samples of key sociodemographic groups therein, at each phase of the mixed-method protocol, to ensure that intrapopulation variation can be properly documented.

Lastly, a comment on the interaction of facets of identity, and on intersectionality, is needed. People cannot be reduced down to one facet of their identity (e.g., their race, their gender). People are more than just the additive sum of their parts; they are their whole, complex self. This multiplicity of identity is important to acknowledge (Clemens, 2019; Yampolsky et al., 2013; Yampolsky et al., 2016). Likewise, “examination of structural forces necessitates an intersectional approach (Cole, 2009) that attends to ways multiple systems combine to create unique forms of stress and discrimination, as well as distinct registers of privilege and oppression within any social or cultural group that affect mental health care dynamics” (Grzanka & Miles, 2016) (as cited in Hammer et al., 2024, p. 3). Therefore, when it comes to cultural identity and to systems of power/oppression, users of the IBM-HS are called to seek to account for this complexity where the whole is greater than the sum of its parts. However, this call can be challenging to answer in practice, especially when it comes to quantitative analysis approaches. Therefore, it is important to acknowledge that, while this Step 1 section talks about the importance of sociodemographic subgroup analysis, it is easy to fall into a trap where we are conducting and interpreting these analyses in such a manner than we only consider one aspect of sociodemographic identity (and its accompanying system of power that privileges and marginalizes people according to their social location related to that identity facet) at a time and ignore multiplicity and intersectionality. There are scholars who have provided guidance on using qualitative and quantitative methods in a manner that honors multiplicity and intersectionality (e.g., Denzin et al., 2017; Garcia et al., 2018; Gillborn et al., 2017; Guan et al., 2021; Levitt et al., 2021; McCormick-Huhn et al., 2019) and, ideally, there will be future scholarship published on how to do this in the specific context of help-seeking research.

Once you have identified the population of interest for a given project and the sociodemographic groups of importance therein, it is time to move to Step 2.

Step 2: Define the help-seeking behavior

Help-seeking behavior and the decision-making that underlies that behavior is dynamic, iterative, and reciprocal (Doblyte & Jiménez-Mejías, 2016; Edge & MacKian, 2010; Pescosolido et al., 1998). For example, it is not uncommon for the help-seeking care pathway to be filled with starts and stops, delays, thwarted attempts, and reliance on a variety of non-professional supports as alternatives or prerequisites to formal help seeking (Chang, 2008; Gulliver et al., 2010; Lindsey et al., 2006).

Here is one mental health care journey, which illustrates this complexity: a person notices that their functioning is impaired, conceptualizes their symptoms and functional impact as indicative of a mental health issue, considers different strategies for addressing the issue, may or may not engage in a variety of coping and self-help strategies including seeking the informal support and consultation of trusted others, and may at some point decide that seeking professional help could be an appropriate next step, at which point they must identify what forms and sources of professional help may be possible which often includes additional information gathering from a variety of sources, and upon identifying a good enough option may then attempt to take steps toward initiating contact with a given source which may be one’s primary care provider or directly with a mental health provider, and this attempt at initiating contact can result in myriad outcomes including abandoning the attempt to obtain help, seeking out alternative sources of professional help, scheduling an initial phone/teleconference/in-person screening conversation or intake appointment, and then possibly attending that first contact appointment, which may or may not be the start of an official treatment relationship, and in cases where that initial screening indicates to both parties that formalizing and starting a treatment relationship is warranted then a first official appointment is scheduled, and then a person may or may not successfully attend that first official appointment.  At each of these points, forward momentum toward attending that first appointment with a mental health provider can stall. In sum, there are many steps to the help-seeking process and the presence, nature, and order of these steps varies across individuals, populations, and contexts.

It is not possible for one theoretical model, such as the IBM-HS, to capture all this complexity while also serving as good theory that is empirically operationalizable. Therefore, good help seeking theory simplifies aspects of the complex help-seeking decision process, a necessary sacrifice that creates utility for users of the theory.

One key aspect of this simplification is that the IBM-HS is designed to facilitate identification of the beliefs that are associated with the intention to engage in a specific and narrowly defined form of mental health help-seeking behavior. Seeking help is narrowly defined in a binary fashion such as attending an initial appointment with a professional (versus not) within a specified span of time. Using a narrow definition allows investigators to obtain precise and compatible measurement of help-seeking constructs in a given study, which is necessary for accurate prediction of a specific prospective help-seeking behavior. However, this narrow focus naturally obfuscates the iterative complexity of the help-seeking process. In other words, the IBM-HS is suitable for predicting intention and prospective behavior when defined in a narrow binary manner but is not suitable for studying the iterative dynamics involved in most people’s mental health care paths. Users interested in a theoretical framework focusing on those iterative dynamics are encouraged to explore alternative models such as the Network-Episode Model (Pescosolido, 2010).

Regarding conceptualization of prospective help-seeking behavior, as noted by Hammer and colleagues (2024, p. 7): “help-seeking behavior has previously been conceptualized as “a problem focused, planned behavior, involving interpersonal interaction with a selected health-care professional” (Cornally & McCarthy, 2011, p. 286). In the context of the IBM-HS, help seekers are seeking assistance for a mental health problem (e.g., anxiety, depression, stress, difficulties related to substance use) or an adjacent issue treated by mental health professionals (e.g., relationship difficulties). Although the mental health care pathway is often complex and involves informal help-seeking (e.g., friends, family; Pescosolido, 2010), the term prospective help-seeking behavior is understood within the IBM-HS to refer to voluntarily seeking assistance from a formal (i.e., professional) source, unless otherwise specified. Whereas some individuals experience involuntary treatment (e.g., psychiatric hospitalization, minor brought to treatment by parent), the IBM-HS is most applicable to consensual, planful, conscious pursuit of help. Unlike help-seeking perceptions, prospective behavior is externally observable by a third party. Prospective behavior can be conceptualized in terms of dichotomies (sought help or not), frequencies (number of sessions attended), or magnitudes (e.g., 10-minute consult, 50-minute session), though a dichotomous conceptualization is most common (Adams et al., 2022), as it most lends itself to being predicted by upstream constructs (Fishbein & Ajzen, 2010). Because the literature often conflates past and prospective help-seeking behavior, we wish to emphasize that past help-seeking behavior refers to help sought prior to baseline data collection, current help-seeking behavior refers to help currently being sought (e.g., I am currently seeing a mental health professional), and prospective (i.e., impending, future) help-seeking behavior refers to help that is sought after baseline data collection.”

As further noted by Hammer and colleagues (2024, p. 7) regarding measurement of prospective behavior: “In the context of mental health, the IBM-HS’s default operational definition of prospective mental health help-seeking behavior is attending a future session with a healthcare professional to acquire the professional’s assistance with addressing a mental health problem. We say “future” to reinforce the fact that prospective behavior is different than the determinants of past experiences with mental health help seeking. We say “default” because users of the IBM-HS are encouraged to define prospective help-seeking behavior in the manner that suits their professional purpose, provided that definition is characterized by the five elements described by Ajzen & Fishbein (1980) of target, action, context, time, and condition. For example, the behavior could be attending an initial session (action) with a mental health professional (target) at the campus counseling center (context) in the next three months (time) if one was experiencing a mental health concern (condition). Regardless of how behavior is defined, the principle of compatibility (Ajzen & Fishbein, 1980) must be observed to ensure that all help-seeking items reference the same exact help-seeking action, target, context, time, and condition. The five elements of action, target, context, time, and condition can be abbreviated as “TACT-C“, with the “C” being an addition formalized by the IBM-HS beyond the standard TACT discussed in prior reasoned action texts.

Regarding action, some users of the IBM-HS may wish to study a particular step of the mental health help-seeking process (e.g., asking their primary care physician for a referral to a mental health specialist, using an internet search engine to look up information about potential mental health providers in their area, filling out a mental health agency’s online intake form, attending a phone screening with a mental health professional, attending an initial working session with a professional).

Regarding target, some users will wish to define behavior as seeking mental health help from a professional (of any kind), a primary care physician, a mental health professional (however defined), a psychologist (specifically), or some other source.

For context, some users may care about a specific context (e.g., counseling center, emergency room) and some may wish to be inclusive across contexts.

For time, some users may be interested in shorter (e.g., in the next two weeks) or longer (e.g., in the next three months) periods of time.

For condition, some may wish to study people who are currently diagnosed as depressed, people who self-identify as having a mental health concern (however defined by the user), or people who screen above a clinical cutoff on a screening measure such as the Kessler-6 (Kessler et al., 2002). Yet others may want to sample from a mixed-distress population (some people currently distressed and some not), in which case the behavior and help-seeking measures would be best framed using conditional language (e.g., “If I had a mental health concern, I would intend to seek help…”; Hammer et al., 2018) instead of unconditional language (e.g., “I intend to seek help…”). For an in-depth guide on defining behavior using the five elements and complying with the principle of compatibility, readers may refer to Chapter 2 of Fishbein and Ajzen (2010).”

Regardless of how help-seeking behavior is defined, it is important that the investigators and the population of interest achieve a shared understanding of the meaning of each part of the definition. For example, the term “mental health professional” is understood differently across people. Some may consider religious leaders (e.g., pastors), school guidance counselors, physicians, and/or hairdressers to be mental health professionals, while others may not. Therefore, when there may be differential interpretation of a term, it is important to provide an accessible definition of the term to participants prior to collecting their responses. This is particularly important when defining the action element. Because help seeking involves a sequence of actions, the phrase “seeking help from a mental health professional” is shorthand for the total sequence involved, with the ultimate step typically conceptualized as attending an initial appointment with a mental health professional. If investigators define seeking help as attending an initial appointment, then investigators must also ensure that the people they collect data from have a shared interpretation of this phrasing. Because the steps involved in seeking help from a mental health professional are often a mystery to people, this increases the risk that a respondent interprets “sought help” using a different step threshold. For example, some respondents may think that seeking help would include calling a mental health practice to inquire about availability, or participating in a brief phone screening with a prospective therapist. Adding to the complexity here is that different mental health providers have different systems for screening potential clients. Some may ask potential clients to first complete an online screening. Others may offer a brief phone consultation as a first step. For some providers, the first session is purely a data-gathering intake session. For others, the first session is a working session. Given this heterogeneity, it is important for investigators to decide what type of “initial contact” constitutes “seeking help” and then ensure that respondents will share this understanding, whether implicitly or through explicit instructions provided at the start of data collection.

Another important consideration when settling on an operationalized definition of help-seeking behavior is related to screening, sampling, and conditionality. In contrast to unconditional intention (e.g., “I intend to seek help in the next 3 months”), conditional intention specifies the condition under which individuals would develop such intention (e.g., “If I had a mental health concern, I would seek help in the next 3 months”). Unlike a universally-applicable behavior like exercising or eating healthy food, seeking mental health care is not a universally relevant behavior for all humans at all times. It is only relevant in certain conditions: when a person has a reason and need to seek help due to having a mental health issue. Pragmatically, if an investigator sampled from a group of high-functioning individuals who are not experiencing symptoms of mental illness or other life challenges that might warrant seeking professional consultation, and asked these respondents about whether they unconditionally intend to seek help from a mental health professional in the near future, a floor effect would occur such that everyone would indicate zero intention to seek help. This lack of variance in intention would preclude the investigator’s ability to study what factors influence this group’s intention to seek help. Therefore, investigators have at least three strategies for avoiding floor effects.

The first strategy for avoiding floor effects is to ensure that only data from people from the population who meet some predetermined mental health distress threshold are analyzed.

  • One version of this strategy uses a psychometrically-sound mental health distress screening measure such as the Kessler-6, PHQ-9, GAD-7, CORE-OM, each of which has published guidelines for categorizing people into distress categories. Because help-seeking can be relevant for people who are both moderately distressed and severely distressed, we encourage investigators to be intentional about the distress cut off they use. In practice, our team often uses a moderate distress-or-greater cutoff because we also want to know the factors that influence people to seek help when in moderate distress, as there is value in people seeking help proactively, before their distress potentially becomes severe. However, a limitation of this approach is that it excludes people whose symptom profile would not indicate significant distress, yet if asked if they feel like they are in distress and might benefit from help, these would say yes.
  • Another version of this strategy is to ask respondents to indicate whether they self-identify as having a current mental health concern, and only analyze data for those who say yes. When doing this, this term must be defined for respondents. Hammer and Spiker (2018) used the following item to assess this: “Are you currently experiencing a mental health concern (e.g., difficulties related to depression, anxiety, family or relationship issues, academic or career problems, adjustment issues, alcohol, drug, or addiction problems, eating disorder or body image, grief or loss, abuse or trauma, etc)? (Yes/No)”. However, the limitation of this approach is that it will end up excluding the group of people who a clinician would diagnose as having a mental health concern despite the people themselves not interpreting their current functioning as indicative of a mental health concern, which is an important group of people to study.
  • Both versions of this first strategy, however, require investigators to oversample the population to ensure collection of a sufficient number of respondents who meet the inclusion criteria. For example, if only 20% of a given population meets distress criteria, then a large number of people will need to be sampled in order to achieve recruitment targets.
  • If using this first strategy, we recommend assessing mental health status up front, so that people who do not meet thresholds can be exited from the remainder of the study before they invest further time and energy, as it is inconsiderate (and can be expensive vis-à-vis participation incentive costs) to collect data from people that will not be analyzed.

The second strategy for avoiding floor effects is to utilize pilot testing or third-party data sources to verify that the population that will be sampled has a large percentage of people who have a mental health concern that would warrant seeking help, thereby ensuring that the collected sample would demonstrate a normal response distribution on help-seeking perception variables, including intention. It is permissible for the collected sample to contain a small number of people with no current mental health concern, but it is a problem if this group is larger than, say, 20% of the sample.

The third strategy, one popular among help-seeking researchers, is to use a conditional frame. By asking people about what their help-seeking perceptions would be if, hypothetically, they currently had a mental health concern, this allows respondents to articulate conditional perceptions that will not be subject to floor effects.

  • The conditional frame can be explained to study participants at the start of each data collection event (e.g., interview, survey). For example, investigators can ask participants to imagine that they have recently been experiencing a set of mental illness symptoms that were negatively impacting their functioning, and to then answer the following questions about their help seeking perceptions as if they were experiencing that hypothetical mental health concern right now in real life. Our team has publications forthcoming that used this strategy for data collection, so check back here in Summer 2024 for an update.
  • Alternatively, the conditionality can be specified within the wording of the questions, scale instructions, and/or item stems. The Mental Help Seeking Intention Scale (MHSIS) provides an example of this approach, with each item specifying “If I had a mental health concern, I would…” in the subject of the sentence. The term “mental health concern” was also defined in the instructions for the scale, given the aforementioned importance of shared meaning of terms between investigators and respondents. Regardless of how conditionality is specified, it is important that this conditionality is consistent across all study measures. If the goal is to measure conditional intention, then attitude, perceived norm, and personal agency should all be conceptualized and measured using the same conditional frame.
  • The benefit of using conditional framing is that people who are not currently in significant mental health distress become eligible to provide data on their help-seeking perceptions, which can be used to inform investigators’ understanding of help-seeking decision making in the population. This is especially useful in higher-risk populations where, even though a given person may not be distressed in this moment, there is a reasonable chance they may be distressed at some later point, in which case understanding their thought processes is useful. To put this in a larger context, whereas only X% of the population meets criteria for a mental illness at any given point in time, XX% of the population will experience a mental illness at some point in their lives. Pragmatically, conditional framing allows all members of a population to contribute data to a study, which makes achieving necessary sample sizes more feasible. Most help-seeking studies sample from mixed-distress populations and use conditional framing (see Adams et al., 2022 for a helpful review of help-seeking studies).
  • The limitations of using conditional framing include measurement imprecision, cognitive load, and complications with predicting prospective help-seeking behavior.
    • Regarding measurement imprecision, research indicates that people are not always good at estimating how they will think, feel, and act in hypothetical situations (CITATIONS TO BE ADDED), so we assume that conditional help-seeking perceptions are imperfect proxies of what their unconditional perceptions would be if they were to truly experience the condition.
    • Regarding cognitive load, it is more cognitively taxing to imagine what one’s perceptions would be in a hypothetical scenario than to report what their actual perceptions are (CITATION TO BE ADDED), and this cognitive load can increase measurement error (CITATION TO BE ADDED).
    • Regarding complications with prediction of behavior, investigators should consider whether they intend to use Time 1 help-seeking perceptions to predict prospective help-seeking behavior as measured at Time 2 (see later Steps in this guide for more information on this). If so, then investigators should consider that conditional measures of help-seeking perceptions will be less accurate predictors of actual future behavior than unconditional measures, as people’s actual behavior is based on their unconditional perceptions. However, unpublished pilot data from the project that led to the publication of the MHSIS indicated that a measure of conditional intention was only slightly less accurate than an unconditional measure that did not specify “If I had a mental health concern…” at the start of each item. The unconditional and conditional measures were strongly correlated, which was only possible because the analyzed sample consisted of people whose situation matched the conditional situation articulated in the items (i.e., all respondents in this sample identified as currently having a mental health concern). The MHSIS, when worded as a conditional measure of intention, was able to predict prospective help-seeking behavior with a reasonable degree of accuracy in this currently distressed sample. Therefore, in practice, conditional measures, while potentially less accurate, appear (at least in this one sample) sufficiently accurate to predict future help-seeking behavior. However, a prerequisite for conditional help-seeking measures’ ability to predict prospective help-seeking behavior is that there are enough Time 1 respondents who meet criteria for the condition to provide adequate statistical power for such prediction analyses. For example, if the condition is “If I had a mental health concern”, investigators would need to verify—through the use of mental health distress screening measures or self-identification measures described above—that enough Time 1 respondents do currently have a mental health concern. Investigators would only collect Time 2 data on prospective help-seeking behavior from those who met the condition at Time 1, as only this portion of the Time 1 participants were providing conditional help-seeking perception responses that could be fairly agued to approximate their unconditional help-seeking perceptions. Therefore, investigators who want to predict prospective help-seeking behavior and also want to use conditional help-seeking perception measures must ensure that enough Time 1 respondents are actually experiencing that hypothetical condition in real life at the time of initial data collection.

Our final advice for Step 2 is that there are different levels of generality vs. specificity that can be considered for each of the TACT-C elements. This table provides examples of different levels of generality:

Level of generalityActionTargetContextTimeConditionality
Very SpecificMy attending the first post-intake treatment sessionWith Dr. Jane Doe, LPAt Healing Arts Counseling CenterNext ThursdayIf I was experiencing racing thoughts and feelings of overwhelm five times in the last two weeks
Somewhat specificMy attending an initial sessionWith a psychologistIn your local areaIn the next two weeksIf I was experiencing anxiety
Somewhat generalMy seeking helpFrom a mental health professional(none specified)In the next three monthsIf I had a mental health concern
Very GeneralMy seeking helpFrom a professional(none specified)(none specified)(unconditional)

Regardless of which level of generality is chosen by the investigators, investigators should consider the principle of compatibility (Ajzen & Fishbein, 1980) to ensure that all help-seeking construct measures utilize the same exact level of generality across the five elements. For example, the principle of compatibility would be violated twice over if intention was measured with the item “I will seek help from a psychologist in the next week” but prospective behavior was measured with the item “I sought help from a mental health professional in the past 3 months” as both target and time are inconsistently defined. To ensure accurate assessment of the strength of associations among help-seeking constructs, all measures of these constructs should be fully compatible.

Once you have carefully defined help-seeking behavior in a manner that is conceptually, practically, and operationally sound, it is time to move to Step 3.

Step 3: Conduct mixed-method pilot study

Step 3 involves conducted a mixed-method (qualitative and quantitative methods working in synergy) pilot study with a representative sample of the population of interest. First, we’ll talk about the qualitative portion (elicitation interviews or surveys), then we’ll talk about the quantitative portion (piloting direct measures).

Representativeness can be defined sociodemographically, as well as by individual differences that prior research has indicated can shape help-seeking perceptions. In this mental health help-seeking context, we recommend ideally sampling 15-20 people from each sociodemographic subgroup (i.e., segment) of interest, such as segments defined by gender, race/ethnicity, presence of significant psychological distress (as measured by a mental health symptom screening measure such as the K6, GAD, or PHQ), and prior (in)experience with obtaining mental health care. Adequate representation in this pilot sample helps ensure that beliefs salient to different segments of the population of interest are identified, which will influence the cross-group validity of the future measurement instruments and the ability to account for the beliefs that differentially matter across segments for the purposes of future intervention planning or testing.

The qualitative portion of this pilot involves elicitation interviews or surveys.

If doing surveys, care must be taken to ensure the beliefs elicited are rich and descriptive enough to allow for the creation of high quality belief items in Step 4. In most cases, we do not recommend the use of focus groups to collect these beliefs because such a format can reduce the heterogeneity of beliefs raised by participants (because there are interpersonal dynamics that in a focus group that can bias what beliefs get shared and which do not).

If doing interviews, these semi-structured interviews should ask participants to describe the:

The following table provides sample questions to elicit these personal salient beliefs—the readily accessible beliefs individual interviewees have about seeking help. The prompt should be reworded to fit the definition of help-seeking behavior established in Step 2 to ensure compliance with the principle of compatibility.

Prompt: Let’s imagine that you have been experiencing a serious mental health concern for the past month. You have been feeling overwhelmed, isolated from others, and are having trouble sleeping and doing your work. I am going to ask you some questions about how you might feel about seeking help if you were experiencing this hypothetical mental health concern right now. When it comes to the possibility of your seeking help from a mental health professional in the next three months…
Outcome beliefsWhat would be the advantages/positive effects and disadvantages/negative effects of seeking help?
What would you like/dislike about seeking help?
Experiential beliefsPlease complete the following sentence with a feeling word: “When I think about the idea of my seeking help from a mental health professional, I feel ___.” Are there any other feelings that would come up? (Optional: present participation with a list of feeling/emotion words to choose from, if they have trouble generating feeling words on their own)
Beliefs about others’ expectationsWhat important people/groups in your life would approve/support or disapprove of your seeking help?
Beliefs about others’ behaviorWhat important people/groups in your life would seek help from a mental health professional if they were experiencing this hypothetical mental health concern?
Logistical beliefsWhat things would make it easy/hard for you to seek help? What things would help/stop you from seeking help?

Montaño and Kasprzyk (2015) note on page 108 that “ideally, interviews should be continued until ‘saturation,’ when no new responses are elicited. The process has been described in detail by Middlestadt and colleagues (Middlestadt, 2012; Middlestadt, Bhattacharyya, Rosenbaum, Fishbein, & Shepherd, 1996).”

Deductive content analysis of interview responses is then used to identify modal salient beliefs, which are used in Step 3 to construct self-report measures of help-seeking beliefs. Modal salient beliefs are those personal salient beliefs held with the greatest frequency in the population of interest. Researchers group together personal salient beliefs mentioned across the interviews that refer to similar themes and count the frequency with which each theme was mentioned. This is akin to identifying semantic codes in thematic analysis (Braun & Clarke, 2012).

When trying to decide whether two personal salient beliefs refer to the same or different themes, the researcher should ask whether the two things could reasonably be stated by the same person; if several respondents mentioned both beliefs then there are grounds to treat them as separate beliefs (Fishbein & Ajzen, 2010, p.102). For example, the beliefs that seeking help would “help me find a solution to my problem” and “help me fix my issues” may be considered to be the same theme stated in slightly different grammatical ways.

Researchers may also combine two different but related lower-level themes into a higher-level theme that accurately captures the core idea of these two themes; this higher-order theme may have appeared in some manner at a frequent enough rate to quality as a modal salient belief. For example, the beliefs that seeking help would “increase my peace of mind” and “decrease my stress” may be considered similar enough to justify combining these into a higher-order theme of “make me feel better”.

Having talked about the qualitative portion of Step 3, we’ll now talk about the quantitative portion of Step 3. Please note that researchers are not required to collect pilot quantitative data during Step 3, but this is highly recommended as it can detect potentially problematic issues and save researchers from severe headaches later in the process.

The quantitative survey portion of this mixed-method pilot study involves administering direct measures of help-seeking intention, the three help-seeking mechanisms, and determinants of interest (at minimum, it is important to include a measure of past experience with mental health help seeking; see “Issue A” bullet point below for explanation). See the “Measuring Help-Seeking Beliefs” section of Hammer and colleagues (2024) for a description of indirect versus direct measures. Template measures for some of these constructs are available on this website for download, adaptation, and translation.

As always, measures should be aligned with your project’s definition of help-seeking behavior. For example, the intention template measure is the Mental Health Seeking Intention Scale (Hammer & Spiker, 2018), which is a 3-item measure (sample item: “If I had a mental health concern, I would intend to seek help from a mental health professional in the next 3 months.”) that defines “mental health professionals” to include psychologists, psychiatrists, clinical social workers, and counselors and defines “mental health concerns” to include issues ranging from personal difficulties (e.g., related to loss of a loved one) to mental illness (e.g., anxiety, depression). However, if a research team wanted to study intention to seek help from a campus counseling center for help with depression in the next month, then the instructions and items would need to be adjusted accordingly (e.g., “If I had depression, I would intend to seek help from the campus counseling center in the next month.”).

If you intend to have the same participants provide both the qualitative data and quantitative data in Step 2, we recommend having people first complete the quantitative portion prior to completing the qualitative portion, because the qualitative elicitation process is likely to bring a wider variety of help-seeking beliefs to their awareness than are usually salient for them in a given day. This might differentially influence how they would otherwise respond to the quantitative survey items without such priming. There is no empirical data of which we are aware that has directly tested this priming hunch, though.

This quantitative survey data is used to provide information about five relevant issues. Several of these issues have been adapted from Fishbein & Ajzen (2010) and customized to the mental health help-seeking context.

  • Issue A is whether the measure of past experience with mental health help seeking supports the notion that the prevalence of help seeking behavior in the population is indeed low. If most people in a population are already seeking help, there is no significant treatment gap, and therefore it will be hard to justify investing limited resources into attempting to close a treatment gap that does not exist.
  • Issue B is whether the mean score on the intention measure indicates that a priority for future intervention is to increase intention to seek help. As noted by Hammer and colleagues (2024, p. 6), it is common for a population to contain a first group of people who need intervention to increase help-seeking intention as well as a second group of people who need intervention to remove barriers that attenuate their ability to successfully actualize their help-seeking intention as prospective help-seeking behavior. The pilot study can offer an imperfect glimpse at the second group by comparing the frequency of past experience with mental health help seeking versus people’s current intention: if there is little past experience yet strong intention among a significant portion of the pilot sample, that suggests but does not prove that this second group of people who need intervention to remove barriers that reduce the correlation between intention and prospective behavior. Please note that personal agency, as defined by the IBM-HS, is about people’s subjective perception of their ability to seek help; low personal agency among a significant portion of the population indicates a need to increase personal agency but high personal agency among a significant portion of the population does not guarantee that they will be able to easily turn their intention into prospective behavior, because people (especially those who have never sought formal help before) who self-report high personal agency are sometimes unaware of all the hidden barriers to seeking help that they are about to encounter in their quest to seek help. Another piece of data that is useful for understanding the potential gap between help-seeking intention and behavior is the degree to which intention is associated with past help seeking behavior and the degree to which this relation is moderated by personal agency, as these relations can imperfectly foreshadow our ability to predict prospective help seeking behavior. A lack of relationship between intention and past help seeking behavior suggests but does not prove that intention is unlikely to accurately predict future help seeking behavior, and interventions designed to increase intention are less likely to be fruitful.
  • Issue C is the degree to which attitude, perceived norm, and personal agency are associated with intention in the context of bivariate correlations and multiple regression (though pilot sample size may be too small to adequately power a multiple linear regression analysis). This foreshadows the degree to which each help-seeking mechanism predicts intention and which of the mechanisms are the strongest predictors, respectively. In the case that all three mechanisms fail to account for substantial variance in intention, this indicates either a psychometric problem (which can be a sign that further measure refinement needs to occur) or that a future intervention designed to influence these mechanisms is unlikely to be effective at closing the portion of the treatment gap that is due to insufficient intention formation. When modeling/scoring the mechanisms, users should be mindful of the dimensionality of the measures used, per Hammer and colleagues (2024, p. 8): “The three help-seeking mechanisms are each conceptualized to have two elements (e.g., perceived norm has injunctive and descriptive elements). However, these elements may be best operationalized as either (a) two separate latent factors or (b) two inseparable facets of the same latent factor, depending on the measure, population, and sample in question. Thus, users are encouraged to use factor analysis to verify the dimensionality of each mechanism’s direct measure in their sample before committing to a given modeling and scoring strategy.” There is past empirical support, in the wider reasoned action literature, for both the separate factors and inseparable elements approaches to measuring these mechanisms (see p. 184-185 of Fishbein & Ajzen, 2010).
  • Issue D is relevance testing for help-seeking determinants. There may be determinants of interest to the research team that are thought to have an indirect effect (via the help-seeking mechanisms) on intention to seek help. Pilot data can be used to determine whether or not there is preliminary evidence that a given determinant is associated with intention and thus potentially relevant in indirectly shaping help seeking decision making.
  • Issue E is psychometric problems. Problems with the wording and structure of items, response scales, and instructions may lead to issues with variability, dimensionality, or internal consistency. Because the pilot sample is small and affords low power, psychometric testing is simple and preliminary. For example, we advise verifying that the mechanism and intention measures (a) do not demonstrate ceiling/floor effects or strong skewness/kurtosis, (b) evidence initial internal consistency and unidimensionality per review of a bivariate correlation table that should indicate stronger correlations among items from the same measure and weaker correlations between items from different measures (the small sample size of the pilot test may not provide enough power to generate reliable Cronbach alpha estimates; see Bujang et al., 2018 for a guide).

Once you have conducted the pilot study, it is time to move to Step 4.

Step 4: Develop help-seeking belief measures

In Step 3, content analysis of interview responses was used to identify modal salient beliefs, which are used here in Step 4 to construct self-report indirect measures of help-seeking beliefs. See the “Measuring Help-Seeking Beliefs” section of Hammer and colleagues (2024) for a description of indirect versus direct measures.

The construction of help-seeking belief measures requires balancing competing considerations.

On one hand, the more modal salient beliefs that make the cut for inclusion in each belief measure, the more comprehensive and sensitive that measure can be for identifying primary beliefs (i.e., the subset of salient beliefs that most distinguish those who intend to seek help from those who do not) in the overall population and sociodemographic segments of interest. This is particularly true in the latter case, in that it is common for certain beliefs to be predictive of intention to seek help for some groups but not others (e.g., “my seeking help would bring shame on my family”, “my seeking help would reinforce stereotypes about people from my cultural group”). Because cross-cultural validity is a desirable attribute in measurement, including a robust set of beliefs that are particularly important for segments that may not be in the numerical majority in the selected population can help ensure that the primary beliefs for important segments are captured in the instrument.

On the other hand, longer survey length can result in lower participant recruitment and retention, higher incentive costs, and lower data quality (Lee et al., 2004, Gibson & Bowling, 2019). Thus, there is an incentive to limit the number of items composing each help-seeking belief measure to ensure the feasibility of the instrument. This incentive is greater in situations where the intended users of the instrument have limited funding to incentivize completion of the survey and where participants are more likely to experience difficulty or irritation in completing a lengthy survey (e.g., people with cognitive impairments). Therefore, developers of help-seeking belief measures must consider these factors simultaneously and generate measures that offer the best tradeoff of these competing factors.

In terms of proposed quantitative thresholds for determining how many modal salient beliefs are incorporated as items, Fishbein and Ajzen (2010) review several common thresholds on page 103 and ultimately recommend that developers “choose beliefs by their frequency of emission until we have accounted for a certain percentage, perhaps 75%, of all responses listed. For example, if the total number of responses provided by all participants in the elicitation sample was 600, a 75% decision rule would require that we select as many of the most frequently mentioned outcomes as needed to account for 450 responses.” However, we want to reinforce that, while such sample thresholds are a useful heuristic, ultimately a careful comparative analysis should be done to balance comprehensiveness with feasibility.

In terms of typical numbers of belief items mentioned in the wider reasoned action literature, our non-systematic review of this literature found the number of beliefs has ranged from 9 to 52 for outcome beliefs, 19 for experiential beliefs, 2 to 27 for referents (i.e., relevant individuals or groups about whom respondents in this population may have beliefs about others’ expectations/behaviors), and 4 to 29 for logistical beliefs, as follows:

  • Montaño & Kasprzyk (2015) page 112 – 13 outcome beliefs, 4 referents, 11 logistical beliefs
  • Montaño & Kasprzyk (2015) page 113 – 38 outcome beliefs, 21 referents, 29 logistical beliefs
  • Montaño and colleagues (2018) page 4 – 52 outcome beliefs, 27 referents , 29 logistical beliefs
  • Kasprzyk & Montaño (2007) page 159 – 9 to 14 outcome beliefs, 4 to 6 referents, 6 to 11 logistical beliefs
  • Hammer and colleagues (in press) page TBD – 37 outcome beliefs, 19 experiential beliefs, 11 referents, 20 logistical beliefs
  • Daigle and colleagues (2002) page 168 – 12 outcome beliefs, 2 referents, 4 logistical beliefs

In line with best practices in scale development, belief items should “(a) incorporate interviewees’ lay language, (b) avoid conceptual redundancy among items, (c) use clear and accessible terminology and syntax, (d) avoid use of double-barreled items, and (e) cohere with the response anchors (DeVellis, 2016)” as cited by Hammer et al. (in press, p. X).

It is also permissible to add belief items from the wider help-seeking literature if the researchers expect these beliefs could be strongly associated with intention despite not being modal salient beliefs identified from the elicitation interviews.

Because the influence of a given belief on one’s help-seeking attitude, perceived norm, personal agency, intention, and/or prospective behavior is dependent on one’s evaluation of the value or implication of that belief, it is important to account for this evaluation when designing help-seeking belief measures. Some beliefs are likely to be evaluated similarly across members of the population of interest. For example, the outcome belief that “my seeking help would help me feel better” is likely to be evaluated as a positive outcome by most everyone. As another example, the logistical belief that “I will not have the money necessary to seek help from a mental health professional” is likely to be evaluated as a barrier to, rather than a facilitator of, seeking help. In contrast, while some respondents might view “receiving a mental illness diagnosis from a mental health professional” as a positive outcome to seeking help for their mental health, others might see this as a negative outcome. Therefore, when “people vary in their evaluation of the value of a given belief, it is important to include a matching evaluation measure item for that belief in the IBM instrument.” (p. X, Hammer et al., in press).

The structure of the evaluation items varies depending on the type of help-seeking belief (Fishbein & Ajzen, 2010).

  • To create evaluation items for outcome beliefs (i.e., “outcome evaluation”), we recommend asking respondents to indicate how bad versus good it would be if their seeking help resulted in each outcome (e.g., “How good or bad would it be if your seeking help [resulted in you receiving a mental health diagnosis]?”).
  • To create evaluation items for beliefs about others’ expectations, we recommend asking respondents to indicate how much they would care about the opinion of each person/group (i.e., “motivation to comply”) when it comes to their decisions about mental healthcare (e.g., “How much would you care about the opinion of your [friends] when it comes to your decisions about mental healthcare?”).
  • To create evaluation items for logistical beliefs, we recommend asking respondents to indicate how much easier versus harder it would be (i.e., “perceived power”) to seek help if a given barrier or facilitator was present (e.g., “How much easier would it be to seek help if you had immediate walk-in access to a mental health professional?”).

To properly account for how evaluation influences the degree to which beliefs shape downstream help-seeking constructs, the standard practice is to multiply the score of the belief item by the score of its corresponding evaluation item, thereby creating a product term that weights the scores of the belief according to its evaluation by that respondent. For example, in developing the UE-MH-HSI for engineering students, we created weighted outcome belief items (with scores ranging from -12 to +12) by multiplying the outcome belief measure item score (ranging from 1 to 7) by the corresponding outcome evaluation item (ranging from -2 to +2).

For some beliefs, it will be obvious how the belief is likely to be evaluated, and evaluation items paralleling those belief items do not need to be included in the survey. For other beliefs, developers will be confident that evaluations may vary (or may be unsure if there will be variation or not), in which case one should include evaluation item for those beliefs. Ideally, developers will look for evidence of whether evaluations vary for a given belief by including evaluation items in the Step 3 survey, as this can allow the developer to drop evaluation items that prove irrelevant, thereby reducing survey length and the corresponding participant burden.

Up until this point, we have been discussing evaluation in terms of direction (e.g., positive versus negative). However, evaluation can vary in terms of degree (e.g., weakly positive versus strongly positive). Therefore, some IBM users have opted to create and use evaluation items for all beliefs. However, in our experience to date, we have found that the evaluation items fail to offer sufficient utility when only the degree, but not the direction, of the evaluation varies. For example, when developing the UE-MH-HSI, we found that evaluation-weighted versions of our logistical beliefs items, on average, correlated with the personal agency mediator score and intention score only slightly more strongly than the unweighted versions of our logistical beliefs items. This meager improvement in correlational strength was deemed insufficient to justify doubling the length of the logistical beliefs portion of the survey. Therefore, we encourage developers of IBM-HS-based help-seeking belief measures to prioritize including evaluation items for beliefs that vary in direction of evaluation and, if possible, utilize pilot testing to determine whether inclusion of evaluation items for beliefs that vary only in degree is necessary to accomplish their research aims. Fishbein and Ajzen (2010) Chapters 3, 4 and 5 offer further guidance on constructing evaluation measures and discuss other potential types of evaluation items (e.g., “identification with referent”) that, in our experience, do not offer consistent utility to warrant incorporation into the IBM-HS.

Once you have developed help-seeking belief measures (and any necessary accompanying evaluation items), it is time to move to Step 5.

Step 5 (recommended): Refine help-seeking belief measures via cognitive interviews

Step 5 is recommended but not required.

When you develop items for any measure, even on the basis of quotes from qualitative interviews with the population of focus, there is no guarantee that respondents from that population will interpret and respond to those measure items in the manner anticipated by the measure developer. This is especially true because each population contains a variety of sociodemographic segments, and people in a given segment may interpret the language in the measure in significantly different ways. Therefore, because a good measure of a construct measures what it is intended to measure, it is prudent to collect data from the population of interest (including members of key sociodemographic segments therein) regarding the degree to which respondents are interpreting and responding to the measure instructions, items, and scaling in the manner intended.

One of the most robust ways to collect this psychometric data is by conducting cognitive interviews (Blair & Presser, 1993; Meadows, 2021; Buschle et al., 2021; Willis, 2015; Petersen et al., 2017; Willis & Boeije, 2013; Douglas & Purzer, 2015).

“Cognitive interviewing (CI), also known as cognitive testing (CT) or cognitive debriefing (CD), is a qualitative survey development method used in questionnaire design and should not be confused with cognitive interviewing to assess mental status. The objective of cognitive interviewing is to gain insights into respondents understanding of survey items as intended by the instrument developer.” (Meadows, 2021, p. 375).

We recommend that cognitive interviews integrate elements of the “think out loud” and “direct probe” paradigms to ensure that the help-seeking belief measures are easy to understand by participants and that participants interpret the instructions, rating scales, and items in the manner intended by the developer. Cognitive interview data points out unforeseen problems with the language in a measure, allowing the measure developer to further tweak the measure (e.g., adding items, removing items, rewording items, adjusting the rating scale, adjusting the instructions, adjusting how the question is presented on the page) to eliminate or reduce these problems.

For example, when conducting cognitive interviews with engineering students, we discovered that a number of items were being interpreted in a manner different than we intended, some terms were confusing to international students who spoke English as a second language, and that some instructions needed clarifying.  This is especially useful in the context of measuring beliefs in a conditional context, such that the researchers are interested in intention to seek help in a hypothetical situation in which the respondent is experiencing psychological distress, as the conditional context creates additional grammatical complexity to the measure instruction stems that increases the cognitive load when answering questions.

Cognitive interview are best conducted in iterative waves, such that a group of cognitive interviews occur and then the insights generated are used to refine the measures, which are then subjected to the next round of cognitive interviews, with this cycle repeating until there is a reduction of significant issues identified by cognitive interviews, indicating that the most serious limitations have likely been identified and addressed.

As noted above, maximizing consistency of understanding across respondents from varying sociodemographic backgrounds within the population of interest is important for the cultural responsiveness/relevance/sensitivity and, therefore, the validity, of the measure. Thus, as was done in Step 3 when conducting elicitation data collection, it is important that cognitive interviews are done with a diverse array of respondents from the population, potentially with an oversampling of respondents from groups (e.g., those who are not yet fluent in the language the measure is being administered in, neurodiverse folx) that may be more likely to interpret the language of the measure in ways different than the measure developer anticipated.

Not only does cognitive interviewing help improve the measure, but it can provide content/substantive evidence of validity (data capturing respondents’ cognitive response processes), which is essential to demonstrating the psychometric soundness of a new measure. To learn more about validity and reliability testing, we recommend DeVellis (2021) and the Standards for Educational and Psychological Testing (2014) and the NCME Statement about the Standards.

Once you have refined your help-seeking belief measures, you are well positioned to move to Step 6.

Step 6: Administer baseline survey containing help-seeking belief measures and direct measures

The Time 1 baseline survey includes measures for help-seeking intention, mechanisms, beliefs (plus relevant belief evaluation items), demographics, and determinants of interest. Many such measures can be found in the Measures portion of this website. The UE-MH-HSI (Hammer et al., in press) and IBM-HS-Q (Hammer et al., under review) scale development papers provide examples of some of these measures in action.

Because the help-seeking beliefs measures can often be quite long, this can increase the risk of participant survey fatigue. Depending on the nature of the sample and the survey incentive structure, it may be necessary to reduce overall survey length by using a planned missingness design (for guidance, see Zhang & Sackett, 2023; Zhang & Yu, 2021; Graham et al., 2006; Noble & Nakagawa, 2021; Little & Rhemtulla, 2013).

For example, when developing the UE-MH-HSI, “participants were presented with the direct measures but about half of the participants were randomly presented with the either (a; n = 274) outcome beliefs measure, outcome evaluation measure, and experiential belief measure or (b; n = 297) beliefs about others’ expectations measure, beliefs about others’ behavior measure, and logistical beliefs measure” (p. X, Hammer et al., under review). This is permissible because it is not necessary to include items from different help-seeking belief measures in the same simultaneous analysis in order to obtain the findings necessary to proceed with the systematic identification of primary/key help-seeking factors. However, care must be taken to balance the efficiency of a planned missingness design with reduced power to examine bivariate effects within important segments of the sample who are already in the numerical minority. Another example of planned missingness is our ongoing data collection using the EMMHI in which we are using Qualtrics’ advanced randomization function to randomly present each participant with only a certain portion of the items on each measure (both direct and indirect) and utilizing MplusFIML to enable us to run analyses with the full power of the entire sample despite each participant not having to complete all items for all measures.

We have several recommendations regarding the structure and format of the baseline survey.

  • First, if using a hypothetical scenario vignette (e.g., “As you answer the questions in this survey, we would like you to imagine something. Imagine that you have been experiencing a serious mental health concern for the last month. You feel significantly more overwhelmed, feel isolated from others, are having trouble sleeping, and are earning lower grades on your coursework. In the following pages of this survey, we’re going to ask you some questions about how you—given your personal views and experiences to date—might feel about seeking help from a mental health professional if you were dealing with this hypothetical mental health concern right now”; Hammer et al., in press, p. X), make sure that is presented before any of the items in the form of start-of-survey instructions.
  • Second, start-of-survey instructions should also define key terms such as “mental health professionals” and “mental health concern” used in the subsequent survey items (see Hammer et al., in press, for example definitions).
  • Third, we have learned from cognitive interviews that some respondents find the repetitiveness of the help-seeking mechanism items annoying, and thus we also recommend the start-of-survey instructions provide context for this up front (e.g., The questions on these next three pages may seem repetitive. This is an intentional, evidence-based, and necessary part of the survey design. Please bear with us, and we apologize for the inconvenience!”).
  • Fourth, if using a hypothetical scenario vignette, we recommend including a reminder image at the top of each page where respondents are supposed to be answering items as if they were living in that hypothetical scenario (e.g., “Reminder: imagine that you have been experiencing a serious mental health concern or the last month. You are having trouble sleeping and are earning lower grades on your coursework. You feel significantly more overwhelmed and isolated from others”). We did this when administering the UE-MH-HSI (Hammer et al., in press). We recommend that the reminder image be designed in accordance with Universal Design principles. We also recommend that the wording of the reminder image be adjusted on the page where the beliefs about others’ behavior help-seeking belief items are displayed (e.g., “New Update: Imagine that the following people in your life were experiencing a serious mental health concern for the last month. They are feeling significantly more overwhelmed, isolated from others, and are having trouble sleeping and accomplishing tasks.”) such that respondents answer as if the people in their life were experiencing a mental health concern that could make help seeking a relevant behavioral option.
  • Fifth, we recommend administering the intention and help-seeking mechanism items in a single survey block such that items are presented in a random, interspersed order that varies from respondent to respondent. This reduces the influence of systematic item order effects that can otherwise bias results.
  • Sixth, we recommend administering the intention and help-seeking mechanisms measures first, followed by the help-seeking belief measures, then determinants, and demographic items at the end. The one exception to this protocol is, if using the “first strategy” described in Step 2, administering mental health status measures first.
  • Seventh, because talking about mental health and help seeking can lead people to have difficult thoughts and feelings, we recommend including links to relevant resources including the 811 national crisis hotline.
  • Eighth, we recommend including a “not applicable” response option for relevant help-seeking belief measures (e.g., outcome beliefs, beliefs about others’ expectations) to ensure that people who don’t believe a thing applies to them can indicate not applicable rather than provide inaccurate noise data for those items.
  • Ninth, we recommend including determinant measures of mental health status. As discussed in Step 2, this allows researchers to determine which Time 1 survey respondents meet distress cutoffs that indicate the potential to benefit from seeking professional mental health care in the near future. This information can be used for several potential purposes: to characterize the mental health status of the sample, to compare results across distressed and non-distressed respondents, to ensure full data collection from only those who are currently distressed (if using the “first strategy” described in Step 2), and/or to guide selection of baseline respondents to invite to provide data regarding prospective help-seeking behavior via a Time 2 follow-up survey (when applicable).

Once Time 1 Survey data has been collected, we recommend verifying the psychometric integrity of the measures.

  • Our discussion of Issue E in Step 3 already provided some information about psychometric issues to test for, to which we would add the following information.
  • First, confirmatory factor analysis is the typical procedure for verifying the dimensionality of the help-seeking intention and mechanism measures. In a later revision of this webpage, we’ll point readers toward sample Mplus Syntax for testing the dimensionality of each help-seeking intention and mechanism measure, including guidance on how to make a judgment call on treating the perceived norm (and personal agency) measure as containing (a) two related yet independent factors with their respective mean scores versus (b) one strong general factor that should be scored with a single global mean score. Here are video tutorials for running EFA using SPSS and CFA using Mplus.
  • Second, calculating the Cronbach alpha score for each help-seeking intention and mechanism measure allows researchers to verify the measures are sufficiently internally consistent (see Table 3 in Ponterotto & Ruckdeschel [2007] regarding what alphas indicate good reliability depending on number of items in the measure and sample size). In a later revision of this webpage, we’ll provide readers with sample SPSS and Mplus Syntax for testing the reliability of each measure. Here are video tutorials for calculating Cronbach alpha using SPSS and Mplus.
  • Third, bivariate zero-order correlation analyses can be used to verify that the help-seeking belief measures assess beliefs relevant to the formation of this population’s help-seeking attitude, perceived norm, and personal agency. This verification is a subjective process, but in the past we (Hammer et al., in press) have used a “majority” threshold such that a help-seeking belief measure was deemed to demonstrate sufficient convergent evidence of validity when at least 50% of the items from that measure were significantly correlated with the corresponding help-seeking mechanism mean score. Another verification option is to create an “index score” across the belief items for a given help-seeking belief measure. Before calculating the index mean score, it is necessary to reverse-score items known to be negatively-valenced (such that a higher score indicates a more favorable attitude, more supportive perceived norm, or greater personal agency) and calculate weighted belief scores (i.e., product terms) for those beliefs that require weighting by their corresponding evaluation items (see discussion of evaluation items in prior steps). When it is necessary to incorporate weighted belief scores in the calculation of a belief measure mean score, it is first necessary to re-scale those product terms back to the original scaling of the original unweighted belief measure such that weighted items do not have a different minimum and maximum score than the unweighted items, which would bias the calculation of the mean score. These index scores for each help-seeking belief measures can then be correlated with their respective help-seeking mechanism mean score, with the expectation that convergent evidence of validity would be demonstrated if the degree of correlation indicates a large effect size (r = .50; Cohen, 1988). However, use of this criterion is predicated on the notion that all belief measure items used to create the index score are strongly modal salient to the overall population. Belief measures that are constructed to be more comprehensive and/or to capture beliefs salient to important sociodemographic segments of the population are more likely to contain items that demonstrate weaker correlations in the overall sample, leading the index score to correlate weaker with the corresponding mechanism score and thereby misleadingly portray the help-seeking belief measure as psychometrically lacking. Thus, this index score verification method should be used with caution. If the chosen verification method indicates that a given help-seeking belief measure demonstrates insufficient convergent evidence of validity, this suggests that remedial action should be taken before using the measure. Perhaps items irrelevant in both the overall sample and any segments of interest should be dropped from further use. More drastically, perhaps the beliefs most salient to this population were inadequately detected during the initial elicitation in Step 3 and additional elicitation must be conducted to identify additional modal salient beliefs that need to be added to a revised version of the measure. In closing, remember that is is not uncommon for some beliefs to arise during elicitation, yet not have strong correlation with their corresponding help-seeking mechanism score; this only becomes a problem if a significant portion of the belief items to fail to correlate with their corresponding mechanism score.

Once measures have been psychometrically verified, it’s time to identify which help-seeking mechanisms may be appropriate targets for intervention. Because we will not have data on prospective help-seeking behavior until we obtain Time 2 follow-up survey data, our focus with Time 1 baseline survey data analysis is to determine what mechanisms are strongly associated with intention to seek help.

  • First, researchers should examine the mean and standard deviation for each mechanism score. If a given mechanism score demonstrates a ceiling effect, this indicates that there is little opportunity to further increase that mechanism (e.g., if attitude is already highly positive, there is no need to focus on further enhancing attitude). However, researchers should mindfully attend not only to the overall sample but also specific sociodemographic segments of interest, as certain segments may have lower mean scores that indicate potential promise as an intervention target despite the overall sample having a high mean score.
  • Second, researchers should examine the bivariate zero-order correlations between the mechanisms and intention to determine which are strongly associated with intention. Mechanisms that are considerably less associated with intention are lower priority targets for intervention, all other things being equal. Please note that linear regression is not an appropriate analysis for picking mediators to focus on, as it forces variance in intention accounted for by multiple mechanisms to be assigned to only one mechanisms, which can mislead researchers into an overly narrow focus on one mediator. However, linear regression is useful when a person is interested in knowing what small set of mechanisms can parsimoniously account for substantial variance in intention, which may be useful when a professional, due to limited resources, must choose a very limited set of mechanisms to focus future intervention on.
  • As discussed in Issue C of Step 3, if no mechanisms are strongly associated with intention, this indicates the need for re-evaluation by the investigators.

Next, once certain mechanisms have been identified as relevant targets for future interventions designed to increase the population’s intention to seek help, researchers need to identify the primary beliefs that are most associated with one’s intention to seek help in the overall population and sociodemographic segments of interest.

  • Step A is for researchers to identify beliefs subject to ceiling (for pro-help-seeking beliefs) or floor (for anti-help-seeking beliefs) effects. If there is little room to further shift the belief, there is little point in targeting it with interventions.
  • Step B is for researchers to examine bivariate zero order correlations between the help-seeking beliefs (particularly those underlying the most relevant mechanisms identified in the prior part of Step 6) and intention to seek help. In general, beliefs most strongly correlated with intention are the most promising targets for intervention, all other things being equal. Please note that it is permissible to examine beliefs underlying mechanisms not strongly associated with intention, but this examination is less likely to result in the identification of primary beliefs.
  • Step C, which combines elements of the first two steps, involves researchers assigning all respondents to either an intender (those who score a 5-7 on the 7-point intention scale), ambivalent (score a 3.1 to 4.9), or non-intender (score a 1-3) group. The intender and non-intender groups are then compared on how strongly they endorse the most strongly correlated beliefs identified in Step B. Pro-help-seeking beliefs (e.g., my seeking help would help me feel better) that have a small percentage of agreement among non-intenders and a high percentage of agreement among intenders are beliefs that hold promise as targets for future interventions. Likewise, anti-help-seeking beliefs (e.g., “my seeking help would be a sign of weakness”) that have a large percentage of agreement among non-intenders and a small percentage of agreement among intenders are beliefs that likewise hold promise. In contrast, beliefs where the difference in percentage endorsement between non-intenders and intenders is small (or floor/ceiling effects are impacting both groups) are less likely to be fruitful targets for intervention. For example of Step B and Step C in action, see our ASEE 2024 paper (forthcoming).
  • Step D is to repeat these first three steps within each segment of interest to discern what are key beliefs in the overall sample versus specific segments. Segmentation analysis is essential because past IBM projects have revealed substantial differences in key beliefs by demographic/context subgroup and this allows for audience segmentation in design of health interventions. While some beliefs may be appropriate population-wide targets, other beliefs might be worthwhile for targeted interventions within key segments.

In summary, at the conclusion of analysis of Time 1 baseline survey data, researchers will have identified which help-seeking mechanisms are most strongly associated with intention to seek help, and which help-seeking beliefs are primary in terms of influencing intention to seek help and distinguishing intenders from non-intenders.

Step 7 (recommended): Administer follow-up survey to determine the help-seeking moderators of the relationship between intention and prospective help-seeking behavior

Step 7 is recommended but not required. Developers who are only interested in which variables predict intention to seek help, and do not need to know which variables predict prospective help-seeking behavior, or what variables may moderate the intention-behavior relationship, do not need to administer a Time 2 follow-up survey.

The primary purpose of the follow-up survey is to measure prospective help seeking behavior since Time 1 baseline among those respondents who met distress thresholds on the Time 1 baseline survey. Baseline respondents who did not meet distress thresholds should not be sent the follow-up survey because this group of participants lacks a clear need for professional mental health care and their help-seeking behavior data would be subject to a strong floor effect (see Step 2 for details).

As noted by Hammer and colleagues (2024), prior experience with mental health help seeking (i.e., past help-seeking behavior) is a determinant construct and should not be considered an appropriate substitute, conceptually or empirically, for prospective help-seeking behavior.

To accurately determine whether help-seeking perceptions predict help-seeking behavior, it is necessary to measure help-seeking behavior prospectively through a multiple time point design. A multiple time point design is likewise necessary to determine whether certain help-seeking determinants (e.g., environmental constraints; mental health perceptions, knowledge, and skills, evaluated need) moderate the relationship between intention and prospective behavior in a population (or important sociodemographic segments of that population). By linking Time 2 follow-up survey data to the Time 1 baseline survey data, these questions can be addressed.

Because the primary purpose of the follow-up survey is to assess self-reported help seeking behavior performed since the respondents completed the baseline survey, the follow-up survey can be brief in length. A single question (e.g., Did you seek help from mental health professional since completing the baseline survey in January?) can suffice. While longitudinal surveys are often challenging and expensive to conduct, in practice we have found it easy to obtain follow-up survey responses even without providing a monetary incentive for participation because we are able to tell our baseline survey participants that completing the follow-up survey will take less than 10 seconds.

That being said, when feasible, there can be added value by asking additional questions on the follow-up survey. For example, additional questions can be asked of those who did report seeking help, including the timing, frequency, nature, care pathway, and professional source of their seeking help. This data can be used to refine how the prospective help-seeking variable is coded, answer secondary research questions of interest, and inform future directions for research and intervention. In addition, questions can be asked of those who did not report seeking help, such as retrospective-report measures of help-seeking moderators such as salience (e.g., “In the last 3 months, how often did you think about the possibility of your seeking help?”).

Once baseline and follow-up data are combined:

  • Part 1 is to use point-biserial correlations to identify which baseline help-seeking mechanisms are the best predictors of prospective help-seeking behavior among respondents reporting distress at baseline. These results can be compared to the results examining how the baseline help-seeking mechanisms correlated with baseline intention, to determine whether there is consistency or difference in what mechanisms predict intention versus prospective behavior. However, any formal comparison of beliefs should utilize the same exact set of participants, to avoid confounding baseline distress level with the variables of interest.
  • Part 2 is to identify the individual baseline help-seeking beliefs that are most strongly predictive of prospective behavior, and compare that to the parallel intention results from the baseline analyses.
  • Part 3 is to assign all respondents to one of three groups: inclined seekers (those who scored a 5-7 on the 7-point intention scale at baseline and who indicated on the follow-up survey that they sought help since baseline), inclined abstainers (those who scored a 5-7 and indicated they did not end up seeking help after the baseline survey), and disinclined abstainers (those who scored a 1-3 and indicated they did not end up seeking help after the baseline survey). Sheeran (2002), who introduced these three terms, also mentions a “disinclined actors” group, but this is less relevant in the present help seeking context. The ability to compare the three groups offers significant value beyond comparison of the old baseline groups (intenders vs. non-intenders) based on intention alone, as it allows us to further divide intenders into those who did, and did not, successfully access care. We want to note that the term “abstainer” implies that the person ultimately chose not to seek help, but the person may not be accessing care because they are being thwarted for seeking help. Therefore, we suggest alternative terminology (e.g., thwarted intenders) to avoid unintentional pathologizing.
  • The baseline comparison of intenders and non-intenders was useful for identifying beliefs that are primary targets for interventions designed to help non-intenders become intenders. Comparing percentage endorsement of primary beliefs identified in Part 2 across these three groups allows for the identification of beliefs that distinguish between those intenders who obtained help and those who did not—the beliefs that are primary targets for intervention designed to help inclined abstainers become inclined seekers. Often, these beliefs are logistical beliefs, as people who intend to seek help may have trouble acting on that intention due to difficulties in taking the necessary steps toward seeking help. This “intention-behavior gap” is well studied outside the help seeking context (e.g., Sheeran, 2002; Sheeran & Webb, 2016; Conner & Norman, 2022; Webb & Sheeran, 2006; Sheeran & Conner, 2019; Sheeran & Conner, 2017; Rhodes & Yao, 2015; Sheeran et al., 2017 ), but has received minimal study in the help-seeking context (exceptions include Sheeran et al., 2007).
  • While there may be some anticipated logistical factors reported at baseline that shape people’s personal agency, intention, and subsequent behavior, often the intention-behavior gap is explained by unanticipated logistical barriers to performing the behavior (Eigenhuis et al., 2021). This is where baseline measures (and possibly retrospectively report follow-up measures) of help-seeking determinants that moderate the intention-behavior relationship are critical. A person can intend to seek professional help but be unaware of all the challenges and opportunities associated with seeking help, particularly given the complexity, expense, and bureaucracy of navigating the mental health care system in most countries. This is why there is value in assessing moderating constructs beyond the logistical barriers that are accurately foreseen by members of the population of interest. Assessment of these moderating constructs can allow for the identification of oft-unanticipated logistical barriers to successful follow through on help-seeking intention, which can then inform targets for intervention beyond those identified through examination of primary help-seeking beliefs.
  • By combining insights from baseline and follow-up survey data, developers can identify a set of constructs that can be addressed through intervention to increase intention to seek help and/or increase the likelihood that help-seeking intention translates into prospective help-seeking behavior. Changing these variables may sometimes require individual or group-level persuasion efforts, but other variables will require policy and structural change to address effectively (Hammer et al., 2024). Though the IBM-HS focuses primarily on subjective help-seeking perceptions and identifying primary beliefs that influence help-seeking intention, this focus should not be taken as an encouragement to conceptualize the mental health treatment gap as a product of “wrong beliefs” on the part of the individual, but instead as a combination of individual misunderstanding and structural inequities. It is common for people to hold beliefs about the help-seeking process that discourage the formation of intention or prospective help-seeking behavior, but oftentimes these beliefs are an accurate reflection of very real stigma and structural barriers that permeate their living context. Thus, it is important for developers to be mindful of how results from these analyses are used to make meaning of what helps or stops people from seeking mental health care, and on whom the burden of change is placed. We do not want to pathologize members of a population when they are merely reacting in an understandable manner to an convoluted, expensive, inequitable mental health care system.
  • A final note about “actual control” and “perceived control” is worth mentioning. Fishbein & Ajzen (2010, p. 335) talk about the 2×2 matrix of perceived control (what the IBM-HS operationalizes as the “autonomy” facet of the help-seeking mechanism of “personal agency“) and actual control (the actual autonomy people have to make and enact their own help-seeking decisions). Actual control is shaped by a number of help-seeking determinants (including environmental constraints and mental health perceptions, knowledge, and skills), whether people are aware of them or not. Depending on whether perceived control and/or actual control is low, these dictate whether the most effective intervention solution is to (1) enhance perceived control/autonomy to bring it in line with the actual considerable level of control people in the population tend to have over their help seeking and/or (2) enhance knowledge & skills and remove environmental constraints. More often than not, interventions focused on #2 are more likely to be needed than interventions focused on #1.

Conclusion

This overview of using the IBM-HS to guide a systematic mixed-method process for identifying important help-seeking determinants and primary help-seeking beliefs can help users think through some of the essential decisions necessary at each step to arrive at a thorough understanding of what things stop or help people to form the intention to seek help and successfully actualize that intention as prospective help-seeking behavior.

However, once several possible intervention targets are identified, there is still critical thinking to be done to determine which of these things are the most promising targets for intervention, as choosing promising targets involves a host of considerations beyond simply what beliefs are the most predictive of intention.

For example, the resources/power/influence available to the users who want to enact interventions can often dictate what targets are (not) possible to intervene around. Some professionals are in a position to only do individual and group-level persuasion interventions. Some are in a position to advocate or enact structural, policy, and top-down cultural changes that then can propagate into changed help-seeking beliefs and enhanced ability to actualize intention into action by potential help seekers. Some interventions require resources, time, power, access, or influence that the professionals may or may not possess, so choice of intervention is driven by practical considerations as much as empirical considerations.

Lastly, we want to reiterate that, from our experience, there are at least two groups of people from a population of interest that underutilize mental health care that need differing interventions to help them increase access.

The first group are people who don’t intend to see help. Addressing the variables that underly those help-seeking beliefs that dictate their intention is the goal of intervention with this first group.

The second group are people who intend, yet ultimately do not access, treatment. Addressing the variables that underly those help-seeking determinants that moderate their intention’s ability to result in successful help-seeking behavior is the goal of intervention with this second group. Depending on the population, one of these groups may be larger and thus the priority for intervention. Or, both groups may be of equal importance, in which case professionals may need to start by intervening with one group while longer-term plans are made to intervene with the other. We encourage professionals to be intentional with identifying and prioritizing intervention with these groups.

How to Receive Assistance with this Mixed-Method Protocol

We hope that this detailed guide to using the IBM-HS to explore the treatment gap for your population of interest has been helpful. If you are interested in receiving personal consultation from Dr. Hammer on your help-seeking project, contact Dr. Hammer to explore potential consultation options.

.