1. Home
  2. Data & Research
  3. Health Workforce Projections
  4. Technical Documentation for HRSA’s Health Workforce Simulation Model
  5. XIII. HWSM Validation, Strengths and Limitations, and Improvement

XIII. HWSM Validation, Strengths and Limitations, and Improvement

Published 2023

In this module:

Annual updates of the Health Workforce Simulation Model (HWSM) incorporate the most recent available supply and demand determinants. During each annual update, we review the latest literature on the occupations being modeled, trends in health care use and delivery, and health workforce modeling. We analyze the latest available data and explore whether new data sources could improve the theoretical underpinnings of HWSM and the projections.

Outreach efforts to the associations representing the modeled health occupations and training institutions provide stakeholders the opportunity to offer feedback on the data, methods, and assumptions used to develop workforce projections. Many of these stakeholders give data to the research team—particularly data on starting year supply of health workers and the number and characteristics of graduates from training programs. States are invited to provide data from their licensure files, and nurse supply projections for some states use licensure data rather than rely on survey data.

This module summarizes activities undertaken to validate the model and projections, discusses HWSM strengths and limitations, and explores efforts to improve the model.

HWSM validation

A model, by definition, is a simplified version of reality. Validation activities are vital in ensuring that the model is accurately reflective of reality. Validation of HWSM is a continual process. The validation activities will continue as the model is updated with new data for different health workers.

Following International Society for Pharmacoeconomics and Outcomes Research (ISPOR) guidelines on best practices, validation activities in HWSM included the following:1

  1. Review by subject matter experts (face validity). The model framework should conform to observations about how the system works and be consistent with theory. Expert review helps ensure that the model uses the best available inputs and parameters. Model outputs should be consistent with expectations of subject matter experts.

    A technical evaluation panel of experts in the health care workforce at HRSA, HRSA-sponsored workforce centers, and others in the field approved the model framework. The modeling approach effectively analyzes complex systems, such as the health care system, which feature decentralized and autonomous decision-making. For supply modeling, each individual makes career and labor force participation decisions based on their unique characteristics. They also take into account external factors such as earnings potential and unemployment risks. For demand modeling, individuals decide to use health care services based on their health risks and financial constraints. The potential to capture the complex dynamic interactive processes that characterize the demand for and supply of health care providers is an ongoing exploration.

  2. Internal validation (verification). This set of activities includes:
    1. Review computer code for accuracy
    2. Validate parameters in the model against their source
    3. “Stress test” HWSM by modeling extreme input values to test whether the model produces expected results
    4. Assess and compare regression coefficients for the health care use patterns to prior year coefficients to determine consistency
  3. External and predictive validation. This form of validation is used to identify external data sources (not used in model development) for comparison to model outputs.

    As an example, the health-related characteristics of the constructed population database are calibrated by comparing the prevalence estimates to published sources. State-level projections of hospital inpatient days are compared to levels reported by the American Hospital Association.

  4. Between-model validation (cross validation). This type of validation compares model outputs with results of other models.

There are few models for comparison, but some states and associations produce workforce projections. Demand estimates are typically simple extrapolations of provider-to-population ratios showing how local supply compares to the national average. Demand projections extrapolate current provider-to-population ratios to the future population. Some associations produce supply projections, often using a cohort model (or “stock and flow” model) with estimates of new entrants and exits from the workforce.

HWSM strengths and limitations

The main strengths of the HWSM are the use of recent data sources and a sophisticated microsimulation model for projecting health workforce supply and demand.

Compared to population-based approaches to modeling, this approach has several advantages:

  1. More predictive variables can be used in modeling: This both enhances the accuracy of results and allows for scenario modeling such as the Reduced Barriers demand scenario described in other modules.
  2. Lower levels of geography can be modeled: HWSM demand projections take into consideration geographic variation in demographics, prevalence of disease and health risk factors, socioeconomic factors, such as household income and prevalence of medical insurance coverage, and level of rurality. This supports HRSA’s goal of building more accurate state level projections.
  3. Projection models can be consolidated across occupations: Profession-specific equations can be integrated into a single platform. Annual updates of HWSM allow for updates of all occupations modeled. Still, modeling the integration of different occupations in care delivery is an ongoing process as care delivery patterns continue to evolve and data becomes available on the degree to which roles and scopes of practice overlap or complement the roles of other occupations.
  4. HWSM uses individuals as the unit of analysis: Modeling at the individual level allows for improvement in the theoretical underpinnings of HWSM. For supply modeling, individual health care workers make their own labor force decisions about whether to work, how many hours to work, whether to move to another state, or whether to change profession. For demand modeling, health care use is determined by the needs and circumstances of individual patients. Modeling at the individual level provides added flexibility for modeling the workforce implications of changes in policy. A past example from earlier HRSA reports is expanded health insurance coverage under the Affordable Care Act. This flexibility facilitates modeling the Reduced Barriers demand scenario, where a person can be simulated as having the health care use patterns of an insured, non-Hispanic, white person living in a metropolitan area (i.e., a population perceived to have lower barriers to receiving care).

Many of the limitations of HWSM stem from data limitations. The following are key limitations:

  1. Modeling long-term supply and demand workforce implications of COVID-19: The pandemic caused substantial disruption to the health care system and the U.S. population. The long-term implications on provider supply and demand are not fully documented and thus are only partially incorporated into the HWSM projections. Prediction equations for health care use are based on pre-COVID-19 data, though we calculate additional outpatient visits and hospitalizations associated with COVID-19 becoming endemic. HRSA continues to monitor the literature on COVID-19 implications for the health workforce, with annual updates to HWSM using post-COVID-19 onset data where available.
  2. Setting demand equal to supply in the starting year: Historically, HWSM has started with the assumption that national demand equals national supply in the starting year, with the exception of primary care physicians and psychiatrists, where we use the number of additional providers required to remove health profession shortage designations, as a proxy for starting year shortfall. (Alternative scenarios such as the Reduced Barriers scenario and the Unmet Needs scenario for behavioral health have national demand starting higher than supply). In recent years and in part due to the COVID-19 pandemic, there is growing evidence of national shortfalls across many health professions. For a small number of physician specialties and for dental hygienists and dental assistants, we use published estimated of shortfall. For nursing, we use recent vacancy data for hospitals and academia to quantify starting year shortfall. For physicians, advance practice providers, and various allied health and other specialties, we use estimates of the increase in demand for healthcare services as COVID-19 becomes endemic as a shift in demand for providers and estimate of starting year shortfall. For physicians in family medicine and general internal medicine, a substantial portion of their time is spent providing behavioral health services. Starting in 2023, this additional demand for behavioral health care is added to demand as there is growing evidence that primary care providers do not have sufficient time during visits to provide recommended preventive services and hence growing demand for behavioral health care is crowding out time for other preventive care services. These shortfall estimates for the starting year are likely conservative because for many professions there is anecdotal evidence and a perception of current shortfall but no quantified estimates of the magnitude of the shortfall. Because demand projections for individual states reflect national averages for health care use and delivery applied to the states’ population, in some states starting, supply will exceed demand and in other states, demand will exceed supply. Even in states where supply exceeds a national average level of demand there could be a provider perceived shortfall (or perceived balance between supply and demand) for two reasons: (a) approximately half the states are expected to be above the national average simply because demand is based on a national average, and (b) some states use a different mix of providers than the national average. For example, some states rely more heavily on licensed practical nurses (LPNs) and less on registered nurses (RNs), relative to the national average. In these states, HWSM might show an RN shortfall and an LPN surplus, whereas the state might perceive no imbalances.
  3. Omission of market forces and economic concepts in HWSM: HWSM currently lacks a market mechanism where labor costs respond to imbalances between supply and demand. A growing shortfall of providers, for example, presumably would cause wages to increase, and subsequently increase the desirability to work in the profession (thus increasing supply), while raising labor expenses (thus decreasing demand). Market forces, therefore, will tend to alleviate severe imbalances between supply and demand. Nevertheless, time lags such as the years required to adjust the training pipeline can lead to inefficiencies if relying on market forces alone. HWSM projections help work as market “signals” of growing imbalances between supply and demand, such that imbalances can be identified sufficiently far in advance to inform career and policy decisions that might help mitigate the severity of imbalances.
  4. Use of survey data in lieu of population data for supply modeling: HWSM uses the American Community Survey (ACS) or Occupational Employment and Wage Statistics (OEWS) data to estimate the starting year supply of many health occupations. Many states, however, have access to more complete supply data collected through the licensure/certification processes. Without comprehensive state-level data, HWSM continues to use the ACS data. However, even state licensure files have data limitations. One limitation is that state licensure boards vary in the types and completeness of information they collect. While licensure files indicate whether the license is active, many licensure boards do not collect information on whether the licensed person is active in their profession and whether the person is active in that particular state. (This is especially true for the registered nurse workforce, where many states belong to compacts that allow the nurse to work in other states).
  5. Omitted data elements from the constructed population file: The population file starts with people in households who responded to the ACS. However, the ACS lacks health-related information such as whether the person has various chronic diseases, health risk factors such as smoking and obesity, and information on the person’s mental health status or use of addictive substances. Information on chronic disease and health risk factors is obtained by statistically merging ACS with other sources of data—the Behavioral Risk Factor Surveillance System (BRFSS) for people who live in the community, the Medicare Beneficiary Survey subset of people living in residential communities, and the Nursing Home Minimum Dataset for people residing in nursing homes. These data sources with health-related information still lack a standardized metric for mental health status, addiction status, and dental insurance. Such information could improve projections for mental health workers, addiction counselors, and oral health providers, respectively. Dental insurance is unavailable for inclusion in the constructed population file. However, as discussed later, the use of medical insurance as a proxy for dental insurance performed equally well for modeling demand for oral health services.
  6. Demand modeled where people reside: There is little information on consumer care migration patterns. Many large metropolitan areas cross state boundaries, and where people live is not necessarily where they receive care. Comparison of HWSM projections of hospital inpatient days to inpatient days reported by the American Hospital Association shows deviation across states between actual and projected inpatient days. It is unclear to what extent such deviation results from uncaptured demand determinants that affect how each state’s population uses health care services, and to what extent such deviation occurs between care is unavailable in the resident state so the patient seeks care in another state.
  7. Supply projections unavailable for some health occupations: For occupations such as aides and assistants, where there is easy entry and exit from the workforce, insufficient data exist to develop accurate supply projections. There are multiple paths to entry for many of these occupations, and some states do not require licensure or graduation from a formal training program. They typically have low pay, so there is a high attrition rate which generally comes well before the traditional retirement.
  8. Uncertainty about changes in health care use and delivery over time: Demand modeling extrapolates current health care use and delivery patterns to the future population. Changing technology, medical innovations, and economic factors all can contribute to evolving care use and delivery patterns.

Additional data limitations are occupation specific and include imprecision in supply and demand determinants.

HWSM improvement

HRSA continues to explore improvements to HWSM, model inputs, and projections. To provide the highest quality projections, questions regarding technical accuracy and suggestions for improvement of the model were thoroughly investigated. The following are examples of improvements to HWSM:

In 2023, a career change component was added to HWSM. Analysis of the Current Population Survey (CPS) Annual Social and Economic Supplement data produced estimates of the probability that people under age 50 in each healthcare occupation would change careers and leave the occupation. This lowered projected growth in supply—particularly for occupations with lower education and training requirements and occupations with lower pay.

Also in 2023, projections of the health workforce demand implications of COVID-19 becoming endemic were added to HWSM. As new data becomes available on the prevalence of acute COVID-19 and long-COVID symptoms, model inputs will be updated.

For many occupations, there is growing evidence of a current shortage. The 2023 projections, covering the period 2021 through 2036, includes a starting year shortfall for professions where there is sufficient information to quantify a shortfall. As noted above, shortfall information comes from published profession-specific studies, vacancy data from surveys, and other calculations based on shifts in demand associated with COVID-19 becoming endemic and increased demand for behavioral health services.

In 2019, in response to questions, we investigated the issue of possible overdispersion in the Poisson models used to develop predictive equations of annual visits to various types of providers and care delivery settings. We examined potential alternatives to Poisson models. If data are distributed according to a Poisson distribution, their mean will equal their variance. However, the Medical Expenditure Panel Survey (MEPS) data with number of annual visits to various providers tend to contain more zeroes than would be expected in a Poisson distribution. Fitting a Poisson model to data exhibiting overdispersion will tend to produce understated standard errors.

Potentially better fitting models for count data that contain large numbers of zeroes include negative binomial, zero-inflated, and zero-altered models. After exploring these alternative regression specifications, HWSM switched from using Poisson to negative binomial models for estimating office visits, outpatient visits, and home health visits. This change had a negligible impact on the projections but is a conceptual improvement in the prediction equations for health care use.

Also, in 2019, we evaluated the suggestion to use dental insurance rather than medical insurance as a predictor for the number of annual visits to oral health care providers. As discussed previously, dental insurance is unavailable in the constructed population file, so predictive equations of oral health use that include a dental insurance variable cannot be applied to the population file. Because dental insurance is available in the MEPS file, we tested how dental insurance compared to medical insurance as a predictive variable for oral health modeling.

Our analysis looked at the root mean square error (RMSE), a measure of accuracy of the resulting predictions, when using dental insurance versus medical insurance as a predictor of annual visits to oral health providers:

RMSE equation. Equation is described in text before and after.

where yj = observed visits, and ŷj = visits predicted by the model, for each observation j. RMSE was equivalent to two decimal places for the two formulations in regressions of visits to both dentists and hygienists.

We then performed additional comparisons. We split the data 10 times for each oral health worker designation into training sets (75%, picked randomly) and testing sets (other 25%). In each split (and for each profession), we compared the percentage of total prediction error using the medical insurance coverage variable to the percentage of total prediction error using the dental insurance coverage variable. Total prediction error was always within 0.5% of each other. It was sometimes higher for the model with dental insurance and sometimes higher for the model with medical insurance. Thus, we found no evidence that dental insurance performed differently from medical insurance and we retained medical insurance as a predictor variable for annual visits to oral health care providers.

Recent analyses have explored potential improvements to HWSM. Some analyses have already been implemented in HWSM—such as improvements to attrition rates for RNs and LPNs under age 50, and improvements to state-level estimates of the number of new nurses entering the nurse workforce.

Other analyses continue to be explored including:

  1. Incorporating economic factors: Health workforce supply and demand respond to economic forces. Labor economic theory and empirical research find that economic factors such as compensation and wealth can affect individual and household employment decisions. Likewise, economic factors affect the decisions made by individuals, households, payers, health care provider organizations, and other entities (e.g., medical device companies, pharmaceutical companies) on health care products and services to consume and provide. Economic factors affect resource allocation decisions to meet the demand for health care services, and often signal any growing gap between supply and demand. Research explored the impact of nurse wages on labor force participation decisions, weekly hours worked, and cross-state migration. Conducted research added cost-of-living-adjusted state data on mean wages for RNs and used LPNs as explanatory variables in the regression equations predicting labor force decisions. While most findings were as expected, many findings were either not statistically significant or had minimal impact. Additional research on other health occupations is required before incorporating such economic factors into HWSM. A review of the literature explored how economic factors might affect demand for health care services and providers. There exists a paucity of recent research published on this topic.
  2. Using dynamic staffing patterns: HWSM uses national staffing ratios when modeling demand for health care workers based on projected demand for health care services. Our research explored how nurse staffing in nursing homes (resident-to-nurse ratio) and hospitals (resident-to-inpatient days ratio) varies across states as a function of nurse wages. The analysis used a regression approach using individual hospitals and individual nursing homes as units of analysis. Explanatory variables included the ratio of state mean RN wages to state mean LPN wages. The hypothesis is that as RN wages rise relative to LPN wages, some employers will substitute LPN labor for RN labor at the margin. (Higher wages also could be indicative of a provider shortfall). Findings were consistent with expected results. However, the overall impact on demand for nurses is small. Future research will continue to explore this topic, seeking to improve data sources available for analysis and extending to other health occupations.
  3. Modeling more detailed hospital care delivery settings: HWSM models demand for hospital inpatient care based on total inpatient days, without accounting for care delivery setting. Research used State Inpatient Databases from five states where a revenue code allowed for tracking the number of inpatient days across different units of the hospital. For analysis, units were grouped under the categories critical care, medical/surgical, obstetrics and newborn (excluding intensive care units), psychiatric, and other. Nurse staffing levels differ by unit type. Also, demand for some occupations—such as critical care physicians and respiratory therapists—is concentrated in critical care units. Modeling of these more detailed settings could improve demand projections for some occupations. This is an area of ongoing research as new data (2021 and later) becomes available.

HWSM continues to evolve as newer and better data becomes available, and as published research helps inform model parameters and scenarios.

Date Last Reviewed:
  • 1 Eddy DM, Hollingworth W, Caro JJ, Tsevat J, McDonald KM, Wong JB. Model Transparency and Validation: A Report of the ISPOR-SMDM Modeling Good Research Practices Task Force-7. Value in Health. 2012;15(6):843-850.