Using item‐response theory to improve interpretation of the Trans Woman Voice Questionnaire

NW Zhao, JM Mason, AM Blum, EK Kim… - The …, 2023 - Wiley Online Library
NW Zhao, JM Mason, AM Blum, EK Kim, VVN Young, CA Rosen, SL Schneider
The Laryngoscope, 2023Wiley Online Library
Objective The Trans Woman Voice Questionnaire (TWVQ) is commonly used to quantify self‐
perceptions of voice for trans women seeking gender‐affirming voice care, but the
interpretation of TWVQ scores remains challenging. The objective of this study was to use
item‐response theory (IRT) to evaluate the relationship between TWVQ items and persons
on a common scale and identify improvements to increase the meaningfulness of TWVQ
scores. Methods A retrospective review of TWVQ scores from trans women patients between …
Objective
The Trans Woman Voice Questionnaire (TWVQ) is commonly used to quantify self‐perceptions of voice for trans women seeking gender‐affirming voice care, but the interpretation of TWVQ scores remains challenging. The objective of this study was to use item‐response theory (IRT) to evaluate the relationship between TWVQ items and persons on a common scale and identify improvements to increase the meaningfulness of TWVQ scores.
Methods
A retrospective review of TWVQ scores from trans women patients between 2018–2020 was performed. Rasch‐family models were used to generate item‐person maps positioning respondent location and item difficulty estimates on a logit scale, which was then converted into a scaled score using linear transformations.
Results
TWVQ responses from 86 patients were analyzed. Initial item‐person maps demonstrated that the middle response categories (“sometimes” and “often”) performed inconsistently across items (poor threshold banding); interpretability improved when these ratings were scored as one category. The models were rerun using revised scoring, which retained high reliability (0.93) and supported a unidimensional construct. Updated item‐person maps revealed four scaled score zones (≤54, >54 to ≤101, >101 to ≤140, and >140) that each corresponded to an increasing pattern of item thresholds (probability of selecting one response category vs. others). These ranges can be interpreted as minimal, low, moderate, and high, respectively.
Conclusions
Empiric data from Rasch analysis supports new interval scoring for the TWVQ that advances the clinical and research utility of the instrument and lays the foundation for future improvements in clinical care and outcomes assessment.
Level of Evidence
NA Laryngoscope, 133:1197–1204, 2023
Wiley Online Library
以上显示的是最相近的搜索结果。 查看全部搜索结果