Skip to content

Patwardhan, Gakidou and Co-authors Explore Health Differences Between Females and Males Across Major Causes of Disease Burden

CSDE Affiliates Vedavati Patwardhan (Center on Gender Equity & Health, UC San Diego) and Emmanuela Gakidou (Health Metrics Sciences), along with Luisa Flor, Gabriela Gil and other co-authors from the Institute for Health Metrics and Evaluation (IHME) released an article in The Lancet Public Health, entitled “Differences across the lifespan between females and males in the top 20 causes of disease burden globally: a systematic analysis of the Global Burden of Disease Study 2021“. This study presents a systematic exploration of health differences between females and males across major causes of disease burden. The authors used data from the 2021 Global Burden of Disease Study to examine differences in health between females and males.
Their analysis examines 20 major causes of disease burden (health loss) globally, as well as by world regions, and covers females and males spanning age ranges from adolescence to older ages. They find that overall, males face higher health loss. In 2021, health loss measured in terms of disability-adjusted life years or DALYs was higher in males than females for 13 out of the top-20 causes of disease. These conditions included COVID-19, road injuries, and a range of cardiovascular, respiratory, and liver diseases. Importantly, their study highlights that females and males experience health and disease differently throughout the lifespan. Females bear a disproportionate toll from morbidity-driven conditions whose impact predominantly contributes to disability throughout life, as opposed to leading to death at a younger ages. These include low back pain, depressive disorders, headache disorders, anxiety, other musculoskeletal disorders, Alzheimer’s disease and other dementias and HIV/AIDS. On the other hand, males bear higher health loss owing to mortality-driven conditions – such as COVID-19, road injuries, and heart disease. Providing similar estimates over conditions, regions, and time enables researchers and policy makers to clearly identify key health differences, and inform priority areas for interventions targeting differences in female–male health outcomes.

CSDE Computational Demography Working Group (CDWG) Hosts Jiahui Xu on New Natural Language Processing Models for Automated Coding (5/15/2024)

On 5/15 from 9:00 AM – 10:00 AM, CDWG will host Jiahui Xu to present her research. Jiahui Xu is a Ph.D. candidate in Sociology and Demography at Pennsylvania State University. Her research interests lie in social inequality, quantitative methodology, and computational sociology. Her actively ongoing projects include: 1). adapting the generalized random forests for causal decomposition to investigate college returns; 2). combining machine learning and causal inference methods to decompose health disparities; 3). applying natural language processing models to autocode occupational text data. The event will occur in 223 Raitt (the Demography Lab) and on Zoom (register here). Learn more about the talk in the full story.

Title: From Job Descriptions to Occupations: New Natural Language Processing Models for Automated Coding

Abstract: Occupation is a fundamental concept in social and policy research, but classifying job descriptions into occupational categories can be challenging and susceptible to errors. Traditionally, this involved expert manual coding, translating detailed, often ambiguous job descriptions to standardized categories, a process both laborious and costly. However, recent advances in computational techniques offer efficient automated coding alternatives. Existing autocoding tools, including the O*NET-SOC AutoCoder, the NIOCCS AutoCoder, and the SOCcer AutoCoder, rely on supervised machine learning methods and string-matching algorithms. Yet these autocoders are not designed to understand semantic meanings in occupational write-in text. We develop a new autocoder based on Google’s Text-to-Text Transfer Transformer (T5) model. Like GPT and other large language models, T5 is pretrained on vast amounts of text data. We develop a T5-based occupational classifier (T5-OCC) model with fine-tuned model parameters and training data from occupation write-ins from the 2019 American Community Survey. By comparing our T5-OCC with existing methods, we show that the autocoding accuracy rate increases from 61.8% to 71.1%. Considering the rapid change in neural language models, we conclude by offering suggestions on how to adapt our method for the development of occupational autocoding models in future research.