The Use of Multiple Correspondence Analysis to Explore Associations between Categories of Qualitative Variables and Cancer Incidence
Data de publicació2021
Florensa Cazorla, Dídac
MetadadesMostra el registre d'unitat complet
Background: Previous works have shown that risk factors for some kinds of cancer depend on people's lifestyle (e.g. rural or urban residence). This article looks into this, seeking relationships between cancer, age group, gender and population in the region of Lleida (Catalonia, Spain) using Multiple Correspondence Analysis (MCA). Methods: The dataset analysed was made up of 3,408 cancer episodes between 2012 and 2014, extracted from the Population-based Cancer Registry (PCR) for Lleida province. The cancers studied were colon and rectal (1,059 cases), lung (551 cases), urinary bladder (446 cases), prostate (609 cases) and breast (743 cases). The MCA technique was applied and used to search relationships among the main qualitative features. The basic statistics were the percentage explaining (variance), the inertia and the contribution of each qualitative variable. Results: General outcomes showed a low and moderate contribution of living in rural areas to colorectal and male prostate cancer. Males in urban areas were slightly and heavily affected by lung and urinary bladder cancer respectively. The analysis of each cancer provided additional information. Colorectal cancer greatly affected males aged <60, urban residents aged 70-79, and rural females aged 80. The impact of lung cancer was high among urban females <60, moderate among males aged 70-79 and high among rural females aged 80. The results for urinary bladder cancer results were similar to those for lung cancer. Prostate cancer affected both the <60 and 80 age groups significantly in rural areas. Breast cancer hit the 70-79 group significantly and, somewhat less so, rural females aged 80. Conclusions: MCA was a significant help for detecting the contributions of qualitative variables and the associations between them. MCA has proven to be an effective technique for analyzing the incidence of cancer. The outcomes obtained help to corroborate suspected trends, as well as detecting and stimulating new hypotheses about the risk factors associated with a specific area and cancer. These findings will be helpful for encouraging new studies and prevention campaigns to highlight observed singularities.
És part deIEEE Journal of Biomedical and Health Informatics, 2021, vol. 25, núm. 9, p. 3659-3667
Projectes de recerca europeus
Els fitxers de llicència següents estan associats amb aquest element: