Yoo C., Saxena A., Krup, K., Kulkarni V., Kulkarni S., Klausner, J., Devieux, J., & Madhivanan, P. (December 2013). Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India.
ABSTRACT
Discernment analyses in survey data are being developed to help researchers better understand intentions of surveyed subjects. These models can aid in successful decision-making by allowing calculation of the likelihood of a particular outcome based on subject’s known characteristics. The most frequently used discernment analysis in epidemiological datasets with binary outcomes is logistic regression. However, modern discernment Bayesian methods — i.e., Naïve Bayes Classifier and Bayesian networks — have shown promising results, especially with datasets that have a large number of independent variables (>30). A study was conducted to review and compare these models, elucidate the advantages and disadvantages of each, and provide criteria for model selection. The two models were used for estimation of acceptance of medical male circumcision among a sample of 457 males in Pune, India on the basis of their answers to a survey that included questions on sociodemographics, HIV prevention knowledge, high-risk behaviors, and other characteristics. Although the models demonstrated similar performance, the Bayesian methods performed better especially in predicting negative cases, i.e., subjects who did not want to undergo medical male circumcision in cross validation evaluations. Since there were less negative cases in the dataset, this indicates with smaller sample size, Bayesian methods perform better than logistic regression. Identifying models’ unique characteristics —strengths as well as limitations — may help improve decision-making.