打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
倾向性评分匹配完整实例(R实现)
倾向性评分匹配(propensity score matching, PSM)主要是在随机对照试验(Randomized controlled trials,RCT)中用于衡量treat组和control组样本的其他各项特征(如年龄、体重、身高、人种等)的整体均衡性的度量。比如说研究一种药物对疾病的影响,在临床实验中,treat组和control组除了使用药物(安慰剂)不同外,其他的临床特征(如年龄、体重等)都应该基本是相似的,这样treat和control组才有可比性,进而才能验证药物的有效性。
如下图所示,该治疗方法实际上是无效的,但是由于分组中年龄的不平衡导致得出错误的结论。
对于衡量同一特征的组间差异或者距离,我们通常使用标准均值误差(Standardized mean difference,SMD,PMC3144483)。
对于连续性特征变量公式如下:
对于离散型特征变量,公式如下:
下面我们使用tableone包来统计每个变量的标准均值误差SMD,数据来源是右心导管插入(right heart catheterization, RHC)数据,treat组是"RHC",control是"No RHC"(变量 swang1)library(tableone)## PS matchinglibrary(Matching)## Weighted analysislibrary(survey)library(reshape2)library(ggplot2)## Right heart cath datasetrhc <- read.csv("http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/rhc.csv")# 待统计的协变量:vars <- c("age","sex","race","edu","income","ninsclas","cat1","das2d3pc","dnr1", "ca","surv2md1","aps1","scoma1","wtkilo1","temp1","meanbp1","resp1", "hrt1","pafi1","paco21","ph1","wblc1","hema1","sod1","pot1","crea1", "bili1","alb1","resp","card","neuro","gastr","renal","meta","hema", "seps","trauma","ortho","cardiohx","chfhx","dementhx","psychhx", "chrpulhx","renalhx","liverhx","gibledhx","malighx","immunhx", "transhx","amihx")## Construct a tabletabUnmatched <- CreateTableOne(vars = vars, strata = "swang1", data = rhc, test = FALSE)## Show table with SMDprint(tabUnmatched, smd = TRUE)
下图是统计结果,第一列是协变量,第二列是按照有无RHC(treat、contol组)各变量的统计值(mean和SD),最后一列是SMD,可以看出treat和control组的age和sex差异都较小(<10%),income较高(>10%)。
接下来计算通过logit回归计算每个样本的倾向性分数(Propensity Score, PS),也就是被分配为RHC的概率.
## Fit modelpsModel <- glm(formula = swang1 ~ age + sex + race + edu + income + ninsclas + cat1 + das2d3pc + dnr1 + ca + surv2md1 + aps1 + scoma1 + wtkilo1 + temp1 + meanbp1 + resp1 + hrt1 + pafi1 + paco21 + ph1 + wblc1 + hema1 + sod1 + pot1 + crea1 + bili1 + alb1 + resp + card + neuro + gastr + renal + meta + hema + seps + trauma + ortho + cardiohx + chfhx + dementhx + psychhx + chrpulhx + renalhx + liverhx + gibledhx + malighx + immunhx + transhx + amihx, family = binomial(link = "logit"), data = rhc)## Predicted probability of being assigned to RHCrhc$pRhc <- predict(psModel, type = "response")## Predicted probability of being assigned to no RHCrhc$pNoRhc <- 1 - rhc$pRhc## Predicted probability of being assigned to the treatment actually assigned (either RHC or no RHC)rhc$pAssign <- NArhc$pAssign[rhc$swang1 == 1] <- rhc$pRhc[rhc$swang1 == 1]rhc$pAssign[rhc$swang1 == 0] <- rhc$pNoRhc[rhc$swang1 == 0]## Smaller of pRhc vs pNoRhc for matching weightrhc$pMin <- pmin(rhc$pRhc, rhc$pNoRhc)
然后使用Matching包进行匹配一致性样本(1:1匹配)。
listMatch <- Match(Tr = (rhc$swang1 == 1), # Need to be in 0,1 ## logit of PS,i.e., log(PS/(1-PS)) as matching scale X = log(rhc$pRhc / rhc$pNoRhc), ## 1:1 matching M = 1, ## caliper = 0.2 * SD(logit(PS)) caliper = 0.2, replace = FALSE, ties = TRUE, version = "fast")# determining if balance exists in any unmatched dataset and in matched datasetsmb <- MatchBalance(psModel$formula, data=rhc, match.out=listMatch, nboots=50)
将匹配的样本提取出来:
rhcMatched <- rhc[unlist(listMatch[c("index.treated","index.control")]), ]
再看下现在匹配后的SMD,现在所有变量的SMD都小于10%了。
## Construct a tabletabMatched <- CreateTableOne(vars = vars, strata = "swang1", data = rhcMatched, test = FALSE)## Show table with SMDprint(tabMatched, smd = TRUE)
然后给样本进行加权,使得各组中的倾向性评分基本一致,进而消除混杂因素,作为标准平衡数据参考。一般有两种加权方法:逆概率处理加权法(the inverse probability of treatment weighting,IPTW)和标准化死亡比加权法(the standardized mortality ratio weighting,SMRW),本次我们是有IPTW的进阶版(PMID:26238958):## Matching weightrhc$mw <- rhc$pMin / rhc$pAssign# IPTW:rhc$mw1=ifelse(rhc$swang1==1,1/(rhc$pRhc),1/(1-rhc$pRhc))## Weighted datarhcSvy <- svydesign(ids = ~ 1, data = rhc, weights = ~ mw)## Construct a table (This is a bit slow.)tabWeighted <- svyCreateTableOne(vars = vars, strata = "swang1", data = rhcSvy, test = FALSE)## Show table with SMDprint(tabWeighted, smd = TRUE)
加权后变量组间差异(很小):
进行作图比较 Unmatched、Mathced和Weighted结果:
library(data.table)## Construct a data frame containing variable name and SMD from all methodsdataPlot <- data.table(variable = rownames(ExtractSmd(tabUnmatched)), Unmatched = ExtractSmd(tabUnmatched), Matched = ExtractSmd(tabMatched), Weighted = ExtractSmd(tabWeighted))colnames(dataPlot) <- c("variable","Unmatched","Matched","Weighted")## Create long-format data for ggplot2dataPlotMelt <- melt(data = dataPlot, id.vars = c("variable"), variable.name = "Method", value.name = "SMD")## Order variable names by magnitude of SMDvarNames <- as.character(dataPlot$variable)[order(dataPlot$Unmatched)]## Order factor levels in the same orderdataPlotMelt$variable <- factor(dataPlotMelt$variable, levels = varNames)## Plot using ggplot2ggplot(data = dataPlotMelt, mapping = aes(x = variable, y = SMD, group = Method, color = Method)) + geom_line() + geom_point() + geom_hline(yintercept = 0.1, color = "black", size = 0.1) + coord_flip() + theme_bw() + theme(legend.key = element_blank())
可以看出加权后的"标准数据"和我们PSM后的结果基本是一致的。最后还看看右心导管插不插对于生存是否有影响,使用ShowRegTable函数计算风险比(hazard ratio,HR)[95% CI)]和pvalue。## Unmatched model (unadjsuted)glmUnmatched <- glm(formula = (death == "Yes") ~ swang1, family = binomial(link = "logit"), data = rhc)## Matched modelglmMatched <- glm(formula = (death == "Yes") ~ swang1, family = binomial(link = "logit"), data = rhcMatched)## Weighted modelglmWeighted <- svyglm(formula = (death == "Yes") ~ swang1, family = binomial(link = "logit"), design = rhcSvy)## Show results togetherresTogether <- list(Unmatched = ShowRegTable(glmUnmatched, printToggle = FALSE), Matched = ShowRegTable(glmMatched, printToggle = FALSE), Weighted = ShowRegTable(glmWeighted, printToggle = FALSE))print(resTogether, quote = FALSE)
参考:
https://cran.r-project.org/web/packages/tableone/vignettes/smd.html
https://www.mediecogroup.com/method_topic_article_detail/131/
更多原创精彩视频敬请关注生信杂谈:
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
倾向得分匹配(PSM)操作过程与反思
倾向性评分匹配(Propensity Score Matching)的基本要点
详细教程:如何利用SPSS和R语言实现倾向得分匹配?
IntelliJ IDEA :: Structural Search and Replace: What, Why, and How-to
混合logit模型(随机参数模型)的STATA应用及结果解读
倾向评分
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服