The usethnicity
data set contains variables on race and ethnic identification from the 2017 Youth Risk Behaviour Survey, together with two variables on smoking behaviour. The YRBS is a multistage cluster-sampled survey, so valid inference about associations requires using survey design information. This subset of variables without weights is useful only for demonstration purposes.
library(rimu)
data(usethnicity)
head(usethnicity)
## Q4 Q5 QN30 QN31
## 1 2 E 2 2
## 2 1 2 2
## 3 1 A 1 2
## 4 1 2 2
## 5 2 E 2 2
## 6 2 E 1 1
Question 4 asks Are you Hispanic or Latino?, and Question 5 asks for any of
that apply. In the data set, these five letters are pasted together into a single variable.
We need to split Q5
into its component letters. The method for character strings does this
race<-as.mr(usethnicity$Q5,"")
mtable(race)
## A B C D E F G H
## 847 863 1014 3643 455 8306 5 1 7
There's a spurious " "
category from the string splitting, and the values F
, G
, and H
are also invalid, so we need to remove them
race<-mr_drop(race,c(" ","F","G","H"))
mtable(race)
## A B C D E
## 863 1014 3643 455 8306
We might want easier-to-recognise names for the categories
race <- mr_recode(race, AmIndian="A",Asian="B", Black="C", Pacific="D", White="E")
Now, Hispanic/Latino ethnicity is asked in a separate question. We convert it via the as.mr
method for logical vectors, and then combine it with race
hispanic<-as.mr(usethnicity$Q4==1, "Hispanic")
ethnicity<-mr_union(race, hispanic)
ethnicity[101:120]
## [1] "Black" "Black" "Black"
## [4] "Black" "AmIndian+Black" "Black"
## [7] "Black" "Black" "Black"
## [10] "Black" "Black" "Black"
## [13] "Black+?Hispanic" "Black" "Black"
## [16] "Black" "Black" "Black"
## [19] "AmIndian+Black+White" "Black"
The plot
method shows co-occurence of the various race/ethnicity terms
plot(ethnicity,nsets=6)
## Warning: Removed 1 rows containing missing values (geom_bar).
Tabulations against other factor or multiple-response variables are possible with mtable
. Note that mtable
shows frequencies for each category; use as.character
to get frequencies for combinations -- do not use as.factor
, which is not generic and so cannot have a mr
method.
mtable(ethnicity, usethnicity$QN30)
## 1 2 <NA>
## AmIndian 242 466 0
## Asian 154 679 0
## Black 612 2000 0
## Pacific 120 256 0
## White 2112 4759 0
## Hispanic 889 2301 0
table(ethnicity %has% "Black", usethnicity$QN30)
##
## 1 2
## FALSE 2704 6596
## TRUE 612 2000
table(ethnicity %hasonly% "Black", usethnicity$QN30)
##
## 1 2
## FALSE 2878 7015
## TRUE 438 1581
table(as.character(ethnicity), usethnicity$QN30)
##
## 1 2
## 27 106
## AmIndian 40 65
## AmIndian+Asian 0 1
## AmIndian+Asian+Black 1 2
## AmIndian+Asian+Black+Pacific 0 0
## AmIndian+Asian+Black+Pacific+Hispanic 0 1
## AmIndian+Asian+Black+Pacific+White 3 3
## AmIndian+Asian+Black+Pacific+White+Hispanic 2 6
## AmIndian+Asian+Black+White 2 2
## AmIndian+Asian+Black+White+Hispanic 1 1
## AmIndian+Asian+Hispanic 0 1
## AmIndian+Asian+Pacific+Hispanic 1 0
## AmIndian+Asian+Pacific+White 0 1
## AmIndian+Asian+White 1 5
## AmIndian+Asian+White+Hispanic 0 1
## AmIndian+Black 11 48
## AmIndian+Black+Hispanic 3 5
## AmIndian+Black+Pacific 0 2
## AmIndian+Black+Pacific+Hispanic 0 1
## AmIndian+Black+Pacific+White 1 0
## AmIndian+Black+Pacific+White+Hispanic 1 3
## AmIndian+Black+White 13 20
## AmIndian+Black+White+Hispanic 5 7
## AmIndian+Hispanic 84 188
## AmIndian+Pacific 2 3
## AmIndian+Pacific+Hispanic 0 3
## AmIndian+Pacific+White 1 3
## AmIndian+Pacific+White+Hispanic 0 2
## AmIndian+White 50 71
## AmIndian+White+Hispanic 20 21
## Asian 80 489
## Asian+Black 2 26
## Asian+Black+Hispanic 1 0
## Asian+Black+Pacific 1 2
## Asian+Black+Pacific+Hispanic 0 1
## Asian+Black+Pacific+White 0 0
## Asian+Black+White 3 4
## Asian+Black+White+Hispanic 2 1
## Asian+Hispanic 14 25
## Asian+Pacific 8 15
## Asian+Pacific+Hispanic 3 2
## Asian+Pacific+White 5 9
## Asian+Pacific+White+Hispanic 3 2
## Asian+White 19 71
## Asian+White+Hispanic 2 8
## Black 438 1581
## Black+Hispanic 40 100
## Black+Pacific 2 14
## Black+Pacific+Hispanic 6 6
## Black+Pacific+White 1 4
## Black+Pacific+White+Hispanic 3 1
## Black+White 59 140
## Black+White+Hispanic 11 19
## Hispanic 375 1010
## Pacific 31 72
## Pacific+Hispanic 34 68
## Pacific+White 8 26
## Pacific+White+Hispanic 4 6
## White 1618 3510
## White+Hispanic 274 812