This vignette showcases two key features that capitalize on the network structure inherent in pedigrees:
Finding extended families with any connecting relationships between members. This feature strictly uses a person’s ID, mother’s ID, and father’s ID to find out which people in a dataset are remotely related by any path, effectively finding all separable extended families in a dataset.
Using path tracing rules to quantify the amount of relatedness between all pairs of individuals in a dataset. The amount of relatedness can be characterized by additive nuclear DNA, shared mitochondrial DNA, sharing both parents, or being part of the same extended pedigree.
Many pedigree datasets only contain information on the person, their mother, and their father, often without nuclear or extended family IDs. Recognizing which sets of people are unrelated simplifies many pedigree-related tasks. This function facilitates those tasks by finding all the extended families. People within the same extended family have at least some form of relation, however distant, while those in different extended families have no relations.
We will use the potter
pedigree data as an example. For
convenience, we’ve renamed the family ID variable to oldfam
to avoid confusion with the new family ID variable we will create.
df_potter <- potter
names(df_potter)[names(df_potter) == "famID"] <- "oldfam"
ds <- ped2fam(df_potter, famID = "famID", personID = "personID")
table(ds$famID, ds$oldfam)
#>
#> 1
#> 1 36
Because the potter
data already had a family ID
variable, we compare our newly created variable to the pre-existing one.
They match!
Subsetting a pedigree allows researchers to focus on specific family
lines or individuals within a larger dataset. This can be particularly
useful for data validation as well as simplifying complex pedigrees for
visualization. However, subsetting a pedigree can result in the
underestimation of relatedness between individuals. This is because the
subsetted pedigree may not contain all the individuals that connect two
people together. For example if we were to remove Arthur Weasley (person
9) and Molly Prewett (person 10) from the potter
dataset,
we would lose the connections amongst their children.
In the plot above, we have removed Arthur Weasley (person 9) and
Molly Prewett (person 10) from the potter
dataset. As a
result, the connections between their children are lost.
Similarly, if we remove the children of Vernon Dursley (1) and
Petunia Evans (3) from the potter
dataset, we would lose
the connections between the two individuals.
However, this subset does not plot the relationship between spouses (such as the marriage between Vernon Dursley and Petunia Evans), as there are not children to connect the two individuals together yet.