---
title: "Proportions"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Proportions}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
# knitr::opts_chunk$set(
# collapse = TRUE,
# comment = "#>"
# )
```
```{r , echo=FALSE}
library(ILSAstats)
```
We can estimate the proportions of the categories of any variable using the function `repprop()`, as any other "rep" function of `ILSAstats`, we need to specify the data (`df`), the total weights (`wt`), the replicate weights (`repwt`), and the method (`method`).
Besides these basic options, other arguments can be used:
- `x`: a string with the name of the variable (or variables) to be used in the analysis. If multiple variables are specified, they will be treated as plausible values.
- `group`: a string containing the name of the variable that contains the groups of countries. If used all statistics will be estimated separately for each group.
- `exclude`: a string containing which groups should be excluded from aggregate estimations.
- `aggregates`: a string containing the aggregate statistics that should be estimated. Options include: `"pooled"` for also estimating all groups (without exclusions) as a single group; and `"composite"` for averaging all the estimations for each single group (without exclusions).
## Weights and setup
For `repprop()`, first we need to create the replicate weights. Using the included `repdata` data, and using the `"LANA"` method:
```{r}
RW <- repcreate(df = repdata,
wt = "wt",
jkzone = "jkzones",
jkrep = "jkrep",
method = "LANA")
```
To make it easier to specify some arguments, it is advised that we create also a `"repsetup"` object. We will create three setups for this example: one without groups, one with groups and without exclusions, and one with groups and exclusions (excluding group 2):
```{r}
# No groups
STNG <- repsetup(repwt = RW, wt = "wt", df = repdata, method = "LANA")
# With groups
STGR <- repsetup(repwt = RW, wt = "wt", df = repdata, method = "LANA",
group = "GROUP")
# With groups and exclusions
STGE <- repsetup(repwt = RW, wt = "wt", df = repdata, method = "LANA",
group = "GROUP", exclude = "GR2")
```
## Single variable
For example, if we want to estimate the proportions of each category of variable `"GENDER"`, we can use either of the setups to get the overall or group results:
```{r}
# No groups
repprop(x = "GENDER", setup = STNG)
# With groups
repprop(x = "GENDER", setup = STGR)
# With groups and exclusions
repprop(x = "GENDER", setup = STGE)
```
We can notice that using no groups we would get the same results for the pooled estimates if we use groups and no exclusions. But, when we exclude group 2, the pooled and the composite estimate changes.
## Proportion table
By default, `repprop()` will provide a list where each element corresponds to the statistics of each category. Nevertheless, we can summarize these results into a single data frame using `repprop.table()`:
```{r}
p1 <- repprop(x = "GENDER", setup = STGR)
repprop.table(x = p1)
```
By default, `repprop.table()` will provide a long table, but we can also get wide tables were groups are separated by columns or by rows, and combined or not the proportion estimates with the standard errors:
```{r}
# Groups by rows, separate SE
repprop.table(x = p1, type = "wide1")
# Groups by rows, non-separate SE
repprop.table(x = p1, type = "wide1", separateSE = FALSE)
# Groups by columns, separate SE
repprop.table(x = p1, type = "wide2")
# Groups by columns, non-separate SE
repprop.table(x = p1, type = "wide2", separateSE = FALSE)
```
## Plausible values
When treating with plausible values, we need to specify the names of all plausible values of a construct, so all estimates will be combined. For example, for estimating the proportions of each proficiency level in math we would use:
mean achievement in math for this sample we would use:
```{r}
# No groups
repprop(x = paste0("CatMath",1:5), setup = STNG)|>
repprop.table(type = "wide2")
# With groups
repprop(x = paste0("CatMath",1:5), setup = STGR)|>
repprop.table(type = "wide2")
```
## Aggregates
When using groups we can always omit the pooled and composite calculations if we need to, by default both estimates will be calculated.
```{r}
# Default
repprop(x = paste0("CatMath",1:5), setup = STGR)|>
repprop.table(type = "wide2",separateSE = FALSE)
# Only pooled
repprop(x = paste0("CatMath",1:5), setup = STGR, aggregates = "pooled")|>
repprop.table(type = "wide2",separateSE = FALSE)
# Only composite
repprop(x = paste0("CatMath",1:5), setup = STGR, aggregates = "composite")|>
repprop.table(type = "wide2",separateSE = FALSE)
# No aggregates
repprop(x = paste0("CatMath",1:5), setup = STGR, aggregates = NULL)|>
repprop.table(type = "wide2",separateSE = FALSE)
```