--- title: "Proportions" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Proportions} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} # knitr::opts_chunk$set( # collapse = TRUE, # comment = "#>" # ) ``` ```{r , echo=FALSE} library(ILSAstats) ``` We can estimate the proportions of the categories of any variable using the function `repprop()`, as any other "rep" function of `ILSAstats`, we need to specify the data (`df`), the total weights (`wt`), the replicate weights (`repwt`), and the method (`method`). Besides these basic options, other arguments can be used: - `x`: a string with the name of the variable (or variables) to be used in the analysis. If multiple variables are specified, they will be treated as plausible values. - `group`: a string containing the name of the variable that contains the groups of countries. If used all statistics will be estimated separately for each group. - `exclude`: a string containing which groups should be excluded from aggregate estimations. - `aggregates`: a string containing the aggregate statistics that should be estimated. Options include: `"pooled"` for also estimating all groups (without exclusions) as a single group; and `"composite"` for averaging all the estimations for each single group (without exclusions). ## Weights and setup For `repprop()`, first we need to create the replicate weights. Using the included `repdata` data, and using the `"LANA"` method: ```{r} RW <- repcreate(df = repdata, wt = "wt", jkzone = "jkzones", jkrep = "jkrep", method = "LANA") ``` To make it easier to specify some arguments, it is advised that we create also a `"repsetup"` object. We will create three setups for this example: one without groups, one with groups and without exclusions, and one with groups and exclusions (excluding group 2): ```{r} # No groups STNG <- repsetup(repwt = RW, wt = "wt", df = repdata, method = "LANA") # With groups STGR <- repsetup(repwt = RW, wt = "wt", df = repdata, method = "LANA", group = "GROUP") # With groups and exclusions STGE <- repsetup(repwt = RW, wt = "wt", df = repdata, method = "LANA", group = "GROUP", exclude = "GR2") ``` ## Single variable For example, if we want to estimate the proportions of each category of variable `"GENDER"`, we can use either of the setups to get the overall or group results: ```{r} # No groups repprop(x = "GENDER", setup = STNG) # With groups repprop(x = "GENDER", setup = STGR) # With groups and exclusions repprop(x = "GENDER", setup = STGE) ``` We can notice that using no groups we would get the same results for the pooled estimates if we use groups and no exclusions. But, when we exclude group 2, the pooled and the composite estimate changes. ## Proportion table By default, `repprop()` will provide a list where each element corresponds to the statistics of each category. Nevertheless, we can summarize these results into a single data frame using `repprop.table()`: ```{r} p1 <- repprop(x = "GENDER", setup = STGR) repprop.table(x = p1) ``` By default, `repprop.table()` will provide a long table, but we can also get wide tables were groups are separated by columns or by rows, and combined or not the proportion estimates with the standard errors: ```{r} # Groups by rows, separate SE repprop.table(x = p1, type = "wide1") # Groups by rows, non-separate SE repprop.table(x = p1, type = "wide1", separateSE = FALSE) # Groups by columns, separate SE repprop.table(x = p1, type = "wide2") # Groups by columns, non-separate SE repprop.table(x = p1, type = "wide2", separateSE = FALSE) ``` ## Plausible values When treating with plausible values, we need to specify the names of all plausible values of a construct, so all estimates will be combined. For example, for estimating the proportions of each proficiency level in math we would use: mean achievement in math for this sample we would use: ```{r} # No groups repprop(x = paste0("CatMath",1:5), setup = STNG)|> repprop.table(type = "wide2") # With groups repprop(x = paste0("CatMath",1:5), setup = STGR)|> repprop.table(type = "wide2") ``` ## Aggregates When using groups we can always omit the pooled and composite calculations if we need to, by default both estimates will be calculated. ```{r} # Default repprop(x = paste0("CatMath",1:5), setup = STGR)|> repprop.table(type = "wide2",separateSE = FALSE) # Only pooled repprop(x = paste0("CatMath",1:5), setup = STGR, aggregates = "pooled")|> repprop.table(type = "wide2",separateSE = FALSE) # Only composite repprop(x = paste0("CatMath",1:5), setup = STGR, aggregates = "composite")|> repprop.table(type = "wide2",separateSE = FALSE) # No aggregates repprop(x = paste0("CatMath",1:5), setup = STGR, aggregates = NULL)|> repprop.table(type = "wide2",separateSE = FALSE) ```