Skip to contents

Score and Order Data

Usage

do_ordering(
  n_trials,
  id_list,
  df_list,
  n_replicates,
  verbose = FALSE,
  sort = TRUE
)

Arguments

n_trials

The number of rows an individual sample will have.

id_list

The list of unique individual or sample names

df_list

The list of data frames per unique individual

n_replicates

The number of replicates in the study.

verbose

A boolean parameter the defaults to FALSE. Determines whether messages are printed.

sort

A boolean parameter that defaults to TRUE. If TRUE, sorts the returned data frame by score. If FALSE, returns the data in the individual order it was provided in

Value

Returns a data frame of the results, in the following form:

     - Column 1: "individual" - the unique identifier of an individual or sample
     - Column 2: "n_crossings" - the calculated number of crossings.
     - Column 3: "max_variance" - the maximum of the variances of the replicate measurements at a single time for the individual or sample.
     - Column 4: "ave_variance" - the average of the variances of the replicate measurements at a single time for the individual or sample.
     - Column 5: "base_score" - the original, unnormalized profile repeatability score. Smaller numbers rank higher.
     - Column 6: "final_score" - the base score, normalized by the sigmoid function. Constrained to be between 0 and 1. Scores closer to 1 rank higher.
     - Column 7: "rank" - the calculated ranking of the individual or sample, against all other individuals or samples in the data set.

Details

Performs the ordering of input data by scoring each individual data frame.

The main function of the package, this will send each individuals data out for scoring. Then, when all scores are computed, it will order the result data frame by score and assign a rank.

Ranks are assigned with ties allowed - if N individuals have a tie, their rank is averaged. For example, if the max score is 1, and two individuals have that score, their rank is 1.5

Examples

df <- data.frame(
    col_a = c('A', 'A', 'B', 'B'),
    col_b = c(5, 15, 5, 15),
    col_c = c(5, 10, 1, 2),
    col_d = c(10, 15, 3, 4)
  )
id_list <- unique(df[, 1])
individuals <- list()
for (i in 1:length(id_list)) {
  individuals[[i]] <- df[df[, 1] == id_list[i], ]
}
ret_df <- do_ordering(n_trials=2, id_list=id_list, df_list=individuals, n_replicates=2)
print(ret_df)
#>   individual n_crossings max_variance ave_variance base_score final_score rank
#> 1          B           0          2.0          2.0       4.00      0.9930    1
#> 2          A           0         12.5         12.5      25.02      0.9914    2