Skip to contents

Perform Profile Repeatability

Usage

profrep(df, n_timepoints, sort = TRUE, verbose = FALSE)

Arguments

df

The input data frame, of minimum shape 3 rows by 4 columns. This can be read in from a csv or another data frame stored in memory. It is assumed that the data frame is of the following structure: Column 1 is the unique identifier of an individual animal or sample Column 2 is the time of the sample Column 3-N are the columns of replicate data. Row 1 is assumed to be header strings for each column.

n_timepoints

The number of rows an individual sample will have. For example, if the replicates were collected for individual 1 at times 15 and 30, for replicates A and B, the data frame would look like:

            | id | time | A | B |
            |:--:|:----:|:-:|:-:|
            | 1  | 15 | 1 | 2 |
            | 1  | 30 | 3 | 4 |

sort

A boolean parameter that defaults to TRUE. If TRUE, sorts the returned data frame by score. If FALSE, returns the data in the individual order in which it was provided.

verbose

A boolean parameter that defaults to FALSE. Determines whether messages are printed.

Value

Returns a data frame of the results, in the following form:

  • Column 1: "individual" - the unique identifier of an individual or sample

  • Column 2: "n_crossings" - the calculated number of crossings.

  • Column 3: "max_variance" - the maximum of the variances of the replicate measurements at a single time for the individual or sample.

  • Column 4: "ave_variance" - the average of the variances of the replicate measurements at a single time for the individual or sample.

  • Column 5: "base_score" - the original, unnormalized profile repeatability score. Smaller numbers rank higher.

  • Column 6: "final_score" - the base score, normalized by the sigmoid function. Constrained to be between 0 and 1. Scores closer to 1 rank higher.

  • Column 7: "rank" - the calculated ranking of the individual or sample, against all other individuals or samples in the data set.

Details

Calculates the profile repeatability measure of the input data according to the method in Reed et al., 2019, J. Gen. Comp. Endocrinol. (270).

See also

do_ordering for the main data processing function.

calculate_crossovers for how the number of crossings are calculated.

score_individual_df for how the score is calculated for an individual or sample.

clean_data for how missing replicate values are handled.

Examples

test_data <- profrep::example_two_point_data
results <- profrep::profrep(df=test_data, n_timepoints=2)
print(results)
#>   individual n_crossings max_variance ave_variance base_score final_score rank
#> 1          6         117        83.64        53.26     187.09      0.9581    1
#> 2          9         150        97.23        57.33     232.34      0.9356    2
#> 3          7         160       151.26        96.69     368.95      0.7876    3
#> 4          1         149       182.41       100.01     428.34      0.6719    4
#> 5          4         180       197.31       113.02     507.64      0.4809    5