Changes in version 2020-09-28                      

New Features

  - tidytable variants of functions, i.e. reshape_wide_tt(),
    renumber_time_id_tt(), pat_status_tt(), vital_status_tt(),
    calc_futime_tt() ⇒ the _tt variants usually have smaller memory use
    than tidyverse and data.table variants. Execution time is usually
    much faster than tidyverse and comparable to or a little slower than
    the data.table variant.
  - sir_byfutime():
  - is much faster using tidytable package
  - gained the option race_var to optionally stratify SIR calculations
    by race.
  - summarize_sir_results():
  - new function that increases functionality in summarizing results
    from sir_byfutime() function
  - new option to define custom site_var_name
  - new package website https://marianschmidt.github.io/msSPChelpR
  - new sample datasets included in the package to demonstrate examples
    (#36)

Breaking Changes

  - sir_byfutime():
      - options add_total_row and add_total_fu are replaced by
        calc_total_row and calc_total_fu. These are logical parameters
        now. The positioning of total rows and columns is completely
        handled by the summarize_sir_results() function now. There total
        rows can be set to top and bottom and total columns to left and
        right.
      - option expcount_src including related parameters stdpop_df,
        refpop_df, std_pop, truncate_std_pop and pyar_var have been
        removed. Function sir_byfutime() will only work calculating
        expected counts based on reference rates, not within the cohort
        of the dataset. To calculate expected based on the cohort, a new
        function create_refrates will be added in the future. (#41)
      - option collapse_ci has been removed and added to
        summarize_sir_results() instead.
      - option name for tumor site variable changed from icdcat_var to
        site_var
      - option name for age/age group variable changed from agegroup_var
        to age_var
      - in total the parameters expcount_src, futime_src, stdpop_df,
        refpop_df, std_pop, truncate_std_pop, pyar_var, icdcat_var,
        collapse_ci have been removed to simply the function ⇒ make sure
        you remove these arguments from your sir_byfutime() function
        calls.
  - sir():
      - is superseded by the use of sir_byfutime(). To migrate your
        former sir() functions, you can simply use sir_byfutime(,
        futime_breaks = "none") that will yield the same results.
  - summarize_sir_results():
      - option name for tumor site variable changed from
        summarize_icdcat to summarize_site
  - reshape_long_tidyr():
      - option var_selection is deprecated. Please select variables
        before running the reshape_long_* functions.
  - asir():
      - option name for age/age group variable changed from agegroup_var
        to age_var
      - option name for tumor site variable changed from icdcat_var to
        site_var
  - pat_status(), pat_status_tt(), vital_status(), and
    vital_status_tt():
      - Capitalized default variable labelling.
      - This might break code that relied on using the labels coming out
        of these functions in later filter or mutate functions.
  - ir_crosstab_byfutime():
      - option futime_breaks now uses breaks in years instead of months
        as previously.
      - default futime_var is now follow-up time in years
  - now requires dplyr version 1.0.0
  - now requires tidytable package
  - the default option name for tumor site variable changed from
    icdcat_var to site_var. This need manual update of function calls of
    sir_byfutime() and asir(), if option is specified.
  - the default variable name for tumor site in all functions has been
    changed from t_icdcat to t_site. So the reference data frames used
    will need to have a t_site column.
  - the data.table variants of functions (renumber_time_id_dt(),
    pat_status_dt(), reshape_long_dt(), reshape_wide_dt(),
    vital_status_dt()) have been removed for simplicity, please use
    tidytable variants, i.e. reshape_wide_tt(), renumber_time_id_tt(),
    pat_status_tt(), vital_status_tt(), calc_futime_tt(), instead. They
    will give the same data.table output and same performance.

Bug Fixes

  - implement new reliable routine to split df when reshape_wide() with
    option chunks is used. Closes #1.
  - Sorting of columns in wide datasets by reshape_wide_tidyr() and
    reshape_wide_tt() is now preserved. Closes #31.
  - ensure sorting in renumer_time_id() and make sure that
    new_time_id_var is returned as integer.
  - fix bug in pat_status_*(., check = TRUE)option
  - improve internal tests in sir_byfutime() so that PYARs do not get
    lost before running summary function
  - sir_byfutime() now also gives correct results if range of
    futime_breaks is not 0-Inf but smaller

                     Changes in version 2020-05-21                      

New Features

  - add timevar_max option to renumber_time_id() function; use sorting
    by date of diagnosis instead of old time_id_var
  - various improvements to reshape_wide_tidyr() function
  - various improvements to reshape_wide_dt() function which is much
    faster now and uses data.table::dcast instead of stats::reshape now
  - various improvements to pat_status() and pat_status_dt() functions
  - option summarize_icdcat in summarize_sir_results() is now functional
  - update vignette vignette("introduction")

Bug Fixes

  - fix incomplete check for required variables in pat_status() and
    pat_status_dt() functions
  - fix error in check for required variables in renumber_time_id() that
    broke functions
  - fix bug in check for end of FU time in pat_status() and
    calc_futime()
  - implement new tidyselect routine using tidyselect::all_of in
    summarize_sir_results()

                     Changes in version 0.9.1.9000                      

                 Changes in version 0.9.1 (2024-01-23)                  

New Features

  - new function histgroup_iarc() to create variable for groups of
    malignant neoplasms considered to be histologically 'different' for
    the purpose of defining multiple tumors, ICD-O-3 (see #100)
  - some functions gain new quiet argument to suppress rlang::warn() and
    rlang::inform() messages. You can use this when you have checked
    your results for correctness and want to reduce message output, but
    keep the progress bars.
  - asir(): add World Standard Population 2000-2025 for function with
    option std_pop=="WHO2000" as described here:
    https://seer.cancer.gov/stdpopulations/world.who.html
  - sir_byfutime() gains new argument expect_missing_refstrata_df. You
    can define another dataframe that contains strata expected to be
    missing from refrates_df (because they are not explicitly coded with
    incidence = 0). This can be helpful, if refrates_df has a lot of
    strata and 0 incidence strata have been removed to save storage
    space. Internally, the rows of expect_missing_refstrata_df will be
    appended to refrates_df. This reduces the number of lines reported
    in attribute problems_missing_ref_strata. Default setting is
    expect_missing_refstrata_df = NULL.
  - sample data set for data("us_second_cancer") gains new variable
    t_hist on histology, i.e. ICD-O-3-Code on tumor morphology (4
    digits)

Breaking Changes

  - no breaking changes in this version

Bug fixes

  - make calc_refrates() more robust for missing race_var (Closes #89)
  - fix bug in calc_refrates() using calc_totals == TRUE (Closes #90)
  - fix bug in calc_refrates() using numeric versions of fill_sites
    (Closes #92)
  - fix bug in asir() that throws error for variable not needed (Closes
    #95)

Internal

  - replace progress bars by cli
  - deprecate verb.()syntax from tidytable (Closes #94)

                 Changes in version 0.9.0 (2022-06-10)                  

New Features

  - new function calc_refrates() to calculate age-, sex-, region-,
    year-specific reference rates from a long format dataframe with
    cancer cases that are counted for incident cases and then matched
    with a reference population. The resulting reference rates dataframe
    can directly be used with sir_byfutime() function.
  - functions gain new default dattype = NULL and thus are more flexible
    to take other source data types (Closes #73)

Breaking Changes

  - functions asir, calc_futime*, calc_refrates, ir_crosstab_byfutime,
    pat_status*, renumber_time_id*, and sir_byfutime now by default are
    set to dattype = NULL. If you relied on automatic variable naming
    feature, you need to add dattype = "seer"or dattype = "zfkd" to your
    function call.
  - fix typo in attribute names: attributes are now correctly named
    problems_missing_count_strata and problems_missing_fu_strata (Closes
    #80)

Bug fixes

  - sir_byfutime():
      - attributes with notes and problems are now correctly saved to
        results_df

Internal

  - deprecated functions from tidytable package have been replaced
    (Closes #71 and #74)

                 Changes in version 0.8.7 (2021-07-01)                  

New Features

  - new function sir_ratio() and related sir_ratio_lci() and
    sir_ratio_uci() to calculate ratio of two SIRs/SMRs to get relative
    risk and confidence limits for this ratio.
  - tidytable variant of reshape_long function, i.e. reshape_long_tt() ⇒
    the _tt variants usually have smaller memory use than tidyverse and
    data.table variants. Execution time is usually much faster than
    tidyverse and comparable to or a little slower than the data.table
    variant.
  - summarize_sir_results():
      - add ability to summarize by different site_var than the one used
        in sir_byfutime()

Bug fixes

  - summarize_sir_results():
      - PYARs are now correctly calculated when using summarize_site ==
        TRUE. Previously the results incorrectly counted each site
        multiple times. (Closes #62)
  - pat_status():
      - update default values for dattype = "zfkd"

Internal

  - add R-CMD-Check to github actions

                 Changes in version 0.8.6 (2020-11-04)                  

New Features

  - new sample data set for standard populations ⇒
    data("standard_population")
  - new sample data set for us population ⇒ data("population_us")
    (Closes #58)

Bug fixes

  - sir_byfutime(): change output of integer columns to numeric to fix
    bug in summarize_sir_results() (Closes #59)

Other changes

  - add examples to function documentation (Closes #56)
  - remove "R" from package title (Closes #57)
  - update package description (Closes #54)
  - update introduction vignette vignette("introduction")

                        Changes in version 0.8.3                        

New Features

  - new faster version of reshape_long based on data.table
  - start new vignette on workflow from filtered long dataset to
    follow-up times vignette("patstatus_futime")

Bug Fixes

  - implement new tidyselect routine using tidyselect::all_of for
    vector-based variable selection
  - implement correct referencing in vital_status_dt and pat_status_dt
  - add exports from data.table
  - update documentation for sir and sir_byfutime functions
  - make reshape_long function work

                        Changes in version 0.8.2                        

                        Changes in version 0.8.1                        

New Features

  - new faster version of vital_status function using data.table
  - new faster version of pat_status function using data.table

                        Changes in version 0.8.0                        

New Features

  - new faster version of reshape_wide_dt function based on data.table
    and without problematic slices done by reshape_wide
  - new faster version of renumber_time_id function based on data.table

                        Changes in version 0.7.4                        

New Features

  - new function renumber_time_id

                        Changes in version 0.7.3                        

Bug Fixes

  - add check to revert status_var to numeric in case it was created
    with option as_labelled_factor
  - fix label bug in life_var_new

                        Changes in version 0.7.2                        

  - add option as_labelled_factor to vital_status function
  - fix newly introduced error in vital_status function

                        Changes in version 0.7.1                        

  - fix error in vital_status function by replacing
    sjlabelled::get_label function

                        Changes in version 0.7.0                        

  - fix error in pat_status and vital_status functions due to change in
    sjlabelled package

                       Changes in version 0.6.10                        

  - rebuild description file and manual

                        Changes in version 0.6.9                        

  - remove nest_legacy functions and use new tidyr syntax, close #19

                        Changes in version 0.6.8                        

  - make summarize_sir_results function work without break variables

                        Changes in version 0.6.7                        

  - for function sir_byfutime ⇒ make option add_total_row work, even if
    option ybreak_vars = "none"

                        Changes in version 0.6.6                        

  - Make use of time_id_var and case_id_var use coherent across reshape
    functions

                        Changes in version 0.6.5                        

  - Fixed issue in Namespace

                        Changes in version 0.6.4                        

  - Added a NEWS.md file to track changes to the package.

                        Changes in version 0.6.3                        

  - add option futime_breaks = "none" to sir_byfutime function

                        Changes in version 0.6.0                        

  - includes a new function to calculate crude (absolute) incidence
    rates a tabulate them by whatever number of grouping variables and
    it can be used as a Table 1 for publications ⇒ The function is
    called msSPChelpR::ir_crosstab
  - includes a new function to calculate SIRs (standardized incidence
    ratios) by whatever strata you desire (unlimited ybreak_vars; one
    xbreak_var) and additionally customized breaks for follow-up times
    (default is: to 6 months, .5-1 year, 1-5 years, 5-10 years, >10
    years) ⇒ attention, it only makes sense to stratify results
    (ybreak_vars or xbreak_var) by variables measured at baseline and
    not for variables that are dependent on the occurrence of an SPC) ⇒
    function msSPChelpR::sir_byfutime ⇒ depending on the number of
    stratification variables you are using, this function may result in
    a very long results data.frame. So please use it together with the
    new function msSPChelpR::summarize_sir_results
  - includes a new function to summarize results dataframes from SIR
    calculations
  - New reshape functions that are faster and are using less memory