smallsets
Visual Documentation for Data Preprocessing
Data practitioners regularly use the 'R' and 'Python' programming languages to prepare data for analyses. Thus, they encode important data preprocessing decisions in 'R' and 'Python' code. The 'smallsets' package subsequently decodes these decisions into a Smallset Timeline, a static, compact visualisation of data preprocessing decisions (Lucchesi et al. (2022) doi:10.1145/3531146.3533175). The visualisation consists of small data snapshots of different preprocessing steps. The 'smallsets' package builds this visualisation from a user's dataset and preprocessing code located in an 'R', 'R Markdown', 'Python', or 'Jupyter Notebook' file. Users simply add structured comments with snapshot instructions to the preprocessing code. One optional feature in 'smallsets' requires installation of the 'Gurobi' optimisation software and 'gurobi' 'R' package, available from https://www.gurobi.com. More information regarding the optional feature and 'gurobi' installation can be found in the 'smallsets' vignette.
- Version2.0.0
- R version≥ 3.5.0
- LicenseGPL (≥ 3)
- Needs compilation?No
- Lucchesi et al. (2022)
- Last release12/05/2023
Documentation
Team
Lydia R. Lucchesi
Petra M. Kuhnert
Show author detailsRolesThesis advisorJenny L. Davis
Show author detailsRolesThesis advisorLexing Xie
Show author detailsRolesThesis advisor
Insights
Last 30 days
Last 365 days
The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.
Data provided by CRAN
Binaries
Dependencies
- Imports10 packages
- Suggests1 package