clusterMI
Cluster Analysis with Missing Values by Multiple Imputation
Allows clustering of incomplete observations by addressing missing values using multiple imputation. For achieving this goal, the methodology consists in three steps, following Audigier and Niang 2022 doi:10.1007/s11634-022-00519-1. I) Missing data imputation using dedicated models. Four multiple imputation methods are proposed, two are based on joint modelling and two are fully sequential methods, as discussed in Audigier et al. (2021) doi:10.48550/arXiv.2106.04424. II) cluster analysis of imputed data sets. Six clustering methods are available (distances-based or model-based), but custom methods can also be easily used. III) Partition pooling. The set of partitions is aggregated using Non-negative Matrix Factorization based method. An associated instability measure is computed by bootstrap (see Fang, Y. and Wang, J., 2012 doi:10.1016/j.csda.2011.09.003). Among applications, this instability measure can be used to choose a number of clusters with missing values. The package also proposes several diagnostic tools to tune the number of imputed data sets, to tune the number of iterations in fully sequential imputation, to check the fit of imputation models, etc.
- Version1.2.2
- R version≥ 3.5.0
- LicenseGPL-2
- LicenseGPL-3
- Needs compilation?Yes
- clusterMI citation info
- Last release10/23/2024
Documentation
Team
Vincent Audigier
Hang Joon Kim
Show author detailsRolesContributor
Insights
Last 30 days
This package has been downloaded 411 times in the last 30 days. More than a random curiosity, but not quite a blockbuster. Still, it's gaining traction! The following heatmap shows the distribution of downloads per day. Yesterday, it was downloaded 18 times.
The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.
Last 365 days
This package has been downloaded 6,342 times in the last 365 days. Impressive! The kind of number that makes colleagues ask, 'How did you do it?' The day with the most downloads was May 19, 2024 with 70 downloads.
The following line graph shows the downloads per day. You can hover over the graph to see the exact number of downloads per day.
Data provided by CRAN
Binaries
Dependencies
- Imports19 packages
- Suggests8 packages
- Linking To2 packages