Persistent Identifier
|
doi:10.18710/DQZKMX |
Publication Date
|
2024-06-24 |
Title
| Replication Data for: Regularized Feature Selection Landscapes: An Empirical Study of Multimodality |
Author
| Sánchez Diaz, Xavier Fernando Cuauhtémoc (NTNU – Norwegian University of Science and Technology) - ORCID: 0000−0003−2271−439X |
Point of Contact
|
Use email button above to contact.
Sánchez Diaz, Xavier Fernando Cuauhtémoc (NTNU – Norwegian University of Science and Technology) |
Description
| This dataset contains replication data for the paper Regularized Feature Selection Landscapes: An Empirical Study of Multimodality. It contains accuracy tables of well-known classification datasets from The UCI Machine Learning Repository. These tables comprise the accuracy for all feature subsets, i.e., all column combinations, obtained by a decision tree classifier, with different levels of regularization. The decision tree classifier is provided by the BetaML.jl package for the Julia programming language, and uses the default parameters: max_depth=0 (no limits) min_gain=0.0, min_records=2, max_features=0 (consider all features), splitting_criterion=BetaML.Utils.gini, (Gini impurity index), and rng = Random.GLOBAL_RNG (no set seed for random number generation). (2024-06-19) |
Subject
| Computer and Information Science; Mathematical Sciences |
Keyword
| machine learning
landscape analysis
evolutionary computation
decision tree |
Related Publication
| X. F. C. Sánchez-Díaz, C. Masson, O. J. Mengshoel. (2024). Regularized Feature Selection Landscapes: An Empirical Study of Multimodality. To appear in: Parallel Problem Solving from Nature – PPSN XVIII. PPSN 2024. Lecture Notes in Computer Science, vol XXXXX. Springer, Cham. issn: 1611-3349 |
Language
| English |
Producer
| NTNU – Norwegian University of Science and Technology (NTNU) https://www.ntnu.edu/ |
Production Date
| 2024-03-01 |
Production Location
| Trondheim, Norway |
Contributor
| Researcher : Sánchez Díaz, Xavier Fernando Cuauhtémoc |
Distributor
| NTNU – Norwegian University of Science and Technology (NTNU) https://dataverse.no/dataverse/ntnu |
Depositor
| Sánchez Diaz, Xavier Fernando Cuauhtémoc |
Deposit Date
| 2024-06-19 |
Date of Collection
| Start Date: 2024-01-22 ; End Date: 2024-03-01 |
Data Type
| machine-readable text |
Software
| Julia, Version: 1.9.3 |
Data Source
| This dataset contains statistical data on machine learning model training carried out on data from the following sources:
The file 1-seeds.csv contains accuracy tables for the following dataset: Charytanowicz, M.; Niewczas, J.; Kulczycki, P.; Kowalski, P. and Lukasik, S. (2012). Seeds. UCI Machine Learning Repository. https://doi.org/10.24432/C5H30K. Licensed under CC BY 4.0.
The file 2-ecoli.csv contains accuracy tables for the following dataset: Nakai, K. (1996). Ecoli. UCI Machine Learning Repository. https://doi.org/10.24432/C5388M. Licensed under CC BY 4.0.
The file 3-breast-w.csv contains accuracy tables for the following dataset: Wolberg, W. (1992). Breast Cancer Wisconsin (Original). UCI Machine Learning Repository. https://doi.org/10.24432/C5HP4Z. Licensed under CC BY 4.0.
The file 4-glass.csv contains accuracy tables for the following dataset: German, B. (1987). Glass Identification. UCI Machine Learning Repository. https://doi.org/10.24432/C5WW2P. Licensed under CC BY 4.0.
The file 5-heart-c.csv contains accuracy tables for the following dataset: Janosi, A.; Steinbrunn, W.; Pfisterer, M. and Detrano, R. (1988). Heart Disease. UCI Machine Learning Repository. https://doi.org/10.24432/C52P4X. Licensed under CC BY 4.0.
The file 6-wine.csv contains accuracy tables for the following dataset: Aeberhard, S. and Forina, M. (1991). Wine. UCI Machine Learning Repository. https://doi.org/10.24432/C5PC7J. Licensed under CC BY 4.0.
The file 7-credit-a.csv contains accuracy tables for the following dataset: Quinlan, J.R. Credit Approval. UCI Machine Learning Repository. https://doi.org/10.24432/C5FS30. Licensed under CC BY 4.0.
The file 8-zoo.csv contains accuracy tables for the following dataset: Forsyth, R. (1990). Zoo. UCI Machine Learning Repository. https://doi.org/10.24432/C5R59V. Licensed under CC BY 4.0.
The file 9-letter-r.csv contains accuracy tables for the following dataset: Slate, D. (1991). Letter Recognition. UCI Machine Learning Repository. https://doi.org/10.24432/C5ZP40. Licensed under CC BY 4.0.
The file 10-hepatitis.csv contains accuracy tables for the following dataset: Gong, G. (1988). Hepatitis. UCI Machine Learning Repository. https://doi.org/10.24432/C5Q59J. Licensed under CC BY 4.0. |