The paper discusses the importance of data normalization in medical diagnostics and introduces a two-step data normalization approach to enhance classification accuracy in medical data mining tasks. The proposed method takes into account both the interdependencies between features and their absolute values, aiming to improve model accuracy. The approach was tested using six real-world medical datasets across various machine learning models, such as Decision Tree, Extra Trees Classifier, and Bagging. The results indicate that the proposed method improves classification accuracy by 1–6%, depending on the task (binary or multiclass classification). The authors demonstrate that their method is effective and can be practically applied to improve accuracy in medical diagnostics, particularly in small data scenarios.
Takeaways:
- The paper introduces a two-step data normalization method that improves classification accuracy in medical diagnostic tasks.
- The method considers interdependencies between features and absolute values, making it more comprehensive than traditional normalization approaches.
- Experimental results show that the method boosts the accuracy of machine learning classifiers by 1–6%, demonstrating its effectiveness.
- The approach can be particularly useful for medical data mining tasks, where data often comes in small sizes and may require additional preprocessing for optimal results.
- The paper emphasizes the practical application of this normalization technique in real-world medical datasets, particularly in binary and multiclass classification problems.