oalogo2  

AUTHOR(S):

S. Selvakani, K. Vasumathi, E. Pavankumar

 

TITLE

Predicting Sleep Disorders Using Machine Learning

pdf PDF

ABSTRACT

This script provides a structured approach to analyzing a dataset related to sleep disorders, focusing on the exploration, cleaning, visualization, modeling, and evaluation of data. It begins by loading the dataset using pandas and exploring its structure and summary statistics, including checking the unique values of key categorical columns. The next step involves correcting inconsistencies in the data, such as standardizing the "BMI Category" and splitting the "Blood Pressure" column into separate systolic and diastolic values. These steps ensure the dataset is clean and ready for further analysis. Various visualizations are created using mat plot lib and sea born to explore relationships between numerical variables through pairwise plots, correlation matrices, and box plots to identify outliers. Categorical variables are analyzed with count plots, revealing the distribution of features such as gender, BMI category, and sleep disorders. A stacked bar chart highlights the relationship between occupation and sleep disorders, and box plots show sleep duration variations across occupations. The data is then preprocessed for machine learning, including label encoding for categorical variables and scaling numerical features using Standard Scaler. The target variable, "Sleep Disorder," is also encoded for modeling. Following this, several classification models, including Logistic Regression, Ridge Classifier, SVM, and Random Forest, are trained on the data. Cross-validation is performed to assess model performance, and confusion matrices are plotted to visualize classification results. The models are then optimized using grid search to fine-tune hyper parameters for better performance, and the best configurations are selected. The optimized models are evaluated again with confusion matrices, and their performance metrics are reviewed. Lastly, feature importance is extracted from the Random Forest model to determine the most influential features for predicting sleep disorders, with the results displayed in a bar plot. This comprehensive process enables a thorough understanding of the factors contributing to sleep disorders and the effectiveness of different machine learning models in predicting them.

KEYWORDS

Sleep Disorders, Visualization, Pandas, Machine Learning, Classification Models, Feature Importance.

 

Cite this paper

S. Selvakani, K. Vasumathi, E. Pavankumar. (2025) Predicting Sleep Disorders Using Machine Learning. International Journal of Computers, 10, 63-73

 

cc.png
Copyright © 2024 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0