International Journal of Data Science

Visualizing Type 2 Diabetes Prevalence: Localizing Model Feature Impacts

Youssef Sultan — 2024-12-31

SHAP values have been a common approach used to understand machine learning model predictions by averaging the marginal contributions of each feature across every possible permutation of the feature set. Our research provides a localized view of SHAP values contributing to Type 2 Diabetes (T2D) prevalence in the United States from 2012 - 2021 covering each year independently. Instead of visualizing SHAP feature importance across an entire geographical dataset using a beeswarm plot, our approach is more granular. We visualize individual SHAP values of Social Determinants of Health (SDOH) features by county on a Choropleth map. Additionally, we found that replacing geographic identifiers such as zipcode with precise latitude and longitude coordinates before applying KNN imputation reduced the MSE by 10%. Our visualization reveals how specific factors influence T2D prevalence at the county level using a non-linear machine learning model. By re-appending the initially preserved geographic identifiers for each record by index, we traced the contribution of each SHAP value back to its locality. Our approach opens up a new geographical vantage point of the mechanisms of model predictions, thereby identifying localized key factors influencing Type 2 Diabetes (T2D). This study extends the possibilities for tailored interventions and public health policies showing how some factors have varying predictive impact on an outcome at the geographic level.

The Accuracy Analysis of Loan Interest Rate Forecasting Using Double Exponential Smoothing Methods

Ansari Saleh Ahmar — 2024-12-27

This study aims to forecast the rupiah loan interest rates at commercial banks in Indonesia using the exponential smoothing method. The data used is the credit interest rate data from December 2015 to September 2016. The exponential smoothing methods applied i.e. double exponential smoothing. The results show that the double exponential smoothing method provides the accurate predictions with the smallest Root Mean Square Error (RMSE) of 0,06629. The optimal parameters used in double exponential smoothing are an alpha of 0.3 and a beta of 0.3. These findings indicate that double exponential smoothing can effectively capture trends and patterns in credit interest rate data, making it a reliable tool for future loan interest rate forecasting. The results of this study are expected to make a significant contribution to strategic decision-making in the banking sector, particularly in risk management and loan interest rate strategy determination.

User-Friendly Interface Attendance System Based on Python Libraries and Deep Learning

Amna Kadhim — 2024-12-31

The old methods used to manage attendance for students, employees, or even teaching staff using a paper attendance record are considered tiring methods that take time. In addition to errors and repetitions or forgetting to register attendance, even with the presence of manual fingerprinting, due to the diseases that the world has experienced in previous years, it has become undesirable for some because it is considered a means of transmitting infection. In this research, we propose a method to record attendance relying on face recognition technology with real-time video processing by using multi-layer perceptron algorithm with two of python libraries, where the camera device is accessed, a picture of the person is taken, and the image is processed and framed, comparing captured faces with images within the stored database, performing face recognition, then dealing with file operations, and managing time-related tasks. Once the desired person is found, attendance is recorded with the actual time entered into an Excel file, and the file is saved with the date of the day on which attendance was recorded. The designed system works efficiently in the real-time implementation of counting and detection, proven to combine high face-detection accuracy and performance.

A Proposed System of Smart Diagnosis based on AI for Early Disease Detection Aligned with Islamic Healthcare Values

Nur Ilyana Ismarau Tajuddin — 2024-12-25

The rapid advancement of machine learning and web technologies is transforming the healthcare sector, offering innovative solutions for disease diagnosis and management. This conceptual paper explores the development of a web-based disease detection platform that utilizes machine learning algorithms to predict potential diseases based on user-reported symptoms. The primary objective of this platform is to provide users with accurate diagnostic results, enhancing the accessibility and efficiency of healthcare services. A distinctive feature of this platform is its integration of Islamic principles, specifically the inclusion of INAQ (Islamic Network for Artificial Intelligence) elements, such as the practice of Ruqyah (spiritual healing), within the technological framework. This approach seeks to align the proposed platform with the Islamic understanding of Tawhid (the Oneness of God) and its relationship to knowledge and healing. The proposed platform will design with a user-friendly interface to ensure accessibility for individuals with varying levels of technological literacy. It aims to bridge the gap between modern medical technologies and traditional Islamic perspectives on health and healing, offering a culturally sensitive solution to healthcare challenges. By embedding Islamic ethical considerations, the platform provides a holistic approach to disease detection, which acknowledges both the scientific and spiritual dimensions of health. This work contributes to the emerging field of culturally inclusive healthcare solutions, laying the groundwork for future research and development in medical technologies that respect and incorporate diverse cultural and religious values. The proposed platform highlights the potential for AI-driven healthcare innovations that are both technically advanced and socially sensitive, thus setting the stage for inclusive, ethically grounded solutions in healthcare technology.

Performance Analysis of 1D Linear Kalman Filter in Modern Scientific Computing Environments

Siti N. Kaban — 2024-12-11

We present a comprehensive performance analysis of 1D linear Kalman filter implementations across three modern scientific computing environments: Python, Julia, and R. Using a position tracking problem with hypothetical noisy sonar measurements, we evaluated both numerical accuracy and computational efficiency. All implementations produced numerically identical results within machine precision (maximum differences of 1.2×10^(-14) m for position estimates), demonstrating the filter's ability to reduce measurement noise by 73.1% while accurately estimating unmeasured velocity states. Performance benchmarking revealed significant efficiency differences, with Python achieving a median execution time of 1.87 seconds, compared to 4.38 seconds for R and 34.82 seconds for Julia. Statistical analysis confirmed these differences were highly significant (Kruskal-Wallis H=1332.445,p<0.001) with extremely large effect sizes (Cohen's d > 13 for all comparisons). Memory profiling revealed significant differences in resource utilization, with Python maintaining the most efficient footprint (97.67 MB), followed by Julia (132.45 MB) and R (168.21 MB), all with minimal variation. The unexpected underperformance of Julia relative to Python contradicts theoretical expectations and highlights the importance of empirical benchmarking for scientific computing applications. Our results provide practical guidance for implementing Kalman filters in time-critical or resource-constrained applications.