GLSTM: A novel approach for prediction of real & synthetic PID diabetes data using GANs and LSTM classification model
Journal Title: International Journal of Experimental Research and Review - Year 2023, Vol 30, Issue 1
Abstract
Generative Adversarial Network (GAN) is a revolution in modern artificial systems. Deep learning-based Generative adversarial networks generate realistic synthetic tabular data. Synthetic data are used to enhance the size of a relatively small training dataset while ensuring the confidentiality of the original data. In this context, we implemented the GAN framework for generating diabetes data to help the health care professional in more clinical applications. GAN is used to validate the Pima Indian Diabetes (PID) Dataset. Various preprocessing techniques, such as handling missing values, outliers and data imbalance problems, enhance data quality. Some exploratory data analyses, such as heat maps, bar graphs and histograms, are used for data visualisation. We employed hypothesis testing to examine the resemblance between real data and GAN-generated synthetic data. In this study, we proposed a GAN-Long Short-Term Memory (GLSTM) system, in which GAN is used for data augmentation, and LSTM is used for diabetes classification. Additionally, various GAN models such as CTGAN, Vanilla GAN, Coupula GAN, Gaussian Coupula GAN, and TVAE GAN are used to generate the synthetic dataset. Experiments were conducted on real data, synthetic data, and by combining real and synthetic data. The model that used both real and synthetic data obtained a substantially better accuracy of 97% compared to 92% when only real data was used. We also observed that synthetic data could be used in place of real data, as the mean correlation between synthetic and real data is 0.93. Our study's findings outperformed when compared to state-of-the-art methodologies.
Authors and Affiliations
Sushma Jaiswal, Priyanka Gupta
Report on winter crop plant-aphid- aphidophagous Chrysopidae (Neuroptera: Insecta) association from Murshidabad district, West Bengal, India
The enhancement of natural predators through habitat manipulation and increasing vegetational diversity can improve herbivore control. Aphids drew attention as they are serious phytosaccivorous pests and a threat against...
Design and development Virtual Doctor Robot for contactless monitoring of patients during COVID-19
The main objective of this paper is to design and develop a virtual doctor robot (VDR) that will operate on the command of the actual doctor available far away from the patient through new technology AI and IoT. It is no...
Advanced News Archiving System with Machine Learning-Driven Web Scraping and AI-Powered Summarization Using T5, Pegasus, BERT and BART Architectures
Data plays a crucial role in the contemporary era of technology, as it is a vital element in the publication of news on the internet or a website. Nevertheless, understanding long reports in order to fully comprehend eve...
Providing Highest Privacy Preservation Scenario for Achieving Privacy in Confidential Data
Machine learning algorithms have been extensively employed in multiple domains, presenting an opportunity to enable privacy. However, their effectiveness is dependent on enormous data volumes and high computational resou...
Ameliorating effects of Vit-C on protein and nucleic acid content in dimecron intoxicated chick embryos
Dimecron when introduced into the fertilized hen’s egg at a certain dose before incubation shows a characteristic and interesting feature which has been studied and discussed. A quantitative study of proteins from differ...