AP09259587 – Developing methods and algorithms of intelligent GIS for multi-criteria analysis of healthcare data
Term of realization: 2021-2023 yy
Project objective:
Development of models, algorithms, and methods, an intelligent geoinformation system for multi-criteria decision support in health care based on the models of explainable machine learning, NLP, GIS using social, medical, and economic information.
Relevance:
The success of artificial intelligence in health care covers three major areas: diagnostics and treatment support, clinical decision support systems, and public health. The project aims to overcome the shortcomings of modern medical information systems and GIS in the field of public health by creating an intelligent geoinformation system for multi-criteria decision support based on the models of explainable machine learning, NLP, GIS. Developed methods will make it possible to produce recommendations for improving the work of healthcare organizations using medical, economic, and social information.
Abstract
The quality of the healthcare system may be manifested in three “dimensions”: medical, economic, and social. In many cases, however, this evaluation is done separately. Attempting to achieve maximum cost efficiency can lead to insecurity in medical tasks and social dissatisfaction, up to social instability, which was particularly evident during the COVID-19 pandemic. At the same time, however, society cannot always afford high costs in this area. There is a need for comprehensive analysis and a comprehensive approach to improving the performance of healthcare organizations. Traditionally, such tasks, with their vaguely defined limitations, fall within the scope of multi-criteria analysis and decision support (MCDA) and are often carried out using different methods of expert judgment ranking and aggregation. However, the vast amount of data sets accumulated in the health sector, the mass media and social networks initiate the use of machine learning (ML) and natural language processing (NLP) techniques in decision support tasks. The methods of such applications have not yet been fully developed. In particular, algorithms and methods of explainable machine learning (EML) have been partially developed, which in practice is used from time to time. Spatial effects (place of residence, environment, a region where the health care organization located, etc.) are considered in the existing geographic information systems (GIS) of health care, but the regularities of such effects in the existing GIS are only statistically analyzed. The social impact of health care is still only qualitatively assessed.
This project aims to overcome these limitations through a combination of NLP, EML, MCDA, and GIS methods.
Within the framework of the project by using information and analytical analysis, modeling, and computer experiments, developed machine learning models, develop methods of explaining their work for the subject area in question, propose methods of data showcasing and pre-processing, models or indicators of error assessment and expert evaluation of data and results. Methods of collecting visualization data and analysis results based on spatial and temporal dynamics ara developed. During the prototyping process, existing prototypes of system components are built to assess the fairness of project hypotheses. The social impact assessment is calculated through the automatic analysis of the public health information space.
The scientific contribution of the project is that it offer private ways to overcome the limitations of machine learning models within decision support systems by using EML, taking into account the spatial and social influence of health organizations. An applied contribution is that MCDA algorithms are developed for the numerical evaluation of health organizations based on medical, economic, and social data, and ways of collecting media information and displaying the results in a GIS environment.
In terms of the methods used, the proposed solution is a type of geospatial intelligence (GeoAI), the paradigm of which is actively discussed in the scientific literature.
Results achieved
- A system for collecting information from open sources has been developed. With its help, a corpus of publications with a volume of 700 thousand documents and messages has been collected, which allows solving problems related to the analysis of coverage of health issues in the electronic media of the Republic of Kazakhstan.
- Methods and software have been developed to identify health-related messages in the information flow. An original approach based on hierarchical thematic modeling is used.
- Automatic identification of messages related to healthcare in the information flow was carried out.
- A method has been developed for obtaining numerical estimates of information trends in the field of healthcare based on mass media data in dynamics. The method is used to obtain numerical estimates of the most important information trends in the field of healthcare. In particular, the analysis and evaluation of the dynamics of changes in a number of the most significant trends (related to epidemiological situations), a correlation analysis of changes in thematic clusters of topics and some WHO indicators related to the pandemic was performed.
- A numerical assessment of the coverage in the mass media of health policy has been developed.
- A method of numerical evaluation of the tonality of the mass media on health issues has been developed. Based on the proposed method, the assessment of the tonality of the mass media of Kazakhstan and some major Russian sources on the topic of healthcare was carried out.
- A prototype of a machine explanation model based on the XGBoostregressor and SHAP libraries has been developed. The model evaluates the significance of the indicators of a medical institution from the point of view of the accepted target parameter (medical indicators, indicators of the official assessment of institutions).
- A prototype of a decision support system as part of an intelligent geoinformation system (IGIS) has been developed, providing interpretation of the results of explicable machine learning models.
- A subsystem for assessing the indicators of the social component of the healthcare system has been developed (based on the corpus of texts). IGIS provides a graphical representation of indicators with reference to geographical coordinates.
- A prototype of an intelligent geoinformation system (IGIS) has been developed, which provides visualization of the results of the operation of models of explicable machine learning, indicators of the evaluation of the social component (based on the corpus of texts). IGIS provides a graphical representation of indicators with reference to geographical coordinates and time dynamics.
Publications:
- Ravil I. Mukhamediev, Marina Yelis, Kirill Yakunin, Yelena Popova, Yan Kuchin, Adilkhan Symagulov, Nadiya Yunicheva, Elena Zaitseva, Vitaly Levashenko, Elena Muhamedijeva, Viktors Gopejenko and Rustam Mussabayev. Exploring the Health Care System’s Representation in the Media through Hierarchical Topic Modeling//Cogent Engineering (under review)
- Якунин К., Мухамедиев Р.И., Елис М., Кучин Я., Сымагулов А., Юничева Н., Мухамедиева Е. Анализ тематических кластеров публикаций сми республики казахстан по теме пандемии COVID-19 // Известия НАН РК. Серия физико-математических наук. – 2022. – № 3(343). – С. 260–274.
- Yakunin K. et al. Analysis of the Correlation between Mass-Media Publication Activity and COVID-19 Epidemiological Situation in Early 2022 //Information. – 2022. – Т. 13. – №. – С. 434. https://www.mdpi.com/2078-2489/13/9/434/htm (Scopus Q2, 72%)
- Mukhamediev R. I. et al. Review of Artificial Intelligence and Machine Learning Technologies: Classification, Restrictions, Opportunities and Challenges //Mathematics. – 2022. – Т. 10. – №. 15. – С. 2552. https://www.mdpi.com/2227-7390/10/15/2552 (Scopus: Q1, 87%, WoS: Q1, IF:2.8)
- Юничева Н., Елис М., Якунин К., Мухамедиев Р., Сымагулов А., Кучин Я., Мухамедиева Е. Аналитический обзор медиаинформации из открытых источников по теме здравоохранения в период пандемии COVID-19 //Труды XVIII Международной Азиатской школы-семинара «Проблемы оптимизации сложных систем» (OPCS’22) – Кыргызстан, 2022. – С. 17–41.
- Yelis, K. Yakunin, R. Mukhamediev, A. Symagulov, Y. Kuchin, E. Mukhamedieva, N. Yunicheva, F. Abdoldina How to predict the interest of the scientific community in subsections of artificial intelligence? //The 20th Int. scientific. conf. Information technologies and management 2022. — Riga, 2022. — P. 16-17. https://www.ismaitm.lv/theses-2022
- M Yelis, K Yakunin, R Mukhamediev, A Symagulov, Y Kuchin, E Muhamedijeva, N Yunicheva, F Abdoldina How to estimate mass media? //The 20th Int. conf. Information technologies and management. — Riga, 2022. — 18-19. https://www.ismaitm.lv/theses-2022
- Kirill Yakunin, Ravil I. Mukhamediev, Elena Zaitseva , Vitaly Levashenko, Marina Yelis, Adilkhan Symagulov, Yan Kuchin, Elena Muhamedijeva, Margulan Aubakirov and Viktors Gopejenko. Mass media as a mirror of the COVID-19 pandemic // Computation2021, 9(12), 140; https://doi.org/10.3390/computation9120140 (Scopus Q2, 71%)
- O. Yakunin, S. B. Murzakhmetov, R. R. Musabayev, R. I. Mukhamediyev News Popularity Prediction Using Topic Modelling // 2021 IEEE Int. Conf. on Smart Information Systems and Technologies. –Nur-Sultan, 2021. – P. 1-4., doi: 10.1109/SIST50301.2021.9465884.
- O. Yakunin et al. Reflection of the COVID-19 pandemic in mass media //2021 IEEE Int. Conf. on Information and Digital Technologies (IDT). – Slovakia, 2021. – P. 260-263. https://ieeexplore.ieee.org/abstract/document/9497572
- Yelis, Y. Kuchin, A .Symagulov, E. Muhamedieva Explainable machine learning for healthcare decision-making tasks //The 19th int. scientific. conf. Information technologies and management 2021. — Riga, 2021. — P. 56-58. https://www.ismaitm.lv/images/Files/Theses/2021/01_NC/23_ITM2021_Yelis_Kuchin_Symagulov_Muhamedieva.pdf
- Сымагулов А., Кучин Я., Елис М., Жумабаев А., Абдуразаков А. Методы интерпретация черных ящиков машинного обучения и их применение для создания систем поддержки принятия решений // Известия НАН РК. Серия физико-математических наук. – 2021. – №5(339). – С. 91–99.
- Свидетельство о внесении сведений в государственный реестр прав на объекты, охраняемые авторским правом № 27917 от «21» июля 2022 года » Прототип информационной системы для хранения, визуализации и анализа данных о деятельности медицинских учреждений Казахстана Кирилл Якунин, Мухамедиев Равиль Ильгизович, Кучин Ян Игоревич, Сымагулов Адилхан, Елис Марина Сергеевна.