A qualitative assessment of machine learning support for detecting data completeness and accuracy issues to improve data analytics in big data for the healthcare industry

Conference paper


Juddoo, S. and George, C. 2020. A qualitative assessment of machine learning support for detecting data completeness and accuracy issues to improve data analytics in big data for the healthcare industry. ELECOM 2020 - 3rd International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM). Mauritius, Mauritius 25 - 27 Nov 2020 IEEE. pp. 58-66 https://doi.org/10.1109/ELECOM49001.2020.9297009
TypeConference paper
TitleA qualitative assessment of machine learning support for detecting data completeness and accuracy issues to improve data analytics in big data for the healthcare industry
AuthorsJuddoo, S. and George, C.
Abstract

Tackling Data Quality issues as part of Big Data can be challenging. For data cleansing activities, manual methods are not efficient due to the potentially very large amount of data. This paper aims to qualitatively assess the possibilities for using machine learning in the process of detecting data incompleteness and inaccuracy, since these two data quality dimensions were found to be the most significant by a previous research study conducted by the authors. A review of existing literature concludes that there is no unique machine learning algorithm most suitable to deal with both incompleteness and inaccuracy of data.
Various algorithms are selected from existing studies and applied against a representative big (healthcare) dataset. Following experiments, it was also discovered that the implementation of machine learning algorithms in this context encounters several challenges for Big Data quality activities. These challenges are related to the amount of data particular machine learning algorithms can scale to and also to certain data type restrictions imposed by some machine learning algorithms. The study concludes that 1) data imputation works better with linear regression models, 2) clustering models are more efficient to detect outliers but fully automated systems may not be realistic in this context. Therefore, a certain level of human judgement is still needed.

Research GroupAspects of Law and Ethics Related to Technology group
ConferenceELECOM 2020 - 3rd International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)
Page range58-66
ISBN
Electronic9781728157078
Electronic9781728157061
Paperback9781728157085
PublisherIEEE
Publication dates
Print25 Nov 2020
Online25 Dec 2020
Publication process dates
Deposited27 Nov 2020
Accepted07 Oct 2020
Output statusPublished
Accepted author manuscript
Copyright Statement

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Digital Object Identifier (DOI)https://doi.org/10.1109/ELECOM49001.2020.9297009
LanguageEnglish
Book title2020 3rd International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)
Permalink -

https://repository.mdx.ac.uk/item/892z2

Download files


Accepted author manuscript
  • 53
    total views
  • 46
    total downloads
  • 2
    views this month
  • 1
    downloads this month

Export as

Related outputs

Orwellian odyssey: smart borders and the imperative for explainability
Nnawuchi, U. and George, C. 2024. Orwellian odyssey: smart borders and the imperative for explainability. Farhaoui, Y., Herawan, T., Imoize, A.L. and El Allaoui, A. (ed.) 6th International Conference on Artificial Intelligence and Smart Environments. Errachidia, Morocco 07 - 09 Nov 2024 Springer.
Transparency vs explanation of machine learning algorithms: perspectives from recent legal proceedings
Nnawuchi, U., George, C. and Kammueller, F. 2024. Transparency vs explanation of machine learning algorithms: perspectives from recent legal proceedings. Santos M., Machado, J., Novais, P., Cortez, P. and Moreira, P. (ed.) 23rd International Conference on Artificial Intelligence. Viana do Castelo, Portugal 03 - 06 Sep 2024 Springer.
Enhancing digital forensics readiness in big data wireless medical networks: A secure decentralised framework
Mpungu, C, George, C. and Mapp, G. 2024. Enhancing digital forensics readiness in big data wireless medical networks: A secure decentralised framework. International Journal of Network Security & Its Applications (IJNSA) . 16 (4), pp. 13-30. https://doi.org/10.5121/ijnsa.2024.16402
Digital forensics readiness in big data networks: a novel framework and incident response script for Linux-Hadoop environments
Mpungu, C., George, C. and Mapp, G. 2024. Digital forensics readiness in big data networks: a novel framework and incident response script for Linux-Hadoop environments. Applied System Innovation. 7 (5). https://doi.org/10.3390/asi7050090
Developing a novel digital forensics readiness framework for wireless medical networks using specialised logging
Mpungu, C., George, C. and Mapp, G. 2023. Developing a novel digital forensics readiness framework for wireless medical networks using specialised logging. Jahankhani, H. (ed.) 14th ICGS3-22: International Conference on Global Security, Safety and Sustainability. Virtual Conference 07 - 08 Sep 2022 Springer. pp. 203-226 https://doi.org/10.1007/978-3-031-20160-8_12
Investigating the attainment of optimum data quality for EHR Big Data: proposing a new methodological approach
Juddoo, S. 2022. Investigating the attainment of optimum data quality for EHR Big Data: proposing a new methodological approach. PhD thesis Middlesex University Computer Science
Investigating data repair steps for EHR Big Data
Juddoo, S. 2022. Investigating data repair steps for EHR Big Data. 3rd International Conference on Next Generation Computing Applications (NextComp). Flic-en-Flac, Mauritius 06 - 08 Oct 2022 IEEE. https://doi.org/10.1109/nextcomp55567.2022.9932167
Securing future healthcare environments in a post-COVID-19 world: moving from frameworks to prototypes
Vithanwattana, N., Karthick, G., Mapp, G., George, C. and Samuels, A. 2022. Securing future healthcare environments in a post-COVID-19 world: moving from frameworks to prototypes. Journal of Reliable Intelligent Environments. 8 (3), pp. 299-315. https://doi.org/10.1007/s40860-022-00180-7
A new privacy framework for the management of chronic diseases via mHealth in a post-Covid-19 world
Jusob, F., George, C. and Mapp, G. 2022. A new privacy framework for the management of chronic diseases via mHealth in a post-Covid-19 world. Journal of Public Health. 30 (1), pp. 37-47. https://doi.org/10.1007/s10389-021-01608-9
Exploring a new security framework for future healthcare systems
Vithanwattana, N., Karthick, G., Mapp, G. and George, C. 2021. Exploring a new security framework for future healthcare systems. IEEE Global Communications Conference: Workshop on Securing Next-Generation Connected Healthcare Systems using Futuristic Technologies. Madrid, Spain [Hybrid: In-Person and Virtual] 07 - 11 Dec 2021 IEEE. pp. 1-6 https://doi.org/10.1109/GCWkshps52748.2021.9681967
Analyzing the prospects and acceptance of mobile-based marine debris tracking
Thanacoody, A., Bekaroo, G., Santokhee, A. and Juddoo, S. 2019. Analyzing the prospects and acceptance of mobile-based marine debris tracking. ELECOM 2018: 2nd International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering. Mauritius 28 - 30 Nov 2018 Springer. pp. 256-267 https://doi.org/10.1007/978-3-030-18240-3_24
Discovering the most important data quality dimensions in health big data using latent semantic analysis
Juddoo, S. and George, C. 2018. Discovering the most important data quality dimensions in health big data using latent semantic analysis. 2018 International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD). Durban, South Africa 06 - 07 Aug 2018 IEEE. https://doi.org/10.1109/ICABCD.2018.8465129
Exploring the need for a suitable privacy framework for mHealth when managing chronic diseases
Jusob, F., George, C. and Mapp, G. 2017. Exploring the need for a suitable privacy framework for mHealth when managing chronic diseases. Journal of Reliable Intelligent Environments. 3 (4), pp. 243-256. https://doi.org/10.1007/s40860-017-0049-7
Data governance in the health industry: investigating data quality dimensions within a big data context
Juddoo, S., George, C., Duquenoy, P. and Windridge, D. 2018. Data governance in the health industry: investigating data quality dimensions within a big data context. Applied System Innovation. 1 (4), pp. 1-16. https://doi.org/10.3390/asi1040043
Developing a comprehensive information security framework for mHealth: a detailed analysis
Vithanwattana, N., Mapp, G. and George, C. 2017. Developing a comprehensive information security framework for mHealth: a detailed analysis. Journal of Reliable Intelligent Environments. 3 (1), pp. 21-39. https://doi.org/10.1007/s40860-017-0038-x
JarPi: A low-cost raspberry pi based personal assistant for small-scale fishermen
Vora, M., Bekaroo, G., Santokhee, A., Juddoo, S. and Roopowa, D. 2017. JarPi: A low-cost raspberry pi based personal assistant for small-scale fishermen. IEEE 4th International Conference on Soft Computing and Machine Intelligence (ISCMI). Port Louis, Mauritius 23 - 24 Nov 2017 IEEE. pp. 159-163 https://doi.org/10.1109/iscmi.2017.8279618
Exploring the application and usability of NFC for promoting self-learning on energy consumption of household electronic appliances
Ramrecha, V., Bekaroo, G., Santokhee, A. and Juddoo, S. 2017. Exploring the application and usability of NFC for promoting self-learning on energy consumption of household electronic appliances. IEEE 4th International Conference on Soft Computing and Machine Intelligence (ISCMI). Port Louis, Mauritius 23 - 24 Nov 2017 IEEE. pp. 154-158 https://doi.org/10.1109/ISCMI.2017.8279617
Procedural aspects of the new regime for the admissibility of expert evidence: what the digital forensic expert needs to know
Sallavaci, O. and George, C. 2013. Procedural aspects of the new regime for the admissibility of expert evidence: what the digital forensic expert needs to know. International Journal of Electronic Security and Digital Forensics. 5 (3/4), pp. 161-171. https://doi.org/10.1504/IJESDF.2013.058645
New admissibility regime for expert evidence: the likely impact on digital forensics
Sallavaci, O. and George, C. 2013. New admissibility regime for expert evidence: the likely impact on digital forensics. International Journal of Electronic Security and Digital Forensics. 5 (1), pp. 67-79. https://doi.org/10.1504/IJESDF.2013.054420
The Internet and pharmaceutical drugs in the era of interoperable eHealth systems across the European Union
George, C. 2013. The Internet and pharmaceutical drugs in the era of interoperable eHealth systems across the European Union. in: George, C., Whitehouse, D. and Duquenoy, P. (ed.) eHealth: Legal, Ethical and Governance Challenges Berlin, Germany Springer. pp. 135-164
Assessing legal, ethical and governance challenges in eHealth
George, C., Whitehouse, D. and Duquenoy, P. 2013. Assessing legal, ethical and governance challenges in eHealth. in: George, C., Whitehouse, D. and Duquenoy, P. (ed.) eHealth: Legal, Ethical and Governance Challenges Berlin, Germany Springer. pp. 3-22
eHealth: legal, ethical and governance challenges: an overview.
Whitehouse, D., George, C. and Duquenoy, P. 2011. eHealth: legal, ethical and governance challenges: an overview. Med-e-Tel 2011. Luxembourg 06 - 08 Apr 2011
Electronic medical records: addressing privacy & security concerns in the UK and US
George, C. and Berčič, B. 2009. Electronic medical records: addressing privacy & security concerns in the UK and US. BILETA 2009 - To Infinity and Beyond: Law and Technology in Harmony?. University of Winchester, UK 21 - 23 Apr 2009
Illegal activities, preventative technologies & ISP immunity: where should the buck stop?
George, C. 2008. Illegal activities, preventative technologies & ISP immunity: where should the buck stop? BILETA 2008: the 23rd annual conference. Glasgow Caledonian University 27 - 28 Mar 2008
Regulatory challenges in the era of the convergence of telecommunications, broadcasting and information
George, C. 2007. Regulatory challenges in the era of the convergence of telecommunications, broadcasting and information. 15th Telecommunications Forum, TELFOR 2007. Belgrade, Serbia 20 - 22 Nov 2007
Legal challenges of cybercrime
George, C. 2007. Legal challenges of cybercrime. 15th Telecommunications Forum, TELFOR 2007. Sava Centre, Belgrade, Serbia 20 - 22 Nov 2007
Players’ image management in the UK football industry
Futre, D., George, C. and Coathup, R. 2007. Players’ image management in the UK football industry. 15th Congress of the European Sports Management Association (EASM 2007). Torino, Italy 12 - 15 Sep 2007
Data protection: peeking over the study cubicle online!
George, C. 2005. Data protection: peeking over the study cubicle online! JISClegal: Legal Issues of Online Learning Environments.. University of Warwick, UK 01 - 02 Jun 2005 pp. 4-5
Online healthcare: internet pharmacies may not be good for your health
George, C. 2005. Online healthcare: internet pharmacies may not be good for your health. in: Zielinski, C., Duquenoy, P. and Kimppa, K. (ed.) The information society: emerging landscapes: IFIP International Conference on Landscapes of ICT and Social Accountability, Turku, Finland, June 27-29, 2005. Netherlands Springer.
Copyright management systems: assessing the power balance.
George, C. 2005. Copyright management systems: assessing the power balance. in: Zielinski, C., Duquenoy, P. and Kimppa, K. (ed.) The information society: emerging landscapes: IFIP International Conference on Landscapes of ICT and Social Accountability, Turku, Finland, June 27-29, 2005. New York Springer.
Pursuing electronic health: a UK primary health care perspective
Ndeti, M. and George, C. 2005. Pursuing electronic health: a UK primary health care perspective. in: Funabashi, M. and Grzech, A. (ed.) Challenges of expanding Internet: e-commerce, e-business, and e-government: 5th IFIP Conference on e-Commerce, e-Business, and e-Government (13E'2005), October 28-30, 2005, Poznan, Poland. New York Springer.
Challenges of identity theft in the information society.
Jahankhani, H. and George, C. 2008. Challenges of identity theft in the information society. in: Kiekegaard, S. (ed.) Synergies and conflicts in cyberlaw. International Association of IT Lawyers. pp. 42-51
Ethical, legal and social issues in medical informatics.
Duquenoy, P., George, C. and Kimppa, K. 2008. Ethical, legal and social issues in medical informatics. Hershey, PA. Medical Information Science Reference.
Compiling medical data into national medical databases: legitimate practice or data protection concern?
Berčič, B. and George, C. 2008. Compiling medical data into national medical databases: legitimate practice or data protection concern? in: Duquenoy, P., George, C. and Kimppa, K. (ed.) Ethical, legal and social issues in medical informatics Hershey, PA. Medical Information Science Reference. pp. 228-247
Managing social irresponsibility in cyberspace: the walled garden approach
George, C. 2008. Managing social irresponsibility in cyberspace: the walled garden approach. The 11th International Symposium - Management and Social Responsibility, (SymOrg2008). Belgrade, Serbia 10 - 13 Sep 2008
Issues and challenges in securing interoperability of DRM systems in the digital music market
George, C. and Chandak, N. 2006. Issues and challenges in securing interoperability of DRM systems in the digital music market. International Review of Law Computers and Technology. 20 (3), pp. 271-285. https://doi.org/10.1080/13600860600852143
Web 2.0 and user-generated content: legal challenges in the new frontier.
George, C. and Skerri, J. 2007. Web 2.0 and user-generated content: legal challenges in the new frontier. Journal of Information, Law and Technology.
Internet pharmacies: global threat requires a global approach to regulation.
George, C. 2006. Internet pharmacies: global threat requires a global approach to regulation. Hertfordshire law journal. 4 (1), pp. 12-25.
Identifying personal data using relational database design principles
Berčič, B. and George, C. 2009. Identifying personal data using relational database design principles. International Journal of Law and Information Technology. 17 (3), pp. 233-251. https://doi.org/10.1093/ijlit/ean007
Investigating the legal protection of data, information and knowledge under the EU data protection regime
Berčič, B. and George, C. 2009. Investigating the legal protection of data, information and knowledge under the EU data protection regime. International Review of Law Computers and Technology. 23 (3), pp. 189-201. https://doi.org/10.1080/13600860903262255
Information systems failures: whose responsibility?
Georgiadou, E. and George, C. 2006. Information systems failures: whose responsibility? in: Dawson, R. (ed.) Learning and teaching issues in software quality [INSPIRE XI] Swindon British Computer Society.
Online medical consultations: are we heading in the right direction?
George, C. and Duquenoy, P. 2005. Online medical consultations: are we heading in the right direction? Ethicomp 2005, Looking back to the future. Linköping, Sweden. 12 - 15 Sep 2005
Exploring legal & ethical aspects of IPR management in e-learning
George, C., Duquenoy, P. and Middlesex Univerity: School of computing science 2007. Exploring legal & ethical aspects of IPR management in e-learning. London School of Computing Science, Middlesex University.
Online medical consultations: legal, ethical and social perspectives
George, C. and Duquenoy, P. 2008. Online medical consultations: legal, ethical and social perspectives. in: Duquenoy, P., George, C. and Kimppa, K. (ed.) Ethical, legal, and social issues in medical informatics Medical Information Science Reference.
What ELSE? Regulation and compliance in medical imaging and medical informatics
Duquenoy, P., George, C. and Solomonides, A. 2008. What ELSE? Regulation and compliance in medical imaging and medical informatics. in: Gao, X., Müller, H., Loomes, M., Comley, R. and Luo, S. (ed.) Medical Imaging and Informatics: 2nd International Conference, MIMI 2007. Berlin Springer.
ICT in medicine and health care: assessing social, ethical and legal issues.
Duquenoy, P., George, C., Colliste, G., Hedstrom, K., Kimppa, K. and Mordini, E. 2006. ICT in medicine and health care: assessing social, ethical and legal issues. in: Berleur, J., Markku, I. and Nurminen, J. (ed.) Social informatics: an information society for all? In remembrance of Rob Kling: proceedings of the seventh international conference on human choice and computers (HCC7). Boston IFIP Press.
Managing IPR in a successful e-learning enterprise: the global campus, Middlesex University, UK
Bacsich, P., Duquenoy, P., George, C., Weldon, J., Bakry, W., Davis, G. and Middlesex Univerity: School of computing science 2006. Managing IPR in a successful e-learning enterprise: the global campus, Middlesex University, UK. in: Proceedings of the IPR in E-learning Workshop/Conference 21st March, 2006. London School of Computing Science, Middlesex University.
Considering something ELSE: ethical, legal and socio-economic factors in medical imaging and medical informatics
Duquenoy, P., George, C. and Solomonides, A. 2008. Considering something ELSE: ethical, legal and socio-economic factors in medical imaging and medical informatics. Computer Methods and Programs in Biomedicine. 92 (3), pp. 227-237. https://doi.org/10.1016/j.cmpb.2008.06.001