FNG-IE: an improved graph-based method for keyword extraction from scholarly big-data

Article


Tahir, N., Asif, M., Ahmad, S., Malik, M.S.A., Aljuaid, H., Butt, M.A. and Rehman, M. 2021. FNG-IE: an improved graph-based method for keyword extraction from scholarly big-data. PeerJ Computer Science. 7. https://doi.org/10.7717/peerj-cs.389
TypeArticle
TitleFNG-IE: an improved graph-based method for keyword extraction from scholarly big-data
AuthorsTahir, N., Asif, M., Ahmad, S., Malik, M.S.A., Aljuaid, H., Butt, M.A. and Rehman, M.
Abstract

Keyword extraction is essential in determining influenced keywords from huge documents as the research repositories are becoming massive in volume day by day. The research community is drowning in data and starving for information. The keywords are the words that describe the theme of the whole document in a precise way by consisting of just a few words. Furthermore, many state-of-the-art approaches are available for keyword extraction from a huge collection of documents and are classified into three types, the statistical approaches, machine learning, and graph-based methods. The machine learning approaches require a large training dataset that needs to be developed manually by domain experts, which sometimes is difficult to produce while determining influenced keywords. However, this research focused on enhancing state-of-the-art graph-based methods to extract keywords when the training dataset is unavailable. This research first converted the handcrafted dataset, collected from impact factor journals into n-grams combinations, ranging from unigram to pentagram and also enhanced traditional graph-based approaches. The experiment was conducted on a handcrafted dataset, and all methods were applied on it. Domain experts performed the user study to evaluate the results. The results were observed from every method and were evaluated with the user study using precision, recall and f-measure as evaluation matrices. The results showed that the proposed method (FNG-IE) performed well and scored near the machine learning approaches score.

KeywordsProgramming; Keyword extraction; Graph-based keyword extraction
Sustainable Development Goals9 Industry, innovation and infrastructure
Middlesex University ThemeCreativity, Culture & Enterprise
PublisherPeerJ
JournalPeerJ Computer Science
ISSN
Electronic2376-5992
Publication dates
Online11 Mar 2021
Print11 Mar 2021
Publication process dates
Submitted09 Dec 2020
Accepted20 Jan 2021
Deposited06 Nov 2025
Output statusPublished
Publisher's version
License
File Access Level
Open
Copyright Statement

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Digital Object Identifier (DOI)https://doi.org/10.7717/peerj-cs.389
Scopus EID2-s2.0-85103094022
Web of Science identifierWOS:000627827600001
Permalink -

https://repository.mdx.ac.uk/item/2y77w9

Download files


Publisher's version
peerj-cs-389.pdf
License: CC BY 4.0
File access level: Open

  • 25
    total views
  • 3
    total downloads
  • 4
    views this month
  • 0
    downloads this month

Export as

Related outputs

Breaking down barriers: the moderating role of organizational support in facilitating knowledge sharing among software developers
Anwar, R., Rehman, M., Rehman, H., Nauman, S., Khan, A.S. and Malik, M. 2024. Breaking down barriers: the moderating role of organizational support in facilitating knowledge sharing among software developers. SAGE Open. 14 (2). https://doi.org/10.1177/21582440241256568
Applying text mining and semantic network analysis to investigate effects of perceived crowding in the service sector
Ellahi, A., Ul Ain, Q., Rehman, H.M., Hossain, M.B., Illes, C.B. and Rehman, M. 2023. Applying text mining and semantic network analysis to investigate effects of perceived crowding in the service sector. Cogent Business & Management. 10 (2). https://doi.org/10.1080/23311975.2023.2215566
Enhancing brand equity through sustainable tourism marketing: a study on home-stays in Malaysia
Janjua, Z., Krishnapillai, G. and Rehman, M. 2023. Enhancing brand equity through sustainable tourism marketing: a study on home-stays in Malaysia. Asian Academy of Management Journal. 28 (1), pp. 237-263. https://doi.org/10.21315/aamj2023.28.1.10
The impact of stressors on the relationship between personality traits, knowledge collection behaviour and programmer creativity intention in software engineering
Amin, A., Rehman, M., Basri, S., Capretz, L.F., Goraya, M.A.S. and Akbar, R. 2023. The impact of stressors on the relationship between personality traits, knowledge collection behaviour and programmer creativity intention in software engineering. Information and Software Technology. 163. https://doi.org/10.1016/j.infsof.2023.107288
Empirical investigation into impact of IT adoption on supply chain agility in fast food sector in Pakistan
Qureshi, F., Ellahi, A., Javed, Y., Rehman, M. and Rehman, H.M. 2023. Empirical investigation into impact of IT adoption on supply chain agility in fast food sector in Pakistan. Cogent Business & Management. 10 (1). https://doi.org/10.1080/23311975.2023.2170516
Transforming strategies in the digital era: the role of social media in customer value analysis and crisis management for tourism firms
Rehman, H.M., Amin, A., Rehman, M., Nematova, G., Shamim, A. and Hossain, M.B. 2023. Transforming strategies in the digital era: the role of social media in customer value analysis and crisis management for tourism firms. International Journal of Management Studies. 30 (2), pp. 373-396. https://doi.org/10.32890/ijms2023.30.2.7
Importance of the sustainability tourism marketing practices: an insight from rural community-based homestays in Malaysia
Janjua, Z., Krishnapillai, G. and Rehman, M. 2023. Importance of the sustainability tourism marketing practices: an insight from rural community-based homestays in Malaysia. Journal of Hospitality and Tourism Insights. 6 (2), pp. 575-594. https://doi.org/10.1108/JHTI-10-2021-0274
Impact of servant leadership on project success through mediating role of team motivation and effectiveness: a case of software industry
Ellahi, A., Rehman, M., Javed, Y., Sultan, F. and Rehman, H.M. 2022. Impact of servant leadership on project success through mediating role of team motivation and effectiveness: a case of software industry. SAGE Open. 12 (3). https://doi.org/10.1177/21582440221122747
Review of factors affecting gender disparity in higher education
Saadat, Z., Alam, S. and Rehman, M. 2022. Review of factors affecting gender disparity in higher education. Cogent Social Sciences . 8 (1). https://doi.org/10.1080/23311886.2022.2076794
Bedtime smart phone usage and its effects on work-related behaviour at workplace
Ellahi, A., Javed, Y., Begum, S., Mushtaq, R., Rehman, M. and Rehman, H.M. 2021. Bedtime smart phone usage and its effects on work-related behaviour at workplace. Frontiers in Psychology. 12. https://doi.org/10.3389/fpsyg.2021.698413
An intelligent graph edit distance-based approach for finding business process similarities
Sohail, A., Haseeb, A., Rehman, M., Dominic, D.D. and Butt, M.A. 2021. An intelligent graph edit distance-based approach for finding business process similarities. Computers, Materials and Continua. 69 (3), pp. 3603-3618. https://doi.org/10.32604/cmc.2021.017795
Co-creation or co-destruction: a perspective of online customer engagement valence
Siddique, J., Shamim, A., Nawaz, M., Faye, I. and Rehman, M. 2021. Co-creation or co-destruction: a perspective of online customer engagement valence. Frontiers in Psychology. 11. https://doi.org/10.3389/fpsyg.2020.591753
Evolution-based performance prediction of star cricketers
Ahmad, H., Ahmad, S., Asif, M., Rehman, M., Alharbi, A. and Ullah, Z. 2021. Evolution-based performance prediction of star cricketers. Computers, Materials and Continua. 69 (1), pp. 1215-1232. https://doi.org/10.32604/cmc.2021.016659
A systematic literature review of rural homestays and sustainability in tourism
Janjua, Z., Krishnapillai, G. and Rehman, M. 2021. A systematic literature review of rural homestays and sustainability in tourism. SAGE Open. 11 (2). https://doi.org/10.1177/21582440211007117
Information security behavior and information security policy compliance: a systematic literature review for identifying the transformation process from noncompliance to compliance
Ali, R., Dominic, P., Ali, S., Rehman, M. and Sohail, A. 2021. Information security behavior and information security policy compliance: a systematic literature review for identifying the transformation process from noncompliance to compliance. Applied Sciences. 11 (8). https://doi.org/10.3390/app11083383
Extracting key factors of cyber hygiene behaviour among software engineers: a systematic literature review
Kalhoro, S., Rehman, M., Ponnusamy, V. and Shaikh, F. 2021. Extracting key factors of cyber hygiene behaviour among software engineers: a systematic literature review. IEEE Access. 9, pp. 99339-99363. https://doi.org/10.1109/ACCESS.2021.3097144
Cyberbullying behaviour: a study of undergraduate university students
Shaikh, F., Rehman, M., Amin, A., Shamim, A. and Hashmani, M. 2021. Cyberbullying behaviour: a study of undergraduate university students. IEEE Access. 9, pp. 92715-92734. https://doi.org/10.1109/ACCESS.2021.3086679
Factors that influence high school female students' intentions to pursue science, technology, engineering and mathematics (STEM) education in Malaysia
Alam, M., Sajid, S., Kok, J., Rehman, M. and Amin, A. 2021. Factors that influence high school female students' intentions to pursue science, technology, engineering and mathematics (STEM) education in Malaysia. Pertanika Journal of Social Sciences and Humanities. 29 (2), pp. 839-867. https://doi.org/10.47836/pjssh.29.2.06
A literature survey and empirical study of meta-learning for classifier selection
Khan, I., Zhang, X., Rehman, M. and Ali, R. 2020. A literature survey and empirical study of meta-learning for classifier selection. IEEE Access. 8, pp. 10262-10281. https://doi.org/10.1109/ACCESS.2020.2964726
Generation of highly nonlinear and dynamic AES substitution-boxes (S-boxes) using chaos-based rotational matrices
Mahmood Malik, M.S., Ali, M.A., Khan, M.A., Ehatisham-Ul-Haq, M., Shah, S.N.M., Rehman, M. and Ahmad, W. 2020. Generation of highly nonlinear and dynamic AES substitution-boxes (S-boxes) using chaos-based rotational matrices. IEEE Access. 8, pp. 35682-35695. https://doi.org/10.1109/ACCESS.2020.2973679
Cyberbullying: a systematic literature review to identify the factors impelling university students towards cyberbullying
Bashir Shaikh, F., Rehman, M. and Amin, A. 2020. Cyberbullying: a systematic literature review to identify the factors impelling university students towards cyberbullying. IEEE Access. 8, pp. 148031-148051. https://doi.org/10.1109/ACCESS.2020.3015669
Adaptive CNN ensemble for complex multispectral image analysis
Jameel, S.M., Hashmani, M.A., Rehman, M. and Budiman, A. 2020. Adaptive CNN ensemble for complex multispectral image analysis. Complexity. 2020. https://doi.org/10.1155/2020/8361989
An adaptive deep learning framework for dynamic image classification in the Internet of Things environment
Jameel, S.M., Hashmani, M.A., Rehman, M. and Budiman, A. 2020. An adaptive deep learning framework for dynamic image classification in the Internet of Things environment. Sensors. 20 (20). https://doi.org/10.3390/s20205811
A critical review on adverse effects of concept drift over machine learning classification models
Jameel, S.M., Hashmani, M.A., Alhussain, H., Rehman, M. and Budiman, A. 2020. A critical review on adverse effects of concept drift over machine learning classification models. International Journal of Advanced Computer Science and Applications. 11 (1), pp. 206-211. https://doi.org/10.14569/IJACSA.2020.0110127
Review of social media influence on software development
Nematova, G., Rehman, M., Amin, A. and Hashmani, M.A. 2020. Review of social media influence on software development. Mehran University Research Journal of Engineering and Technology. 39 (3), pp. 603-611. https://doi.org/10.22581/muet1982.2003.15
Automatic image annotation for small and ad hoc intelligent applications using Raspberry Pi
Jameel, S.M., Hashmani, M.A., Rizvi, S.S.H., Uddin, V. and Rehman, M. 2019. Automatic image annotation for small and ad hoc intelligent applications using Raspberry Pi. Engineering Application of Artificial Intelligence Conference 2018 . Sabah, Malaysia 03 - 05 Dec 2018 EDP Sciences. https://doi.org/10.1051/matecconf/201925501003
Accuracy performance degradation in image classification models due to concept drift
Hashmani, M.A., Jameel, S.M., Alhussain, H., Rehman, M. and Budiman, A. 2019. Accuracy performance degradation in image classification models due to concept drift. International Journal of Advanced Computer Science and Applications. 10 (5), pp. 422-425. https://doi.org/10.14569/IJACSA.2019.0100552
Job satisfaction and knowledge sharing among computer and information science faculty members: a case of Malaysian universities
Rehman, M., Mahmood, A.K., Salleh, R. and Amin, A. 2014. Job satisfaction and knowledge sharing among computer and information science faculty members: a case of Malaysian universities. Research Journal of Applied Sciences, Engineering and Technology. 7 (4), pp. 839-848. https://doi.org/10.19026/rjaset.7.326
Framework to increase knowledge sharing behavior among software engineers
Rehman, M., Mahmood, A.K., Salleh, R. and Amin, A. 2014. Framework to increase knowledge sharing behavior among software engineers. Research Journal of Applied Sciences, Engineering and Technology. 7 (4), pp. 849-856. https://doi.org/10.19026/rjaset.7.327