An auto-scaling framework for analyzing big data in the cloud environment
Article
Jannapureddy, R., Vien, Q., Shah, P. and Trestian, R. 2019. An auto-scaling framework for analyzing big data in the cloud environment. Applied Sciences. 9 (7), pp. 1-16. https://doi.org/10.3390/app9071417
Type | Article |
---|---|
Title | An auto-scaling framework for analyzing big data in the cloud environment |
Authors | Jannapureddy, R., Vien, Q., Shah, P. and Trestian, R. |
Abstract | Processing big data on traditional computing infrastructure is a challenge as the volume of data is large and thus high computational complexity. Recently, Apache Hadoop has emerged as a distributed computing infrastructure to deal with big data. Adopting Hadoop to dynamically adjust its computing resources based on real-time workload is itself a demanding task, thus conventionally a pre-configuration with adequate resources to compute the peak data load is set up. However, this may cause a considerable wastage of computing resources when the usage levels are much lower than the preset load. In consideration of this, this paper investigates an auto-scaling framework on cloud environment aiming to minimise the cost of resource use by automatically adjusting the virtual nodes depending on the real-time data load. A cost-effective auto-scaling (CEAS) framework is first proposed for an Amazon Web Services (AWS) Cloud environment. The proposed CEAS framework allows us to scale the computing resources of Hadoop cluster so as to either reduce the computing resource use when the workload is low or scale-up the computing resources to speed up the data processing and analysis within an adequate time. To validate the effectiveness of the proposed framework, a case study with real-time sentiment analysis on the universities’ tweets is provided to analyse the reviews/tweets of the people posted on social media. Such a dynamic scaling method offers a reference to improving the Twitter data analysis in a more cost-effective and flexible way. |
Keywords | big data; cloud computing; Apache Hadoop; Amazon web service; Twitter |
Publisher | MDPI AG |
Journal | Applied Sciences |
ISSN | |
Electronic | 2076-3417 |
Publication dates | |
Online | 04 Apr 2019 |
01 Apr 2019 | |
Publication process dates | |
Deposited | 04 Apr 2019 |
Accepted | 29 Mar 2019 |
Output status | Published |
Publisher's version | License |
Copyright Statement | © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. |
Additional information | Article number = 1417 |
Digital Object Identifier (DOI) | https://doi.org/10.3390/app9071417 |
Web of Science identifier | WOS:000466547500152 |
Language | English |
https://repository.mdx.ac.uk/item/88353
Download files
94
total views15
total downloads4
views this month0
downloads this month