Detection of unsolicited web browsing with clustering and statistical analysis
PhD thesis
Chwalinski, P. 2014. Detection of unsolicited web browsing with clustering and statistical analysis. PhD thesis Middlesex University School of Science and Technology
Type | PhD thesis |
---|---|
Title | Detection of unsolicited web browsing with clustering and statistical analysis |
Authors | Chwalinski, P. |
Abstract | Unsolicited web browsing denotes illegitimate accessing or processing web content. The harmful activity varies from extracting e-mail information to downloading entire website for duplication. In addition, computer criminals prevent legitimate users from gaining access to websites by implementing a denial of service attack with high-volume legitimate traffic. These offences are accomplished by preprogrammed machines that avoid rate-dependent intrusion detection systems. Therefore, it is assumed in this thesis that the only difference between a legitimate and malicious web session is in the intention rather than physical characteristics or network-layer information. As a result, the main aim of this research has been to provide a method of malicious intention detection. This has been accomplished by two-fold process. Initially, to discover most recent and popular transitions of lawful users, a clustering method has been introduced based on entropy minimisation. In principle, by following popular transitions among the web objects, the legitimate users are placed in low-entropy clusters, as opposed to the undesired hosts whose transitions are uncommon, and lead to placement in high-entropy clusters. In addition, by comparing distributions of sequences of requests generated by the actual and malicious users across the clusters, it is possible to discover whether or not a website is under attack. Secondly, a set of statistical measurements have been tested to detect the actual intention of browsing hosts. The intention classification based on Bayes factors and likelihood analysis have provided the best results. The combined approach has been validated against actual web traces (i.e. datasets), and generated promising results. |
Department name | School of Science and Technology |
Institution name | Middlesex University |
Publication dates | |
20 Feb 2015 | |
Publication process dates | |
Deposited | 20 Feb 2015 |
Completed | 2014 |
Output status | Published |
Accepted author manuscript | |
Language | English |
https://repository.mdx.ac.uk/item/84z70
Download files
15
total views13
total downloads2
views this month2
downloads this month