• Arabic text classification methods: Systematic literature review of primary studies

      Alabbas, Waleed; al-Khateeb, Haider M.; Mansour, Ali (IEEE, 2017-01-05)
      Recent research on Big Data proposed and evaluated a number of advanced techniques to gain meaningful information from the complex and large volume of data available on the World Wide Web. To achieve accurate text analysis, a process is usually initiated with a Text Classification (TC) method. Reviewing the very recent literature in this area shows that most studies are focused on English (and other scripts) while attempts on classifying Arabic texts remain relatively very limited. Hence, we intend to contribute the first Systematic Literature Review (SLR) utilizing a search protocol strictly to summarize key characteristics of the different TC techniques and methods used to classify Arabic text, this work also aims to identify and share a scientific evidence of the gap in current literature to help suggesting areas for further research. Our SLR explicitly investigates empirical evidence as a decision factor to include studies, then conclude which classifier produced more accurate results. Further, our findings identify the lack of standardized corpuses for Arabic text; authors compile their own, and most of the work is focused on Modern Arabic with very little done on Colloquial Arabic despite its wide use in Social Media Networks such as Twitter. In total, 1464 papers were surveyed from which 48 primary studies were included and analyzed.
    • Comprehensive review of cybercrime detection techniques

      Al-Khater, Wadha Abdullah; Al-Ma’adeed, Somaya; Ahmed, Abdulghani Ali; Al-Shakarchi, Ali; Khan, Muhammad Khurram (IEEE, 2020-07-22)
      Cybercrimes describe cases of indictable offences and misdemeanours in which computer or any communication tools are involved as targets, commission instruments, incidental to, or that cases are associated with the prevalence of computer technology. Common forms of cybercrimes could be child pornography, cyberstalking, identity theft, cyber laundering, credit card theft, cyber terrorism, drug sale, data leakage, sexually explicit content, phishing and other cyber hacking. These kinds of cybercrimes are mostly leading to breaching users’ privacy, security violation, business loss, financial fraud, or damage in public as well as government properties. Hence, this paper intensively reviews cybercrime detection and prevention methods. It first explores the different types of cybercrimes then discusses their threats against privacy and security in computer systems. It also describes the strategies that cybercriminals might utilize in committing these crimes against individuals, organizations, and societies. The paper then reviews the existing techniques of cybercrime detection and prevention. It objectively discusses the strengths and critically analyses the vulnerabilities of each technique. As a future study, the paper provides recommendations for the development of cybercrime detection model in which it is capable to effectively detect cybercrime in comparison to the existing techniques.
    • A conditional opposition-based particle swarm optimization for feature selection

      Too, Jingwei; Sadiq, Ali Safaa; Mirjalili, Seyed Mohammad (Taylor & Francis, 2021-11-22)
      Because of the existence of irrelevant, redundant, and noisy attributes in large datasets, the accuracy of a classification model has degraded. Hence, feature selection is a necessary pre-processing stage to select the important features that may considerably increase the efficiency of underlying classification algorithms. As a popular metaheuristic algorithm, particle swarm optimization has successfully applied to various feature selection approaches. Nevertheless, particle swarm optimization tends to suffer from immature convergence and low convergence rate. Besides, the imbalance between exploration and exploitation is another key issue that can significantly affect the performance of particle swarm optimization. In this paper, a conditional opposition-based particle swarm optimization is proposed and used to develop a wrapper feature selection. Two schemes, namely opposition-based learning and conditional strategy are introduced to enhance the performance of the particle swarm optimization. Twenty-four benchmark datasets are used to validate the performance of the proposed approach. Furthermore, nine metaheuristics are chosen for performance verification. The findings show the supremacy of the proposed approach not only in obtaining high prediction accuracy but also in small feature sizes.
    • Energy efficient resource allocation strategy in massive IoT for industrial 6G applications

      Mukherjee, Amrit; Goswami, Pratik; Khan, Mohammad Ayoub; Manman, Li; Yang, Lixia; Pillai, Prashant (Institute of Electrical and Electronics Engineers (IEEE), 2020-11-03)
      The birth of beyond 5G (B5G) and emerge of 6G has made personal and industrial operations more reliable, efficient, and profitable, accelerating the development of the next-generation Internet of Things (IoT). We know, one of the most important key performance indicators in 6G is smart network architecture, and in massive IoT applications, energy efficient ubiquity networks rely mainly on the intelligence and automation for industrial applications. This paper addresses the energy consumption problem with a massive IoT system model with dynamic network architecture or clustering using a multi-agent system (MAS) for industrial 6G applications. The work uses distributed artificial intelligence (DAI) to cluster the sensor nodes in the system to find the main node and predict its location. The work initially uses the back-propagation neural network (BPNN) and convolutional neural network (CNN), which are respectively introduced for optimization. Furthermore, the work analyze the correlation of mutual clusters to allocate resources to individual nodes in each cluster efficiently. The simulation results show that the proposed method reduces the waste of resources caused by redundant data, improves the energy efficiency of the whole network, along with information preservation.
    • Incremental algorithm for association rule mining under dynamic threshold

      Aqra, Iyad; Abdul Ghani, Norjihan; Maple, Carsten; Machado, José; Sohrabi Safa, Nader (MDPI AG, 2019-12-10)
      Data mining is essentially applied to discover new knowledge from a database through an iterative process. The mining process may be time consuming for massive datasets. A widely used method related to knowledge discovery domain refers to association rule mining (ARM) approach, despite its shortcomings in mining large databases. As such, several approaches have been prescribed to unravel knowledge. Most of the proposed algorithms addressed data incremental issues, especially when a hefty amount of data are added to the database after the latest mining process. Three basic manipulation operations performed in a database include add, delete, and update. Any method devised in light of data incremental issues is bound to embed these three operations. The changing threshold is a long-standing problem within the data mining field. Since decision making refers to an active process, the threshold is indeed changeable. Accordingly, the present study proposes an algorithm that resolves the issue of rescanning a database that had been mined previously and allows retrieval of knowledge that satisfies several thresholds without the need to learn the process from scratch. The proposed approach displayed high accuracy in experimentation, as well as reduction in processing time by almost two-thirds of the original mining execution time.
    • A LogitBoost-based algorithm for detecting known and unknown web attacks

      Kamarudin, Muhammad Hilmi; Maple, Carsten; Watson, Tim; Safa, Nader Sohrabi (Institute of Electrical and Electronics Engineers (IEEE), 2017-11-03)
      The rapid growth in the volume and importance of web communication throughout the Internet has heightened the need for better security protection. Security experts, when protecting systems, maintain a database featuring signatures of a large number of attacks to assist with attack detection. However used in isolation, this can limit the capability of the system as it is only able to recognize known attacks. To overcome the problem, we propose an anomaly-based intrusion detection system using an ensemble classification approach to detect unknown attacks on web servers. The process involves removing irrelevant and redundant features utilising a filter and wrapper selection procedure. Logitboost is then employed together with random forests as a weak classifier. The proposed ensemble technique was evaluated with some artificial data sets namely NSL-KDD, an improved version of the old KDD Cup from 1999, and the recently published UNSW-NB15 data set. The experimental results show that our approach demonstrates superiority, in terms of accuracy and detection rate over the traditional approaches, whilst preserving low false rejection rates.