Browsing by Author "Oleji, Chukwuemeka Philips"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Open Access Development of an improved model for big data analytics using dynamic multi-swarm optimization and unsupervised learning algorithms(Federal University of Technology, Owerri, 2021-07) Oleji, Chukwuemeka PhilipsAn improved model for big data analytics was developed in this work using dynamic multi-swarm optimization and unsupervised machine learning algorithms. The problems of premature convergence of traditional data mining models due to the influence of heterogeneous data types and the voluminous nature of big data were solved with the developed Dynamic-K-reference Clustering Algorithm. Java programming language was used for implementation and Python Jupyter Notebook, Apache Spark frameworks were utilized for the virtualization of the clustered output results. The developed model was used to analyze a big dataset of Boko Haram insurgency attacks in Nigeria. The big dataset of Boko Haram terrorist attacks was scraped from the social media. The attributes of the dataset including the area of attacks, period of attacks, death tolls, and attack strategies were used for the analysis for the period of 2008 to May 2019. The output clustered results of the area of attack produced 64% at Borno, Abuja 1.3%, Adamawa 1.3%, Gombe 3.8%, Kano, 2.5%, Kastina 2.5%, Maiduguri 20% and Yobe 5% respectively. The output clustered results of death tolls at different years produced 4.1% on 2011, 15.6% on 2012, 3.4% on 2013, 6.0% on 2014, 42.6% on 2015, 0.0% on 2016, 2.8% on 2017, 6.0% on 2018 and 19.5% on 2019 respectively. The results show constant attacks of Boko Haram insurgency in the study area, which had led to millions of people currently displaced and killed. The Dynamic-K-reference clustering algorithm is resourceful enough to provide clustering accuracy of 0.9820 and clustering sum of square error of 0.0018 from the analysis of the Boko Haram attacks dataset. In other to validate Dynamic-K-references clustering algorithm its performance was compared with the existing algorithms on six datasets from the machine learning repository: Hepatitis, Australian Credit Approval, German Credit Data, Starlog Heart, Soybean and Yeast. The analysis of four datasets with Dynamic-K-references clustering algorithm when compared with PSO-based K-prototype algorithm produced performance improvement of 22%, 17%, 34%, and 12%, respectively. Similar analysis of Soybean and Yeast datasets with the existing MixK meansXFon algorithm and the Dynamic-K-reference clustering algorithm produced performance improvement of 13.8% and 13.7% respectively. From the analysis the Dynamic-K-reference algorithm was found to be robust and very efficient at expelling outliers from its dissimilar clusters/classifications. Future work should develop big data analytic services with the improved Dynamic-K-reference clustering algorithm and other improved models of its kind using a service oriented architectural methodology for real time analysis and prediction.