Harnessing supremacy of big data using hadoop for healthy human survival making use of bioinformatics
Supreet Kaur and Seema Baghla
Abstract
In this paper, the analysis of big data genre performed in order to achieve critical objectives for revolutionizing healthcare and to mine out the bioinformatics facet of a particular age group affected by a particular disease. The major challenge is to provide the right care, right living, and right value to general public by mining and providing available remedies for curing common and deadly diseases and it could be accomplished via appropriately mining the collected data. In the experimental work, data mining process was performed on the self-created primary database using Apache Hadoop framework and Hadoop based Hortonworks-Sandbox 2.2.0 data platform using the MapReduce algorithm. The result obtained describes that the scripts and queries provide sorted attributes from the database created and these attributes provide norms which justifies the objectives stated.
Keyword
Apache Hadoop, Hortonworks-sandbox 2.2.0, VMware player, File browser tool, HCatalog tool, Beeswax tool.
Cite this article
.Harnessing supremacy of big data using hadoop for healthy human survival making use of bioinformatics. International Journal of Advanced Technology and Engineering Exploration. 2018;5(48):460-468. DOI:10.19101/IJATEE.2018.547010
Refference
[1]Luo J, Wu M, Gopukumar D, Zhao Y. Big data application in biomedical research and health care: a literature review. Biomedical Informatics Insights. 2016; 8: 1-10.
[2]Kashyap H, Ahmed HA, Hoque N, Roy S, Bhattacharyya DK. Big data analytics in bioinformatics: architectures, techniques, tools and issues. Network Modeling Analysis in Health Informatics and Bioinformatics. 2016; 5(1).
[3]Mukherjee A, Datta J, Jorapur R, Singhvi R, Haloi S, Akram W. Shared disk big data analytics with Apache Hadoop. In international conference on high performance computing 2012 (pp. 1-6). IEEE.
[4]Tsai CF, Wu HC, Tsai CW. A new data clustering approach for data mining in large databases. In international symposium on parallel architectures, algorithms and networks 2002 (pp. 315-20). IEEE.
[5]Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM. 2008; 51(1):107-13.
[6]Greeshma L, Pradeepini G. Big data analytics with Apache Hadoop Mapreduce framework. Indian Journal of Science and Technology. 2016; 9(26):1-5.
[7]Rane NP, Patil DD. Big data and big data security with Hadoop's MapReduce. International conference on natural computation 2014 (pp. 1508-13). IEEE.
[8]Khanal R. The role of open standard electronic health record in medical data mining. European Journal of Business Management and Research. 2017; 2(2):1-7.
[9]Padhy S. Kumar S. Big data analysis using Apache Hadoop. International Journal of Advance Research, Ideas and Innovations in Technology. 2018; 4(1):225-7.
[10]Saxena S, Kumar P, Tewari RG. Two-step technique for prediction analysis using k-means clustering algorithm. International Journal of Computer Applications. 2017; 166(9):9-12.
[11]Shukla V, Dubey PK. Big Data: moving forward with emerging technology and challenges. International Journal of Advance Research in Computer Science and Management Studies. 2014; 2(9):187-93.