Tuesday, 24 Dec, 2024

+91-9899775880

011-47044510

011-49075396

CDPSM: A New Optimized Progressive Big Data Analytics for Partial Cancer Data using Amazon EMR

International Journal of Business Analytics and Intelligence

Volume 5 Issue 1

Published: 2017
Author(s) Name: Shyam Mohan J. S. | Author(s) Affiliation: Asst Prof, Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya, Kanchipuram, Tamil Nadu, India
Locked Subscribed Available for All

Abstract

Identifying of symptoms and treating cancer requires a thorough investigation and research requiring analysis of multiple levels available (partial or full) cancer data. Cancer data is spread across multiple data sources and data warehouses which are decentralized and are in different locations. Therefore only half or partial data is available. Progressive analytics provide an efficient way for querying data from various data clusters where each cluster contains only a piece of the examined data. We propose an effective framework to perform analytics over the available cancer data say Cancer Data Progressive Sampling Model (CDPSM) built for partially available cancer data deployed on Amazon EMR. Through a large number of experiments, we reveal the advantages of the proposed model and give numerical results comparing them with a deterministic model. These results indicate that the proposed model can efficiently reduce the time for performing progressive data analytics over partial cancer data and maintaining the quality of the result at high levels.

Keywords: Big Data, Progressive Sampling

View PDF

Refund policy | Privacy policy | Copyright Information | Contact Us | Feedback © Publishingindia.com, All rights reserved