A Discrete Formulation of a Unified Data Mining Model
Abstract
A race is going on to process the complex and huge amount of data. To achieve this, data analytics are proposing different models and methods. Parallel to this, rich research work has been done to simplify different mathematical models for the validation and for the acceptance level of calculated knowledge. In this paper, we propose a discrete formulation of a unified data mining model. It envisages that knowledge extraction is a multi-step process where different data mining processes such as clustering, classification and visualization are unified in a cascade way; that is, an output of a process is the input to another process which helps to achieve scalability and flexibility on a larger scale. Simultaneously, to prove whether our proposed model is valid or invalid, it is evaluated by discrete structure. For this, different mathematical formulations are formed to support the cause and then these mathematical formulations are evaluated to achieve the required target. Each mathematical formulation is examined in detail by using a simple technique called Truth Table and its Truth Values. Truth Table shows that evaluated mathematical formulations are valid and correct.
References
D.M. Khan and N. Mohamudally, “An integration of K-means and decision tree (ID3) towards a more efficient data mining algorithmâ€, J. Comp., vol. 3, no. 12, pp. 76-82, 2011.
D.M. Khan, N. Mohamudally and D. Babajee, “Investigating the Statistical Linear Relation between the Model Selection Criterion and the Complexities of Data Mining Algorithmsâ€, J. Comp., vol. 4, no. 8, pp. 14-28, 2012.
D.M. Khan and N. Mohamudally, “Model Selection Criterions as Data Mining Algorithms Selector The Selection of Data Mining Algorithms through Model Selection Criterionsâ€, J. Comp., vol. 4, no. 3, pp. 102-114, 2012.
D.M. Khan, N. Mohamudally and D. Babajee, “A unified theoretical framework for data miningâ€, Proc. Comp. Science, vol. 17, pp. 104-113, 2013.
Q. Yang and X. Wu, “10 challenging problems in data mining researchâ€, Intl. J. Info. Tech & Decis. Mak., vol. 5, no. 04, pp. 597-604, 2006.
D. Skillicorn, Understanding complex datasets: data mining with matrix decompositions: CRC press, 2007.
R. Chattamvelli, "Data mining algorithms" Alpha science international, 2011.
G. Li and H. Sheng, "Extracting features from gene ontology for the identification of protein subcellular location by semantic similarity measurement", In Pacific-Asia Conf. Kn. Dis. and Data Min., pp. 112-118. Springer, Berlin, Heidelberg, 2007.
Y.Y. Yao, "On modeling data mining with granular computingâ€, In 25th Ann. Intl. Comput. Soft. App. Conf. COMPSAC 2001, pp. 638-643. IEEE, 2001.
D.M. Khan and M. Nawaz, “A comparative study of single-step and multi-step data mining toolsâ€, J. Comp., vol. 4, no. 10, pp. 26-41, 2012.
D.M. Khan and N. Mohamudally, “The adaptability of conventional data mining algorithms through intelligent mobile agents in modern distributed systemsâ€, Intl. J. Comp. Sci. Iss., vol. 9, no. 1, p. 38, 2012.
D.M. Khan, N. Mohamudally and D. Babajee, “The Formulation of a Data Mining Theory for the Knowledge Extraction by means of a Multiagent Systemâ€, J. Comp., vol. 4, no. 8, pp. 29-38, 2012.
S. Demri and E. Orlowska, "Logical analysis of indiscernibility", Incomplete information: Rough set analysis, pp. 347-380, Physica Heidelberg, 1998.
Z. Pawlak, Rough sets: Theoretical aspects of reasoning about data: Springer Science & Business Media, 2012.
R.R. Stoll, Linear algebra and matrix theory: A Book by Courier Corporation, 2013.
M.T. Kane, “An argument-based approach to validityâ€, Psychol. bullet., vol. 112, no. 3, p. 527, 1992.
D.M. Khan, F. Shahzad, N. Saher and N. Mohamudally, “Optimization and Analysis of Clustersâ€, Sci. Int. Lahore, pp. 1959-1971, 2014
D.M. Khan and N. Mohamudally, “The Relevancy of a Unified Data Mining Theory For the Big Dataâ€, J. Appl. Environ. Biol. Sci., vol. 4, no. 8S, pp. 160-165, 2014.
N. Mohamudally and D.M. Khan, "Application of a unified medical data miner for prediction, classification, interpretation and visualization on medical datasets: The diabetes dataset case.", In Indus. Conf. on Data Mini., pp. 78-95. Springer, Berlin, Heidelberg, 2011.
D.M. Khan, N. Mohamudally and D. Babajee, “Towards the Formulation of a Unified Data Mining Theory, Implemented by Means of Multiagent Systems (MASs)â€, Adv. D. Min. Kn. Dis. App., pp. 03-42, 2012.
B. Mahesh, “Machine Learning Algorithms-A Reviewâ€, Int. J. Sci. Res., vol. 9, pp. 381-386, 2020.
R.A. Nisbet, “How to choose a data mining suiteâ€, Data Mining Direct Special Report, 2004.
I. Lakatos, Proofs and refutations: The logic of mathematical discovery: Cambridge University Press, 2015.
A. Broadie, “The practical syllogismâ€, Analysis, vol. 29, no. 1, pp. 26-28, 1968.
A. Popescu, “On Agency And Joint Actionâ€, Studia Universitatis Babes-Bolyai-Philosophia, vol. 65, no. 2, pp. 67-84, 2020.
K.H. Rosen, “Discrete Mathematics and Its Applicationsâ€, McGraw-Hill Education, 2012.