احسان علي علي كريم ( أستاذ مساعد )
كلية التخطيط العمراني - التخطيط البيئي
[email protected]
Modified Decision Tree Classification Algorithm for Large Data Sets
بحث النوع:
علوم التخصص العام:
Ihsan A. Kareem اسم الناشر:
, Mehdi G. Duaimi اسماء المساعدين:
Iraqi Journal of Science الجهة الناشرة:
2014, Vol 55, No.4A, pp:1638-1645  
2014 سنة النشر:


A decision tree is an important classification technique in data mining classification. Decision trees have proved to be valuable tools for the classification, description, and generalization of data. Work on building decision trees for data sets exists in multiple disciplines such as signal processing, pattern recognition, decision theory, statistics, machine learning and artificial neural networks. This research deals with the problem of finding the parameter settings of decision tree algorithm in order to build accurate, small trees, and to reduce execution time for a given domain. The proposed approach (mC4.5) is a supervised learning model based on C4.5 algorithm to construct a decision tree. The modification on C4.5 algorithm includes two phases: the first phase is discretization all continuous attributes instead of dealing with numerical values. The second phase is using the average gain measure instead of gain ratio measure, to choose the best attribute. It has been experimented on three data sets. All those data files are picked up from the popular (UCI) University of California at Irvine data repository. The results obtained from experiments show that (mC4.5) is better than C4.5 in decreasing the total number of nodes without affecting the accuracy; at the same time increasing the accuracy ratio.