Random Forest (RF) is known as one of the best classifiers in many fields. They are parallelizable, fast to train and to predict, robust to outlier, handle unbalanced data, have low bias, and moderate variance. Apart from these advantages, there are still opportunities to increase RF efficiency. The absence of recommendations regarding the number of trees involved in RF ensembles could make the number of trees very large. This can increase the computational complexity of RF. Recommendations for not pruning the decision tree further aggravates the condition. This research attempts to build an efficient RF ensemble while maintaining its accuracy, especially in problem activity. Data collection is performed using an accelerometer sensor on a smartphone device. The data used in this research are collected from five peoples who perform 11 different activities. Each activity is carried out five times to enrich the data. This study uses two steps to improve the efficiency of the classification of the activity: 1) Optimal splitting criteria for activity classification, 2) Measured pruning to limit the tree depth in RF ensemble. The first method in this study can be applied to determine the splitting criteria that are most suitable for the classification problem of activities using Random Forest. In this case, the decision model built using the Gini Index can produce the highest accuracy. The second method proposed in this research successfully builds less complex pruned-tree without reducing its classification accuracy. The research results showed that the method applied to the Random Forest in this study was able to produce a decision model that was simple but yet accurate to classify activity.
Tag: Random Forest
by Endang Anggiratih and Agfianto Eko Putra
Abstract
Ship identification on satellite imagery can be used for fisheries management, monitoring of smuggling activities, ship traffic services, and naval warfare. However, high-resolution satellite imagery also makes the segmentation of the ship difficult in the background, so that to handle it requires reliable features so that it can be identified adequately between large vessels, small vessels and not ships. The Convolution Neural Network (CNN) method, which has the advantage of being able to extract features automatically and produce reliable features that facilitate ship identification. This study combines CNN ZFNet architecture with the Random Forest method. The training was conducted with the aim of knowing the accuracy of the ZFNet layers to produce the best features, which are characterized by high accuracy, combined with the Random Forest method. Testing the combination of this method is done with two parameters, namely batch size and a number of trees. The test results identify large vessels with an accuracy of 87.5% and small vessels with an accuracy of not up to 50%.
(for more information please click https://doi.org/10.22146/ijccs.37461)