Jasa bantu mengerjakan tugas kuliah teknik informatika, hubungi Meruya Statistik 0812 1944 9060
Jasa konsultasi tesis, disertasi S3 bidang informatika, teknik elektro, teknik sipil, dan berbagai macam riset pascasarjana bidang ilmu teknik.
Informasi lebih lanjut hubungi Meruya Statistik 0812 1944 9060
Resume untuk buku berjudul : “Data Mining: Theory,
Methodology, Techniques, and Applications”
Graham J. Williams Simeon J. Simoff (Eds.)
Resume untuk artikel berjudul : Generality Is Predictive of
Prediction Accuracy
Geoffrey I. Webb1 and Damien Brain2
1 Faculty of Information Technology,
Monash University, Clayton, Vic 3800, Australia
webb@infotech.monash.edu.au
2 UTelco Systems,
Level 50/120 Collins St Melbourne, Vic 3001, Australia
In knowledge acquisition,
the classification rule
can achieve high accuracy on the
training data. However, there is another
trade-off that will also be inherent
since the more specific rule will make fewer predictions on unseen
cases. It is also known that a
classifier has an option of not making predictions to create a system that makes fewer
decisions of higher expected quality.
When the accuracy of the rules on the training data is high, specializing
the rules can be expected to raise their accuracy on unseen data towards that
obtained on the training data.
Where a classifier must always make decisions and
maximization of prediction accuracy is desired, the rules for the class that
occurs most frequently should be generalized at the expense of rules for
alternative classes. This is because as each rule is generalized it will trend
towards the accuracy of a default rule for that class, which will be highest
for rules of the most frequently occurring class.
It is also the alternative sources of information that might
be brought to bear upon such decisions. We have emphasized that our hypotheses
relate only to contexts to distinguish
between the expected accuracy of two rules other than their relative
generality to derive such evidence from
training data.
During knowledge acquisition, multiple alternative potential rules all
appear equally credible. In comparison to the general rule, the accuracy of the unseen cases will tend to be closer to the
accuracy obtained on training data.
the accuracy of classification rules is
likely to be closer to the accuracy on unseen data of a default rule for
the class than will the accuracy on unseen data of the more specific rule. By
using classification rules formed by
C4.5rules and random classification rules may develop learning biases based on rule generality
that do not rely upon prior domain knowledge, and may be sensitive to
alternative knowledge acquisition objectives, such as trading-off accuracy for
cover. The frequent existence of rule variants between
which traditional rule quality metrics, such as an information measures, could
not be distinguished.
In knowledge
acquisition , the classification
rules sometimes uses the training
data. If we are selecting rules to use
for some decision making task, we must select between such rules with identical
performance on the training data. To do so, it needs learning algorithms that learn rule sets for the purpose of
prediction by making arbitrary choices
between rules with equivalent performance on the training data. This
masking in machine learning provides support for identification and
selection between such rule variants.
References
1. Mitchell,
T.M.: Version spaces: A candidate elimination approach to rule learning. In:
Proceedings of the Fifth International Joint Conference on Artificial Intelligence.(1977)
305–310
2. Mitchell,
T.M.: The need for biases in learning generalizations. Technical Report
CBM-TR-117, Rutgers University, Department of Computer Science, New Brunswick,
NJ (1980)
3. Webb,
G.I.: Integrating machine learning with knowledge acquisition through direct
interaction with domain experts. Knowledge-Based Systems 9 (1996) 253–266
4. Webb,
G.I., Wells, J., Zheng, Z.: An experimental evaluation of integrating machine
learning with knowledge acquisition. Machine Learning 35 (1999) 5–24
5. Wolpert,
D.H.: On the connection between in-sample testing and generalization error.
Complex Systems 6 (1992) 47–94
6. Schaffer,
C.: A conservation law for generalization performance. In: Proceedings of the
1994 International Conference on Machine Learning, Morgan Kaufmann (1994)
7. Rendell,
L., Seshu, R.: Learning hard concepts through constructive induction: Framework
and rationale. Computational Intelligence 6 (1990) 247–270
8. Webb,
G.I.: Further experimental evidence against the utility of Occam’s razor.
Journal of Artificial Intelligence Research 4 (1996) 397–417
9. Quinlan,
J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA
(1993)
10. Webb,
G.I.: OPUS: An efficient admissible algorithm for unordered search. Journal of
Artificial Intelligence Research 3 (1995) 431–465
11. Blake, C.,
Merz, C.J.: UCI repository of machine learning databases. [Machine-readable
data repository]. University of California, Department of Information and
Computer Science, Irvine, CA. (2004)
12. Pazzani,
M.J., Murphy, P., Ali, K., Schulenburg, D.: Trading off coverage for accuracy
in forecasts: Applications to clinical data analysis. In: Proceedings of the
AAAI Symposium on Artificial Intelligence in Medicine. (1994) 106–110
13. Compton,
P., Edwards, G., Srinivasan, A., Malor, R., Preston, P., Kang, B., Lazarus, L.:
Ripple down rules: Turning knowledge acquisition into knowledge maintenance.
Artificial Intelligence in Medicine 4 (1992) 47–59
14. Blumer,
A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s Razor. Information
Processing Letters 24 (1987) 377–380
15. Domingos,
P.: The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge
Discovery 3 (1999) 409–425
Tidak ada komentar:
Posting Komentar