% Retrieved from http://repository.seasr.org/Datasets/UCI/arff/soybean.arff
%
% # Notes: The large soybean database (soybean-large-data.arff) and it's
% corresponding test database (soybean-large-test.arff) combined
% into a single file.
%
% 1. Title: Large Soybean Database
%
% 2. Sources:
% (a) R.S. Michalski and R.L. Chilausky "Learning by Being Told and
% Learning from Examples: An Experimental Comparison of the Two
% Methods of Knowledge Acquisition in the Context of Developing
% an Expert System for Soybean Disease Diagnosis", International
% Journal of Policy Analysis and Information Systems, Vol. 4,
% No. 2, 1980.
% (b) Donor: Ming Tan & Jeff Schlimmer (Jeff.Schlimmer%cs.cmu.edu)
% (c) Date: 11 July 1988
%
% 3. Past Usage:
% 1. See above.
% 2. Tan, M., & Eshelman, L. (1988). Using weighted networks to represent
% classification knowledge in noisy domains. Proceedings of the Fifth
% International Conference on Machine Learning (pp. 121-134). Ann Arbor,
% Michigan: Morgan Kaufmann.
% -- IWN recorded a 97.1% classification accuracy
% -- 290 training and 340 test instances
% 3. Fisher,D.H. & Schlimmer,J.C. (1988). Concept Simplification and
% Predictive Accuracy. Proceedings of the Fifth
% International Conference on Machine Learning (pp. 22-28). Ann Arbor,
% Michigan: Morgan Kaufmann.
% -- Notes why this database is highly predictable
%
% 4. Relevant Information Paragraph:
% There are 19 classes, only the first 15 of which have been used in prior
% work. The folklore seems to be that the last four classes are
% unjustified by the data since they have so few examples.
% There are 35 categorical attributes, some nominal and some ordered. The
% value ``dna'' means does not apply. The values for attributes are
% encoded numerically, with the first value encoded as ``0,'' the second as
% ``1,'' and so forth. An unknown values is encoded as ``?''.
%
% 5. Number of Instances: 683 (one moved to test set)
%
% 6. Number of Attributes: 35 (all have been nominalized)
%
% 7. Attribute Information:
% -- 19 Classes
% diaporthe-stem-canker, charcoal-rot, rhizoctonia-root-rot,
% phytophthora-rot, brown-stem-rot, powdery-mildew,
% downy-mildew, brown-spot, bacterial-blight,
% bacterial-pustule, purple-seed-stain, anthracnose,
% phyllosticta-leaf-spot, alternarialeaf-spot,
% frog-eye-leaf-spot, diaporthe-pod-&-stem-blight,
% cyst-nematode, 2-4-d-injury, herbicide-injury.
%
% 1. date: april,may,june,july,august,september,october,?.
% 2. plant-stand: normal,lt-normal,?.
% 3. precip: lt-norm,norm,gt-norm,?.
% 4. temp: lt-norm,norm,gt-norm,?.
% 5. hail: yes,no,?.
% 6. crop-hist: diff-lst-year,same-lst-yr,same-lst-two-yrs,
% same-lst-sev-yrs,?.
% 7. area-damaged: scattered,low-areas,upper-areas,whole-field,?.
% 8. severity: minor,pot-severe,severe,?.
% 9. seed-tmt: none,fungicide,other,?.
% 10. germination: '90-100%','80-89%','lt-80%',?.
% 11. plant-growth: norm,abnorm,?.
% 12. leaves: norm,abnorm.
% 13. leafspots-halo: absent,yellow-halos,no-yellow-halos,?.
% 14. leafspots-marg: w-s-marg,no-w-s-marg,dna,?.
% 15. leafspot-size: lt-1/8,gt-1/8,dna,?.
% 16. leaf-shread: absent,present,?.
% 17. leaf-malf: absent,present,?.
% 18. leaf-mild: absent,upper-surf,lower-surf,?.
% 19. stem: norm,abnorm,?.
% 20. lodging: yes,no,?.
% 21. stem-cankers: absent,below-soil,above-soil,above-sec-nde,?.
% 22. canker-lesion: dna,brown,dk-brown-blk,tan,?.
% 23. fruiting-bodies: absent,present,?.
% 24. external decay: absent,firm-and-dry,watery,?.
% 25. mycelium: absent,present,?.
% 26. int-discolor: none,brown,black,?.
% 27. sclerotia: absent,present,?.
% 28. fruit-pods: norm,diseased,few-present,dna,?.
% 29. fruit spots: absent,colored,brown-w/blk-specks,distort,dna,?.
% 30. seed: norm,abnorm,?.
% 31. mold-growth: absent,present,?.
% 32. seed-discolor: absent,present,?.
% 33. seed-size: norm,lt-norm,?.
% 34. shriveling: absent,present,?.
% 35. roots: norm,rotted,galls-cysts,?.