# Extreme Data Mining: Inference from Small Datasets

## Abstract

Neural networks have been applied successfully in many fields. However, satisfactory results can only be found under large sample conditions. When it comes to small training sets, the performance may not be so good, or the learning task can even not be accomplished. This deficiency limits the applications of neural network severely. The main reason why small datasets cannot provide enough information is that there exist gaps between samples, even the domain of samples cannot be ensured. Several computational intelligence techniques have been proposed to overcome the limits of learning from small datasets.

We have the following goals: i. To discuss the meaning of "small" in the context of inferring from small datasets. ii. To overview computational intelligence solutions for this problem. iii. To illustrate the introduced concepts with a real-life application.

## References

R. Andonie, L. Fabry-Asztalos, S. Abdul-Wahid, C. Collar, and N. Salim, "An integrated soft computing approach for predicting biological activity of potential HIV-1 protease inhibitors," in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2006), Vancouver, BC, Canada, July 16-21 2006, pp. 7495-7502.

L. Fabry-Asztalos, R. Andonie, C. Collar, S. Abdul-Wahid, and N. Salim, "A genetic algorithm optimized fuzzy neural network analysis of the affinity of inhibitors for HIV-1 protease," Bioorganic and Medicinal Chemistry, vol. 16, pp. 2903-2911, 2008. http://dx.doi.org/10.1016/j.bmc.2007.12.055

R. Andonie, L. Fabry-Asztalos, C. B. Abdul-Wahid, S. Abdul-Wahid, G. I. Barker, and L. C. Magill, "Fuzzy ARTMAP prediction of biological activities for potential HIV-1 protease inhibitors using a small molecular dataset," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 99, no. PrePrints, 2009.

R. Andonie and L. Sasu, "Fuzzy ARTMAP with input relevances," IEEE Transactions on Neural Networks, vol. 17, pp. 929-941, 2006. http://dx.doi.org/10.1109/TNN.2006.875988

G. A. Carpenter, S. Grossberg, N. Markuzon, J. H. Reynolds, and D. B. Rosen, "Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps," IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 698-713, 1992. http://dx.doi.org/10.1109/72.159059

S. Verzi, G. Heileman, M. Georgiopoulos, and G. Anagnostopoulos, "Universal approximation with fuzzy art and fuzzy ARTMAP," in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN '03), vol. 3, Portland, Oregon, 20-24 July 2003, pp. 1987-1992. http://dx.doi.org/10.1109/ijcnn.2003.1223712

R. Andonie, L. Fabry-Asztalos, C. Collar, S. Abdul-Wahid, and N. Salim, "Neuro-fuzzy prediction of biological activity and rule extraction for HIV-1 protease inhibitors," in Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB'05), 2005, pp. 113-120. http://dx.doi.org/10.1109/cibcb.2005.1594906

R. Andonie, L. Fabry-Asztalos, L. Magill, and S. Abdul-Wahid, "A new Fuzzy ARTMAP approach for predicting biological activity of potential HIV-1 protease inhibitors," in Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007), I. C. S. Press, Ed., San Jose, CA, 2007, pp. 56-61. http://dx.doi.org/10.1109/bibm.2007.9

R. Andonie, "Inference from small training sets - a computational intelligence perspective," University of Ulster, Jordanstown, Nothern Ireland, United Kingdom, invited talk, June 2008.

R. Andonie, L. Fabry-Asztalos, B. Crivat, S. Abdul-Wahid, and B. Abdul-Wahid, "Fuzzy ARTMAP rule extraction in computational chemistry," in IJCNN'09: Proceedings of the 2009 International Joint Conference on Neural Networks. IEEE, 2009, pp. 2961-2967. http://dx.doi.org/10.1109/IJCNN.2009.5179007

R. Andonie, "Extreme data mining: Inference from small datasets," National University of Ireland, Maynooth, Ireland, invited talk, June 2008.

â€”â€”, "How to learn from small training sets," Dalle Molle Institute for Artificial Intelligence (IDSIA), Manno-Lugano, Switzerland, invited talk, September 2009.

V. Vapnik, Statistical Learning Theory. New York: Wiley, 2000.

J. L. BalcÃ¡zar and R. V. Book, "Sets with small generalized Kolmogorov complexity," Acta Inf., vol. 23, no. 6, pp. 679-688, 1986. http://dx.doi.org/10.1007/BF00264313

A. Ambainis, "Application of Kolmogorov complexity to inductive inference with limited memory," in ALT '95: Proceedings of the 6th International Conference on Algorithmic Learning Theory. London, UK: Springer-Verlag, 1995, pp. 313-318. http://dx.doi.org/10.1007/3-540-60454-5_48

A. Ambainis, K. Apsitis, C. Calude, R. Freivalds, M. Karpinski, T. Larfeldt, I. Sala, and J. Smotrovs, "Effects of Kolmogorov complexity present in inductive inference as well," in ALT '97: Proceedings of the 8th International Conference on Algorithmic Learning Theory. London, UK: Springer-Verlag, 1997, pp. 244-259. http://dx.doi.org/10.1007/3-540-63577-7_47

J.-L. Yuan and T. Fine, "Neural-network design for small training sets of high dimension," IEEE Tnansactions on Neural Networks, vol. 9, pp. 266-280, 1998. http://dx.doi.org/10.1109/72.661122

J.-L. Yuan, "Bootstrapping nonparametric feature selection algorithms for mining small data sets," in Proceedings of the International Joint Conference on Neural Networks (IJCNN), 1999, pp. 2526 - 2529.

C. Huang and C. Moraga, "A diffusion-neural-network for learning from small samples," International Journal of Approximate Reasoning, vol. 35, pp. 137-161, 2004. http://dx.doi.org/10.1016/j.ijar.2003.06.001

R. Mao, H. Zhu, L. Zhang, and A. Chen, "A new method to assist small data set neural network learning," in Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06), 2006, pp. 17-22. http://dx.doi.org/10.1109/ISDA.2006.67

D.-C. Li, C.-S. Wu, T. T.-I., and L. Y.-S., "Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge," Computers and Operations Research, vol. 34, pp. 966-982, 2007. http://dx.doi.org/10.1016/j.cor.2005.05.019

D.-C. Li, C.-W. Yeh, T.-I. Tsai, Y.-H. Fang, and S. Hu, "Acquiring knowledge with limited experience," Expert Systems, vol. 24, pp. 162-170, 2007. http://dx.doi.org/10.1111/j.1468-0394.2007.00427.x

D.-C. Li, C.-S. Wu, T.-I. Tsai, and F. M. Chang, "Using mega-fuzzification and data trend estimation in small data set learning for early FMS scheduling knowledge," Comput. Oper. Res., vol. 33, no. 6, pp. 1857-1869, 2006. http://dx.doi.org/10.1016/j.cor.2004.11.022

T.-I. Tsai and D.-C. Li, "Approximate modeling for high order non-linear functions using small sample sets," Expert Syst. Appl., vol. 34, no. 1, pp. 564-569, 2008. http://dx.doi.org/10.1016/j.eswa.2006.09.023

D.-C. Li and C.-W. Yeh, "A non-parametric learning algorithm for small manufacturing data sets," Expert Syst. Appl., vol. 34, no. 1, pp. 391-398, 2008. http://dx.doi.org/10.1016/j.eswa.2006.09.008

D.-C. Li and C.-W. Liu, "A neural network weight determination model designed uniquely for small data set learning," Expert Syst. Appl., vol. 36, no. 6, pp. 9853-9858, 2009. http://dx.doi.org/10.1016/j.eswa.2009.02.004

I. V. Tetko, A. I. Luik, and G. I. Poda, "Application of neural networks in structure-activity relationships of a small number of molecules," J. Med. Chem., vol. 36, pp. 811-814, 1993. http://dx.doi.org/10.1021/jm00059a003

D. Hecht and G. Fogel, "High-throughput ligand screening via preclustering and evolved neural networks," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 4, pp. 476- 484, 2007. http://dx.doi.org/10.1109/tcbb.2007.1038

M. Cheung, S. Johnson, D. Hecht, and G. Fogel, "Quantitative structure-property relationships for drug solubility prediction using evolved neural networks," in Proceedings of the IEEE World Congress on Computational Intelligence, 2008, pp. 688-693. http://dx.doi.org/10.1109/cec.2008.4630870

H. Lohr, Sampling: Design and Analysis. Duxbury Press, 1999.

J. Gamez, F. Modave, and O. Kosheleva, "Selecting the most representative sample is NP-hard: Need for expert (fuzzy) knowledge," in Fuzzy Systems, 2008. FUZZ-IEEE 2008. (IEEE World Congress on Computational Intelligence). IEEE International Conference on, June 2008, pp. 1069- 1074. http://dx.doi.org/10.1109/fuzzy.2008.4630502

L. Holmstrom and P. Koistinen, "Using additive noise in backpropagation training," IEEE Transactions on Neural Networks, vol. 3, pp. 24-38, 1992. http://dx.doi.org/10.1109/72.105415

C. Wang and J. C. Principe, "Training neural networks with additive noise in the desired signal," IEEE Transactions on Neural Networks, vol. 10, pp. 1511-1517, 1995. http://dx.doi.org/10.1109/72.809097

K. Wang, J. Yang, G. Shi, and Q. Wang, "An expanded training set based validation method to avoid overfitting for neural network classifier," International Conference on Natural Computation, vol. 3, pp. 83-87, 2008. http://dx.doi.org/10.1109/icnc.2008.571

G. N. Karystinos and D. A. Pados, "On overfitting, generalization, and randomly expanded training sets," IEEE Transactions on Neural Networks, vol. 5, pp. 1050-1057, 2000. http://dx.doi.org/10.1109/72.870038

Y. Liu, J. A. Starzyk, and Z. Zhu, "Optimized approximation algorithm in neural networks without overfitting," IEEE Transactions on Neural Networks, vol. 19, no. 6, pp. 983-995, 2008. http://dx.doi.org/10.1109/TNN.2007.915114

S. Bos and E. Chug, "Using weight decay to optimize the generalization ability of a perceptron," in Proceedings of the 1996 International Conference on Neural Networks. IEEE, 1996, pp. 241-246. http://dx.doi.org/10.1109/icnn.1996.548898

K. Mahdaviani, H. Mazyar, S. Majidi, and M. H. Saraee, "A method to resolve the overfitting problem in recurrent neural networks for prediction of complex systems' behavior," in IJCNN'08: Proceedings of the 2008 International Joint Conference on Neural Networks, 2008, pp. 3723-3728. http://dx.doi.org/10.1109/ijcnn.2008.4634332

R. Reed, "Pruning algorithms - a survey," IEEE Transactions on Neural Networks, vol. 4, pp. 740- 747, 1993. http://dx.doi.org/10.1109/72.248452

T.-Y. Kwok and D.-Y. Yeung, "Constructive algorithms for structure learning in feedforward neural networks for regression problems," IEEE Transactions on Neural Networks, vol. 8, pp. 630-645, 1997. http://dx.doi.org/10.1109/72.572102

L. Prechelt, "Automatic early stopping using cross validation: Quantifying the criteria," Neural Networks, vol. 11, pp. 761-767, 1998. http://dx.doi.org/10.1016/S0893-6080(98)00010-0

I. Dagher, M. Georgiopoulos, G. Heileman, and G. Bebis, "Ordered Fuzzy ARTMAP: a Fuzzy ARTMAP algorithm with a fixed order of pattern presentation," in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 1998), IEEE World Congress on Computational Intelligence, Anchorage, Alaska, 1998, pp. 1717-1722. http://dx.doi.org/10.1109/ijcnn.1998.687115

I. Dagher, M. Georgiopoulos, G. L. Heileman, and G. Bebis, "An ordering algorithm for pattern presentation in Fuzzy ARTMAP that tends to improve generalization performance," IEEE Transactions on Neural Networks, vol. 10, pp. 768-778, 1999. http://dx.doi.org/10.1109/72.774217

S. Tan, M. Rao, and C. P. Lim, "A hybrid neural network classifier combining ordered Fuzzy ARTMAP and the dynamic decay adjustment algorithm," Soft Computing, vol. 12, pp. 765-775, 2008. http://dx.doi.org/10.1007/s00500-007-0235-2

J. Tou and R. Gonzales, Pattern recognition principles. Reading, MA: Addison-Wesley, 1976.

## Published

## Issue

## Section

## License

**ONLINE OPEN ACCES:** Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.

**You are free to:**

-Share: copy and redistribute the material in any medium or format;

-Adapt: remix, transform, and build upon the material.

The licensor cannot revoke these freedoms as long as you follow the license terms.

**DISCLAIMER:** The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.