Asymptotically Unbiased Estimation of A Nonsymmetric Dependence Measure Applied to Sensor Data Analytics and Financial Time Series

Authors

  • Angel Cațaron Department of Electronics and Computers Transilvania University of Brasov, Romania
  • Razvan Andonie Central Washington University
  • Yvonne Chueh Department of Mathematics Central Washington University, USA

Keywords:

machine learning, sensor data analytics, financial time series, statistical inference, information energy, nonsymmetric dependence measure, big data analytics

Abstract

A fundamental concept frequently applied to statistical machine learning is the detection of dependencies between unknown random variables found from data samples. In previous work, we have introduced a nonparametric unilateral dependence measure based on Onicescu’s information energy and a kNN method for estimating this measure from an available sample set of discrete or continuous variables. This paper provides the formal proofs which show that the estimator is asymptotically unbiased and has asymptotic zero variance when the sample size increases. It implies that the estimator has good statistical qualities. We investigate the performance of the estimator for data analysis applications in sensor data analysis and financial time series.

Author Biography

Razvan Andonie, Central Washington University

Executive Editor

References

Andonie R., Cațaron A. (2004), An informational energy LVQ approach for feature ranking, European Symposium on Artificial Neural Networks 2004, pages In d-side publications, 471- 476, 2004.

Andonie R. (1986), Interacting systems and informational energy, Foundation of Control Engineering, 11, 53-59, 1986.

Bonachela J.A., Hinrichsen H., Miguel A. Munoz M.A. (2008), Entropy estimates of small data sets, MATH.THEOR., 41(20), 1-20, 2008.

Cațaron A., Andonie R., Chueh Y. (2013), Asymptotically unbiased estimator of the informational energy with kNN, International Journal of Computers Communications & Control, 8(5), 689-698, 2013. https://doi.org/10.15837/ijccc.2013.5.643

Cațaron A., Andonie R. (2012), How to infer the informational energy from small datasets, Optimization of Electrical and Electronic Equipment (OPTIM), 2012 13th International Conference on, 1065 -1070, 2012.

Cațaron A., Andonie R., Chueh Y. (2014), kNN estimation of the unilateral dependency measure between random variables, 2014 IEEE Symposium on Computational Intelligence and Data Mining, (CIDM 2014), Orlando, FL, USA, 471-478, 2014.

Cațaron A., Andonie R., Chueh Y. (2015), Financial data analysis using the informational energy unilateral dependency measure, Proceedings of the International Joint Conference on Neural Networks, (IJCNN 2015), Killarney, Ireland, 1-8, 2015. https://doi.org/10.1109/ijcnn.2015.7280734

Chueh Y., Caµaron A., Andonie R. (2016), Mortality rate modeling of joint lives and survivor insurance contracts tested by a novel unilateral dependence measure, 2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016, Athens, Greece, December 6-9, 2016, 1-8, 2016. https://doi.org/10.1109/SSCI.2016.7850023

Faivishevsky L., Goldberger J. (2008), ICA based on a smooth estimation of the differential entropy, NIPS, 1-8, 2008.

Gamez J.E., Modave F., Kosheleva O. (2008), Selecting the most representative sample is NP-hard: Need for expert (fuzzy) knowledge, Fuzzy Systems, 2008. FUZZ-IEEE 2008. (IEEE World Congress on Computational Intelligence). IEEE International Conference on, 1069-1074, 2008.

Guiasu S. (1977), Information theory with applications, McGraw Hill, New York, 1977.

Hogg R.V., McKean J., Allen T. Craig A.T. (2006), Introduction To Mathematical Statistics, 6/E, Pearson Education, 2006.

Kozachenko L. F., Leonenko N. N. (1987), Sample estimate of the entropy of a random vector, Probl. Peredachi Inf., 23(2), 9-16, 1987.

Kraskov A., Stögbauer H., Grassberger P. (2004), Estimating mutual information, Phys. Rev. E, 69, 1-16, 2004. https://doi.org/10.1103/PhysRevE.69.066138

Li H. (2015), On nonsymmetric nonparametric measures of dependence, arXiv:1502.03850, 2015.

Lohr H. (1999), Sampling: Design and Analysis, Duxbury Press, 1999.

Miller M., Miller M. (2003), John E. Freund's mathematical statistics with applications, Pearson/Prentice Hall, Upper Saddle River, New Jersey, 7th edition, 2003.

Onicescu O. (1966), Theorie de l'information. Energie informationelle, C. R. Acad. Sci. Paris, Ser. A-B, 263, 841-842, 1966.

Paninski L. (2003), Estimation of entropy and mutual information, Neural Comput., 15, 1191-1253, 2003. https://doi.org/10.1162/089976603321780272

Schweizer B., Wolff E. F. (1981), On nonparametric measures of dependence for random variables, Ann. Statist., 9:879-885, 1981. https://doi.org/10.1214/aos/1176345528

Silverman B.W. (1986), Density Estimation for Statistics and Data Analysis (Chapman & Hall/CRC Monographs on Statistics & Applied Probability), Chapman and Hall/CRC, 1986.

Singh H., Misra N., Hnizdo V., Fedorowicz A., Demchuk E. (2003), Nearest neightboor estimates of entropy, American Journal of Mathematical and Management Sciences, 23, 301-321, 2003. https://doi.org/10.1080/01966324.2003.10737616

Walters-Williams J., Li Y. (2009), Estimation of mutual information: A survey, Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology, Springer- Verlag, Berlin, Heidelberg, 389-396, 2009. https://doi.org/10.1007/978-3-642-02962-2_49

Wang Q., Kulkarni S. R., Verdu S. (2006), A nearest-neighbor approach to estimating divergence between continuous random vectors, Proc. of the IEEE International Symposium on Information Theory, Seattle, WA, 242-246, 2006. https://doi.org/10.1109/isit.2006.261842

Published

2017-06-29

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.