Asymptotically Unbiased Estimation of A Nonsymmetric Dependence Measure Applied to Sensor Data Analytics and Financial Time Series

  • Angel Cațaron Department of Electronics and Computers Transilvania University of Brasov, Romania
  • Razvan Andonie Central Washington University
  • Yvonne Chueh Department of Mathematics Central Washington University, USA

Abstract

A fundamental concept frequently applied to statistical machine learning is the detection of dependencies between unknown random variables found from data samples. In previous work, we have introduced a nonparametric unilateral dependence measure based on Onicescu’s information energy and a kNN method for estimating this measure from an available sample set of discrete or continuous variables. This paper provides the formal proofs which show that the estimator is asymptotically unbiased and has asymptotic zero variance when the sample size increases. It implies that the estimator has good statistical qualities. We investigate the performance of the estimator for data analysis applications in sensor data analysis and financial time series.

Author Biography

Razvan Andonie, Central Washington University
Executive Editor

References

[1] Andonie R., Cațaron A. (2004), An informational energy LVQ approach for feature ranking, European Symposium on Artificial Neural Networks 2004, pages In d-side publications, 471– 476, 2004.

[2] Andonie R. (1986), Interacting systems and informational energy, Foundation of Control Engineering, 11, 53–59, 1986.

[3] Bonachela J.A., Hinrichsen H., Miguel A. Munoz M.A. (2008), Entropy estimates of small data sets, MATH.THEOR., 41(20), 1-20, 2008.

[4] Cațaron A., Andonie R., Chueh Y. (2013), Asymptotically unbiased estimator of the informational energy with kNN, International Journal of Computers Communications & Control, 8(5), 689–698, 2013.
https://doi.org/10.15837/ijccc.2013.5.643

[5] Cațaron A., Andonie R. (2012), How to infer the informational energy from small datasets, Optimization of Electrical and Electronic Equipment (OPTIM), 2012 13th International Conference on, 1065 –1070, 2012.

[6] Cațaron A., Andonie R., Chueh Y. (2014), kNN estimation of the unilateral dependency measure between random variables, 2014 IEEE Symposium on Computational Intelligence and Data Mining, (CIDM 2014), Orlando, FL, USA, 471–478, 2014.

[7] Cațaron A., Andonie R., Chueh Y. (2015), Financial data analysis using the informational energy unilateral dependency measure, Proceedings of the International Joint Conference on Neural Networks, (IJCNN 2015), Killarney, Ireland, 1-8, 2015.
https://doi.org/10.1109/ijcnn.2015.7280734

[8] Chueh Y., Caµaron A., Andonie R. (2016), Mortality rate modeling of joint lives and survivor insurance contracts tested by a novel unilateral dependence measure, 2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016, Athens, Greece, December 6-9, 2016, 1–8, 2016.
https://doi.org/10.1109/SSCI.2016.7850023

[9] Faivishevsky L., Goldberger J. (2008), ICA based on a smooth estimation of the differential entropy, NIPS, 1-8, 2008.

[10] Gamez J.E., Modave F., Kosheleva O. (2008), Selecting the most representative sample is NP-hard: Need for expert (fuzzy) knowledge, Fuzzy Systems, 2008. FUZZ-IEEE 2008. (IEEE World Congress on Computational Intelligence). IEEE International Conference on, 1069–1074, 2008.

[11] Guiasu S. (1977), Information theory with applications, McGraw Hill, New York, 1977.

[12] Hogg R.V., McKean J., Allen T. Craig A.T. (2006), Introduction To Mathematical Statistics, 6/E, Pearson Education, 2006.

[13] Kozachenko L. F., Leonenko N. N. (1987), Sample estimate of the entropy of a random vector, Probl. Peredachi Inf., 23(2), 9–16, 1987.

[14] Kraskov A., Stögbauer H., Grassberger P. (2004), Estimating mutual information, Phys. Rev. E, 69, 1–16, 2004.
https://doi.org/10.1103/PhysRevE.69.066138

[15] Li H. (2015), On nonsymmetric nonparametric measures of dependence, arXiv:1502.03850, 2015.

[16] Lohr H. (1999), Sampling: Design and Analysis, Duxbury Press, 1999.

[17] Miller M., Miller M. (2003), John E. Freund's mathematical statistics with applications, Pearson/Prentice Hall, Upper Saddle River, New Jersey, 7th edition, 2003.

[18] Onicescu O. (1966), Theorie de l'information. Energie informationelle, C. R. Acad. Sci. Paris, Ser. A–B, 263, 841–842, 1966.

[19] Paninski L. (2003), Estimation of entropy and mutual information, Neural Comput., 15, 1191–1253, 2003.
https://doi.org/10.1162/089976603321780272

[20] Schweizer B., Wolff E. F. (1981), On nonparametric measures of dependence for random variables, Ann. Statist., 9:879–885, 1981.
https://doi.org/10.1214/aos/1176345528

[21] Silverman B.W. (1986), Density Estimation for Statistics and Data Analysis (Chapman & Hall/CRC Monographs on Statistics & Applied Probability), Chapman and Hall/CRC, 1986.

[22] Singh H., Misra N., Hnizdo V., Fedorowicz A., Demchuk E. (2003), Nearest neightboor estimates of entropy, American Journal of Mathematical and Management Sciences, 23, 301–321, 2003.
https://doi.org/10.1080/01966324.2003.10737616

[23] Walters-Williams J., Li Y. (2009), Estimation of mutual information: A survey, Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology, Springer- Verlag, Berlin, Heidelberg, 389–396, 2009.
https://doi.org/10.1007/978-3-642-02962-2_49

[24] Wang Q., Kulkarni S. R., Verdu S. (2006), A nearest-neighbor approach to estimating divergence between continuous random vectors, Proc. of the IEEE International Symposium on Information Theory, Seattle, WA, 242-246, 2006.
https://doi.org/10.1109/isit.2006.261842
Published
2017-06-29
How to Cite
CAȚARON, Angel; ANDONIE, Razvan; CHUEH, Yvonne. Asymptotically Unbiased Estimation of A Nonsymmetric Dependence Measure Applied to Sensor Data Analytics and Financial Time Series. INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, [S.l.], v. 12, n. 4, p. 475-491, june 2017. ISSN 1841-9844. Available at: <http://univagora.ro/jour/index.php/ijccc/article/view/2928>. Date accessed: 05 july 2020. doi: https://doi.org/10.15837/ijccc.2017.4.2928.

Keywords

machine learning, sensor data analytics, financial time series, statistical inference, information energy, nonsymmetric dependence measure, big data analytics