Title: | Associated Kernel Estimations |
---|---|
Description: | Continuous and discrete (count or categorical) estimation of density, probability mass function (p.m.f.) and regression functions are performed using associated kernels. The cross-validation technique and the local Bayesian procedure are also implemented for bandwidth selection. |
Authors: | W. E. Wansouwé, S. M. Somé and C. C. Kokonendji |
Maintainer: | W. E. Wansouwé <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2025-03-08 03:14:34 UTC |
Source: | https://github.com/cran/Ake |
Continuous and discrete estimation of density dke.fun
, probability mass function (p.m.f.) kpmfe.fun
and regression reg.fun
functions are performed using continuous and discrete associated kernels. The cross-validation technique hcvc.fun
, hcvreg.fun
and the Bayesian procedure hbay.fun
are also implemented for bandwidth selection.
The associated kernel estimator of
is defined as
where is one of the kernels
kef
defined below.
In practice, we first calculate the global normalizing constant
where is the support of the density or p.m.f. function and
is the Lebesgue or count measure on
. For both continuous and discrete associated kernels, this normalizing constant is not generally equal to 1 and it will be computed. The represented density or p.m.f. estimate is then
.
For discrete data, the integrated squared error (ISE) defined by
is the criteria used to measure the smoothness of the associated kernel estimator with the empirical p.m.f.
;
see Kokonendji and Senga Kiessé (2011).
Both in continuous and discrete cases, considering the relation between a response variable and an explanatory variable
given by
where is an unknown regession function on
and
the disturbance term with null mean and finite variance. Let
be a sequence of independent and identically distributed (iid) random vectors on
with
.
The well-known Nadaraya-Watson estimator using associated kernels is
defined as
where and
is one of the associated kernels defined below.
Beside the criterion of kernel support, we retain the root mean squared error (RMSE) and also the practical coefficient of determination defined respectively by
and
where ; see Kokonendji et al. (2009).
Given a data sample, the package allows to compute the density or p.m.f. and regression functions using one of the seven associated kernels: extended beta, lognormal, gamma, reciprocal inverse Gaussian for continuous data, DiracDU for categorical data, and binomial and discrete triangular for count data. The bandwidth parameter is computed using the cross-validation technique. When the associated kernel function is binomial, the bandwidth parameter is also computed using the local Bayesian procedure. The associated kernel functions are defined below. The first four kernels are for continuous data and the last three kernels are for discrete case.
The extended beta kernel is defined on with
,
and
:
where is the usual beta function with
,
and
denotes the indicator function of A. For
and
, it corresponds to the beta kernel which is the probability density function of the beta distribution with shape parameters
and
; see Libengué (2013).
The gamma kernel is defined on with
and
by
where is the classical gamma function. The probability density function
is the gamma distribution with scale parameter
and shape parameter
; see Chen (2000).
The lognormal kernel is defined on with
and
by
It is the probability density function of the classical lognormal distribution with parameters and
; see Libengué (2013).
Let and
. The Binomial kernel is defined on the support
by
where . Note that
is the p.m.f. of the binomial distribution with its number of trials
and its success probability
; see Kokonendji and Senga Kiessé (2011).
For fixed arm , we define
. The discrete triangular kernel is defined on
by
where ,
and
is the normalizing constant. For
, the Discrete Triangular kernel
corresponds to the Dirac kernel on
; see Kokonendji et al. (2007), and also Kokonendji and Zocchi (2010) for an asymmetric version of discrete triangular.
For fixed number of categories , we define
. The DiracDU kernel is defined on
by
where and
. See Kokonendji and Senga Kiessé (2011), and also Aitchison and Aitken (1976) for multivariate case.
Note that the global normalizing constant is 1 for DiracDU.
Two functions are implemented to select the bandwidth: cross-validation and local Bayesian procedure. The cross-validation technique is used for all the associated kernels both in density and regression; see Kokonendji and Senga Kiessé (2011). The local Bayesian procedure is implemented to select the bandwidth in the estimation of p.m.f. when using binomial kernel; see Zougab et al. (2014).
In the coming versions of the package, adaptive Bayesian procedure will be included for bandwidth selection in density estimation when using gamma kernel. A global Bayesian procedure will also be implemented for bandwidth selection in regression when using binomial kernel.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Maintainer: W. E. Wansouwé <[email protected]>
Aitchison, J. and Aitken, C.G.G. (1976). Multivariate binary discrimination by the kernel method, Biometrika 63, 413 - 420.
Chen, S. X. (1999). Beta kernels estimators for density functions, Computational Statistics and Data Analysis 31, 131 - 145.
Chen, S. X. (2000). Probability density function estimation using gamma kernels, Annals of the Institute of Statistical Mathematics 52, 471 - 480.
Igarashi, G. and Kakizawa, Y. (2015). Bias correction for some asymmetric kernel estimators, Journal of Statistical Planning and Inference 159, 37 - 63.
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Demétrio, C.G.B. (2009). Appropriate kernel regression on a count explanatory variable and applications, Advances and Applications in Statistics 12, 99 - 125.
Libengue, F.G. (2013). Méthode Non-Paramétrique par Noyaux Associés Mixtes et Applications, Ph.D. Thesis Manuscript (in French) to Université de Franche-Comté, Besançon, France and Université de Ouagadougou, Burkina Faso, June 2013, LMB no. 14334, Besançon.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2014). Bayesian approach in nonparametric count regression with binomial kernel, Communications in Statistics - Simulation and Computation 43, 1052 - 1063.
The (S3) generic function dkde.fun
computes the density.
Its default method does so with the given kernel
and bandwidth .
dke.fun(Vec, ...) ## Default S3 method: dke.fun(Vec, h, type_data = c("discrete", "continuous"), ker = c("BE", "GA", "LN", "RIG"), x = NULL, a0 = 0, a1 = 1, ... )
dke.fun(Vec, ...) ## Default S3 method: dke.fun(Vec, h, type_data = c("discrete", "continuous"), ker = c("BE", "GA", "LN", "RIG"), x = NULL, a0 = 0, a1 = 1, ... )
Vec |
The data sample from which the estimate is to be computed. |
h |
The bandwidth or smoothing parameter. |
type_data |
The data sample type. Data can be continuous or discrete (categorical or count). Here, in this function , we deal with continuous data. |
ker |
A character string giving the smoothing kernel to be used which is the associated kernel: "BE" extended beta, "GA" gamma, "LN" lognormal and "RIG" reciprocal inverse Gaussian. |
x |
The points of the grid at which the density is to be estimated. |
a0 |
The left bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a1 |
The right bound of the support used for extended beta kernel. Default value is 1 for beta kernel. |
... |
Further arguments. |
The associated kernel estimator of
is defined in the above sections.
We recall that in general, the sum of the estimated values on the support is not equal to 1. In practice, we compute the global normalizing constant
before computing the estimated density
; see e.g. Libengué (2013).
Returns a list containing:
data |
The data - same as input Vec. |
n |
The sample size. |
kernel |
The asssociated kernel used to compute the density estimate. |
h |
The bandwidth used to compute the density estimate. |
eval.points |
The coordinates of the points where the density is estimated. |
est.fn |
The estimated density values. |
C_n |
The global normalizing constant. |
hist |
The histogram corresponding to the observations. |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Libengué, F.G. (2013). Méthode Non-Paramétrique par Noyaux Associés Mixtes et Applications, Ph.D. Thesis Manuscript (in French) to Université de Franche-Comté, Besançon, France and Université de Ouagadougou, Burkina Faso, June 2013, LMB no. 14334, Besançon.
## A sample data with n=100. V<-rgamma(100,1.5,2.6) ##The bandwidth can be the one obtained by cross validation. h<-0.052 ## We choose Gamma kernel. est<-dke.fun(V,h,"continuous","GA")
## A sample data with n=100. V<-rgamma(100,1.5,2.6) ##The bandwidth can be the one obtained by cross validation. h<-0.052 ## We choose Gamma kernel. est<-dke.fun(V,h,"continuous","GA")
The (S3) generic function hbay.fun
computes the local Bayesian procedure for bandwidth selection.
hbay.fun(Vec, ...) ## Default S3 method: hbay.fun(Vec, x = NULL, ...)
hbay.fun(Vec, ...) ## Default S3 method: hbay.fun(Vec, x = NULL, ...)
Vec |
The data sample from which the estimate is to be computed. |
x |
The points of the grid where the density is to be estimated. |
... |
Further arguments for (non-default) methods. |
hbay.fun
implements the choice of the bandwidth using the local Bayesian approach
of a kernel density estimator.
Returns the bandwidth selected using the local Bayesian procedure.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Chen, S. X. (1999). Beta kernels estimators for density functions, Computational Statistics and Data Analysis 31, 131 - 145.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2014). Bayesian approach in nonparametric count regression with binomial kernel, Communications in Statistics - Simulation and Computation 43, 1052 - 1063.
The (S3) generic function hcvc.fun
computes the
cross-validation bandwidth selector.
hcvc.fun(Vec,...) ## Default S3 method: hcvc.fun(Vec, bw = NULL, type_data, ker, a0 = 0, a1 = 1, ...)
hcvc.fun(Vec,...) ## Default S3 method: hcvc.fun(Vec, bw = NULL, type_data, ker, a0 = 0, a1 = 1, ...)
Vec |
The data sample from which the estimate is to be computed. |
bw |
The sequence of bandwidths where to compute the cross-validation. Default value is NULL. |
type_data |
The sample data type. |
ker |
The associated kernel. |
a0 |
The left bound of the extended beta. Default value is 0. |
a1 |
The right bound of the extended beta.Default value is 1. |
... |
Further arguments. |
hcvc.fun
implements the choice of the bandwidth using the cross-validation approach
of a kernel density estimator.
Returns a list containing:
hcv |
value of bandwidth parameter. |
CV |
the values of cross-validation function. |
seq_h |
the sequence of bandwidths where the cross validation is computed. |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Chen, S. X. (1999). Beta kernels estimators for density functions, Computational Statistics and Data Analysis 31, 131 - 145.
Chen, S. X. (2000). Gamma kernels estimators for density functions, Annals of the Institute of Statistical Mathematics 52, 471 - 480.
Libengué, F.G. (2013). Méthode Non-Paramétrique par Noyaux Associés Mixtes et Applications, Ph.D. Thesis Manuscript (in French) to Université de Franche-Comté, Besançon, France and Université de Ouagadougou, Burkina Faso, June 2013, LMB no. 14334, Besançon.
Igarashi, G. and Kakizawa, Y. (2015). Bias correction for some asymmetric kernel estimators, Journal of Statistical Planning and Inference 159, 37 - 63.
V=rgamma(100,1.5,2.6) ## Not run: hcvc.fun(V,NULL,"continuous","GA") ## End(Not run)
V=rgamma(100,1.5,2.6) ## Not run: hcvc.fun(V,NULL,"continuous","GA") ## End(Not run)
The (S3) generic function hcvd.fun
computes the
cross-validation bandwidth selector in p.m.f. estimation.
hcvd.fun(Vec, ...) ## Default S3 method: hcvd.fun(Vec, seq_bws = NULL, ker = c("bino", "triang", "dirDU"), a = 1, c = 2,...)
hcvd.fun(Vec, ...) ## Default S3 method: hcvd.fun(Vec, seq_bws = NULL, ker = c("bino", "triang", "dirDU"), a = 1, c = 2,...)
Vec |
The data sample from which the estimate is to be computed. |
seq_bws |
The sequence of bandwidths where to compute the cross-validation. Default value is NULL. |
ker |
The associated kernel |
a |
The arm of the discrete triangular kernel. Default value is 1. |
c |
The number of categories in DiracDU kernel.Default value is 2. |
... |
Further arguments. |
The hcvd.fun
function implements the choice of the bandwidth using the cross-validation approach in p.m.f. estimate.
Returns a list containing:
hcv |
The optimal bandwidth parameter. |
CV |
The cross-validation function values. |
seq_h |
The sequence of bandwidths where the cross-validation is computed. |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Chen, S. X. (1999). Beta kernels estimators for density functions, Computational Statistics and Data Analysis 31, 131 - 145.
Chen, S. X. (2000). Probability density function estimation using gamma kernels, Annals of the Institute of Statistical Mathematics 52, 471 - 480.
Libengué, F.G. (2013). Méthode Non-Paramétrique par Noyaux Associés Mixtes et Applications, Ph.D. Thesis Manuscript (in French) to Université de Franche-Comté, Besançon, France and Université de Ouagadougou, Burkina Faso, June 2013, LMB no. 14334, Besançon.
Igarashi, G. and Kakizawa, Y. (2015). Bias correction for some asymmetric kernel estimators, Journal of Statistical Planning and Inference 159, 37 - 63.
## Data can be simulated data or real data ## We use real data ## and then compute the cross validation. Vec<-c(10,0,1,0,4,0,6,0,0,0,1,1,1,2,4,4,5,6,6,6,6,7,1,7,0,7,7, 7,8,0,8,12,8,8,9,9,0,9,9,10,10,10,10,0,10,10,11,12,12,10,12,12, 13,14,15,16,16,17,0,12) ## Not run: CV<-hcvd.fun(Vec,NULL,"bino") CV$hcv ## End(Not run) ##The cross validation function can be also ploted. ## Not run: plot.fun(CV$seq_bws,CV$CV, type="l") ## End(Not run)
## Data can be simulated data or real data ## We use real data ## and then compute the cross validation. Vec<-c(10,0,1,0,4,0,6,0,0,0,1,1,1,2,4,4,5,6,6,6,6,7,1,7,0,7,7, 7,8,0,8,12,8,8,9,9,0,9,9,10,10,10,10,0,10,10,11,12,12,10,12,12, 13,14,15,16,16,17,0,12) ## Not run: CV<-hcvd.fun(Vec,NULL,"bino") CV$hcv ## End(Not run) ##The cross validation function can be also ploted. ## Not run: plot.fun(CV$seq_bws,CV$CV, type="l") ## End(Not run)
The (S3) generic function hcvreg.fun
computes the bandwidth by cross-validation for the regression.
Its default method does so. It allows to compute the optimal bandwidth using the cross-validation method. The associated kernels available are: "BE" extended beta, "GA" gamma, "LN" lognormal and "RIG" reciprocal inverse Gaussian, DiracDU, binomial and discrete triangular; see Kokonendji and Senga Kiessé (2011), and also Kokonendji et al. (2009).
hcvreg.fun(Vec, ...) ## Default S3 method: hcvreg.fun(Vec, y, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), h = NULL, a0 = 0, a1 = 1, a = 1, c = 2, ...)
hcvreg.fun(Vec, ...) ## Default S3 method: hcvreg.fun(Vec, y, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), h = NULL, a0 = 0, a1 = 1, a = 1, c = 2, ...)
Vec |
The explanatory variable. |
y |
The response variable. |
type_data |
The data sample type. Data can be continuous or discrete. |
ker |
A character string giving the smoothing kernel to be used which is the associated kernel: "BE" extended beta, "GA" gamma, "LN" lognormal and "RIG" reciprocal inverse Gaussian, "dirDU" DiracDU,"bino" binomial, "triang" discrete triangular. |
h |
The bandwidth or smoothing parameter.the smoothing bandwidth to be used, can also be a character string giving a rule to choose the bandwidth. |
a0 |
The left bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a1 |
The right bound of the support used for extended beta kernel. Default value is 1 for beta kernel. |
a |
The arm of the discrete triangular kernel |
c |
The number of categories |
... |
Further arguments |
The selection of the bandwidth parameter is always crucial. If the bandwidth is small, we will obtain an undersmoothed estimator, with high variability. On the contrary, if the value is big, the resulting estimator will be very smooth and farther from the function that we are trying to estimate. The cross-validation function defined in the above sections is used to compute the optimal bandwidth for the associated kernels.
Returns a list containing:
kernel |
The associated kernel used to compute the optimal bandwidth. |
hcv |
The optimal bandwidth parameter obtained by cross-validation. |
CV |
The values of the cross-validation. |
seq_bws |
A sequence of bandwidths where the cross-validation is computed. |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Demétrio, C.G.B. (2009). Appropriate kernel regression on a count explanatory variable and applications, Advances and Applications in Statistics 12, 99 - 125.
## Data can be simulated data or real data ## We use real data ## and then compute the cross validation. data(milk) x=milk$week y=milk$yield hcvreg.fun(x,y,"discrete",ker="triang",a=1)
## Data can be simulated data or real data ## We use real data ## and then compute the cross validation. data(milk) x=milk$week y=milk$yield hcvreg.fun(x,y,"discrete",ker="triang",a=1)
This function computes the associated kernel function.
kef(x, t, h, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), a0 = 0, a1 = 1, a = 1, c = 2)
kef(x, t, h, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), a0 = 0, a1 = 1, a = 1, c = 2)
x |
The target. |
t |
A single value or the grid where the associated kernel function is computed. |
h |
The bandwidth or smoothing parameter. |
type_data |
The sample data type |
ker |
The associated kernel:"bino" Binomial, "triang" discrete triangular kernel, "BE" extended beta, "GA" gamma, "LN" lognormal and "RIG" reciprocal inverse Gaussian,"dirDU" DiracDU. |
a0 |
The left bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a1 |
The right bound of the support used for extended beta kernel. Default value is 1 for beta kernel. |
a |
The arm in discrete triangular kernel. The default value is 1. |
c |
The number of categories in DiracDU kernel. The default value is 2. |
The associated kernel is one of the those which have been defined in the sections above : extended beta, gamma, lognormal, reciprocal inverse Gaussian, DiracDU, binomial and discrete triangular; see Kokonendji and Senga Kiessé (2011), and also Kokonendji et al. (2007).
Returns the value of the associated kernel function at t
according to the target and the bandwidth.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Chen, S. X. (1999). Beta kernels estimators for density functions, Computational Statistics and Data Analysis 31, 131 - 145.
Chen, S. X. (2000). Probability density function estimation using gamma kernels, Annals of the Institute of Statistical Mathematics 52, 471 - 480.
Igarashi, G. and Kakizawa, Y. (2015). Bias correction for some asymmetric kernel estimators, Journal of Statistical Planning and Inference 159, 37 - 63.
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function, Journal of Nonparametric Statistics 19, 241 - 254.
Libengué, F.G. (2013). Méthode Non-Paramétrique par Noyaux Associés Mixtes et Applications, Ph.D. Thesis Manuscript (in French) to Université de Franche-Comté, Besançon, France and Université de Ouagadougou, Burkina Faso, June 2013, LMB no. 14334, Besançon.
x<-5 h<-0.2 t<-0:10 kef(x,t,h,"discrete","bino")
x<-5 h<-0.2 t<-0:10 kef(x,t,h,"discrete","bino")
The (S3) generic function kern.fun
computes the value of the associated kernel function.
Its default method does so with a given kernel
and bandwidth .
kern.fun(x, ...) ## Default S3 method: kern.fun(x, t, h, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), a0 = 0, a1 = 1, a = 1, c = 2, ...)
kern.fun(x, ...) ## Default S3 method: kern.fun(x, t, h, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), a0 = 0, a1 = 1, a = 1, c = 2, ...)
x |
The target |
t |
A single value or the grid where the discrete associated kernel function is computed. |
h |
The bandwidth or smoothing parameter. |
type_data |
The sample data type |
ker |
The associated kernel: "dirDU" DiracDU,"bino" Binomial, "triang" Discrete Triangular kernel, "BE" extended beta, "GA" gamma, "LN" lognormal and "RIG" reciprocal inverse Gaussian. |
a0 |
The left bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a1 |
The right bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a |
The arm in Discrete Triangular kernel. The default value is 1. |
c |
The number of categories in DiracDU kernel. The default value is 2. |
... |
Further arguments |
The associated kernel is one of the those which have been defined in the sections above : extended beta, gamma,lognormal, reciprocal inverse Gaussian, DiracDU, Binomial and Discrete Triangular; see Kokonendji and Senga Kiessé (2011), and also Kokonendji et al. (2007).
Returns the value of the discrete associated kernel function at t
according to the target and the bandwidth.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function, Journal of Nonparametric Statistics 19, 241 - 254.
x<-5 h<-0.2 t<-0:10 kern.fun(x,t,h,"discrete","bino")
x<-5 h<-0.2 t<-0:10 kern.fun(x,t,h,"discrete","bino")
The function estimates the p.m.f. in a single value or in a grid using discrete associated kernels. Three different associated kernels are available: DiracDU (for categorical data), binomial and discrete triangular (for count data).
kpmfe.fun(Vec,...) ## Default S3 method: kpmfe.fun(Vec, h, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU"), x = NULL, a = 1, c = 2, ...)
kpmfe.fun(Vec,...) ## Default S3 method: kpmfe.fun(Vec, h, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU"), x = NULL, a = 1, c = 2, ...)
Vec |
the data sample from which the estimate is to be computed. |
h |
The bandwidth or smoothing parameter. The smoothing bandwidth to be used, can also be a character string giving a rule to choose the bandwidth. |
type_data |
The data sample type. Data type is "discrete" (categorical or count). |
ker |
The associated kernel: "dirDU" DiracDU,"bino" binomial, "triang" discrete triangular. |
x |
The points of the grid at which the density is to be estimated. |
a |
The arm in discrete triangular kernel. The default value is 1. |
c |
The number of categories in DiracDU. The default value is 2. |
... |
Further arguments. |
The associated kernel estimator of
is defined in the above sections.
We recall that in general, the sum of the estimated values on the support is not equal to 1. In practice, we compute the global normalizing constant
before computing the estimated p.m.f.
; see Kokonendji and Senga Kiessé (2011).
The bandwidth parameter in the function is obtained using the cross-validation technique for the three associated kernels. For binomial kernel, the local Bayesian approach is also implemented and is recommanded to select the bandwidth; see Zougab et al. (2012).
Returns a list containing:
data |
The number of observations. |
n |
The number of observations. |
eval.points |
The support of the estimated p.m.f. |
h |
The bandwidth |
C_n |
The global normalizing constant. |
ISE_0 |
The integrated square error. |
f_0 |
A vector of (x,f0(x)). |
f_n |
A vector of (x,fn(x)). |
f0 |
The empirical p.m.f. |
est.fn |
The estimated p.m.f. containing estimated values after normalization. |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function. Journal of Nonparametric Statistics 19, 241 - 254.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2012). Binomial kernel and Bayes local bandwidth in discrete functions estimation. Journal of Nonparametric Statistics 24, 783 - 795.
## A sample data with n=60. V<-c(10,0,1,0,4,0,6,0,0,0,1,1,1,2,4,4,5,6,6,6,6,7,1,7,0,7,7, 7,8,0,8,12,8,8,9,9,0,9,9,10,10,10,10,0,10,10,11,12,12,10,12,12, 13,14,15,16,16,17,0,12) ##The bandwidth can be the one obtained by cross validation. h<-0.081 ## We choose Binomial kernel. est<-kpmfe.fun(Vec=V,h,"discrete","bino") ##To obtain the normalizing constant: est
## A sample data with n=60. V<-c(10,0,1,0,4,0,6,0,0,0,1,1,1,2,4,4,5,6,6,6,6,7,1,7,0,7,7, 7,8,0,8,12,8,8,9,9,0,9,9,10,10,10,10,0,10,10,11,12,12,10,12,12, 13,14,15,16,16,17,0,12) ##The bandwidth can be the one obtained by cross validation. h<-0.081 ## We choose Binomial kernel. est<-kpmfe.fun(Vec=V,h,"discrete","bino") ##To obtain the normalizing constant: est
This data is the average daily fat yields (kg/day) of milk from a single cow for each of 35 weeks; see Kokonendji et al. (2009).
data(milk)
data(milk)
A data frame with 35 observations on the following 2 variables.
week
Number of the week
yield
The yield quantity
McCulloch, C.E. (2001). An Introduction to Generalized Linear Mixed Models, 46a Reuniao Anual da RBRAS - 9o SEAGRO, University of Sao Paulo - ESALQ, Piracicaba.
Kokonendji, C.C., Senga Kiessé, T. and Demétrio, C.G.B. (2009). Appropriate kernel regression on a count explanatory variable and applications, Advances and Applications in Statistics 12, 99 - 125.
data(milk)
data(milk)
The plot.dke.fun
is to plot the associated kernel density estimation.
## S3 method for class 'dke.fun' plot(x,main = NULL, sub = NULL, xlab = NULL, ylab = NULL, type = "l", las = 1, lwd = 1, col = "blue", lty = 1, ...)
## S3 method for class 'dke.fun' plot(x,main = NULL, sub = NULL, xlab = NULL, ylab = NULL, type = "l", las = 1, lwd = 1, col = "blue", lty = 1, ...)
x |
An object class |
main |
The main parameter |
sub |
The sub title |
xlab , ylab
|
The axis label |
type |
the type parameter |
las |
Numeric in 0,1,2,3; the style of axis labels. |
lwd |
The line width, a positive number, defaulting to 1. |
col |
A specification for the default plotting color. |
lty |
The line type. |
... |
Futher arguments |
Plot of associated kernel density function is sent to graphics window.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function. Journal of Nonparametric Statistics 19, 241 - 254.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2012). Binomial kernel and Bayes local bandwidth in discrete functions estimation. Journal of Nonparametric Statistics 24, 783 - 795.
The functions allows to plot the cross-validation both in discrete plot.hcvd.fun
and continuous plot.hcvc.fun
cases.
## S3 method for class 'hcvc.fun' plot(x, ...) ## S3 method for class 'hcvd.fun' plot(x, ...)
## S3 method for class 'hcvc.fun' plot(x, ...) ## S3 method for class 'hcvd.fun' plot(x, ...)
x |
an object |
... |
Further arguments |
Plot a graphic for cross-validation function
returns a graphics
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function. Journal of Nonparametric Statistics 19, 241 - 254.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2012). Binomial kernel and Bayes local bandwidth in discrete functions estimation. Journal of Nonparametric Statistics 24, 783 - 795.
The plot.kern.fun
function loops through calls to
the kern.fun
function.
## S3 method for class 'kern.fun' plot(x, ...)
## S3 method for class 'kern.fun' plot(x, ...)
x |
an object of class |
... |
Other graphics parameters |
Plot of associated the kernel function is sent to graphics window.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Demétrio, C.G.B. (2009). Appropriate kernel regression on a count explanatory variable and applications, Advances and Applications in Statistics 12, 99 - 125.
The function plots the p.m.f. estimation in a single value or in a grid using discrete associated kernels. Three different associated kernels are available: DiracDU (for categorical data), binomial and discrete triangular (for count data).
## S3 method for class 'kpmfe.fun' plot(x, ...)
## S3 method for class 'kpmfe.fun' plot(x, ...)
x |
An object of class |
... |
Further arguments |
Plot a graphic
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function. Journal of Nonparametric Statistics 19, 241 - 254.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2012). Binomial kernel and Bayes local bandwidth in discrete functions estimation. Journal of Nonparametric Statistics 24, 783 - 795.
Plot for associated kernel regression for univariate data. The plot.reg.fun
function loops through calls to
the reg.fun
function.
## S3 method for class 'reg.fun' plot(x, ...)
## S3 method for class 'reg.fun' plot(x, ...)
x |
An object of class |
... |
other graphics parameters |
The function allows to plot the regression
Plot is sent to graphics window.
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Zocchi, S.S. (2007). Discrete triangular distributions and non-parametric estimation for probability mass function. Journal of Nonparametric Statistics 19, 241 - 254.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2012). Binomial kernel and Bayes local bandwidth in discrete functions estimation. Journal of Nonparametric Statistics 24, 783 - 795.
The function allows to print the result of computation in regression as a data frame.
## S3 method for class 'reg.fun' print(x, digits = NULL, ...)
## S3 method for class 'reg.fun' print(x, digits = NULL, ...)
x |
object of class |
digits |
The number of digits |
... |
Further arguments |
The associated kernel estimator of
is defined in the above sections; see Kokonendji and Senga Kiessé (2011). The bandwidth parameter in the function is obtained using the cross-validation technique for the associated kernels.
Returns a list containing:
data |
The explanatory variable, printed as a data frame |
y |
The response variable, printed as a data frame |
n |
The size of the sample |
kernel |
The associated kernel |
h |
The smoothing parameter |
eval.points |
The grid where the regression is computed, printed as data frame |
m_n |
The estimated values, printed as data frame |
Coef_det |
The Coefficient of determination |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Demétrio, C.G.B. (2009). Appropriate kernel regression on a count explanatory variable and applications, Advances and Applications in Statistics 12, 99 - 125.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2014). Bayesian approach in nonparametric count regression with Binomial Kernel, Communications in Statistics - Simulation and Computation 43, 1052 - 1063.
data(milk) x=milk$week y=milk$yield ##The bandwidth is the one obtained by cross validation. h<-0.10 ## We choose binomial kernel. m_n<-reg.fun(x, y, "discrete",ker="bino", h) print.reg.fun(m_n)
data(milk) x=milk$week y=milk$yield ##The bandwidth is the one obtained by cross validation. h<-0.10 ## We choose binomial kernel. m_n<-reg.fun(x, y, "discrete",ker="bino", h) print.reg.fun(m_n)
The function estimates the discrete and continuous regression in a single value or in a grid using associated kernels. Different associated kernels are available: extended beta, gamma, lognormal, reciprocal inverse Gaussian (for continuous data), DiracDU (for categorical data), binomial and also discrete triangular (for count data).
reg.fun(Vec, ...) ## Default S3 method: reg.fun(Vec, y, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), h, x = NULL, a0 = 0, a1 = 1, a = 1, c = 2, ...)
reg.fun(Vec, ...) ## Default S3 method: reg.fun(Vec, y, type_data = c("discrete", "continuous"), ker = c("bino", "triang", "dirDU", "BE", "GA", "LN", "RIG"), h, x = NULL, a0 = 0, a1 = 1, a = 1, c = 2, ...)
Vec |
The explanatory variable. |
y |
The response variable. |
type_data |
The sample data type. |
ker |
The associated kernel: "dirDU" DiracDU,"bino" binomial, "triang" discrete triangular, etc. |
h |
The bandwidth or smoothing parameter. |
x |
The single value or the grid where the regression is computed. |
a0 |
The left bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a1 |
The right bound of the support used for extended beta kernel. Default value is 0 for beta kernel. |
a |
The arm in Discrete Triangular kernel. The default value is 1. |
c |
The number of categories in DiracDU. The default value is 2. |
... |
Further arguments |
The associated kernel estimator of
is defined in the above sections; see also Kokonendji and Senga Kiessé (2011). The bandwidth parameter in the function is obtained using the cross-validation technique for the seven associated kernels. For binomial kernel, the local Bayesian approach is also implemented; see Zougab et al. (2014).
Returns a list containing:
data |
The data sample, explanatory variable |
y |
The data sample, response variable |
n |
The size of the sample |
kernel |
The asociated kernel |
h |
The bandwidth |
eval.points |
The grid where the regression is computed |
m_n |
The estimated values |
Coef_det |
The coefficient of determination |
W. E. Wansouwé, S. M. Somé and C. C. Kokonendji
Kokonendji, C.C. and Senga Kiessé, T. (2011). Discrete associated kernel method and extensions, Statistical Methodology 8, 497 - 516.
Kokonendji, C.C., Senga Kiessé, T. and Demétrio, C.G.B. (2009). Appropriate kernel regression on a count explanatory variable and applications, Advances and Applications in Statistics 12, 99 - 125.
Zougab, N., Adjabi, S. and Kokonendji, C.C. (2014). Bayesian approach in nonparametric count regression with binomial kernel, Communications in Statistics - Simulation and Computation 43, 1052 - 1063.
data(milk) x=milk$week y=milk$yield ##The bandwidth is the one obtained by cross validation. h<-0.10 ## We choose binomial kernel. ## Not run: m_n<-reg.fun(x, y, "discrete",ker="bino", h) ## End(Not run)
data(milk) x=milk$week y=milk$yield ##The bandwidth is the one obtained by cross validation. h<-0.10 ## We choose binomial kernel. ## Not run: m_n<-reg.fun(x, y, "discrete",ker="bino", h) ## End(Not run)