```
set.seed(1234)
<- rnorm(n = 1000, mean = 2, sd = 2)
MCresults <- mean(MCresults)
pointestim <- sd(MCresults)/sqrt(length(MCresults)) se_pointestim
```

# Introduction

Consider a Monte Carlo simulation study where you run \(B=1000\) replications of some data generating mechanism and estimate the parameters of some models. You can treat the replications as independent and identically distributed observations and obtain an idea of the sampling distribution of your statistic: this is useful for power studies, estimation of bias, variance and mean square errors, etc.

Consider a simple random sample \(\boldsymbol{Y} = (Y_1, \ldots, Y_B)^\top\) of size \(B\) (where \(Y_i\) is potentially a random vector) and a functional \(\widehat{\theta}_i = g(Y_i)\). We are interested in estimating the expectation \(\theta = \mathsf{E}\{g(\boldsymbol{Y})\}\), which is approximated by simple Monte Carlo using the sample average \[\begin{align*} \widehat{\theta} = \frac{1}{B} \sum_{b=1}^B \widehat{\theta}_b. \end{align*}\]

The estimator \(\widehat{\theta}\) is consistent and asymptotically Gaussian, by virtue of the strong law of large number and the central limit theorem. If the second moment of the functional is finite, meaning \(\mathsf{E}\{g^2(Y_i)\}<\infty\), we can get a measure of uncertainty by computing the standard error of the statistic. Thus, if you want to estimate the expectation of a point estimator \(\widehat{\theta}\), you simply would compute \(\widehat{\theta}_i\) for each replication and compute the sample mean. The standard error of the mean is \(\mathsf{sd}(\widehat{\theta})/\sqrt{B}\) if the \(B\) replications are independent.

The point estimate reported by the software is, to six decimal place, \(1.946806\), which naively suggests a large precision. However, this ignores the sampling variability: the estimated standard error is \(0.06\). If we build a Wald symmetric 95% confidence intervals of the form [\(\pm 1.96 \widehat{\mathsf{se}}(\widehat{\theta})\)], we find \([1.82, 2.07]\) suggesting that all digits but the first decimal place are spurious. One should thus report \(1.9\) for the point estimate.

The formula can also be used to obtain the number of replications \(B = 3.84\widehat{\mathsf{Var}}(\widehat{\theta})/\mathrm{prec}\), where for \(B\) large enough the variance of the parameter estimate is obtained via Monte Carlo from the sample variance. Suppose that I want a precision of \(\mathrm{prec}=0.005\) for my point estimator so as to report the estimates to two decimal place. Given the estimated standard deviation, \(B = \mathsf{var}(\widehat{\theta}) / 0.000025\), I would need roughly \(6.11133\times 10^{5}\) Monte Carlo replications.

Outside of simulation studies, we may not have access to a data generating mechanism to evaluate the standard error of an estimator. We can resort to bootstrap methods for more general cases; see Davison and Hinkley (1997) for more details.

## References

*Bootstrap Methods and Their Application*. New York: Cambridge University Press.

*Handbook of Markov Chain Monte Carlo*, edited by S. Brooks, A. Gelman, G. Jones, and X. L. Meng, 3–48. Boca Raton: CRC Press. https://doi.org/10.1201/b10905.

## Citation

```
@online{belzile2023,
author = {Belzile, Léo},
title = {How Many Digits?},
date = {2023-08-26},
url = {https://lbelzile.bitbucket.io/posts/numberofdigits/},
langid = {en}
}
```