4

So I have this homework problem that I am struggling a little bit with coming to a solid answer on. The problem goes like this:

Suppose X~Beta($\theta,\theta), (\theta>0)$, and let $\{X_1, X_2 , \ldots , X_n \}$ be a sample. Show that T=$\Pi_i(X_i*(1-X_i)$ is a sufficient statistic for $\theta$.

I started out with my Beta distribution as:

$f(x_i,\theta)=\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{(\alpha-1)}(1-x)^{(\beta-1)}$

$=\frac{\Gamma(\theta + \theta)}{\Gamma(\theta)\Gamma(\theta)}x_1^{(\theta-1)}(1-x_1)^{(\theta-1)} ***\frac{\Gamma(\theta + \theta)}{\Gamma(\theta)\Gamma(\theta)}x_n^{(\theta-1)}(1-x_n)^{(\theta-1)} $

$=\frac{\Gamma(2\theta)}{\Gamma(\theta)^2}x_1^{(\theta-1)}(1-x_1)^{(\theta-1)} ***\frac{\Gamma(2\theta)}{\Gamma(\theta)^2}x_n^{(\theta-1)}(1-x_n)^{(\theta-1)}$

$={(\frac{\Gamma(2\theta)}{\Gamma(\theta)^2})}^n \Pi_i (x_i)(1-x_i)^{(\theta-1)}$

I know that in order for my statistic to be sufficient by factorization, I need to have a $g(T,\theta)$ and a $h(x_1,x_2,...,x_n)$.

What I have above is my $g(T,\theta)$, but I am not so sure about my $h(x_1,x_2,...,x_n)$. I have seen other places where the suggestion is to use 1 for my $h(x_1,x_2,...,x_n)$. Could I do this here with this problem? It just seems a little too easy to do that, but I will be happy if it is that easy.

If anyone could let me know, that would be greatly appreciated.

pglpm
  • 831
Perdue
  • 321
  • 3
  • 4
  • 9
  • 1
    what if $h(x_1, x_2, \dots, x_n) = 1$? I think that suggestion should work. – Enzo Mar 06 '13 at 03:27
  • So I can do that? I didn't know if I could or not. – Perdue Mar 06 '13 at 04:21
  • yes you can! that's justified – Enzo Mar 06 '13 at 14:11
  • What if I took $\Pi_i (x_i)(1-x_i)^{(\theta-1)}$ and broke that up to $\Pi_i (x_i)(1-x_i)^{(\theta)} \frac{1}{\Pi_i (x_i)(1-x_i)}$? Then I would have my statistic right there instead of having to multiply everything by 1. – Perdue Mar 06 '13 at 18:06

2 Answers2

6

If $ X\sim Beta(\alpha,\beta)$, then $f(x;\alpha,\beta) = {\Gamma(\alpha+\beta) \over \Gamma(\alpha)\Gamma(\beta)}x^{\alpha-1}(1-x)^{\beta-1} $ and $ f(\underline x;\alpha,\beta) = ({\Gamma(\alpha+\beta) \over \Gamma(\alpha)\Gamma(\beta)})^{n}(\prod x_{i})^{\alpha-1}(\prod (1-x_{i}))^{\beta-1}$

Thus, the sufficient statisitcs is $(\prod x_{i},\prod (1-x_{i}))$ by Fisher–Neyman factorization theorem.

Here $ h(\underline x) = 1$.

If $ \alpha = \beta $, then sufficient statistics is $(\prod x_i(1-x_i))$. Again $h(\underline x) = 1$

2

(The above answer did not answer minimal sufficiency. So I show it here.)

Note that beta distribution with p.d.f $$\delta_{\alpha, \beta}(x) = cx^{\alpha-1}(1-x)^{\beta-1}dx$$ is a exponential distribution, because it can be rewrite as

$$\delta_{\alpha, \beta}(x) = \exp{\left[\alpha\log(x)+\beta\log(1-x)\right]}\mu(dx)$$

We know that, for example from equation 2.5 from this reference,

a minimal sufficient statistic for a distribution belongs to the full-rank exponential family is $T(X)$, where the p.d.f of exponential family is $\exp{[T(X)'\eta(\theta)]c(\eta(\theta))}\mu(dx)$.

Thus we conclude that $$\left(\sum_{i=1}^n\log(x_i),\sum_{i=1}^n\log(1-x_i)\right)$$ is a minimal sufficient statistic for $(\alpha, \beta)$.

When $\alpha=\beta$, the distribution with p.d.f is still from a full-rank exponential family because the p.d.f can be written as $$\delta_{\alpha}(x) = \exp{\left\{\alpha[\log(x)+\log(1-x)]\right\}}\mu(dx)$$ This means $$\sum_{i=1}^n\left[\log (x_i), \log(1-x_i)\right]$$ is a minimal sufficient statistic when $\alpha=\beta$.

Tan
  • 621