Does a sequence of moments determine the function?

Related questions and answers: Consider a real valued integrable function $f(x)$ at the interval $a \le x \le b$ ; $x,a,b \in \mathbb{R}$ ; $ b > a$ .
The moments of such a function are defined by: $$ M_0 = \int_a^b f(x)\,dx \\ M_1 = \int_a^b x \, f(x)\,dx \\ M_2 = \int_a^b x^2 \, f(x)\,dx \\ \cdots \\ M_k = \int_a^b x^k \, f(x)\,dx \\ \cdots $$ Note that the function $f$ is not said to be positive. Also note that the moments $M$ form an infinite sequence. Now the gist of the question is: can this problem be reversed? That is, given the infinite sequence $M_k$ , can $f(x)$ be solved - and how? - from: $$ \int_a^b f(x)\,dx = M_0\\ \int_a^b x \, f(x)\,dx = M_1\\ \int_a^b x^2 \, f(x)\,dx = M_2\\ \cdots \\ \int_a^b x^k \, f(x)\,dx = M_k\\ \cdots $$ Theorem. Without loss of generality, the domain of the function $f(x)$ may be restricted to $0 \le x \le 1$ and the range of the moments may be restricted to $0 \le M_k \le 1$ .
The primary purpose of this Theorem is to make numerical experiments easier to accomplish.
Proof. Let $t=(x-a)/(b-a)$ or $x=(b-a)t+a$ with $0 \le t \le 1$ and $g(t) = f((b-a)t+a)$ or $f(x) = g((x-a)/(b-a))$ . The moments $m_k$ of $g(t)$ are calculated by: $$ m_k = \int_0^1 t^k \, g(t) \, dt $$ It follows (with Newton's binomial) that: $$ M_k = \int_a^b x^k \, f(x)\,dx = \int_0^1 \left[(b-a)t+a\right]^k g(t) (b-a) \, dt = \sum_{i=0}^k \binom{k}{i} (b-a)^{i+1} a^{k-i} m_i $$ Or, the other way around: $$ m_k = \int_0^1 t^k \, g(t) \, dt = \int_a^b \left[\frac{x-a}{b-a}\right]^k f(x)/(b-a) \, dx = \frac{1}{(b-a)^{k+1}}\sum_{i=0}^k \binom{k}{i}(-a)^{k-i} M_i $$ Herewith the $f(x)$-moments $M_k$ can be converted into $g(t)$-moments $m_k$ and vice versa.
Then maybe we can solve for a function $g(t)$ . But anyway we can do the backward transformation $f(x) = g((x-a)/(b-a))$ . This proves the first part of the Theorem.
The moments $\mu_k$ of a function $h(t) = \lambda\, g(t) + C$ are calculated by: $$ \mu_k = \int_0^1 t^k \, h(t) \, dt = \lambda\, m_k + C/(k+1) $$ Suppose the smallest moment $\mu_k$ is found for $k=p$ and the largest moment is found for $k=q$ , then we can adapt the constants $\lambda$ and $C$ in such a way that all moments $\mu_k$ are between $0$ and $1$ : $$ \mu_p = \lambda\, m_p + C/(p+1) = 0 \quad ; \quad \mu_q = \lambda\, m_q + C/(q+1) = 1 \quad \Longleftrightarrow \\ \lambda = \frac{q+1}{m_q(q+1) - m_p(p+1)} \quad ; \quad C = - \frac{m_p(p+1)(q+1)}{m_q(q+1) - m_p(p+1)} $$ This proves the second part of the Theorem.
But how to proceed with the (re)main(ing) part of the question:
given $0 \le M_k \le 1$ , then - if possible - solve $f$ from $$ \int_0^1 f(x)\,dx = M_0\\ \int_0^1 x \, f(x)\,dx = M_1\\ \int_0^1 x^2 \, f(x)\,dx = M_2\\ \cdots \\ \int_0^1 x^k \, f(x)\,dx = M_k\\ \cdots $$ Note again that nothing has been said about the range of $f$ .
Simplification (?) with help of the above Theorem is optional.
Posted here by this author: the question , a general answer, and a special answer. You are reading the general answer now.
Suppose that we have a (real valued) function $f(x)$ defined WLOG at the interval $0 \le x \le 1$ and with moments $M_k$ , as described in the question. We have said nothing about $f(x)$ being positive, or smooth; the only obvious requirement being that it is integrable. A restriction $0 < M_k < 1$ on the moments, as proposed, does not simplify the problem much, admittedly.

Polynomial

Define a (real valued) polynomial $p(x) = a_0+a_1 x+ a_2 x^2 + \cdots + a_{n-1} x^{n-1} $ , at the same interval $0 \le x \le 1$ . The moments of the polynomial can be calculated easily: $$ m_k = \int_0^1 p(x) x^k \, dx = \\ = a_0 \int_0^1 x^k dx + a_1 \int_0^1 x^{k+1} dx + a_2 \int_0^1 x^{k+2} dx + \cdots + a_{n-1} \int_0^1 x^{k+n-1} dx =\\ = \sum_{j=0}^{n-1} \frac{1}{j+k+1} a_j $$ In matrix form: $$ \left[\begin{array}{c} m_0 \\ m_1 \\ m_2 \\ \cdots \\ m_{n-1} \end{array}\right] = \left[\begin{array}{ccccc} 1 & 1/2 & 1/3 & \cdots & 1/n \\ 1/2 & 1/3 & 1/4 & \cdots & 1/(n+1) \\ 1/3 & 1/4 & 1/5 & \cdots & 1/(n+2) \\ \cdots & \cdots & \cdots & \cdots & \cdots \\ 1/n & 1/(n+1) & 1/(n+2) & \cdots & 1/(2n-1) \end{array}\right] \left[\begin{array}{c} a_0 \\ a_1 \\ a_2 \\ \cdots \\ a_{n-1} \end{array}\right] $$ Now suppose that the first $n$ moments of our function $f(x)$ are equal to the first $n$ moments of the polynomial $p(x)$ , that is: $m_k = M_k$ for $k=0,1,2,\cdots,n-1$ . Then we have $n$ equations with $n$ unknowns, which can be solved (theoretically) by matrix inversion: $$ \left[\begin{array}{c} a_0 \\ a_1 \\ a_2 \\ \cdots \\ a_{n-1} \end{array}\right] = \left[\begin{array}{ccccc} 1 & 1/2 & 1/3 & \cdots & 1/n \\ 1/2 & 1/3 & 1/4 & \cdots & 1/(n+1) \\ 1/3 & 1/4 & 1/5 & \cdots & 1/(n+2) \\ \cdots & \cdots & \cdots & \cdots & \cdots \\ 1/n & 1/(n+1) & 1/(n+2) & \cdots & 1/(2n-1) \end{array}\right]^{-1} \left[\begin{array}{c} M_0 \\ M_1 \\ M_2 \\ \cdots \\ M_{n-1} \end{array}\right] $$ And we can actually find the (coefficients of the) whole polynomial from this limited subset of moments. I said: theoretically. Because the matrix to be inverted is recognized as the infamous Hilbert matrix. The Hilbert matrix is notorious for its bad conditioning, which means in practice that its inverse can only be obtained with pain, namely at the cost of losing many significant digits.

Now forget about the moments for a moment :-)
With a Least Squares Method, on the other hand, the following integral is minimized: $$ \int_0^1 \left[ f(x)-p(x) \right]^2 dx = \int_0^1 \left[ f(x)-\sum_{k=0}^{n-1} a_k x^k \right]^2 dx = \mbox{minimum}(a_0,a_1,a_2,\cdots,a_{n-1}) $$ Take partial derivatives $\partial/\partial a_k \; ; \; k=0,1,2,\cdots,n-1$ to find the minimum: $$ \int_0^1 x^k \left[ f(x) - \sum_{j=0}^{n-1} a_j x^j \right] dx = 0 \quad \Longleftrightarrow \quad \sum_{j=0}^{n-1} \frac{1}{k+j+1} a_j = M_k $$ Which turns out to be exactly the same system of linear (and ill-conditioned) equations as above. Thus we conclude that finding a polynomial with the same first $n$ moments of a given function is equivalent with finding the least squares best fit polynomial of order $(n-1)$ to that function, WLOG at the interval $\left[0,1\right]$ . Now take the limit for $n\rightarrow\infty$ and we're done, it seems. So, any real valued function defined at a finite interval can be resolved (in a least squares sense) from its sequence of moments and there are no restrictions on these moments. It is noted that the functions found are, in general, no probability distributions and can be quite "wild".

At last, there is a residual (i.e. error estimate) to take into account: $$ \int_0^1 \left[ f(x)-p(x) \right]^2 dx = \int_0^1 f(x)^2 dx - 2 \sum_{k=0}^{n-1} a_k \int_0^1 f(x) x^k dx + \sum_{k=0}^{n-1} a_k \sum_{j=0}^{n-1} a_j \int_0^1 x^k x^j dx = \\ = \int_0^1 f(x)^2 dx - \sum_{k=0}^{n-1} a_k M_k $$ It is worrysome that, sometimes but too often, the residual turns out to be negative (numerically). How can that happen? I think it means that our numerical calculations just aren't much reliable, due to poor condtioning of the Hilbert matrix.

Histogram

Define a (real valued) histogram $h(x)$ , at the same interval $0 \le x \le 1$ . Take equal intervals to begin with: $$ h(x) = f_j \quad \mbox{for} \quad j/n < x < (j+1)/n \quad ; \quad j=0,1,2,\cdots,n-1 $$ The moments of the histogram can be calculated easily: $$ m_k = \int_0^1 h(x) x^k \, dx = \sum_{j=0}^{n-1} f_j \int_{j/n}^{(j+1)/n} x^k \, dx = \frac{1}{k+1}\sum_{j=0}^{n-1} \left[ \left(\frac{j+1}{n}\right)^{k+1}-\left(\frac{j}{n}\right)^{k+1}\right] f_j $$ Now suppose that the first $n$ moments of our function $f(x)$ are equal to the first $n$ moments of the histogram $h(x)$ , that is: $m_k = M_k$ for $k=0,1,2,\cdots,n-1$ . Then we have $n$ linear equations with $n$ unknowns, which can be solved (theoretically). So we can actually find the whole histogram from this limited subset of moments. I said: theoretically. Because the linear equations to be solved are ill conditioned again; maybe the corresponding matrix is even worse than the Hilbert matrix.

With histogram functions, there is NO equivalent between the Moment problem and Least Squares: $$ \int_0^1 \left[ f(x) - h(x) \right]^2 dx = \sum_{k=0}^{n-1} \int_{k/n}^{(k+1)/n} \left[ f(x) - f_k \right]^2 dx = \mbox{minimum}(f_0,f_1,f_2,\cdots,f_{n-1}) $$ Take partial derivatives $\partial/\partial f_k \; ; \; k=0,1,2,\cdots,n-1$ to find the minimum: $$ \int_{k/n}^{(k+1)/n} \left[ f(x) - f_k \right] dx = 0 \quad \Longleftrightarrow \quad \frac{\int_{k/n}^{(k+1)/n} f(x)\, dx}{(k+1)/n-k/n} = f_k $$ Meaning that any $f_k$ is the mean of the function $f(x)$ over the interval $\left[k/n,(k+1)/n\right]$ .
At last, there is a residual (i.e. error estimate) to take into account: $$ \int_0^1 \left[ f(x)-h(x) \right]^2 dx = \int_0^1 f(x)^2 dx - \sum_{k=0}^{n-1} \left[\frac{k+1}{n}-\frac{k}{n}\right] f_k^2 $$ Numerical calculations with Histogram Moments reveal that the results seem to be approximately what one would expect with Least Squares (see: Histogram fit to parabolas). But it is not at all obvious (to me at least) that the two methods converge to the same thing.

It is also possible to have histograms as $\color{red}{upper}$ and $\color{green}{lower}$ bounds to a function. Then the (raw) moments of the function are in between the (raw) moments of the histograms. Given an infinite precision computer that does not suffer from ill conditioned equations, then for $n\to\infty$ this fact would prove that an infinite sequence of moments determines the function. It is shown in the picture, though, that solving the histogram equations gives $\color{red}{upper}$ and $\color{green}{lower}$ histograms that are seen to be different from the exact ones, thus confirming again the dreadful conditioning with even moderate size (< $10$) of the equations system. References:


Posted here by this author: the question, a general answer, a special answer. You are reading the special answer now. The reason why is that IMO not enough details are revealed with the general answer. Note that the special answer is very much analogous to this one:

Histograms

The most trivial histogram function $y = f(x)$ at the interval $\left[0,1\right]$ may be defined as follows: $$ y = f_0 \quad \mbox{for} \quad 0 < x < 1 $$ The zero'th moment of this function is: $m_0 = f_0$ and the function as a whole is trivially determined by its zero'th moment: $f_0 = m_0$ .
A bit less trivial histogram function $y = f(x)$ at the same interval may be defined as follows: $$ y = f_0 \quad \mbox{for} \quad 0 < x < \frac{1}{2} \quad ; \quad y = f_1 \quad \mbox{for} \quad \frac{1}{2} < x < 1 $$ The first two moments of this function are: $$ \left[ \begin{array}{c} m_0 \\ m_1 \end{array} \right] = \left[ \begin{array}{cc} 1/2 & 1/2 \\ 1/8 & 3/8 \end{array} \right] \left[ \begin{array}{c} f_0 \\ f_1 \end{array} \right] $$ With the inverse matrix it yields: $$ \left[ \begin{array}{c} f_0 \\ f_1 \end{array} \right] = \left[ \begin{array}{cc} 3 & -4 \\ -1 & 4 \end{array} \right] \left[ \begin{array}{c} m_0 \\ m_1 \end{array} \right] $$ Thus the function as a whole is determined by $m_0$ and $m_1$ . If there are no restrictions upon the function values, then there are no restrictions upon the moments. If we demand, though, that the function values are positive, then further restrictions are imposed upon the moments. Assume that $m_0 > 0$ and define the mean $\mu = m_1/m_0$ : $$ \left. \begin{array}{l} \frac{3}{4} - \frac{m_1}{m_0} > 0 \\ -\frac{1}{4} + \frac{m_1}{m_0} > 0 \end{array}\right\} \quad \Longrightarrow \quad \frac{1}{4} < \mu < \frac{3}{4} $$ Another sample histogram function $y = f(x)$ at the same interval may be defined as follows: $$ y = f_k \quad \mbox{for} \quad \frac{k}{3} < x < \frac{k+1}{3} \quad : \quad k = 0,1,2 $$ The first three moments are: $$ \left[ \begin{array}{c} m_0 \\ m_1 \\ m_2 \end{array} \right] = \left[ \begin{array}{ccc} 1/3 & 1/3 & 1/3 \\ 1/18 & 1/6 & 5/18 \\ 1/81 & 7/81 & 19/81 \end{array} \right] \left[ \begin{array}{c} f_0 \\ f_1 \\ f_2 \end{array} \right] $$ Calculating the inverse matrix yields: $$ \left[ \begin{array}{c} f_0 \\ f_1 \\ f_2 \end{array} \right] = \left[ \begin{array}{ccc} 11/2 & -18 & 27/2 \\ -7/2 & 27 & -27 \\ 1 & -9 & 27/2 \end{array} \right] \left[ \begin{array}{c} m_0 \\ m_1 \\ m_2 \end{array} \right] $$ Thus the function as a whole is determined by $m_0,m_1,m_2$ . If there are no restrictions upon the function values, then there are no restrictions upon the moments. If we demand, though, that the function values are positive, then further restrictions are imposed upon the moments. Assume again that $m_0 > 0$ and define $x = m_1/m_0$ and $y = m_2/m_0$ for the sake of displaying results in a diagram (at the bottom of this answer). Then: $$ \left. \begin{array}{l} f_0 > 0 \\ f_1 > 0 \\ f_2 > 0 \end{array} \right\} \quad \Longleftrightarrow \quad \left\{ \begin{array}{l} y - 4/3 x + 11/27 > 0 \\ y - x + 7/54 < 0 \\ y - 2/3 x + 2/27 < 0 \end{array} \right. $$ Three straight lines delineate the $\color{blue}{blue}$ area of an obtuse triangle with vertices $(1/6,1/27) , (5/6,19/27) , (1/2,7/27)$ , as shown in the diagram at the bottom.
The moments must be such that $(x,y)$ values are within this triangle;
it has an area of $1/27$ .

A further restriction upon the histogram might be that it is sort of a concave function. To be more precise, this would mean that the second order derivative is negative, which is of course impossible with histograms. A reasonable substitute is that the well known discretization of the second order derivative should be negative: $$ f_0 - 2 f_1 + f_2 < 0 \quad \Longleftrightarrow \quad y - x + 1/6 < 0 $$ While the equations for $f_0 > 0$ and $f_2 > 0$ remain the same.
Three straight lines delineate the $\color{green}{green}$ area of an obtuse triangle with vertices $(5/18,1/9) , (13/18,5/9) , (1/2,7/27)$ , as shown in the diagram at the bottom.
The moments must be such that $(x,y)$ values are within this triangle;
it has an area of $4/243 < 1/27$ .

Polynomials

The most trivial polynomial function $y = p(x)$ at the interval $\left[0,1\right]$ is: $y = a_0$ with a zero'th order moment $= m_0$ . This case is identical to the trivial histogram above.
A somewhat less trivial polynomial $y = p(x)$ at the same interval is the linear one: $y = a_0 + a_1 x$ . Its first two moments are: $$ \left[ \begin{array}{c} m_0 \\ m_1 \end{array} \right] = \left[ \begin{array}{cc} 1 & 1/2 \\ 1/2 & 1/3 \end{array} \right] \left[ \begin{array}{c} a_0 \\ a_1 \end{array} \right] $$ With the inverse (symmetric & integer) matrix it yields: $$ \left[ \begin{array}{c} a_0 \\ a_1 \end{array} \right] = \left[ \begin{array}{cc} 4 & -6 \\ -6 & 12 \end{array} \right] \left[ \begin{array}{c} m_0 \\ m_1 \end{array} \right] $$ Thus the function as a whole is determined by $m_0$ and $m_1$ . If there are no restrictions upon the function values, then there are no restrictions upon the moments. If we demand, though, that the function values are positive, then further restrictions are imposed upon the moments. Assume that $m_0 > 0$ and define the mean $\mu = m_1/m_0$ : $$ \left. \begin{array}{l} \frac{2}{3} - \frac{m_1}{m_0} > 0 \\ -\frac{1}{3} + \frac{m_1}{m_0} > 0 \end{array}\right\} \quad \Longrightarrow \quad \frac{1}{3} < \mu < \frac{2}{3} $$ Another sample polynomial $y = p(x)$ at the same interval may be defined as follows: $$ y = a_0 + a_1 x + a_2 x^2 $$ It's a parabola, of course. The first three moments are: $$ \left[ \begin{array}{c} m_0 \\ m_1 \\ m_2 \end{array} \right] = \left[ \begin{array}{ccc} 1 & 1/2 & 1/3 \\ 1/2 & 1/3 & 1/4 \\ 1/3 & 1/4 & 1/5 \end{array} \right] \left[ \begin{array}{c} a_0 \\ a_1 \\ a_2 \end{array} \right] $$ Calculating the (symmetric & integer) inverse matrix yields: $$ \left[ \begin{array}{c} a_0 \\ a_1 \\ a_2 \end{array} \right] = \left[ \begin{array}{ccc} 9 & -36 & 30 \\ -36 & 192 & -180 \\ 30 & -180 & 180 \end{array} \right] \left[ \begin{array}{c} m_0 \\ m_1 \\ m_2 \end{array} \right] $$ Thus the function as a whole is determined by $m_0,m_1,m_2$ . If there are no restrictions upon the function values, then there are no restrictions upon the moments. If we demand, though, that function values are positive, then further restrictions are imposed upon the moments. Assume again that $m_0 > 0$ and define $x = m_1/m_0$ and $y = m_2/m_0$ for the sake of being able to display results in a diagram. Then assume that $p(0) > 0$ , $p(1) > 0$ and that the parabola is concave (i.e. has a maximum). So what we have again is a function that is positive and concave: $$ \left. \begin{array}{l} a_0 > 0 \\ a_0+a_1+a_2 > 0 \\ a_2 < 0 \end{array} \right\} \quad \Longleftrightarrow \quad \left\{ \begin{array}{l} y - 6/5 x + 3/10 > 0 \\ y - 4/5 x + 1/10 > 0 \\ y - x + 1/6 < 0 \end{array} \right. $$ Three straight lines delineate the $\color{red}{red}$ area of an obtuse triangle with vertices $(1/3,1/6) , (2/3,1/2) , (1/2,3/10)$ , as shown in the diagram below.
The moments must be such that $(x,y)$ values are within this triangle;
the triangle has an an area of $1/180 < 4/243 < 1/27$ .

The grey area comes from restrictions on moments as explained in a Wikipedia reference .
The moment sequence $\; m_0,m_1,m_2, \cdots , m_n , \cdots \;$ is completely monotonic iff: $$ (-1)^k(\Delta^k m)_n \ge 0 \quad \mbox{for all} \quad n,k \ge 0 \qquad \mbox{with} \qquad (\Delta m)_n = m_{n+1} - m_n $$ In order to avoid trivialities, strict inequalitites will be assumed. Let's do it, but only for $m_0,m_1,m_2$ : $$ n=0 \; , \; k=0 \quad : \qquad m_0 > 0 \\ n=1 \; , \; k=0 \quad : \qquad m_1 > 0 \\ n=2 \; , \; k=0 \quad : \qquad m_2 > 0 \\ n=0 \; , \; k=1 \quad : \qquad - (m_1 - m_0) > 0 \\ n=1 \; , \; k=1 \quad : \qquad - (m_2 - m_1) > 0 \\ n=0 \; , \; k=2 \quad : \qquad (\Delta m)_1 - (\Delta m)_0 = m_2 - 2 m_1 + m_0 > 0 $$ Summarizing and putting $\;x = m_1/m_0\;$ and $\;y = m_2/m_0\;$ for visualization purposes: $$ y > 0 \quad ; \quad y < x \quad ; \quad y > 2 x - 1 $$ Three straight lines delineate the $\color{gray}{grey}$ area of an obtuse triangle with vertices $(0,0) , (1/2,0) , (1,1)$ as shown in the above diagram. The triangle has an an area of $1/4$ .
By coincidence, the $\color{red}{red}$ polynomial triangle is a proper subset of the $\color{green}{green}$ histogram triangle is a proper subset of the $\color{blue}{blue}$ histogram triangle (is a proper subset of the $\color{grey}{grey}$ "Hausdorff" triangle).
But the question is: does there exist something like "coincidence" in mathematics?