Connecting Modern and Elementary Probability (2)

From the previous blog, we see that once the probability measure is absolutely continuous with a properly defined base product measure, then the resulted “joint density function”, the induced “marginal density function” and “conditional density” can be used freely in most elementary probability formulas, such as marginalization, conditioning, the Bayes formula, etc. However, the single non-absolutely-continuous variable case in Example 3 does arise sometimes. For example, consider the following Dirichlet process modeling

(1)   \begin{equation*} \begin{array}{l} \theta \sim G \\ G\sim DP\left( {\alpha ,G_{0} } \right) \\ \end{array} \end{equation*}

When we want to find the posterior Dirichlet process, it involves such a distribution,

(2)   \begin{equation*} G\vert \theta \sim DP\left( {\alpha +1,\frac{\alpha G_{0} +\delta \left( {\theta ,\cdot } \right)}{\alpha +1}} \right) \end{equation*}

Note that \theta may not be in \mathbb{R}, instead it can be multi-dimensional. Even in this case, the non-absolute-continuous part is still a point mass. Similarly, the resulted posterior p\left( {\theta_{i} \vert \theta_{\backslash i} } \right) in Gibbs sampling is also a non-absolute-continuous distribution with its non-absolute-continuous part having multiple atoms.

Similar situation happens when we want to associate a certain “risk functional” for learning with graphical models. I will postpone this issue until I learn more about stochastic process. For now, when dealing with usual probabilistic graphical models, I consider absolute continuous cases only.

In the following, there is a summarized claim which we can make before using the term “density” freely.

Let \Omega =\prod\limits_{i=1}^k {\Omega_{i} } be the sample space. Assume for each i, either \Omega_{i} =\mathbb{R} or \left| {\Omega _{i} } \right|\leqslant \aleph_{0}. Let \mu =\otimes_{i=1}^{k} \mu_{i} be the base measure over \Omega, where \mu_{i} is the counting measure over \Omega_{i} if \left| {\Omega_{i} } \right|\leqslant \aleph_{0}, and the Lebesgue measure over \mathbb{R} otherwise.

Let Z=\left\{ {Z_{1} ,...,Z_{k} } \right\} be a set of random variables. For any z =\left( {z_{1} ,...,z_{k} } \right) in \Omega, define Z_{i} \left( z \right)=z_{i} for each i=1,...,k

Let P be a probability measure over \left( {\Omega ,\mathcal{B}\left(\Omega \right)} \right), and assume that P<<\mu. Let p:\Omega \to \left[ {0,\infty } \right) be the “density” function defined as p=dP/d\mu. To simplify the notation, for any X\subset Z, denote \Omega _{X} =\prod\nolimits_{Z_{i} \in X} {\Omega_{i} }, and \mu_{X} =\otimes _{Z_{i} \in X} \mu_{i}

Two extra definitions: (i) For any X,Y as a partition of Z, define p\left( x \right)=\int_{\Omega_{Y} } {p\left( z \right)d\mu_{Y} }; (ii) For any X,Y\subset Z,X\cap Y=\emptyset, define p\left( {x\vert y} \right)=p\left( {x,y} \right)/p\left( y \right)

With these we have that p satisfies the following elementary probability formulas:

  • If X,Y\subset Z,X\cap Y=\emptyset, then p\left( x \right)=\int_{\Omega_{Y} } {p\left( {x,y} \right)d\mu_{Y} }
  • If X,Y\subset Z,X\cap Y=\emptyset, then p\left( {x\vert y} \right)p\left( y \right)=p\left( {y\vert x} \right)p\left( x \right)
  • If X\subset Z,A\in \sigma \left( X \right), then P\left( {\left[ {X\in A} \right]} \right)=\int_A {p\left( x \right)d\mu_{X} }
  • If X,Y\subset Z,X\cap Y=\emptyset, and A\in \sigma \left( X \right),B\in \sigma \left( Y \right), then
    \small P\left( {\left[ {X\in A} \right]\vert \left[ {Y\in B} \right]} \right)=P\left( {\left[ {\left( {X,Y} \right)\in A\times B} \right]} \right)/P\left( {\left[ {Y\in B} \right]} \right)
  • If X,Y,V\subset Z, X\bot Y\vert V, then p\left( {x,y\vert v} \right)=p\left( {x\vert v} \right)\cdot p\left( {y\vert v} \right)
  • If X,Y\subset Z,X\cap Y=\emptyset, E\left( {f\left( X \right)\vert Y=y} \right)=\int_{\Omega_{X} } {f\left( x \right)p\left( {x\vert y} \right)d\mu_{X} }

In this case, if we talk about a parametrized family p_\theta\left(x\right), it is really the same thing as the conditional density p\left(x \vert \theta\right) with certain conditions satisfied. Let’s talk about this in detail next time. (to be continued)

 

Leave a Comment January 1, 2012

Connecting Modern and Elementary Probability (1)

In the past quarter, I took two classes “Probability Theory” and “Real Analysis” together. It’s very interesting looking at the same content in two different perspectives. Elementary probability formulas like marginalization, conditioning, and the Bayes formula are widely used in the area of probabilistic graphical models, with the notion of “probability density” and “probability mass” used interchangeably very often. Yet it is not so obvious how they are consistent with rigorous modern probability theory. In addition, the distinguishing between “parameters” and “random variables” is not very clear, e.g., when can we view p_{\theta } \left( X \right) and p\left( {X\vert \theta } \right) in an identical way? Also, how to generalize the notion of “entropy” from discrete variables to continuous variables (even with non-absolute-continuity) is not clear. This passage is trying to justify these questions.

Elementary Probability in the Modern Viewpoint

Let \Omega =\prod\limits_{i=1}^k {\Omega_{i} } be the sample space, let \mu =\otimes_{i=1}^{k} \mu_{i} be a certain “base measure” over \Omega. Let Z=\left\{ {Z_{1} ,...,Z_{k} } \right\} be a set of random variables over \Omega. For each i=1,...,k, any z=\left( {z_{1} ,...,z_{k} } \right) in \Omega, define Z_{i} \left( z \right)=z_{i}. Let P be a probability measure over \left( {\Omega ,\mathcal{B}\left( \Omega \right)} \right), let X,Y,V\subset Z be sets of random variables, with the corresponding elements of base measure \mu_{X} ,\mu_{Y} ,\mu_{V}, and \Omega_{X} ,\Omega_{Y},\Omega_{V} the corresponding subspace of \Omega. We want to see when does there exist a notion of “joint density” or “joint mass” p\left( z \right) for which the following results hold, with proper definitions available:

  1. If X,Y is a partition of Z, then p\left( x \right)=\int_{\Omega_{Y} } {p\left( z \right)d\mu_{Y} }
  2. If X,Y\subset Z,X\cap Y=\emptyset, then p\left( x \right)=\int_{\Omega_{Y} } {p\left( {x,y} \right)d\mu_{Y} }
  3. If X,Y\subset Z,X\cap Y=\emptyset, then p\left( {x\vert y} \right)=p\left( {x,y} \right)/p\left( y \right)
  4. If X,Y\subset Z,X\cap Y=\emptyset, then p\left( {x\vert y} \right)p\left( y \right)=p\left( {y\vert x} \right)p\left( x \right)
  5. If X\subset Z,A\in \sigma \left( X \right), then P\left( {\left[ {X\in A} \right]} \right)=\int_A {p\left( x \right)d\mu_{X} }
  6. If X,Y\subset Z,X\cap Y=\emptyset, and A\in \sigma \left( X \right),B\in \sigma \left( Y \right), then
    \tiny P\left( {\left[ {X\in A} \right]\vert \left[ {Y\in B} \right]} \right)=P\left( {\left[ {\left( {X,Y} \right)\in A\times B} \right]} \right)/P\left( {\left[ {Y\in B} \right]} \right)
  7. If X,Y,V\subset Z, X\bot Y\vert V, then p\left( {x,y\vert v} \right)=p\left( {x\vert v} \right)\cdot p\left( {y\vert v} \right)
  8. If X,Y\subset Z,X\cap Y=\emptyset, E\left( {f\left( X \right)\vert Y=y} \right)=\int_{\Omega_{X} } {f\left( x \right)p\left( {x\vert y} \right)d\mu_{X} }

Example 1: Continuous Variables

Let \left( {\Omega =\mathbb{R}^{2},\mathcal{F}=\mathcal{B}\left({\mathbb{R}^{2}} \right),P} \right) be a probability space. Define random variables X,Y:\left( {\Omega ,\mathcal{F}} \right)\mapsto \left( {\mathbb{R},\mathcal{B}\left( \mathbb{R} \right)} \right), such that for each z=\left( {x,y} \right) in \Omega,  X\left( z \right)=x, Y\left( z \right)=y. In this case, the \sigma-field generated by Y is \sigma \left( Y \right)=\left\{{\mathbb{R}\times B\vert B\in \mathcal{B}\left( \mathbb{R} \right)} \right\}. Let \mu_{2} be the Lebesgue measure on \mathbb{R}^{2}, \mu_{1} be the Lebesgue measure on \mathbb{R}. Assume P<<\mu_{2}, and P is \sigma-finite.

Define p\left( {x,y} \right)=dP/d\mu_{2}, then \forall F\in \mathcal{F}, P\left( F \right)=\int_F {p\left( {x,y} \right)d\mu_{2} }. Define p\left( x \right)=\int_\mathbb{R} {p\left( {x,y} \right)d\mu_{1} \left( y \right)} and p\left( y \right)=\int_\mathbb{R} {p\left( {x,y} \right)d\mu_{1} \left( x \right)}. \forall A\in \mathcal{B}\left( \mathbb{R} \right), we have [1]

(1)   \begin{equation*} P\left( {\left[ {X\in A} \right]\vert Y} \right)=\frac{\int_A {p\left( {x,y} \right)d\mu_{1} \left( x \right)} }{p\left( y \right)} \end{equation*}

(1) is a function of z. \forall \left( {x_{1} ,y} \right),\left( {x_{2} ,y} \right)\in \Omega, \forall F\in \mathcal{F}, we have \left( {x_{1},y} \right)\in F\Leftrightarrow \left( {x_{2} ,y} \right)\in F, so \forall E\in \mathcal{B}\left( {\left[ {0,1} \right]} \right), we have P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{1} ,y} \right)} \in E\Leftrightarrow P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{2} ,y} \right)} \in E, this means that P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{1} ,y} \right)} =P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{2} ,y} \right)}. So P\left( {\left[ {X\in A} \right]\vert Y} \right) is constant over x

Let \nu_{y} \left( A \right)=P\left( {\left[ {X\in A} \right]\vert Y} \right), then \nu_{y} <<\mu_{1} \left( \mathbb{R} \right), we can define p\left( {x\vert y} \right)=d\nu_{y} /d\mu_{1}, then

(2)   \begin{equation*} P\left( {\left[ {X\in A} \right]\vert Y} \right)=\int_A {p\left( {x\vert y} \right)d\mu_{1} \left( x \right)} \end{equation*}

Thus, by the integration comparison lemma [1], we have

    \begin{equation} \[p\left( {x|y} \right) = p\left( {x,y} \right)/p\left( y \right),{\text{ a}}{\text{.e}}{\text{. }}{\mu _1}\left( x \right)\] \end{equation}

Similarly, we can define p\left( {y\vert x} \right).

By these definitions, the elementary formulas are true.

Example 2: Combining with Discrete Variables

Let S\subset \mathbb{Z} be a discrete set. Let \left( {\Omega =\mathbb{R}\times S,\mathcal{F}=\mathcal{B}\left( \Omega \right),P} \right) be a probability space, \omega =\left( {x,y} \right) is an arbitrary element in \Omega. Define measureable functions X,Y:\left( {\Omega ,\mathcal{F}} \right)\mapsto \left( {\mathbb{R},\mathcal{B}\left( \mathbb{R} \right)} \right) as X\left( \omega \right)=x,Y\left( \omega \right)=y. So \sigma \left( Y \right)=\left\{ {\mathbb{R}\times B\vert B\in \mathcal{P} \left( S \right)} \right\} and \sigma \left( X \right)=\left\{ {A\cap S\vert A\in \mathcal{B}\left( \mathbb{R} \right)} \right\}. Let \mu be the Lebesgue measure on \mathbb{R}, let \lambda be the counting measure on S Assume P<<\mu \times \lambda, and P is \sigmafinite.

Let p\left( {x,y} \right)=dP/d\mu \times \lambda, then \forall F\in \mathcal{F}, P\left( F \right)=\int_F {p\left( {x,y} \right)d\mu \times \lambda }. Define p\left( x \right)=\int_S {p\left( {x,y} \right)d\lambda \left( y \right)} and p\left( y \right)=\int_\mathbb{R} {p\left( {x,y} \right)d\mu \left( x \right)}. \forall A\in \mathcal{B}\left( \mathbb{R} \right), we have

(3)   \begin{equation*} P\left( {\left[ {X\in A} \right]\vert Y} \right)=\frac{\int_A {p\left( {x,y} \right)d\mu \left( x \right)} }{p\left( y \right)} \end{equation*}

(1) is a function of \omega. \forall \left( {x_{1} ,y} \right),\left( {x_{2} ,y} \right)\in \Omega, \forall F\in \mathcal{F}, we have \left( {x_{1} ,y} \right)\in F\Leftrightarrow \left( {x_{2} ,y} \right)\in F so \forall E\in \mathcal{B}\left( {\left[ {0,1} \right]} \right), we have
P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{1} ,y} \right)} \in E\Leftrightarrow P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{2} ,y} \right)} \in E, this means that P\left( {\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{1} ,y} \right)} =P\left({\left[ {X\in A} \right]\vert Y} \right)_{\left( {x_{2} ,y} \right)}. So P\left( {\left[ {X\in A} \right]\vert Y} \right) is constant over x

Let \nu_{y} \left( A \right)=P\left( {\left[ {X\in A} \right]\vert Y} \right), then \nu_{y} <<\mu \left( \mathbb{R} \right), we can define p\left( {x\vert y} \right)=d\nu_{y} /d\mu, then

(4)   \begin{equation*} P\left( {\left[ {X\in A} \right]\vert Y} \right)=\int_A {p\left( {x\vert y} \right)d\mu \left( x \right)} \end{equation*}

Similarly, we can define p\left( {y\vert x} \right).

By these definitions, the elementary formulas are true.

Example 3: Single Non-Absolute Continuous Variable

Let \left( {\Omega =\mathbb{R},\mathcal{F}=\mathcal{B}\left( \mathbb{R} \right),P} \right) be a probability space. Let X\left( x \right)=x. If P is not absolutely continuous w.r.t. \mu, then we have the Lebesgue decomposition P=\Bar{{P}}+\Hat{{P}}, such that \Bar{{P}}<<\mu ,\Hat{{P}}\bot \mu. It’s easy to show that \Hat{{P}} assign probability mass to at most countably many points. Denote the set of these points as S=\left\{ {s_{n} } \right\}_{n=1}^{\infty } \subset \mathbb{R}. Let \lambda_{S} be the counting measure over S, and let \Bar{{p}}=d\Bar{{P}}/d\mu, let \Hat{{p}}=d\Hat{{P}}/d\lambda_{S}. Then dP=\Bar{{p}}d\mu +\Hat{{p}}d\lambda_{S}. For any A\in \mathcal{F}, we have

(5)   \begin{equation*} \begin{gathered} P\left( {\left[ {X \in A} \right]} \right) = \int_A {dP} \hfill \\ = \int_A {\bar pd\mu } + \int_A {\hat pd{\lambda _S}} \hfill \\ = \int_A {\bar pd\mu } + \sum\limits_{n = 1}^\infty {\hat p\left( {{s_n}} \right)} \hfill \\ \end{gathered} \end{equation*}

So the notion of “density” consists of two parts, the continuous part \Bar{{p}} and the discrete part \Hat{{p}}, but we also see that P < < \mu + {\lambda _S}, so if we let \nu = \mu + {\lambda _S}, we have P < < \nu. Let p = dP/d\nu, then this “hybrid density” works fine for calculating probability of a certain event through integration w.r.t. \nu. In addition, p = \bar p + \hat p almost everywhere \mu. However, other elementary probability formulas may not hold as we are only working with the 1d case here.

Conclusion

Example 1 and 2 can be generalized to the case with more than 2 variables. As long as integration is through the right corresponding base measure, all elementary formulas hold. Example 3 seems difficult to generalize to more than 1 variable, since in 2d space, the non-absolute-continuous part is not so simple. How to combine such variables with example 1 and 2 is also not clear. I will keep working on this. (to be continued…)

References

[1] S. Resnick, A Probability Path, Birkhäuser Boston, 1999.
[Bibtex]
@book{AProbabilityPath,
    author = {Resnick, Sidney},
    day = {16},
    howpublished = {Hardcover},
    isbn = {081764055X},
    keywords = {book},
    month = oct,
    publisher = {Birkh\"{a}user Boston},
    title = {A Probability Path},
    url = {http://www.worldcat.org/isbn/081764055X},
    year = {1999}
}

Leave a Comment December 27, 2011

Boolean Fourier Transform and the Laplacian

Boolean functions have a natural form of Fourier Transform [1], corresponding to eigenfunctions of the discrete Laplacian operator. Here is a short justification.

Boolean Function Space

Let F=\left\{ {f\vert f:\left\{ {0,1} \right\}^{n}\to \left\{ {-1,+1} \right\}} \right\} be the space of functions over n Boolean inputs. Each f\in F is a 2^{n}-dimensional vector. Attach to the 2^{n} vector space an inner product as follows,

(1)   \begin{equation*} \left\langle {f,g} \right\rangle =\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n}} {f\left( x \right)g\left( x \right)} ,\mbox{\thinspace \thinspace }\forall f,g\in F \end{equation*}

The induced norm is \left\| f \right\|=\sqrt {\left\langle {f,f} \right\rangle }. For Boolean functions f\in F, we have

(2)   \begin{equation*} \left\| f \right\|^{2}=\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n}} {f\left( x \right)^{2}} =1 \end{equation*}

Discrete Laplacian Operator

Define a graph of the domain of f\in Fas G=\left\langle {V,E}\right\rangle, where

(3)   \begin{equation*} \begin{array}{l} V=\left\{ {0,1} \right\}^{n} \\ E=\left\{ {\left\langle {i,j} \right\rangle \vert x_{i} ,x_{j} \in V,H\left( {x_{i} ,x_{j} } \right)=1} \right\} \\ \end{array} \end{equation*}

here H\left( {\cdot ,\cdot } \right) is the Hamming distance over binary vectors. Let the weight of each e\in E to be set as 1. Let d\left( {v_{i} ,v_{j} } \right) be defined as the Geodesic distance over G. Then the discrete Laplacian operator is defined as

(4)   \begin{equation*} \left( {\Delta f} \right)\left( v \right)=\sum\limits_{w:d\left( {w,v} \right)=1} {\left( {f\left( w \right)-f\left( v \right)} \right)} \end{equation*}

Fourier Basis as Eigenfunctions

For each s\in \left\{ {0,1} \right\}^{n}, define a function \phi_{s} :\left\{ {0,1} \right\}^{n}\to \left\{ {-1,1} \right\} as

(5)   \begin{equation*} \phi_{s} \left( x \right)=\left( {-1} \right)^{\sum\limits_{i=1}^n {x_{i} s_{i} } } \end{equation*}

For \forall a,b\in \left\{ {0,1} \right\}^{n},a\ne b, \exists j\in\left\{ {1,...,n} \right\}, such that a_{j} \ne b_{j}, we have

(6)   \begin{equation*} \small \begin{array}{l} \left\langle {\phi_{a} ,\phi_{b} } \right\rangle =\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n}} {\phi_{a} \left( x \right)\phi_{b} \left( x \right)} \\ =\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n}} {\left[ {\left( {-1} \right)^{\sum\limits_{i=1}^n {x_{i} a_{i} } }} \right]\left[ {\left( {-1} \right)^{\sum\limits_{i=1}^n {x_{i} b_{i} } }} \right]} \\ =\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n}} {\left( {-1} \right)^{x_{j} a_{j} +x_{j} b_{j} }\left[ {\left( {-1} \right)^{\sum\limits_{i\ne j} {x_{i} a_{i} } }} \right]\left[ {\left( {-1} \right)^{\sum\limits_{i\ne j} {x_{i} b_{i} } }} \right]} \\ =\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n},x_{j} =0} {\left( {+1} \right)\left[ {\left( {-1} \right)^{\sum\limits_{i\ne j} {x_{i} a_{i} } }} \right]\left[ {\left( {-1} \right)^{\sum\limits_{i\ne j} {x_{i} b_{i} } }} \right]} \\ +\frac{1}{2^{n}}\sum\limits_{x\in \left\{ {0,1} \right\}^{n},x_{j} =1} {\left( {-1} \right)\left[ {\left( {-1} \right)^{\sum\limits_{i\ne j} {x_{i} a_{i} } }} \right]\left[ {\left( {-1} \right)^{\sum\limits_{i\ne j} {x_{i} b_{i} } }} \right]} \\ =0 \\ \end{array} \end{equation*}

By (2), we have \left\langle {\phi_{a} ,\phi_{a} } \right\rangle =1. So the set C=\left\{ {\phi_{s} \vert s\in \left\{ {0,1} \right\}^{n}} \right\} is orthonormal, with cardinality 2^{n}, which is finite and the same as the dimensionality of F. So the set C forms an orthonormal basis of F

(7)   \begin{equation*} F=span\left( C \right) \end{equation*}

In addition, plug (5) into (4) we have

(8)   \begin{equation*} \begin{array}{l} \left( {\Delta \phi_{s} } \right)\left( v \right) \\ =\sum\limits_{w:d\left( {w,v} \right)=1} {\left( {\left( {-1} \right)^{\sum\limits_{i=1}^n {w_{i} s_{i} } }-\left( {-1} \right)^{\sum\limits_{i=1}^n {v_{i} s_{i} } }} \right)} \\ =\sum\limits_{j=1}^n {\left( {\left( {-1} \right)^{\left( {1+v_{j} } \right)s_{j} +\sum\limits_{i\ne j} {v_{i} s_{i} } }-\left( {-1} \right)^{v_{j} s_{j} +\sum\limits_{i\ne j} {v_{i} s_{i} } }} \right)} \\ =\sum\limits_{j=1}^n {\left( {\left( {-1} \right)^{\left( {1+v_{j} } \right)s_{j} }-\left( {-1} \right)^{v_{j} s_{j} }} \right)\left( {-1} \right)^{\sum\limits_{i\ne j} {v_{i} s_{i} } }} \\ =\sum\limits_{j=1}^n {\left( {\left( {-1} \right)^{s_{j} }+1} \right)\left( {-1} \right)^{\sum\limits_{i=1}^n {v_{i} s_{i} } }} \\ =\left( {\sum\limits_{j=1}^n {\left( {\left( {-1} \right)^{s_{j} }+1} \right)} } \right)\phi_{s} \left( v \right) \\ \end{array} \end{equation*}

So \phi_{s} is an eigenfunction of \Delta, with eigenvalue \lambda_{s} =\left( {\sum\limits_{j=1}^n {\left( {\left( {-1} \right)^{s_{j} }+1}\right)} } \right), which is two times the number of 0′s in s. So there are in total n+1 different eigenvalues of \Delta, each has C_{n}^{\lambda_{s} /2} corresponding eigenvectors.

Since the space \left\{ {0,1} \right\}^{n} is finite, there exists amatrix notation L\left( {v',v} \right) of the Laplacian operator such that

(9)   \begin{equation*} \left( {\Delta f} \right)\left( v \right)=\sum\limits_{v'\in \left\{ {0,1} \right\}^{n}} {L\left( {v',v} \right)f\left( v \right)} \end{equation*}

by representation theory, we have

(10)   \begin{equation*} L\left( {v',v} \right)=\sum\limits_{s\in \left\{ {0,1} \right\}^{n}} {\lambda_{s} \phi_{s} \left( v \right)\phi_{s} \left( {v'} \right)} \end{equation*}

Questions

  • What is the most fundamental mathematical structure to define Fourier Transform, a topological space, a metrical space or even a group?
  • Do they all correspond to certain kinds of eigenfunctions?

Reference

[1] [doi] N. Linial, Y. Mansour, and N. Nisan, “Constant depth circuits, Fourier transform, and learnability,” J. ACM, vol. 40, pp. 607-620, 1993.
[Bibtex]
@article{BooleanFourier,
 author = {Linial, Nathan and Mansour, Yishay and Nisan, Noam},
 title = {Constant depth circuits, Fourier transform, and learnability},
 journal = {J. ACM},
 volume = {40},
 issue = {3},
 month = {July},
 year = {1993},
 issn = {0004-5411},
 pages = {607--620},
 numpages = {14},
 url = {http://doi.acm.org/10.1145/174130.174138},
 doi = {http://doi.acm.org/10.1145/174130.174138},
 acmid = {174138},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {AC0 circuits, Boolean functions, approximation, complexity, harmonic analysis learning},
}

Leave a Comment December 26, 2011

The Chamber of Secrets has been Opened

But where is the entrance?

It’s time to take a break.

Science fiction movies prefer to depict aliens as insect-like, which are commonly ugly, cold blooded, uncivilized, without fear of death etc. However, at the same time, they usually have technologies we can’t even imagine of (e.g. district 9, cowboy and aliens, etc.). So does it mean that those species, somehow, at some point, evolved into a certain state, that unconsciously they know how to let P = NP (or approximately, but with guarantee), such that there is no need to have a highly civilized society to support researches which are unpredictable, and inventions can just be done on demand?

Leave a Comment December 11, 2011

I Eat When I’m Upset

“Down here, I’m the King … I’m the King~~~ … … Sorry what? I can’t hear you cuz I’m sitting too high … …”

So in the summer, I drive people to airport, I help people moving their furniture, I store people’s belongings, I take care of people’s cat, and many others. I believe I’m doing the good things, I’m bringing happiness to the world. But why my researches don’t work out? I need you my God…

2 Comments August 5, 2011

重看《黑客帝国》

一大早看了《加勒比海盗4》,虽说是一场视听盛宴,但总觉得还是缺乏内涵。回家又看了一遍《黑客帝国》。这部十多年来我看了无数遍的电影,每次看还是能有不少新的领悟。制衡是最终级的艺术,两个老头在构建和谐社会,探究终级文明的存在形式上孜孜不倦的追求,让人感动。很多东西和系统动力学,优化这些工程学的东西之间都有千丝万缕的联系,可以相互借鉴。最近对这些东西很有兴趣,学起来却总是心有余而力不足。继续干活。。

当然视听盛宴型的电影还是要看的,期待7月1号的变形金刚和7月15号的哈利波特。

4 Comments May 21, 2011

Is There a Way We Can Be Immortal?

It seems rather too far away for the biological science area to discover the mechanism of natural death and find out a way to make existing people immortal. It will probably take more than 50 years. So I’m not hoping to have that applicable to me. Another possible approach is to first develop true AI and hopefully by using large scale AI, this science discovery process can be accelerated. However, based on my current knowledge of AI, the true AI, when developed, has limitation on its scale, which intrinsically, will not exceed the level of human intelligence to a non-trivial extent. What only we can do to improve the capability of AI is to improve the connectivity of AI systems to the external world, and the communications among individual AI systems. But after all, that is no more than a group of well-collaborated scientists.

However, there is still hope. We can do something like this: When we know truly what AI is and how intelligence works in our human brain, which is not that difficult, then we can develop a artificial brain(AB) that is functionally the same as the biological brain(BB), except that it is implemented by man made techniques, and will never die. A further step we need to go through is to develop a method to connect AB to BB, let the artificial neurons and biological neurons work together, exchange information freely. We make sure that at first, the portion of AB is 0%. Then the portion of AB goes up gradually by substituting neurons in BB one by one, using neurons in AB. Finally, we get a AB that never dies, and it is you. At any time when the substituting is happening, you are still alive and have your complete, unique personality with the hybrid brain. Once you are 100% in AB, then you can base your body on whatever you want, thus you never die. This is not that difficult compared to the biological approach, and is hopefully possible in the near future. I’m looking forward to that.

If you watched the Tron movie, this makes a reasonable explanation to the fact that people can change their physical forms to the digital correspondences, and still maintains their complete personalities continuously.

Leave a Comment April 26, 2011

The Secret

I believe this is the spirit of America. The law of attraction.

YouTube Preview Image

 

Leave a Comment April 17, 2011

倚楼听风雨,淡看江湖路

远离尘嚣的,不是住所,而是心境。
找一个遗世独立的地方,品读经典,追求卓越。
Inside U-haul. Pouring outside.

Leave a Comment April 10, 2011

Teaching Practice 3: Probabilistic Graphical Models

It’s so difficult to introduce Probabilistic Graphical Model in 8 minutes.

/video/SHAO_YuanglongLouis_Unit_III.flv

Leave a Comment February 19, 2011

Previous page


Categories

Archives

Recent Comments

Meta