Studies in Computer Sciences and Practices in Software Engineering

Confession of a would be Bayesianian, Bayesian, I mean

My interest in Bayesian decision theory stem from my need to apply Bayesian analysis to the pattern analysis task I'm undertaking lately. I encountered a chapter titled "Bayesian Decision Theory" as I was reading the book "Pattern Classification" by R. Duda, et al.

I went through the chapter back and forth for a few times. Yet I still didn't get a grasp what the essence of the theory of "Baysian decision" is. I googled the Internet for the term and with the exception of a few good references, not many of them shed new insight.

Finally I discovered the sample pages of James O. Berger's book "Statistical Decision Theory and Bayesian Analysis" in google book search.

  • There's no such thing as the so called null hypothesis in the overly simplified sense, i.e. nothing is perfect: coins are always unfair.
  • Making inferences without a context of decision making is a waste of effort, since the inference always contains less information than the data itself.

04:32:50 PM on 05/22/2008 by x mar - Computer Science - comments

Bayesian Decision Theory

theory
• noun (pl. theories) 1 a supposition or a system of ideas intended to explain something, especially one based on general principles independent of the thing to be explained. 2 an idea accounting for or justifying something. 3 a set of principles on which an activity is based. (Compact Oxford English Dictionary)
Decision theory is a formal theory of decision making under uncertainty. (Kathryn Blackmond Laskey, Department of Systems Engineering and Operations Research, George Mason University)
The approach to statistics which formally seeks to utilize prior information is called Bayesian analysis. Bayesian analysis and decision theory go rather naturally together, partly because of their common goal of utilizing nonexperimental sources of information, and partly because of some deep theoretical ties; thus, we will emphasize Bayesian decision theory in the book. Statistical Decision Theory and Bayesian Analysis by James O. Berger

03:32:14 PM on 04/29/2008 by x mar - Computer Science - comments

Cubic Spline Interpolation

Cubic spline interpolation of the following four points:

(0, 0)
(1, 1)
(2, 1)
(3, 0)
is

y=x-(x^3-x)/5  for 0≤y<1
y=1-3/5((x-1)^2-(x-1)) for 2>y≥1
Cubic spline

05:32:42 PM on 12/11/2007 by x mar - Computer Science - comments

My First ATLAS Installation

BIG_MM N=1600, mf=3692.20,3566.40!

The times labeled Reference are for ATLAS as installed by the authors.
NAMING ABBREVIATIONS:
   kSelMM : selected matmul kernel (may be hand-tuned)
   kGenMM : generated matmul kernel
   kMM_NT : worst no-copy kernel
   kMM_TN : best no-copy kernel
   BIG_MM : large GEMM timing (usually N=1600); estimate of asymptotic peak
   kMV_N  : NoTranspose matvec kernel
   kMV_T  : Transpose matvec kernel
   kGER   : GER (rank-1 update) kernel
Kernel routines are not called by the user directly, and their
performance is often somewhat different than the total
algorithm (eg, dGER perf may differ from dkGER)


Reference clock rate=2593Mhz, new rate=2400Mhz
   Refrenc : % of clock rate achieved by reference install
   Present : % of clock rate achieved by present ATLAS install

                    single precision                  double precision
            ********************************   *******************************
                  real           complex           real           complex
            ---------------  ---------------  ---------------  ---------------
Benchmark   Refrenc Present  Refrenc Present  Refrenc Present  Refrenc Present
=========   ======= =======  ======= =======  ======= =======  ======= =======
  kSelMM      325.9   310.2    327.4   310.2    167.7   130.9    169.9   175.4
  kGenMM       94.4    73.8     94.4    81.5     88.7    68.3     86.8    78.7
  kMM_NT       64.1    55.5     73.7    77.6     62.5    52.6     61.1    56.2
  kMM_TN       92.3    70.6     85.4    74.1     85.1    68.4     72.5    71.7
  BIG_MM      262.2   254.0    269.2   258.5    153.1   149.7    160.6   153.8
   kMV_N       44.0    39.9     60.7    45.3     28.6    20.0     45.3    36.3
   kMV_T       48.2    38.5     57.1    56.3     32.3    24.8     43.7    44.5
    kGER       22.4    20.8     44.1    44.4     11.6    12.0     26.5    26.8
make[1]: Leaving directory `/cygdrive/d/atlas3.8.0/Win_P4'

11:08:46 AM on 11/16/2007 by x mar - Computer Science - comments

Real Quick Start with VTK and TCL

  1. Download and install VTK binary.
  2. Open a command window and navigate to C:\Program Files\VTK 5.0\bin.
  3. Type vtk filename, where filename is the filename of the tcl script.

11:51:45 AM on 01/02/2007 by x mar - Computer Science - comments

Comparison of a Bézier Curve and a B-Spline Curve

A Bézier curve of degree 3 is:

    C(u) = (1-u)2P0 + 2u(1-u)P1 + u2P2

One example of B-Spline Cureve of degree 3 is:

    C(u) = 1/(1+u2)[(1-u)2P0 + 2u(1-u)P1 + 2u2P2]

Consider the case when the control points are:

    P0 = (1, 0), P1 = (1, 1), P2 = (0, 1)

A Bézier curve can be written as:

    x(u) = (1-u)2(1) + 2u(1-u)(1) + u2(0) = 1 - u2,
    y(u) = (1-u)2(0) + 2u(1-u)(1) + u2(1) = 2u - u2.

and a B-Spline curve take the following form:

    x(u) = (1-u2)/(1+u2)
    y(u) = 2u/(1+u2)

null

12:38:27 PM on 12/13/2006 by x mar - Computer Science - comments

Study Guide to TAOCP

1.4.2. Coroutines

First time reader may want to study the coroutine example before trying to understand equation (1).

05:45:16 PM on 09/18/2006 by x mar - Computer Science - comments

Productivity of C# over C++

I've been spending larger and larger portion of my daily programming time coding in C# and/or .NET Framework flavor C++. Yes, it's more productive under .NET Framework, at least for the small projects I've been working on lately. No, it's not the language features such as garbage collection that makes programming in .NET Framework more productivity. It is the vast and well organized .NET Framework Class Library (FCL) that programmer much more productive.

In fact, C# removed some language features in C++ and made it a less strongly typed language compare to C++. As a result, a lot more errors that should be caught at the compile time will slip through the compiling phase into the run time. Consequently, it takes more time to debug a program. Use the latest decent-sized project I developed in C++ as an example, I'll look at two significant language differences and show the would be impact of doing the project in C#.

Garbage Collection: It's been a long time since the last time I had to identify a memory leak. With a well thought out design, couple with the use of boost library smart pointer, I virtually eliminated memory leak from occurring. In addition, the application requires near real time performance, relying an automatic garbage collector is probably not good enough.

Class Template: I consider class template receive far less appreciation than it deserves. The usefulness of class template can go far beyond the automatic instantiation of classes. In this application, there's an array of parameters (gain, frequency, filters, etc.) of the instrument the software's designed to control is to be adjusted by the user via the user interface of the program. When value of the parameters are set by the user from the controls in the user interface, they pass through several software modules before arriving at the software's boundary: the module that writes the parameter to the external interface. When the data comes in, the module that handles the reception of the data need to grab the parameters and insert them into the data. From there, this set of parameters is used when the data is been displayed. They also need to pass a couple of interfaces to reach the data storage.

Since the members in this set of parameters is treated the same way in many interfaces they have to pass, it's very hard for a programmer to resist, after finish coding the interface for one of the parameters, the temptation to copy the code and paste it multiple times in a sequence, then CUSTOMIZED the "mass produced" code to satisfy the need of each individual parameter. Unfortunately, this practice is easy to produce errors that may not be detected long after the code is written and when the error is detected, each interface the parameter passes through need to be inspected in order to catch the error caused by a missing or incorrect CUSTOMIZATION.

The class template is to rescue. That is, create a class template for the parameters and create an instantiation for each parameter:


template class HardwareParameter
{
public:
    // The copy constructor allows the value to be
    // passed across the software interfaces
    HardwareParameter(const HardwareParameter& v) : m_value(v.value) {};

    // The constructor that creates class object to wrap
    // the native type, to be used at the software boundary.
    explicit HardwareParameter(unsigned int v) : m_value(v) {};

    // To release the value wrapped by an object.
    // Should only be used at the software boundary.
    unsigned int Value() const {
        return m_value;
    }

private:
    // the native value
    unsigned int m_value;
};

enum ParameterType {
    eGAIN,
    eFREQ,
    eFILTER,

    ePLACEHOLDER 0xFF
};

typedef HardwareParameter GAIN;
typedef HardwareParameter FREQ;
typedef HardwareParameter FILTER;

With this, an interface the paramters has to pass through can be defined as:


class CRelay
{
public:
    template void SetParam(HardwareParameter pv) {
        m_hwInterface.SetParam(pv);
    };
};

Here's how to release the parameters at the software boundary:


class HardwareInterface
{
public:
    // define the member function template
    template SetParam(HardwareParameter pv);

    // instantiate for each parameter
    template<> SetParam(GAIN pv) {
        m_port.SetPinHigh(1);
        m_port.SetValue(pv.Value());
    }
    template<> SetParam(FREQ pv) {
        m_port.SetPinHigh(2);
        m_port.SetValue(pv.Value());
    }

private:
    // the physical port
    PORT m_port;
};

Clearly, we can not eliminate the need to handle the underlying value completely. But by limits such activity only at the software boundary, it greatly reduced the chance of mistakes and makes the detection of any error much easier.

The aboves are two differences between C++ and C#. As an experienced programmer, I do not consider the memory management and garbage cleaning provided by C# a big positive. On the other hand, missing class template in C# is a huge draw back.

05:38:48 PM on 05/05/2006 by x mar - Computer Science - comments

TAOCP 4.1-1

To express a number a in radix -2, following the following algorithm:

  1. [initialize] k ← 0;
  2. [obtain the current least significant digit] bk ← (a mod -2);
  3. [Reduce] Let a &larr a/(-2);
  4. [Is there more digits] if a is zero, end, otherwise k←k-1 go to 2.

For example:

From the least significant digit to the most significant digit:
The least significant digit is: -10 mod (-2) = 0 and (-10)/(-2) = 5;
The second digit is: 5 mod (-2) = 1 and 5/(-2) = -2;
Followed by: (-2) mod (-2) = 0 and (-2)/(-2) = 1;
Finally, 1 mod (-2) = 1 and 1/(-2) = 0;

Thus the result is: (-10)10 = (1010)-2

In fact, equation (1) neglected to define notation for negative numbers. Consequently, there's nothing says a notation in the form a0a1..., (ak>0), is more preferrable than -a0a1..., (ak>0). Thus we may also express -10 in radix -2 in as: (-10)10 = (-11110)-2

10:55:49 PM on 05/04/2006 by x mar - Computer Science - comments

TAoCP Notes

  • Volume 1, p404 Huffman's method can be generalized to t-ary trees as well as binary trees.
  • Volume 2, p47 code for 1-5 percent, 95-99 percent.
  • Volume 2, p7, line 8-9 geiger counter.
  • 12:37:54 PM on 04/29/2006 by x mar - Computer Science - comments
    <   November 2008   >
    MonTueWedThuFriSatSun
         12
    3456789
    10111213141516
    17181920212223
    24252627282930

    My Links