Pages

Tuesday, April 21, 2015

Generic Programming in Java



Java first appeared in 1995, and it was designed to let the Java developers "write once, run anywhere". C++ was nearly at its peak while Java was born, so OOP and Portability have been fully supported since the very beginning. Any programming language has design principles, here I copied
the five primary goals in the creation of the Java language:
   1. Simple, Object-Oriented and Familiar
   2. Robust and Secure
   3. Architecture-neutral and portable
   4. High performance
   5. Interpreted, Threaded, and Dynamic 






Since then, Java has experienced lots of changes. One change is the Generics or Generic Programming introduced in Java 5. Now Generics has been widely in practice, and the Functional Programming in Java 8 makes it even more indispensable.

It's uncommon to see a method in Java Stream API like this:


public static <T> Collector<T,?,Long> counting();


or,

public static <T,K,D,A,M extends Map<K,D>> Collector<T,?,M>
       groupingBy(
Function<? super T,? extends K> classifier,
                 
Supplier<M> mapFactory,                                            Collector<? super T,A,D> downstream);



It's fun to write/play code. Here is some on Basics of Java Generics:
/**
   Code snippet to initialize the generic type
   There are 4 compilation errors:

     1, Cannot make a static reference to the non-static type T

     2, Cannot make a static reference to the non-static type T

     3, Cannot instantiate the type T

     4, Cannot create a generic array of T

**/

public class Generic<T> {



    // returns an object of the generic type

    public static T returnGenericType_static() {

        return new T();

    }
    // returns an object of the generic type

    public T returnGenericType() {

        return new T();

    }
    // returns an array of the generic type

    public T[] returnGenericArray() {

         return new T[2];

    }


}




Monday, March 16, 2015

What if I get "Run-Time Error 429: ActiveX Component Can't Create Object"?

It's not uncommon to call MATLAB from Excel. But I once got an error message: "Run-Time Error 429: ActiveX Component Can't Create Object" when I tried to start MATLAB within Excel 2010.

I followed the following procedure to fix it:

1) In Windows 7, click "Start", type in "cmd", and then "Ctl+Shift+Enter" (i.e., run as an admin).
2) Run "MATLAB /regserver"
3) In Excel, click MATLAB add-in's Preferences and check "Use MATLAB desktop".

Now MATLAB and Excel are linked!


Friday, February 20, 2015

A video on the consumption-based model - CCAPM

CCAPM is different from the traditional CAPM. Here is the 'root' formula of the consumption-based model of asset pricing:
                                                              
                               p = E(m x) = E(x* x) = <x*, x>

where
                p   - the asset price
                m  - stochastic discount factor
                x   - payoff
                x* - the unique discount factor in the payoff space X.

This video explains it in details.

Friday, January 30, 2015

Financial Data for Statistical Learning

There are plenty of financial and economic data which can be used in Machine Learning and Data Mining exercises. Here is list of mine:

FRED - Federal Research Economic Data.
CRSP - Center for Research in Security Prices
Ken-French - Fama-French 3-factor model data
John Cochrane - Data and Programs, Liquidity factor, Grumpy Economist.
Robet Shiller - Online data, and other research data.


Web Scraping is the new way to get the free "real-time" data.:)

Here is introduction given by Christopher Reeves.

Python (urllib, re, scrapy) and R(quantmod) are my favorite languages for FIN/ECON data scraping.

Saturday, January 24, 2015

ffn


Intruction to ffn - Financial Functions for Python

ffn is a Python library for quantitative finance. It stands on the shoulders of giants (Pandas, Numpy, Scipy, etc.) and provides a vast array of utilities, from performance measurement and evaluation to graphing and common data transformations.


Its APIs support data retrieval, data manipulation, performance measurement, numerical routines and financial functions.



Statistical Significance vs. Economic Significance


Statistical Significance
Economic significance
Is it fitted well?
Is it an important factor?
[large] t-stat or [small] p-value
[large] bj values
Low t-stat => need more sample data?
Small bj values => Multicollinearity?

This is a sample regression output from fitlm() in Matlab:

Linear regression model:
    y ~ 1 + x1

Estimated Coefficients:
                   Estimate     SE           tStat     pValue    
    (Intercept)    0.0028283    0.0023652    1.1958    0.23514
    x1             0.91903      0.045009     20.419    2.5487e-34


Number of observations: 86, Error degrees of freedom: 84
Root Mean Squared Error: 0.0143
R-squared: 0.832,  Adjusted R-Squared 0.83
F-statistic vs. constant model: 417, p-value = 2.55e-34


Saturday, November 29, 2014

Machine Learning vs Statistics (Statistical Learning)

In a sense, the difference between Machine Learning and Statistics (Statistical Learning) is what is the difference between a computer scientist and a statistician. In my humble opinion, they are becoming more similar as time goes by. On the other hand, Machine Learning is probably more empirical and works on problems by utilizing various optimization algorithms while Statistics (Statistical Learning) is more emphasizing on the assumptions and model validation -- more rigorous in mathematics and more often to talk about VC dimensions, KL divergence, conjugate prior, and etc. In practice, ML uses Matlab or Python while SL favors R. 

Rob Tibshirani, the author of "The Elements of Statistical Learning" gave a 'comparison of machine learning and statistics, here is a screenshot for your convenience:
In useR! 2004, Brian D. Ripley, another statistician, said: 'machine learning is statistics minus any checking of models and assumptions'.
A lecturer at Udacity gives a good comparison in one sentence, "Statistics focuses on analyzing the existing data and drawing valid conclusions, while Machine Learning focuses on making prediction and less worrying about the assumption as long as it makes good predictions".
Since 2001, the comparison become less  member the things have changed when William Cleveland introduced "data Science" as an independent discipline. As  said, "Machine learning and statistics may be the stars, but data science orchestrates the whole show."

In David Smith's blog, you can see CMU machine learning students "protest" at the G20 summit in Pittsburg, September 25 2009.