Tuesday, March 17, 2009

Al go rhythms

Mr. Larry Freeman, see link, is a software engineer from Fremont, who has taken the time to explain the "global effects removal" technique sometimes used by contenders in the Netflix prize and reflects on a number of mathematical issues. It's quite interesting for non-academic people, since it's bite-sized pieces.

I'm already grateful for explaining and demonstrating the global effect removal technique. I've tried putting it into my implementation, but sadly could not significantly improve my ratings. As you may guess, I'm using RSVD together with bias determination รก la Funk and Paterek. I'm not using any blending whatsoever.

One of the reasons it doesn't work so well is probably that my implementation already covers biases by training biases along with svd factors in the first phase. Thus, the benefit of it is greatly reduced. I do notice however that using the technique requires my factors to be severely changed. I had lambda=0.05f as suggested by Paterek for calculating the biases, but now that these have been factored out, I reckon they have to be a lot higher to prevent over-fitting.

From the first 3 iterations in the svd algorithm, I can tell how things will turn out towards the end, unless my parameters are really wrong. I've seen a very good first rate at 0.951 (lower than the netflix start average ), which only ended up in a disappointing end rate of 0.9120 (higher than my ultimate post of 0.9114).

Thing that's interesting to research is to factor in global effects into the training phase. I imagine that I'm getting better results with biases, as opposed to global effects pre-processing. That's probably because all global effects are global averages, whereas some kind of gradient descent training may be able to discern the real features and the way they influence (or seem to influence) specific people. Those features may be different from the ones identified for global features, but one thing I'd like to definitely do is use the date more. Consider ratings made on a certain day, find out how much they differed and then make predictions on the change of the date.

This is not related to the production date of the movie versus the date of rating (I think that's very difficult and probably there's no causal relationship between them). One of the tricks in this netflix competition is to find causal relationships, as they make huge differences in the outcome. (in another way, you could say that if a business rule could be developed based on a certain observation that's mostly true, then that's great discovery!).

No comments: