What is interesting?
I was chatting this afternoon on one of those great channels with a small membership that seems to effortlessly mix computing, philosophy, politics, physics and just about anything else that comes along. And I was thanking a speaker at a recent event for a great talk on SVM's - but I complained about how I wanted to not concern myself with the kernels used in a SVM.
In someways I wanted a system which I could just throw data at it, and it would work out the relationships -- I cited my recently failed attempt to do so with the Perl Survey data. And my major problem with what that attempt spat out was that it just wasn't interesting. And he asked me what was "interesting"?
The problem was, that my attempt had identified that people who contributed to Perl 6 were pretty much the same people who had contributed to Perl 5. Not a very interesting fact.
So I suggested a solution to figure out if something is interesting that could be executed by a computer and I wondered if anyone reading this could come up with something better?
Assume you have a set of conditionals, if X->Y. Then for each permutation of these (apart from where X = Y) you can evaluate the following...
gh(X AND Y)/(min(gh(X),gh(Y)))
looking for the lowest score for the combination and comparing it to the % of if X->Y for a set of data. Where gh() is a function that measures the number of Google hits, and gh(X and Y) looking for the number of google hits that mention both X and Y. This should find the most strong correlation of things that is not mentioned by people on the internet.
Now I just need a set of data thats wider than the Perl survey and Google not to cut me off for hammering them -- but you guys can take it right? ;-) And I might well have my computer find out that people who buy nappies buy beer.

Comments