a journal of a researcher

Wednesday, July 07, 2004

The Complexity of Machine Learning Algorithms

Machine learning algorithms are complicated due to different implementations in different contexts. I just did a quick literature search and consulted my colleagues. Here are some results:
• Decision tree induction: O(mnlogn). n is the number of instances, m is the number of the features.—Witten and Frank, Data Mining.
• K nearest neighbour: O(n) (maybe O(mn) for m features)
• Tune the parameters for Neutral Network: NP hard.

Well, I have no energy to investigate more. People should have a clear answer to it. What I can know is that linear or quadratic algorithms for statistic learning are not rare.

Daniel gives a nice estimation for the complexity: if it takes a minute to process 1 MB, and you have 1 GB of data, it will take more than a year if it is O(n^2). Probably it is too prefect to find linear algorithm, then Daniel claims that O(nlogn) is a target.

Thanks for Daniel Lemire and Peter Turney for the inputs.

2 Comments:

  • Beautiful Yuhong. I think your blog is very interesting and full of useful info. More so than mine which is much less science-driven...

    By Blogger Daniel Lemire, at 5:22 PM  

  • I also post your opinion. Daniel Lemire is a frequent name here.

    By Blogger flydragon, at 9:24 AM  

Post a Comment

<< Home