a journal of a researcher: The Complexity of Machine Learning Algorithms

Machine learning algorithms are complicated due to different implementations in different contexts. I just did a quick literature search and consulted my colleagues. Here are some results:
• Decision tree induction: O(mnlogn). n is the number of instances, m is the number of the features.—Witten and Frank, Data Mining.
• K nearest neighbour: O(n) (maybe O(mn) for m features)
• Tune the parameters for Neutral Network: NP hard.

Well, I have no energy to investigate more. People should have a clear answer to it. What I can know is that linear or quadratic algorithms for statistic learning are not rare.

Daniel gives a nice estimation for the complexity: if it takes a minute to process 1 MB, and you have 1 GB of data, it will take more than a year if it is O(n^2). Probably it is too prefect to find linear algorithm, then Daniel claims that O(nlogn) is a target.

Thanks for Daniel Lemire and Peter Turney for the inputs.

a journal of a researcher

Wednesday, July 07, 2004

The Complexity of Machine Learning Algorithms

1 Comments:

About

Previous

NRC Bloggers

My Links

Feeds