### The Complexity of Machine Learning Algorithms

Machine learning algorithms are complicated due to different implementations in different contexts. I just did a quick literature search and consulted my colleagues. Here are some results:

• Decision tree induction: O(mnlogn). n is the number of instances, m is the number of the features.—Witten and Frank, Data Mining.

• K nearest neighbour: O(n) (maybe O(mn) for m features)

• Tune the parameters for Neutral Network: NP hard.

Well, I have no energy to investigate more. People should have a clear answer to it. What I can know is that linear or quadratic algorithms for statistic learning are not rare.

Daniel gives a nice estimation for the complexity: if it takes a minute to process 1 MB, and you have 1 GB of data, it will take more than a year if it is O(n^2). Probably it is too prefect to find linear algorithm, then Daniel claims that O(nlogn) is a target.

Thanks for Daniel Lemire and Peter Turney for the inputs.

• Decision tree induction: O(mnlogn). n is the number of instances, m is the number of the features.—Witten and Frank, Data Mining.

• K nearest neighbour: O(n) (maybe O(mn) for m features)

• Tune the parameters for Neutral Network: NP hard.

Well, I have no energy to investigate more. People should have a clear answer to it. What I can know is that linear or quadratic algorithms for statistic learning are not rare.

Daniel gives a nice estimation for the complexity: if it takes a minute to process 1 MB, and you have 1 GB of data, it will take more than a year if it is O(n^2). Probably it is too prefect to find linear algorithm, then Daniel claims that O(nlogn) is a target.

Thanks for Daniel Lemire and Peter Turney for the inputs.

## 2 Comments:

Beautiful Yuhong. I think your blog is very interesting and full of useful info. More so than mine which is much less science-driven...

By Daniel Lemire, at 5:22 PM

I also post your opinion. Daniel Lemire is a frequent name here.

By flydragon, at 9:24 AM

Post a Comment

<< Home