This article was originally written by AI Frontier


Finishing | Natalie

Recently, someone raised a question on Zhihu, which attracted the attention of xiaobian. “There have been many public reviews comparing TensorFlow with other deep learning frameworks (mainly Mxnet and Caffe) that have been poorly performed in a variety of scenarios,” the asker said. So how slow is TensorFlow? Where would you fix it to improve performance?”

So far this question has been viewed by more than 2W people with 369 followers. The answers to the questions are mostly poking fun at “ignorant amateur reviewers.”

Yuxin Wu, who now works at Facebook’s Artificial Intelligence Research Institute (FAIR), gave a straightforward answer to the question that has earned him praise from several ai gurus. They include Jia Yangqing, a Research scientist at Facebook and author of Caffe; Tian Yuandong, a researcher at Facebook’s ARTIFICIAL Intelligence Institute; and Zhou Yuefeng, a Brain engineer at Google.

Please pay attention to the wechat public account “AI Front”, (ID: AI-front)



Wu Yuxin’s reply is as follows:

Do not make too small common models on a standalone basis, and when used correctly, TensorFlow is not significantly slower than any other framework. Not significant: within 5% or so. There is no such thing as “getting hung up”. Common models: CNN without weird layers, and standard RNN/LSTM. Because Google is not stupid… Tensorflow /benchmarks contains the same number of DGX1 as Caffe2 when it is first published.

Uncommon models: The kernel used by each framework is different, and how fast it can run depends on luck. Specifically, it depends on whether or not one of the developers has actually optimized the non-mainstream kernel that you happen to be using. Model too small: frame overhead is reflected when hundreds or thousands of iterations can be run in 1 second. Multiple machines: TensorFlow has poor scalability. Seems to be trying to do efficient AllReduce recently. Uber/Horovod also adapted TensorFlow’s scalable Allreduce op…

As for the so-called “public evaluation articles” said to be suspended, it is mostly due to the incorrect use of various frameworks.

There are so many possible reasons why it’s not true… Use the wrong kernel, copy many times I do not know, in the loop to build the map, not finished calculation on the timing, use Keras. Of course, this is not a problem for users, after all, the efficiency of many official TensorFlow samples is questionable.

I wanted to answer this question because of this reddit post today

www.reddit.com/r/MachineLe…

To quote Soumith (Author of Torch/PyTorch) in the post:

Amateur benchmarks are hard, and more often than not they are quite wrong.Inrecent times I don’t think I’ve seen a single amateur benchmark that didn’t screw up in it’s first couple of iterations.

I’ve never seen an amateur assessment get it right.

Another Zhihu user named “Lidang” was equally scathing: “Language and framework are not the bottleneck, IQ is.”

Welcome to the scene.

www.zhihu.com/question/26…

For more content, you can follow AI Front, ID: AI-front, reply “AI”, “TF”, “big Data” to get AI Front series PDF mini-book and skill Map.