The author | Jameson Toole
The translator | hong-xing hu
Edit | Natalie
AI Front Line introduction:Python is popular in data analysis and machine learning because of its ease of use, but performance has always been a problem. Things seem to have changed a bit since Google launched Swift for TensorFlow. Will Swift become a better programming language for data analysis and machine learning than Python?






Please pay attention to the wechat public account “AI Front”, (ID: AI-front)


A week into a first-year physics course at the University of Michigan, a professor assigned problems that required simulating a multibody system. The questions must be finished by Friday. It was the week I learned my first programming language, Matlab.

That’s why I’ve piecemeal studied a dozen languages over the past decade. Apart from C++ in my entry-level CS courses and java-based database courses at graduate level, I have never had any formal study in software engineering. For me, coding is about completing assignments, analyzing data to answer questions, or implementing my ideas. Sometimes that means getting familiar with the details of algorithms or data structures, but I’ve never found myself coding for coding’s sake. I have no problem with generics (don’t @me). I think that describes the vast majority of data scientists and machine learning engineers I know. When choosing tools for the problem to be solved, we often prioritize usability and efficiency over software fundamentals.

Fast forward to 2018, and Python has been embraced by the machine learning and data science communities. Python’s syntax is easy to get started with, and it’s a great scripting language to work directly with the underlying C library when you want to optimize performance. For me, however, the most fascinating part of Python is that it can be used to build entire systems end-to-end. Notebooks for scientific computing such as Numpy, Pandas, Matplotlib, and Jupyter Notebooks have great community support. And when it comes to building applications around your work, frameworks like Flask and Django are powerful enough to scale to hundreds of millions of users. I can use a programming language to build the entire system.

I’ve enjoyed working with Python for the past 10 years. But I don’t think I’ll use it in the future, and I’ll use Swift next.

At the TensorFlow Development Summit 2018, Chris Lattner from Google announced that TensorFlow will soon support Swift. Don’t think of TensorFlow for Swift as simply a wrapper around TensorFlow for iOS devices. It’s much more than that. The project seeks to change the default tools used by the entire machine learning and data science ecosystem.

As I became familiar with the Python stack, two other technology trends were creeping in behind it: the resurgence of ARTIFICIAL intelligence through neural networks and deep learning, and the adoption of AI scenarios into billions of smartphones and Iot devices. Both technologies require high performance computing, and Python doesn’t seem like a good fit.

Deep learning requires massive data transfer to long-chain networks, which results in a large amount of computation. To perform these calculations quickly, the software must be compiled for a dedicated processor with thousands of threads and cores. These problems are even more pronounced on mobile devices, where power consumption and heat are a big concern. Optimizing for a slower processor with less memory is extremely challenging. So far, Python hasn’t been very helpful.

This is a big problem for data scientists and machine learning researchers. We ended up using gpus for computing in the hacker way, and many of us are struggling with mobile application development. Learning a new language is not impossible, but switching costs are high. Look at cross-platform projects like Node.js and React-Native to see how costly the conversion can be. I had planned to stick with my Numpy array forever, but now it’s impossible to get the job done without giving up Python. It’s no longer a good enough solution.

In a world dominated by machine learning and edge computing, Python’s inability to be an end-to-end language is the driving force behind Swift for TensorFlow. Chris Lattner argues that Python’s dynamic typing and interpreter prevent us from going further. Engineers need a language that treats machine learning as a “first-class citizen,” in his words. While he gives an insightful account of the technical reasons why a new approach to compiler analysis is needed to change the way programs using TensorFlow are built and executed, the most compelling aspect of his argument is the programmer’s experience.


Features that should be included in a programming language that makes machine learning easier include:

  • Readable, efficient syntax

  • Ability to script

  • Notebook – like interface

  • Third-party libraries built by active communities

  • Automatically compiles code for specific hardware from TPU to mobile chip

  • It runs on mobile devices

  • Close to C performance

Lattner and his team are adding all these features with Swift for TensorFlow. Swift syntax is almost as efficient as Python, with an interpreter for scripting. Best of all, you can run any Python code for portability, and because Swift is now the default development language for iOS application development, it’s easy to deploy to mobile devices. Swift’s open source compiler and static typing make it possible to build for a specific AI chipset. As one of the founders of Swift, Lattner may be biased, but I believe deeply in his understanding of machine learning.

You can see Chris Lattner’s entire talk here.

Read the original article:

https://heartbeat.fritz.ai/why-data-scientists-should-start-learning-swift-66c3643e0d0d