Just as Google is quieting down, Apple is fighting again! This is the battle for AI from mobile applications.

With the recent release of Apple’s mobile-optimized Core ML, how strong is the demand from mobile developers? And what does Prisma, last year’s hugely successful AI app, tell us? How does Apple’s new weapon, Core ML, work? What are apple’s ambitious plans for mobile machine learning? Can Apple really dent the dominance of Google and Facebook? Will the future of AI change? The answer will be explained in this article.

The author | hu



At WWDC, Core ML is a great gift from Apple to mobile developers. Using it, developers can import trained machine learning models into iOS apps while speeding up in-app AI calculations at the system level. Specific tasks supported by the Core ML API include image recognition, text processing, face detection and tracking, and more.

Apple’s move is in line with what Gartner predicted in October: that the tech industry will soon become a battle between intelligent systems that can learn and adapt and respond autonomously. By 2018, the vast majority of the world’s 200 largest companies will launch smart applications and use full big data analytics tools to redefine and enhance user experiences.

Gartner’s claim, in turn, comes from the market performance of mobile applications that already have machine learning capabilities. SnapChat’s popularity among teenagers, Prisma’s popularity last year, and all kinds of beauty photo apps that have been widely used in China. With the right scenario, ai-enabled mobile apps will always stand out.

Prisma as the representative of mobile application AItization

Looking back at Prisma’s story, there are always clues.

In early 2016, tech enthusiast Alexey Moiseenkov, a post-90s combatant, came across two machine learning papers on images: A Neural Algorithm of Artistic Style and Texture Synthesis Using Convolutional Neural networks Networks).

Artistic style neural algorithm: Extract the artistic style of the painting

Based on the research results in this paper, Gatys et al. launched a charging website DeepArt in 2015, which automatically paints for users in the style of master paintings. The steps are as follows:

  1. Identify photos uploaded by users;

  1. Learn the art style information in the photos;

  2. Output a redrawn artwork.

    Texture Synthesis using neural convolutional networks: Redrawing and rendering of photo textures

    This is similar to how humans learn to draw:

    1. See a work, have a preliminary concept of painting;

    2. Study the painting style and brushwork in the works;

    3. Copy the above style and brushwork and paint again.

    By extracting the styles of different famous paintings, the photos can be rendered with different effects

    But DeepArt, which only appears on the web, takes a long time — at least 20 minutes to render a new photo — and costs a lot of money. Gatys and others failed to see the opportunity in the mobile market, which inspired young Alexey and made him believe that “AI+ image processing” on mobile terminal must be promising.

    Alexey knows that the key to success in mobile terminal is to significantly reduce the time consuming of image processing, which is to greatly improve the operating efficiency of neural network. To this end, he studied the algorithm of Gatys et al for two months to simplify the details needed by the neural network as much as possible without affecting the image effect on the mobile terminal. This optimization allows Prisma to take just a few seconds to process, 1,000 times faster than DeepArt.

    The subsequent development process only took a month and a half. Prisma was launched on The Apple App Store on June 11. Within two weeks, it had been downloaded 1.6 million times; Three months later, 70 million downloads worldwide; By the end of the year Prisma had won app of the Year awards on both Google and Apple.

    Prisma’s iterative upgrade has been around “speed up”, the key is to continuously improve the image processing algorithm.

    Initially, Prisma’s three neural networks were deployed in the cloud. After the user selects a photo effect, the Prisma app on the phone uploads the photo to the server. After the convolutional neural network on the cloud interprets the photo, a new photo will be rendered according to the user’s selected effect and downloaded to the user’s phone.

    As Prisma began to expand overseas, problems with communication between overseas users and Prisma’s Servers in Moscow were the main cause of Prisma’s slow loading. Alexey had to figure out a way to do it, and he had to do it on his phone. In August 2016, Prisma became the first mobile app to run a style-switching neural network offline, directly using an iPhone processor. A 1080×1080 resolution photo can be transformed into a new style in half a second.

    As you can see, the need to deploy machine learning algorithms on mobile phones is extremely strong.

    After all, Prisma isn’t the only one using AI to edit or create photos. In March 2016, SnapChat’s dynamic camera effects Lenses, which SnapChat bought from Looksery, a Ukrainian company it had acquired in September 2015, became a hit. Facebook followed suit in August, rolling out filters on Instagram, Messenger and WhatsApp that compete with SnapChat. Facebook’s technology comes from Masquerad, a Belarusian company it bought in March 2016.

    So, starting this year, Facebook and Google have shifted their machine learning frameworks to mobile, and the trend of running AI algorithms directly on mobile apps is almost irreversible. And then came Core ML, our star of the day.

    Core ML model for mobile

    Different from Google TensorFlow and Facebook Caffe2, Apple’s Core ML is optimized for iOS mobile machine learning to minimize memory footprint and power consumption. Moreover, it can ensure that the application works and responds properly even if the network connection is lost.

    Core ML supports image processing for the iOS Vision API, natural language processing for the iOS Foundation API, and learning decision Tree analysis for iOS GameplayKit. Apple has four models for image recognition ready for developers: places205-googlenet, ResNet50, Inception v3, and VGG16 in the format of Core ML’s.mlmodel extension.

    How do I make Core ML models

    With Core ML, developers can also import trained machine learning models into their own applications for direct use. The Machine learning models it supports include Neural Network, Tree Ensemble, Support Vector Machine and Generalized Linear model The Model).

    The Core ML Tools Python toolkit from Apple converts trained models created by third-party machine learning Tools into Core ML model format. Machine learning frameworks supported by Core ML include Keras, Caffe, Scikit-Learn, XGBoost, and LIBSVM.

    Caffe model (. Caffemodel), for example, by calling the converter, you can pass it to coremltools. Converters. Caffe. Convert methods:

    import coremltools coreml_model = coremltools.converters.caffe.convert(‘my_caffe_model.caffemodel’)

    Then, save the result in Core ML model format:

    coreml_model.save(‘my_model.mlmodel’)

    For formats not supported by Core ML, such as the TensorFlow model, you have to create your own conversion tools to convert the inputs, outputs, and architectural representations of the model to Core ML format. This requires careful reference to the transformation Tools provided by Core ML Tools, which demonstrate how to convert various third-party models into Core ML format by defining each layer of model architecture and the connections between layers.

    How do I add and use Core ML models in my application?

    To predict price Mars colony has training model MarsHabitatPricer. Mlmodel, for example:

    First we need to add the model to the Xcode project:

    So I’m gonna drag my model into the Project Navigator

    For MarsHabitatPricer. Mlmodel, Xcode will generate the corresponding interface, To represent the model itself (MarsHabitatPricer), model input (MarsHabitatPricerInput) and model output (MarsHabitatPricerOutput) respectively.

    Using the constructor of the generated MarsHabitatPricer class, you can create this model:

    let model = MarsHabitatPricer()

    Get the input value and pass it to the model:

    The sample application uses UIPickerView to get input values for the model from the user.

    func selectedRow(for feature: Feature) -> Int {

    return pickerView.selectedRow(inComponent: feature.rawValue)

    }

    let solarPanels = pickerDataSource.value(for: selectedRow(for: .solarPanels), feature: .solarPanels)

    let greenhouses = pickerDataSource.value(for: selectedRow(for: .greenhouses), feature: .greenhouses)

    let size = pickerDataSource.value(for: selectedRow(for: .size), feature: .size)

    Use models to make predictions:

    guard let marsHabitatPricerOutput = try? model.prediction(solarPanels: solarPanels, greenhouses: greenhouses, size: size) else {

    fatalError(“Unexpected runtime error.”)

    }

    By reading the Price property of marsHabitatPricerOutput, you can get the predicted price and display the result in the application’S UI.

    let price = marsHabitatPricerOutput.pricepriceLabel.text = priceFormatter.string(for: price)

    Note: the generated prediction (solarPanels: greenhouses: size:) method returns the error message, for example application expected input type is Double.

    A common error encountered with Core ML is that the input data type passed to the method does not match the input type expected by the model: for example, the image type represented in the wrong format.

    Build and run the Core ML application

    Xcode compiles the Core ML model into the resource for optimization and running on the device. The optimized model representations are included in your application package and can be used for prediction when the application is running on the device.

    Apple says the optimized image recognition on the iPhone can be up to six times faster than the Google Pixel.

    What might be the impact of Core ML

    WWDC hasn’t finished yet, and after Keynote has attracted everyone’s attention, there are 5 Core ML items to be covered in the conference schedule:

    • Introducing Core ML

    • Vision Framework: Building on Core ML

    • Core ML in depth

    • Core ML and Natural Language Processing Lab

    • Core ML & Natural Language Processing Lab

    The focus is on app-biased visual frameworks and natural language processing, and Apple’s experience and support will certainly keep many developers excited: find the right scenario, and new phenomenal AI applications like Prisma will still appear.

    As the threshold for using machine learning in iOS applications has been lowered, more and more engineers are focusing on specific applications of AI. With the influx of iOS developers, machine learning is no longer just for algorithmic engineers.

    On the other end of the spectrum, it will be interesting to see how Google’s TensorFlow Lite for Android will perform when it launches this year.

    After all, Google doesn’t make mobile chips, so how to optimize the hardware level for AI applications on phones is a big question for Google: is it working with Qualcomm’s Neural Processing Engine? Or to miniaturize their TPU?

    With regards to Caffe2, it is an interesting question how Facebook will optimize AI applications at the system level.

    As Gartner points out, things get a lot more interesting when AI becomes a major battleground in the tech industry.

    Reference content:

    https://developer.apple.com/documentation/coreml

    https://developer.apple.com/wwdc/schedule/

    http://www.gartner.com/smarterwithgartner/gartners-top-10-technology-trends-2017/

    https://backchannel.com/an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple-8dbfb131932b#.cg37ae5f0

    https://zhuanlan.zhihu.com/p/26746283

    https://developer.apple.com/videos/play/wwdc2017/703/