This article is a review of WWDC 2018 Session 708 and Session 709, and links to the video and accompanying PDF are as follows: What is New in Core ML, Part 1 What is New in Core ML, Part 2 will first review the basic background knowledge of Core ML, and then focus on the application and tool updates of Core ML.

The Core ML review

Core ML is a machine learning framework introduced by Apple in 2017. It mainly supports image classification and text information processing. The basic process is three steps: acquiring model, importing and generating interface, programming with interface. Let’s break down each step in detail:

  1. Get the model. In 2017, there were two main methods: download and convert models generated by third-party frameworks from the official website to Core ML models. To facilitate conversion, Apple has introduced Core ML Tools written in Python. In 2018, the native Create ML framework was introduced to generate Core ML models directly from data. The advantage of Core ML Tools is that its syntax is simple and straightforward, but the disadvantage is that it supports few third-party frameworks and generates models that are too large and not customizable.

  2. Import and generate interfaces. Here, drag and drop the model directly into Xcode, which automatically generates the corresponding machine learning model interface. Without any manual or other operation, very convenient and friendly. The fly in the ointment is that the generated interface is fixed and there is no way to add custom interfaces.

  3. Use programming interfaces. Program against the generated API. In 2017, the Core ML model only supports prediction of a single object, not batch prediction, resulting in low operation efficiency.

The Core ML framework launched in 2017 is easy to use, but it’s also pretty rudimentary. Developers can only customize the model generation from the start, and most likely rely on third-party frameworks. After that, you can only program with fixed models generated by Core ML, which is very limited: you can’t optimize the prediction efficiency, you can’t reduce the size, you can’t add new layers, and you can’t customize the model.

Apple addressed these flaws with this year’s Core ML 2.0 with the following improvements — smaller. Faster. Highly customized.

New features

The new feature in Core ML is the addition of a batch prediction API for model interface generation. The following code shows the old API and the new API:

// Predict a single input
public func prediction(from: MLFeatureProvider,
options: MLPredictionOptions) throws -> MLFeatureProvider

// Predict multiple inputs
public func predictions(from: MLBatchProvider,
options: MLPredictionOptions) throws -> MLBatchProvider
Copy the code

Operations that previously needed to be done with a for loop can now be done with a method. In addition, the new batch prediction method nested a single prediction implementation relative to the for loop, and also used batch for optimization.

The original single prediction method of for cycle needs to read every data completely, preprocess it and send it to GPU, and then take out the result from GPU and output the result in the program after the CALCULATION. The new batch prediction method eliminates the operations of pre-processing and extraction, sends all data to GPU at one time, and takes out the results successively while computing them one by one by GPU Pipeline. In addition, because GPU is always in the operation state, GPU can uniformly optimize the calculation, making the processing of similar data faster and faster. In this way, the overall performance will be much faster, as shown in the figure below:

Apple showed the efficiency difference between the two methods on the spot: with 40 images, the new batch prediction method was nearly 5 seconds faster than the single prediction method for the for loop, almost doubling the efficiency.

In addition, Core ML Tools has increased the number of third-party machine learning frameworks from 6 to 11, including the most well-known ones, TensorFlow, IBM Watson, MXNet, both in quantity and quality.

Performance optimization

Performance optimizations are a big part of Core ML, with Apple claiming that Core ML 2 is 30% faster. Here’s a look at what Apple has done:

  • Quantify the weight. Core ML’s model quantifies the weights as needed. The lower the weight is, the smaller the model size is, the faster the operation speed is, the less memory resources are occupied, but the worse the processing effect is.

  • Multi-size support. For image processing, Core ML can now handle images of different resolutions with just one model. Compared with the previous model with single resolution images, this model is more flexible, and the volume of the model is much smaller than the sum of the original multiple individual models due to the large amount of shared code at the bottom.

Let’s focus on quantitative weights. In 2017, all models in Core ML had 32 bits of weight, meaning that each model could recognize 2^32 different eigenvalues. This leads to very high accuracy, but it also makes the Core ML model very large (20+M). Size is a very important factor for App development. Borrowing from App Stochastic, Apple has optimized the model size for Core ML. Developers can now use Core ML Tools to quantify the original 32-bit weight model, and Apple supports 16-bit, 8-bit, and 4-bit weights as needed. The lower the weight. The smaller the model, the faster it runs, but the less effective it is. Therefore, it is still necessary to choose according to actual needs. In the figure below, we can see the comparison of different model sizes and processing effects.

In the weight quantization, we can make the minimum volume model according to the demand; At the same time, we can combine multiple models with similar functions for multi-size images. Core ML Tools also provides apis for customizing weights. Optimized by multiple measures, Core ML models can be minimized in size, resulting in more efficient loading and computation.


Apple has introduced two approaches to customization: customized neural network layer and customized model. Let’s start with the custom neural network layer.

The internal implementation of many Core ML models is a multi-layer neural network, with each layer taking input from the previous layer, processing it and exporting the results to the next layer. For example, the process of identifying the animal in the photo as a horse is shown below:

Each layer of the neural network is fixed, automatically generated and optimized by the Core ML framework, and we can’t do anything about it. This greatly limits the function of the model: if we want to generate a new model based on the above model, which can identify not only horses, but also birds, fish, cats, dogs and other animals, the simplest way is to replace the level of identifying animals as horses in the above model. Core ML 2 now provides this functionality. Here’s how:

  1. Gets generates a model with a specific hierarchy. The common approach is to rely on third-party neural network libraries, such as Keras.
  2. Use Core ML Tools to convert models with specific levels into corresponding Core ML models. Here we need to customize the special layer transformation method. The specific code is as follows:
Generate model with keras neural network library, where the special layer is GridSampler
model = keras.model.load_model('spatial_transformer_MNIST.h5', custom_object: {'GridSampler': GridSampler})

Custom Core ML model corresponding to the special layer GridSampler conversion method
def convert_grid_sampler(keras_layer):

  params = NerualNetwork_pb2.customLayerParams()

  Define the name and description
  params.className = 'AAPLGridSampler'
  params.description = 'Custom grid sampler layer for the spatial transformer network'

  # define the hierarchy parameters, here we only need to deal with the length and width of the image
  params.parameters["output_height"].intValue = keras_layer.output_size[0]
  params.parameters["output_width"].intValue = keras_layer.output_size[1]

  return params

Convert the Keras model with Core ML Tools, where the conversion method for a specific layer of GridSampler is defined as convert_grid_sampler
coreml_model = coremltools.converters.keras.convert(model, custom_conversion_functions = {'GridSampler': convert_grid_sampler})
Copy the code
  1. Import the Core ML model into Xcode to customize the interface for a particular layer. The corresponding class must implement the MLCustomLayer protocol, which is the behavior protocol of the custom neural network layer. For details about each method, refer to the official Documentation of Apple: MLCustomLayer.
public protocol MLCustomLayer {
  public init(parameters: [String : Any]) throws

  public func setWeightData(_ weights: [Data]) throws

  public func outputShapes(forInputShapes: [[NSNumber]]) throws- > [[NSNumber]]

  public func evaluate(inputs: [MLMultiArray], outputs: [MLMultiArray]) throws
Copy the code

Meanwhile, the concrete implementation of the GridSampler mentioned above is shown below:

Of course, not all model implementations are neural networks. So Apple also came out with a customized model. Implementing a custom model is as simple as implementing the MLCustomModel protocol:

public protocol MLCustomModel {
  public init(modelDescription: MLModelDescription, parameters: [String : Any]) throws
  public func prediction(from: MLFeatureProvider,
options: MLPredictionOptions) throws -> MLFeatureProvider

  optional public func predictions(from: MLBatchProvider,
options: MLPredictionOptions) throws -> MLBatchProvider
Copy the code

For details, please refer to apple’s official documentation.


Core ML 2 adds new features from 2017, while optimizing model size and performance accordingly. Its companion tool, Core ML Tools, also adds a machine learning framework that allows developers to customize neural network layers and Core ML models. In addition, Apple’s Create ML greatly addresses the limitations of model acquisition. Currently, Core ML has been extensively used in native apps like Siri, Photos, and QuickType, and there are 182 third-party apps that use Core ML. We believe that in the near future, Core ML will become standard in all major apps.