An overview of the

  • In order to make the similar image retrieval scene of “search by image”, a search by image system is designed based on Faiss vector index calculation and image feature extraction model VGG16.
  • Open source: github.com/thirtyonele…

Retrieve the scene

  • Reasoning process: the image is read and the algorithm generates feature vectors
  • Feature storage: The feature vector is stored in ES
  • Retrieval process: on-line real-time vector retrieval
  • The specific process is as follows:

Faiss profile

  • faissIs a framework to provide efficient similarity search and clustering for dense vectors. Developed by Facebook AI Research. It has the following features.
    • Multiple search methods are provided
    • Speed is fast
    • It can be stored in memory or disk
    • C++ implementation, providing Python wrapped calls.
    • Most algorithms support GPU implementation

Faiss retrieval implementation

  • Provide Euclidean and inner product two kinds of distance, the specific code is as follows:
import faiss                   # make faiss available
index = faiss.IndexFlatL2(d)   # build the index
index.add(xb)                  # add vectors to the index
D, I = index.search(xq, k)     # actual search
Copy the code
import faiss                   # make faiss available
index = faiss.IndexFlatIP(d)   # build the index
index.add(xb)                  # add vectors to the index
D, I = index.search(xq, k)     # actual search
Copy the code

Introduction to operation

  • Download the project source code: github.com/thirtyonele…
  • Operation 1: Build the base index
Python index.py --train_data: specifies the path to the training images folder. The default path is' <ROOT_DIR>/data/train '--index_file: Custom index file storage path, default is' <ROOT_DIR>/index/train.h5 'Copy the code
  • Operation two: Use similarity search
Python Retrieval. Py --engine=faiss --test_data: Custom test image details address, default '<ROOT_DIR>/data/test/001_accordion_image_0001.jpg' --index_file: H5 '--db_name: specifies the ES or Milvus index name. The default is' image_retrieval' --engine: User-defined search engine type. The default search engine type is' numpy '. The options are numpy, FAiss, ES, or MilvusCopy the code

Further reading

  • Retrieval speed optimization scheme
  • Index memory optimization scheme
  • GPU runs inference scheme

That’s all!