After receiving feedback from students in the community, WE hope MMClassification can support kfold-cross-valid cross-validation function. The developers will arrange it immediately and plan to support this feature within 24 hours. However, there is a problem with development: deep-copy generated Config objects do not have a dump method. The deep-copy generated object is not a Config type. Then there’s only one truth, deep copy has a problem. Here is an example describing a problem:

# https://github.com/open-mmlab/mmcv/blob/v1.4.5/mmcv/utils/config.py > > > the from MMCV import Config > > > the from copy import deepcopy >>> cfg = Config.fromfile("./tests/data/config/a.py") >>> new_cfg = deepcopy(cfg) >>> type(cfg) == type(new_cfg) False >>> type(cfg), type(new_cfg) (mmcv.utils.config.Config, mmcv.utils.config.ConfigDict)Copy the code

Can be found that deep copy generated object new_cfg unexpectedly is MMCV utils. Config. Rather than expected MMCV ConfigDict type. The utils. Config. Config type.

Of course, the problem was solved and the new feature went live. I’ve heard a lot of feedback about the deep copy problem, so I’m going to share the whole process of solving the deep copy problem here, hoping to help you understand the deep copy problem. To solve the deep copy problem, first understand what deep copy is and how it differs from shallow copy.

Shallow copy vs deep copy

When the copied object is an immutable object, such as a string or a tuple of immutable elements, shallow and deep copies make no difference. Both copies return the copied object, that is, no copy has taken place.

>>> import copy >>> a = (1, 2, 3) # tuple elements are immutable objects >>> b = copy.copy(a) # shallow copy >>> c = copy.deepcopy(a) # deepcopy >>> id(a), Id (b), id(c) # check memory address (140093083446128, 140093083446128)Copy the code

As can be seen from the above example, the addresses of A, B, and C are the same, indicating that no copy has occurred and all three point to the same object. When the objects being copied are mutable objects, such as dictionaries, lists, tuples with mutable elements, etc., shallow copies are different from deep copies. A shallow copy creates a new object and copies references from the original object. The difference is that deep copy creates a new object and then recursively copies the values from the original object. Here is an example that shows that both shallow and deep copies create a new object.

>>> import copy 
>>> a = [1, 2, 3] 
>>> b = copy.copy(a) 
>>> c = copy.deepcopy(a) 
>>> id(a), id(b), id(c) 
(140093084981120, 140093585550464, 140093085038592) 
Copy the code

As you can see from the above example, the addresses of A, B, and C are inconsistent and do not point to the same object. That is, new objects are created for both shallow and deep copies. But if there are mutable objects in A, then changes to A affect the value of B, but not the value of C.

The following is an example of a mutable object being copied.

>>> import copy >>> a = [1, 2, [3, 4]] >>> b = copy.copy(a) >>> c = copy.deepcopy(a) >>> id(a), id(b), id(c) (140093082172288, 140093090759296, 140093081717760) >>> id(a[2]), id(b[2]), id(c[2]) (140093087982272, >>> A [2]. Append (5) >>> A, b, c ([1, 2, [3, 4, 5]], [1, 2, [3, 4, 5] [1, 2, [3, 4]])Copy the code

As you can see from the above example, when modifying a mutable object in A, object B generated using a shallow copy also changes, while object C generated using a deep copy does not change.

Emergence of a problem

Now that we understand the difference between shallow and deep copies, let’s return to the main point of this article. The answer is that Config does not implement the DeepCopy magic method. So, does a class that does not implement deepCopy necessarily have deep-copy type inconsistency? Let’s start with an example.

>>> from copy import deepcopy 
>>> class HelloWorld: 
        def __init__(self): 
        self.attr1 = 'attribute1' 
        self.attr2 = 'attribute2' 
 
>>> hello_world = HelloWorld() 
>>> new_hello_world = deepcopy(hello_world) 
>>> type(hello_world), type(new_hello_world) 
(__main__.HelloWorld, __main__.HelloWorld) 
Copy the code

As you can see from the above, the new_hello_world object generated by the deep copy is the same as the hello_world object copied. I wonder why neither Config nor HelloWorld provides a deepCopy method, but why the object types of the former deep-copy are inconsistent while those of the latter are. To understand why, take a look at the source code for the Copy module. Here is the source code for deep copy in the copy module.

# https://github.com/python/cpython/blob/3.10/Lib/copy.py#L128 # _deepcopy_dispatch is a dictionary, Deepcopy_dispatch = d = {} def _deepcopy_atomic(x, memo): D [int] = _deepcopy_atomic d[float] = _deepcopy_atomic d[STR] = _deepcopy_atomic # Def _deepcopy_list(x, memo, deepcopy=deepcopy) def _deepcopy_list(x, memo, deepcopy) y = [] memo[id(x)] = y append = y.append for a in x: append(deepcopy(a, memo)) return y d[list] = _deepcopy_list def deepcopy(x, memo=None, _nil=[]): """Deep copy operation on arbitrary Python objects. See the module's __doc__ string for more info. """ if memo is None: D = id(x) y = memo. Get (d, _nil) if y is not _nil: CLS = type(x) copier = _deepcopy_dispatch.get(CLS) if copier is not None: y = copier(x, memo) else: if issubclass(cls, type): y = _deepcopy_atomic(x, memo) else: Copier = getattr(x, "__deepcopy__", None) if copier is not None: y = copier(memo) else: # https://github.com/python/cpython/blob/3.10/Lib/copyreg.py reductor = dispatch_table. Get (CLS) if reductor: rv = reductor(x) else: # # __reduce_ex__ and __reduce__ for serializing they will return a string or a tuple # https://docs.python.org/3/library/pickle.html#object.__reduce__ reductor = getattr(x, "__reduce_ex__", None) if reductor is not None: rv = reductor(4) else: reductor = getattr(x, "__reduce__", None) if reductor: rv = reductor() else: raise Error( "un(deep)copyable object of type %s" % cls) if isinstance(rv, str): y = x else: The _reconstruct(X, memo, * RV) bread crumbs (Y = _reconstruct(X, MEMO, * RV)) memo[d] = y _keep_alive(x, memo) # Make sure x lives at least as long as d return yCopy the code

For the HelloWorld object hello_world, copy.deepCopy (hello_world) first calls the reduce_EX serialized object and then calls the _reconstruct object to create it. For the Config object CFG, copy. Deepcopy (CFG) should call the Config deepCopy method to copy the object. But getattr(x, “__deepcopy__”, None) (line 50 of the source code above) cannot find Config’s __deepcopy__ method because Config does not implement it, The getattr(self, name) method of Config is called, but it returns the deepCopy method of _cfg_dict (type ConfigDict). Therefore, the new_cfg = copy.deepcopy(CFG) object generated by the deepcopy is of type ConfigDict.

# https://github.com/open-mmlab/mmcv/blob/v1.4.4/mmcv/utils/config.py class Config: def.__getattr__ (self, name) : return getattr(self._cfg_dict, name)Copy the code

Problem solving

To avoid calling the deepcopy method of _cfg_dict, we need to add the deepCopy method to Config. Copier = getattr(x, “__deepcopy__”, None) calls Config’s deepcopy to make a deepcopy of the object.

# https://github.com/open-mmlab/mmcv/blob/master/mmcv/utils/config.py class Config: def __deepcopy__(self, memo): CLS = self.__class__ # Use __new__ to create an empty object other = cls.__new__(CLS) # The other object is added to the memo to avoid creating the same object over and over again https://pymotw.com/3/copy/ memo/id (self) = # other object initialization for the key and the value in the self. The __dict__. The items () : super(Config, other).__setattr__(key, copy.deepcopy(value, memo)) return otherCopy the code

The development students submitted a PR to MMCV (github.com/open-mmlab/…) To finally resolve the problem, here is an Example from the PR message. \

  • Before incorporating the PR (MMCV version <= 1.4.5)
>>> from mmcv import Config 
>>> from copy import deepcopy 
>>> cfg = Config.fromfile("./tests/data/config/a.py") 
>>> new_cfg = deepcopy(cfg) 
>>> type(cfg) == type(new_cfg) 
False 
>>> type(cfg), type(new_cfg) 
(mmcv.utils.config.Config, mmcv.utils.config.ConfigDict) 
Copy the code

As you can see, the type of the Config object copied with copy. deepCopy becomes ConfigDict, which is not what we expected. \

  • After incorporating the PR (MMCV version > 1.4.5)
>>> from mmcv import Config 
>>> from copy import deepcopy 
>>> cfg = Config.fromfile("./tests/data/config/a.py") 
>>> new_cfg = deepcopy(cfg) 
>>> type(cfg) == type(new_cfg) 
True 
>>> type(cfg), type(new_cfg) 
(mmcv.utils.config.Config, mmcv.utils.config.Config) 
>>> print(cfg._cfg_dict == new_cfg._cfg_dict) 
True 
>>> print(cfg._cfg_dict is new_cfg._cfg_dict) 
False 
Copy the code

After the PR is merged, the copied Config object meets expectations.

reference

  • Copy — Shallow and deep copy operations
  • what-is-the-difference-between-reduce-and-reduce-ex
  • how-to-override-the-copy-deepcopy-operations-for-a-python-object