Dictionaries and collections

The reason dictionaries and collections always go together, oddly enough, is that they are wrapped in braces {}.

Dictionaries and collections are the basic operations

Say first dictionary

Dictionaries are made up of key-value pairs. The Key is Key, and the Value is Value. Just to be clear, dictionaries were not required before Python3.6, their length was variable, and elements could be deleted and changed at will.

To test the unordered nature of dictionaries, I specifically tested the Python online environment with the following code:

my_dict = {}
my_dict["A"] = "A"
my_dict["B"] = "B"
my_dict["C"] = "C"
my_dict["D"] = "D"

for key in my_dict:
    print(key)
Copy the code

The results also demonstrate disorder.

In the native Python 3.8 test, there is no out-of-order condition.

So when someone asks if the dictionary in Python has order, don’t just say unordered, it now has order.

Dictionaries are key-value pairs that are better suited for adding, deleting, and searching elements than lists and tuples.

If the key does not exist, a KeyError will occur. This error is a very common error.

my_dict = {}
my_dict["A"] = "A"
my_dict["B"] = "B"
my_dict["C"] = "C"
my_dict["D"] = "D"

print(my_dict["F"])
Copy the code

The error message is as follows:

Traceback (most recent call last):
  File ".\demo.py", line 7, in <module>
    print(my_dict["F"])
KeyError: 'F'
Copy the code

If you don’t want this exception, use get(key,default) when indexing keys.

print(my_dict.get("F","None"))
Copy the code

Talk to you a collection

Collections and dictionaries have the same basic structure. The biggest difference is that collections have no key-value pairs. They are an unordered and unique set of elements. Collections do not support indexing, which means the following code is bound to report an error.

my_set = {"A","B","C"}
print(my_set[0])
Copy the code

TypeError: ‘set’ object is not subscriptable.

The other important thing to remember is that sets are often used for de-duplicating operations.

Dictionary and set sorting

Basic operation is still not too much explanation, need can go to the first snowball learning, here emphasize the sorting function, because it involves some extended knowledge points, can first touch, behind part of the content will be detailed.

Before you learn, remember that when you pop a collection, the resulting elements are indeterminate because the collection is out of order. You can test the following code:

my_set = {"A","B","C"}
print(my_set.pop())
Copy the code

If we wanted to sort dictionaries, we could do it this way, using techniques we already know.

The following is the result of running Python3.6.

Use the sorted function to sort dictionaries. When sorting, you can also specify whether to sort by key or value, for example, sorting by ascending dictionary value.

my_dict = {}
my_dict["A"] = "4"
my_dict["B"] = "3"
my_dict["C"] = "2"
my_dict["D"] = "1"
sorted_dict = sorted(my_dict.items(),key=lambda x:x[1])
print(sorted_dict)
Copy the code

The output is as follows, sorted by the dictionary value. Note that lambda anonymous functions will be expanded in subsequent lectures

[('D', '1'), ('C', '2'), ('B', '3'), ('A', '4')]
Copy the code

Sorted sets are sorted by using the sorted function.

The efficiency of dictionaries and sets

The efficiency of dictionaries and sets is mainly compared with lists. Suppose we have a bunch of student numbers and weight data, and we need to judge the number of students with different weights. The requirements are described as follows: there are 4 students, and the tuples formed by sorting their student numbers are (1,90), (2,90), (3,60), and (4,100). The final output 3 (there are three different weights) is written in the following code according to the requirements: listing method

Def find_unique_weight(students): # declare a list of students. Unique_list = [] # loop all students. # if not in unique_list if not in unique_list Ret = len(unique_list) return RET students = [(1, 90), (2, 90), (3, 60), (4, 100) ] print(find_unique_weight(students))Copy the code

Next modify the above code to be written as a collection

Def find_unique_weight(students): def find_unique_set = set() : def find_unique_weight(students): Add (weight) # Calculate the set length ret = len(unique_set) return RETCopy the code

After the code is written, there is not much difference, but if you scale up the data to two larger episodes, say tens of thousands of data. The following code time calculation function applies time.perf_counter(). When this function is called for the first time, it randomly selects A time point A from the computer system and calculates how many seconds it is from the current time point B1. When the function is called A second time, it defaults to the number of seconds since the point A of the first call to the current point B2. To take the difference between the two functions, that is, to implement the timing function from points B1 to B2, first run the following code in conjunction with the function calculated from the list

Import time id = [x for x in range(1, 10000)] Weight = [x for x in range(1, 10000)] Students = list(zip(id, Perf_counter () # call list function find_unique_weight(students) end_time = time.perf_counter() Print (" print time: {}". Format (end_time-start_time))Copy the code

The running time is 1.7326523. The running speed of each computer is inconsistent, depending on the specific difference. Modify the above code to run on the function written by the collection, and the final result is 0.0030606. You can see that the difference is already so large at 10000 pieces of data, and if the order of magnitude increases, the difference will increase again, so do you know what to use?

Summary of this blog post

In this blog post, we have added some knowledge about dictionaries and collections. One knowledge eraser has been skipped, namely, the storage principle of dictionaries and collections. Specifically, it will involve some knowledge about the structure of hash tables. If you need to find data efficiently and remove data in writing programs, it is recommended to apply both in time.