Preface:

Using Bayesian formula, guess the gender of the other party according to the Chinese name. Without further ado, let’s begin happily

The development tools

Python version: 3.6.4

Related modules:

Pyqt5 module;

And some of the modules that come with Python.

Environment set up

Install Python and add it to the environment variable, and the PIP will install the appropriate module.

Introduction of the principle

Let’s start with a brief introduction to the Bayesian formula, and then move on to the code implementation. As we all know, the probability of event A occurring under the condition that event B has already occurred is:

If A and B are two independent events, then:



 

Obviously, we can use the above formula to determine whether two events are independent. Let’s introduce the total probability formula (superscript C stands for complement) :

The above formula is easy to understand if you draw a Venn diagram (source network) :

Based on the above conclusions, we can easily derive the Bayesian formula:

The practical application of our name to guess gender is to ask:

Obviously, we have:

Here we know how often each character appears in male and female names:

We can assume that they are independent, for example:

 

def genderprob(name, probs, type_='male'):
  assert type_ in ['male', 'female']
  if type_ == 'male':
    p = self.male_total / self.total
    for c in name:
      p *= probs.get(c, (0, 0))[0]
  else:
    p = self.female_total / self.total
    for c in name:
      p *= probs.get(c, (0, 0))[1]
  return p

Take Liu Yifei as an example:

P (female) = the number of occurrences of female name/total occurrences P = liu (liu | female) in the women's name the number of occurrences of total number/female name

The denominator cancels out when we do division, so we don’t have to calculate it, which is:

male_prob = genderprob(name, self.name_probs, 'male')
female_prob = genderprob(name, self.name_probs, 'female')
result = {'male': male_prob / (male_prob + female_prob), 'female': female_prob / (male_prob + female_prob)}

Then we use PyQT5 to create a simple visual interface for this small model of name predictive gender:

That’s all for now. Thank you for watching. Follow my daily series of Python widgets

To thank you readers, I’d like to share with you some of my recent collection of dry programming goodies and give something back to every reader in the hope of helping you out.

Dry goods mainly include:

① More than 2000 Python e-books (mainstream and classic books are available)

(2) The Python Standard Library (Chinese version)

③ project source code (40 or 50 interesting and classic hands-on project and source code)

④Python basic introduction, crawler, web development, big data analysis video (suitable for small white learning)

⑤Python Learning Roadmap (Goodbye to Slow Learning)

⑥ Two days of Python crawler training camp live access

All done~ complete source code + dry goods see personal profile or private message to obtain the relevant files.