Introduction: With the growing prosperity of the Internet big data industry, more and more people are engaged in it. There are also a lot of friends who have a strong interest in it and want to participate in it. From this issue, we will introduce you into the Internet big data industry in four phases, and understand four different positions related to big data: data mining & machine learning, data analysis, algorithm & Deep learning, and data product manager.

Data source: Our data for the next four periods mainly come from Laogou. Currently, popular recruitment websites liepin, Boss Zhipin and Laogou all have a lot of Internet job introductions. We chose the check box for the following reasons: 1. Salaries are mostly direct and rarely negotiable. 2. The number of enterprises is relatively complete, basically covering Internet-related companies. 3. The URL address is relatively orderly, which is convenient for batch crawling. The data display page is as follows:

This section uses Selenium in Python to crawl, with the following code:

while
 
True
:
        
try
:
            
for
 j 
in
 range(
15
):
                xpath = 
'//*[@id="s_position_list"]/ul/li['
+str(j+
1
)+
'] '
                a = driver.find_element_by_xpath(xpath)
                job_desc.append(a.text)
                job_code.append(a.find_element_by_class_name(
'position_link'
                                ).get_attribute(
'data-lg-tj-cid'
))
            js=
"var q=document.documentElement.scrollTop=10000"
  
            driver.execute_script(js)
            driver.find_element_by_class_name(
'pager_next'
).click()
        
except
:
            
break
Copy the code

Salary: We’ll take a look at the number of opportunities and the average monthly salary in each city from several angles, starting with the following chart (the bubble size indicates the number of jobs and the bar height indicates the average monthly salary) :

It can be seen that the number of jobs in Wuhan, the eighth city in the list, is already one fortieth of that in Beijing, and the number of jobs in the bottom city is less than 20. To some extent, this reflects the concentration of data mining & machine learning positions in Beijing, Shanghai, Guangzhou, Shenzhen and Hangzhou. Besides the five major cities, Chengdu, Nanjing and Wuhan also have unlimited potential in the future. Let’s take a look at the number of jobs and salaries for different types of work experience:

You can see that most of the job opportunities on the checkbox are for experienced job seekers. 3 years and 5 years of work experience have also become two more important thresholds, and there will be an obvious hint of salary, which shows that enterprises value experience. Here are the requirements of an enterprise for education:



It should be noted that the education requirement on the check box is the minimum requirement, and the average education in the actual work will be much higher than that shown in the picture. Let’s take a look at how much experience increases pay in different cities by combining cities with experience:

Beijing leads the country in salary across all categories of work experience, demonstrating the capital’s status as an Internet hub. In terms of the comparison of 5-10 years of work experience, the increase rate of Guangzhou lags behind other big cities. Friends who work in Guangzhou can share with us whether this is realistic to some extent. Major companies offer the average monthly salary:



BAT&TMD are among the 15 companies that offer the most jobs on The website, including sogou, Weibo and NetEase. Surprisingly, the highest-paid job offers come from Sina Weibo, where the actual average salary of a company is notoriously complex, and the figures are just for reference. We plotted the above chart using GGplot, with the code as follows (take the company salary chart as an example) :

ggplot(company_com,aes(x=reorder(company,-salary),y=salary,fill=
as
.character(rep(
1
:
5
,each=
3
))))+
  geom_bar(stat=
'identity'
)+
  geom_text(aes(label=round(salary,
2
),y= salary+
1
),size=
5
)+
  theme_wsj()+
  scale_fill_wsj()+
  scale_color_wsj()+
  ggtitle(
'Average monthly salary for Posts by Type of Company (K)'
)+
  theme(axis.text.x = element_text(size=
12
),
        axis.text.y = element_blank(),
        plot.title = element_text(hjust=
0.5
,size=
25
),
        legend.position=
'none'
,
        panel.grid = element_blank(),
        axis.title  = element_blank(),
        axis.text = element_text(face=
'bold',hjus= 0.8,size= 10, Angle = 15))Copy the code

Anticipated salary calculation: we use the linear regression model, can easily help you calculate the expected salary situation (K) data for a month, the unit and we only selected the experience, city, three factors of record of formal schooling, and does not take into account the interaction term, higher order term results are for reference only, actual situation a lot more complicated:



Skills & Benefits: In addition to the above hardware requirements, the actual skills you have are actually more important to getting a good salary. Let’s take a look at the skills you need to get into data mining & machine learning:



What kind of benefits can we get after successfully joining the company? Please take a look at the following figure:

This article is from Ali Cloud developer community

The original link: developer.aliyun.com/article/617…