Hello, I’m Yue Chuang.

If you already have the basics of Python, congratulations on this simple and efficient language. You can skip these lessons, or you can use them as a refresher to catch up on what you missed. You can also share your Python learning and using tips in the comments section.

Ok, is there a question in your mind right now, is it necessary to know Python to learn data analysis well?

My answer is that if you want to do data analysis well, you better know Python. Why do you say that?

First, in a survey of development languages, 80% of developers who have used Python have Python as their primary language. Python has become the fastest growing mainstream programming language, standing out from the crowd of development languages and becoming a favorite among developers.

Second, Python is used by more developers than any other language combined in data analysis. Finally, Python is a compact language with a large number of third-party libraries and is powerful enough to solve most of the problems of data analysis, which I’ll discuss in detail below.

Python’s greatest strength is its brevity. Although it is written in C, it does away with C Pointers, which makes the code very concise. The same line of Python code is equivalent to five lines of Java code. We read Python code as intuitively as we read It in English, which allows programmers to focus on problem solving rather than on the language itself.

Of course, in addition to Python’s own features, Python also has powerful developer tools. Python has a number of well-known libraries in data science, including the scientific computing tools NumPy and Pandas, the deep learning tools Keras and TensorFlow, and the machine learning tool SciKit-learn.

In conclusion, if you want to make a difference in data science, such as data analysis and machine learning, it is essential to know how to use a language, especially Python, especially the tools we just mentioned.

1. Install and IDE environment

Now that you know why to learn Python, let’s get you started quickly on your first Python program, so let’s take a look at how to install and set up an IDE.

1.1 Python version selection

There are two major versions of Python: 2.7.x and 3.x. There are some differences between the two versions, but not much, with less than 10% of the differences in syntax.

Another fact is that most Python libraries support both Python versions 2.7.x and 3.x. Although Python2.7 is officially maintained until 2020, I would like to tell you this: Don’t ignore Python2.7, it has a life well beyond 2020, and it is still the dominant Python version for two years. A survey showed that version 2.7 was still the dominant version in commercial projects in 2017, accounting for 63.7%. Even though python 3.x has been used more rapidly in recent years, python 3.x has actually been around since 2008.

So you might ask: How do you choose between these two versions?

The criteria for version selection is whether your project will depend on the Python2.7 package. If it does, use Python2.7. Otherwise, you can start a new project with Python 3.x.

1.2 Python IDE Recommendation

How do you choose the Python IDE once you’ve identified the versioning issues? There are many excellent choices, here are some recommendations.

  1. PyCharm

    Python is a cross-platform Python development tool that helps users improve their Python productivity by debugging, syntax highlighting, code jumping, auto-completion, smart hints, and more.

  2. Sublime Text

    SublimeText is a well-known editor, and Sublime Text3 launches in about a second and is very responsive. It also has good Python support, with code highlighting, syntax hints, auto-completion, and more.

  3. Vim

    Vim is a clean, efficient tool that is fast and can do anything without crashing. Vim, however, is a bit more difficult to get started with than Sublime Text and a bit more cumbersome to configure.

  4. Eclipse+PyDev

    Eclipse is no stranger to Java, so using the Eclipse+PyDev plugin is a good choice for developers who are familiar with Eclipse.

    If you haven’t used much of any of the above ides before, Sublime Text is recommended for easy use and quick response.

2. Basic Python syntax

With the environment configured, let’s quickly learn some basic Python syntax that Python must know. I’m assuming you have zero foundation in Python, but already have some foundation in other programming languages. Let’s look at them one by one.

2.1 Input and Output

name = raw_input("What's your name?")
sum = 100+100
print('hello,%s' %name)
print('sum = %d' %sum)
Copy the code

Raw_input is an input function for Python2.7. In Python3. x, you can directly use the input function to assign a value to name, print is the output function, and %name is the value of the variable.

Here is the result:

What's your name? cy hello,cy sum = 200Copy the code

2.2 Statement: If… The else…

score = 10
if score >= 90:
	print('Excellent')
else:
	if score < 60:
		print('Fail')
	else:
		print('Good Job')
Copy the code

The if… The else… Is a classic judgment statement. Note that there is a colon after the if expression and also a colon after the else.

Also note that Python does not use {} or begin… as other languages do. End separates blocks of code, instead using indentation and colons to distinguish between layers of code. So indentation is a syntax in Python. What happens if you don’t indent your code consistently, say, with tabs and Spaces? Errors or exceptions are generated. Code of the same level must use the same level of indentation.

2.3 Loop Statements: For… in

sum = 0
for number in range(11) :sum = sum + number
print(sum)
Copy the code

Running results:

55
Copy the code

The for loop is an iterative loop mechanism that repeats the same logical operation. If we specify the number of loops, we can use the range function, which is often used in for loops.

  • Range (11) is from 0 to 10, not including 11, which is the same thing as range(0,11)
  • You can also add steps to range. For example, range(1,11,2) represents [1,3,5,7,9].

2.4 Loop statements: while

sum = 0
number = 1
while number < 11:
	sum = sum + number
	number = number + 1
print(sum)
Copy the code

Running results:

55
Copy the code

The sum from 1 to 10 can also be written in a while loop, where the while controls the number of loops. The while loop is a conditional loop, where variables are evaluated in a more flexible way. Therefore, the while loop is suitable for the loop with uncertain number of cycles, while the for loop is relatively certain and suitable for the loop with fixed number of cycles.

2.5 Data types: List, tuple, dictionary, and set

2.5.1 List: []

lists = ['a'.'b'.'c']
lists.append('d')
print(lists)
print(len(lists))
lists.insert(0.'mm')
lists.pop()
print(lists)
Copy the code

Running results:

['a'.'b'.'c'.'d']
4
['mm'.'a'.'b'.'c']
Copy the code

List is a common data structure in Python. It is equivalent to an array. It has the function of adding, deleting, modifying and searching lists. Append () is used to add elements to the tail, insert() is used to insert elements into the list, and pop() is used to remove elements from the tail.

2.5.2 tuples (a tuple)

tuples = ('tupleA'.'tupleB')
print(tuples[0])
Copy the code

Running results:

tupleA
Copy the code

A tuple is very similar to a list, but a tuple cannot be modified once initialized. There are no append(), insert() methods because they cannot be modified, which can be accessed like an array, such as tuples[0], but cannot be assigned.

2.5.3 dictionary {dictionary}

# -*- coding: utf-8 -*
Define a dictionary
score = {'guanyu': 95.'zhangfei': 96}
Add an element
score['zhaoyun'] = 98
print(score)
Delete an element
score.pop('zhangfei')
Check whether the key exists
print('guanyu' in score)
Check the value of a key
print(score.get('guanyu'))
print(score.get('yase'.99))
Copy the code

Running results:

{'guanyu': 95.'zhaoyun': 98.'zhangfei': 96}
True
95
99
Copy the code

A dictionary is {key, value}. If you add value to the same key several times, the following value will wash out the previous value. Adding a dictionary element is equivalent to assigning a value, such as score[‘ zhaoyun ‘] = 98, deleting an element uses pop, and querying uses get. If the queried value does not exist, we can also give a default value, such as score. Get (‘ yase ‘,99).

2.5.4 Set: Set

s = set(['a'.'b'.'c'])
s.add('d')
s.remove('b')
print(s)
print('c' in s)
Copy the code

Running results:

{'a'.'d'.'c'}
True
Copy the code

The set set is similar to the dictionary dictory, except that it is just a set of keys and does not store values. You can also add and delete, add using add, remove using remove, query to see if an element is in the collection, and use in.

2.6 Note: #

Comments in Python use #, or if there is Chinese in the comment, # — coding: UTF-8 – is added before the code.

For multi-line comments, use three single quotes, or three double quotes, as in:

# -*- coding: utf-8 -*
"" This is a multi-line comment with three single quotes."
Copy the code

2.7 Reference modules/packages: import

Import a module
import model_name
# import multiple modules
import module_name1,module_name2
Import the specified module from the package
from package_name import moudule_name
Import all modules in the package
from package_name import *
Copy the code

Import is easy to use in Python. You can import the import module_name statement directly. What is the nature of import here? Import is essentially a path search. An import reference can be a module module or a package.

For module, we actually refer to a.py file. For package, you can use from… The import… In this case, the module is actually referenced from a directory, which must have an _init_.py file in the directory structure.

2.8 Function: def

def addone(score) :
	return score + 1

print(addone(99))
Copy the code

Running results:

100
Copy the code

The function code block begins with the def keyword, followed by the function identifier name and parentheses, in which the parameters are passed in, and returns the result of the function.

2.9 A + B Problem

With the basic syntax described above, we can run Python code using the Sumlime Text editor. In addition, tell you a very efficient method, you can make full use of a brush advanced url: acm.zju.edu.cn/onlinejudge… This is OnlineJudge from ZHEJIANG University ACM.

What is OnlineJudge?

OnlineJudge will tell you the result of the run. If the result is correct, it will say “Accepted”. If it is Wrong, it will say “Wrong Answer”.

Don’t take such problems lightly. Compilation errors, memory overruns, runtime timeouts, etc. So the quality of the code is pretty high. Now I’m going to tell you about the A+B problem, so you can do the exercises yourself and submit your answers in the background.

2.10 Title: A+B

Input format: A series of integer pairs A and B, separated by Spaces.

Output format: For each integer pair A and B, the sum of A and B needs to be given.

Input and output examples:

INPUT
1 5
OUTPUT
6
Copy the code

To answer this question, I gave the following answer:

while True:
	try:
		line = input()
		a = line.split()
		print(int(a[0]) + int(a[1]))
	except:
		break
Copy the code

Of course, everyone can have a different solution, and there are official Python answers. Here’s why we introduce OnlineJudge:

  1. You can get feedback online, submit your code, and the system will tell you whether it’s right or wrong. And you can see the correct rate of each question, and the status of feedback after submission;
  2. There are community forums for exchange and learning;
  3. Of course, the flexible use of data mining algorithm and the promotion of the whole programming foundation will be of great help.

3. Summary

We now know that Python is without a doubt the most dominant language for data analysis. We’ve learned so much about Python’s basic syntax today, if you’ve noticed its brevity. If you have a background in other programming languages, you can easily convert to Python syntax. So here we are, getting started with Python. Is there any way to quickly improve Python programming based on this? I’ll share my thoughts with you.

In our daily work, the problems we solve are not really difficult problems, and most of us do development work rather than research projects. So the main thing we need to improve is proficiency, and the only way to proficiency is practice, practice, practice!

If you’re new to Python, don’t worry, the best way to do it is to do it directly. Let me run through the examples above, and let me do it myself.

If you want to improve your programming fundamentals, especially algorithmic and data structure-related skills, this will be used in future development. ACM Online Judge is a great choice. Be brave enough to open the door and use it as a good tool for your advancement.

You can start with questions with a high Accepted ratio. The more questions you get right, the higher your ranking, which means that your programming skills, including algorithms and data structures, improve. In addition, this kind of learning with everyone in the community, but also ranking, just like a game, make learning more interesting, no longer lonely.

Many times in this article I emphasize the importance of practice to increase your proficiency in data analysis. So I’ve given you two practice questions, so you can think about how to do them, and feel free to post your answers in the comments, and I’ll discuss them with you in the comments section.

  1. How do I reference the SciKit-learn library in Python?
  2. For 1 + 3 + 5 + 7 +… How do I write the sum of +99 in Python?

You are welcome to share today’s lesson with your friends and help them master Python’s powerful language.

Student Share 1:

3. Online Judge recommended by teachers

Getting Started with Python: Enough with Programming Python: Getting Started to Practice

IDE: PyCharm, Jupyter Notebook + Spyder3, Sublime Text 3

Database: PGsql (good to use), Mysql (open source, mainstream)

Py version: Choose PY3 without hesitation (since py2 will be out of maintenance in 2020)

Improve: There is nothing to say, just “do”, write more practice will naturally feel, yes, when you write more code, you will see the problem at a different level. So, be cruel to yourself, don’t have been wandering in the entrance.

Student Share 2:

  1. I have used PyCharm, Sublime and Jupyter, and I think PyCharm is suitable for larger projects. I usually develop some small scripts by myself and use Sublime, which is simple and convenient. Now I have been using Jupyter, which is more suitable for data analysis and chart display. Visualization, a line of code a result is very convenient, today’s course has been written with Jupyter all over.
  2. Sum: Sum (range(1, 100, 2)) sum(iterable, start). Sum (range(1, 100, 2)) sum(iterable, start) is an iterable
  3. ‘int’ object is not iterable. Error, I today use Jupyter also encountered, should be the previous teacher’s example used sum as a variable, after the sum of this problem and then use sum() to do the function, so the error, restart the next Jupyter on the line, or use magic command %reset to clear the variable should also be ok.

The answer:

Sum = 0 for I in range(1,100,2): sum+= I print(sum) (2) def sum(x): if x>99 Num = sum(x+2) return x+num print(sum(1)

Q1: Not a Python built-in library

PIP install scikit-learn

Reference library import scikit-learn

Q2:

Method 1: sum function

Print (sum (range (1100, 2)))

Method 2: If iteration

a = 0

For I in range (1100, 2) :

a += i

print(a)

Method 3: while loop

i = 1

b = 0

while i < 100:

if i % 2 != 0 :

b += i

i +=1

print(b)