Chapter two network application

Section 1 Computer network application architecture

1.1 Classification of computer network applications

There are many computer network applications, which can be divided into three types from the perspective of architecture, as follows:

  • Client/server (C/S) structure
  • Pure P2P structure
  • Hybrid structure


1.2 C/S structure network application

  • C/S structure network applications areThe most typical,The most basicNetwork applications.
  • Commonly usedWWW applications,File Transfer FTP,E-mailAnd so on are C/S structure network applications.


The characteristics of

C/S network applications are characterized by communication only between clients and servers, and no direct communication between clients.

The schematic diagram of C/S structure network application is as follows:




1.3 Pure P2P network application

What is a peer-to-peer (P2P)

P2P is a peer-to-peer network. In the network each end is the client, is the server.


The characteristics of

  • Each peer has the characteristics of both client and server of C/S applications. It is a combination of server and client, and applications are dynamically implemented between peers.
  • Full aggregation takes advantage of the computing capacity and network transmission bandwidth of the end system, and has little dependence on the server.

The schematic diagram of pure P2P network application is as follows:




1.4 Hybrid structure network applications

The characteristics of

Hybrid structure network application combines C/S application with P2P application, which has both the existence of central server and direct communication between peers.

The schematic diagram of hybrid structure network application is as follows:




Section 2 Basic principles of Network Application communication

2.1 Basic Principles of Network Application Communication

No matter C/S network application, P2P network application or hybrid network application, their basic communication principle is C/S communication.


The basic communication process of network applications

Application processes running on different hosts communicate in C/S mode.


Basic principles of C/S communication

The server side is running the server process, passively waiting for customers to request services; The client runs the client process, initiates communication and requests the server process to provide services. Application processes exchange application-layer packets according to application layer protocols.


2.2 Network Application and Transport Layer Services

  • Network applications require transport layer provisionThe end-to-endThe transport service of.
  • The Internet Transport layer provides only two types of services:
    • Connection-oriented Reliable Byte Stream Transport Service (TCP)
    • Connectionless unreliable Datagram Transport Service (UDP)
    • Neither of these services provides end-to-end throughput and delay assurance services.


2.3 Application programming interfaces

  • Socket (Socket)Is a typical network application programming interface.
  • The socket can exchange packets between application processes and underlying protocols. Therefore, the socket is a channel through which each application process communicates with other application processes to send and receive packets.


2.4 Process Identification

A process is identified by the IP address of the host on which it runs and the port number bound to its socket.


Section 3 Domain name System

The domain name system

It is a distributed database that stores mapping data between domain names and IP addresses on the network and provides domain name resolution service.


Domain name resolution

The process of mapping domain names to IP addresses.


3.1 Hierarchical Domain name Space

Domain names are named in a hierarchical tree structure, as shown in the following figure.

Schematic diagram of domain name structure:

As you can see below the root of the first layer are the top-level domains. There are three types of top-level domains:

  • Country top-level Domains (nTLD)
    • Cn: China
    • Us: the United States
    • UK: England
  • Generic Top-level Domain Names (gTLD)
    • Com: companies and enterprises
    • Net: Network service organization
    • Org: non-profit organization
    • Edu: a dedicated educational institution
    • Gov: dedicated government department
    • Mil: Dedicated military department
    • Int: international organization
  • Infrastructure domain name
    • ARPA: Reverse domain name (for reverse domain name resolution)

The structure of a domain name consists of a sequence of labels separated by dots, for example, “… “. Level 3 domain name Level 2 domain name Top-level domain name “, each label represents a different level of domain name.


3.2 Domain Name Server

Domain name server

To resolve domain names, a distributed database is required to store the mapping data between domain names and IP addresses on the network. These databases are stored on the domain name server (DNS), which provides domain name resolution services based on users’ requests.


Classification of domain name servers

According to the information stored in domain names and their roles in domain name resolution, domain names can be classified into the following four categories:

  • Root DNS server
    • Currently, there are 13 root DNS servers whose names range from A to M, such as a.rootservers.net,…, and m.rootservers.net.
    • Each root DNS server holds the domain names and IP addresses of all top-level DNS servers.
  • Top-level domain name server
    • Manages all secondary domain names registered with the top-level domain name server.
  • Authoritative domain name server
    • Responsible for the domain name server (DNS) of a zone, which saves the mapping between domain names and IP addresses of all hosts in the zone.
  • Intermediate domain name server

Local domain name server:

  • When configuring the network address of any host, a DNS server is configured as the default DNS server. The default DNS server is usually called the local DNS server.
  • The domain name server is the first domain name server queried by a host during domain name query.


3.3 Domain name Resolution Process

Domain name Resolution classification

The domain name resolution process is divided into recursive query and iterative query.


Recursive analysis

Perform further domain name queries on behalf of the query host or other domain name server and send the final resolution results to the query host or server.

Schematic diagram of recursive parsing process:


Iterative parsing

Simply tell the query host or server which server to query next.

Schematic diagram of iterative parsing process:


Section 4 Application of world Wide Web

4.1 Web application structure

The web application structure is mainly composed of the following three parts:

  • The Web server
  • The browser
  • Hypertext Transfer Protocol (HTTP)

The schematic diagram of the Web application is as follows:


URL

Each URL consists of two parts: the host name of the server where the object is stored and the path name of the object.

For example, http://www.abc.edu.com/cs/index.html, www.abc.edu is a Web server hostname, / cs/index. HTML is the path name.


4.2 – HTTP

4.2.1 – OVERVIEW of HTTP

HTTP is an application-layer protocol for Web applications. It defines how the browser sends requests to the Web server and how the Web server responds to the browser.


4.2.2 – HTTP Connection

HTTP can be divided into non-persistent HTTP and persistent HTTP according to different TCP connection policies.


(1) Non-persistent connection

Concept: A non-persistent connection means that after an HTTP client establishes a TCP connection with an HTTP server, a TCP connection needs to be established for each object transfer request.

Features: Each object is requested serially, and a new TCP connection is established each time.

Disadvantages: When each object is requested in this serial mode, a new TCP connection is established each time, so it has to go through the slow start phase of TCP congestion control, which makes the TCP connection work in the lower throughput state and the latency is more obvious.

To improve HTTP performance, there are two typical optimization techniques: parallel connections and persistent connections.

The request transmission process is shown as follows:


(2) Parallel connection

Concept: Establish multiple parallel TCP connections, send HTTP requests and receive HTTP responses in parallel.

Features: Request each object in parallel, each object to establish a TCP.

The request transmission process is shown as follows:


(3) Persistent connection

Concept: Reuse established TCP connections to send new HTTP requests and receive HTTP responses.

Depending on the strategy of transferring multiple objects using persistent connections, there are two types:


  1. Non – flowing persistent connection

Concept: Also called non-piped persistent connection, a client can send a request packet to the next object only after receiving the previous response packet through persistent connection.

The request transmission process is shown as follows:


  1. Water mode persistent connection

Concept: A client sends multiple request packets to subsequent objects before receiving response packets from the previous object through a persistent connection, and then receives response packets from the server through this connection.

The request transmission process is shown as follows:


4.2.3 – HTTP Packets

HTTP packets are classified into request packets and response packets. And they are all composed of four parts:

  • The starting line
  • The first line
  • Blank lines
  • The entity body


The request message

Definition: Sent by the browser (client) to the Web server.

The request message structure is shown in the figure:

There are several request methods:

  • GET: Requests to read information identified by a URL, which is the most common method.
  • HEAD: Requests to read the detailed header identified by the URL, such as that of an HTML file<head>Tag section.
  • POST: Adds information to the server, such as registered users.
  • OPTION: Requests information about some options.
  • PUT: Stores a document at the specified URL.


The response message

Definition: Sent by the Web server to the browser.

The response message result is shown in the figure:




HTTP status code classification:




Common HTTP status codes:

Status code meaning
100 Indicates that the initial portion of the request was successfully received. The client is asked to continue.
200 Success. The requested information is in the response message.
301 Redirect: The requested object is permanently removed and a new URL is provided in the response packet. Usually, the browser automatically sends a request to the new URL.
400 Client request error, that is, the server cannot immediately request the client correctly.
401 Not authorized, you need to enter the user name and password.
404 The object requested by the client does not exist on the server.
451 Unsupported media types, may be rejected by the server, or request methods or parameters do not match server requirements.
505 Requested HTTP version not supported by server.


4.3 the Cookie

Definition: Cookie Is a small text file in Chinese. It refers to data stored on a user’s local terminal by some websites for identifying the user and tracking the user’s session.

Cookies can be divided into permanent cookies and session cookies.


Permanent cookies

The permanent Cookie is stored in the hard disk of the computer. The validity period of the Cookie can be set. Closing the browser does not affect the validity period of the Cookie.


A session Cookie

Session cookies are stored in the memory of the computer. The validity period of this kind of Cookie is during the session of the browser. As long as the browser window is closed, the Cookie will disappear.


How cookies work

  1. The server generates cookies and sends them to the browser.
  2. The browser saves the Cookie to a text file in a directory.
  3. The Cookie is sent to the server the next time the same web site is requested.
  4. The server retrieves historical user behavior data based on Cookie values.

Schematic diagram of working principle:




Common Uses of Cookies

  • Websites can use the Cookie ID to accurately count the actual number of visitors, the number of new visitors and repeat visitors, the probability of visitors and other data.
  • Web sites can use cookies to restrict access to certain users.
  • Websites can store users’ unique operating habits and preferences, and provide targeted services for users.
  • E-commerce sites can use cookies to achieve the “shopping cart” function.


Section 5 Internet E-mail

5.1 Email System Architecture

E-mail system structure includes mail server, simple Mail Transfer Protocol (SMTP), user agent and mail reading protocol, as shown in the figure:




Mail server

  • The function of mail server is to send and receive mail, and at the same time to report to the sender of mail delivery situation, is the E-mail architectureThe core.
  • Because mail servers send and receive messages according toClient/ServerSo mail servers usually containMail sending processMail receiving process.
  • SMTP is the application layer protocol used to send mails between mail servers.
  • The default port number bound to the mail receiving process is 25.

The email format is as follows: Recipient email name@Domain name of the host where the email resides For example, if the email address is [email protected], the domain name of the mail server is mail.hit.edu.cn, and the email address of the user is user_A.


The user agent

User agent is the client software for E-mail applications, providing users with an interface to use E-mail, such as QQ mailbox.


5.2 SMTP

  • SMTP is the core application layer protocol in email, which realizes mail transmission between mail servers or between user agents and mail servers.
  • SMTP uses the transport layerTCPProtocol for reliable data transfer,The port number is 25.
  • SMTP completes mail transmission through three phases of application layer interaction, respectivelyhandshake,Mail transfer phaseClose the stage.
  • The BASIC interaction mode of SMTP is that the SMTP client sends commands with parameters, and the SMTP server responds to the commands.


The characteristics of

  • SMTP can only send 7-bit ASCII text. Other content (such as graphics, sounds, and videos) must be converted to 7-bit ASCII text. The recipient can restore the content.

  • SMTP mail cannot contain “CRLF.CRLF”, because this information is used to identify the end of the mail content. If the contents of an email contain the message, you need to escape it during transmission.

  • SMTP is the push protocol. When a client sends an email to the server, the client initiates a TCP connection with the server and pushes the email to the server.

  • SMTP is persistent using TCP connections. At the end of a message transmission, there is no requirement to enter the shutdown phase. If you still need to send emails, you can use the established TCP connection to send subsequent emails.


5.3 Email Formats and MIME

Email format

The email consists of header, blank line and body, among which header includes one or more header lines.

Common keywords in the front line:

The keyword meaning
To Email address of the recipient
From The email address of the sender
Subject Email subject
Cc Send a copy of a message to someone
Date A letter date
Reply-To The address from which the reply was made


MIME

MEME is short for Multi-purpose Internet Email Extension. The function is to convert non-7-bit ASCII text content to 7-bit ASCII text content, and then use SMTP for transmission.

MIME consists of three main parts:

  • Five MIME message header fields that can be included in a message header.

    The keyword meaning
    MIME-Version Used to identify the MIME version
    Content-Description A general description of the contents of an email
    Content-Id The unique identifier of the message
    Content-Encoding Describe how the message body is encoded as a standard
    Content-Type Describes the type and format of the email body
  • A variety of mail content formats are defined, and the presentation methods of multimedia E-mail are standardized.

  • Defines mail delivery codes that can be converted to any content format suitable for SMTP delivery.


5.4 Mail Read Protocol

When users need to access their own mailbox and read the mail, they need to use the mail read protocol.

At present, the popular mail reading protocols in the mail system are as follows:

  • POP3 [Post Office Protocol-Version 3]
  • Internet Mail Access Protocol (IMAP)
  • HTTP


POP3

With POP3, users can download mail to the local host and perform operations on mail locally (such as moving, querying, reading, and deleting mail). However, this action on mail locally is not updated to the mail server.

For example, if we clone the GitHub repository and modify the code locally, but do not submit the modified code to the repository, then other people will see the same code in the repository.


IMAP

With IMAP, user actions on mail (move, query, read, delete, etc.) can be updated to the server so that the user’s access to mail from any machine is up to date.

As if we were directly manipulating the code in the GitHub repository, everyone else would see it later.


HTTP

When web-based mail is used, HTTP is used to read the mail, and HTTP is also used as the mail read protocol.


Section 6 FTP

6.1 summary of the FTP

FTP is short for file transfer Protocol. It is an application layer protocol used to transfer files between two hosts on the Internet.


role

FTP can reduce or eliminate the incompatibility of file processing under different operating systems and shield the details of each computer system. It is suitable for transferring files between any heterogeneous computer in the network.


The characteristics of

  • FTP is a typical client/server network application. It implements bidirectional file transfer between the client and the server in C/S mode.

  • FTP applications use two “parallel” TCP connections: the control connection and the data connection.


Schematic diagram of FTP application structure

When a user uses the FTP service, the client process first requests to establish a TCP connection with port 21 of the FTP server, which is called a control connection.

To transfer file content, the client process requests to establish a TCP connection with port 20 of the FTP server, called a data connection.


6.2 Control Connection and data connection

Control connection

When a user uses the FTP service, the client process establishes a control connection with port 21 of the FTP server. The connection is used to transfer control information (such as user ID, password, and file upload commands) between the client and the FTP server.

The control connection is persistent and remains open throughout the session.


Data connection

Since the control connection does not provide file transfer, the client process also needs to create a data connection to port 20 of the server for file transfer.

The data connection is non-persistent and is closed after the file transfer.