Browser mechanism

1. Browsers are multi-process processes and threads

- A process is a factory, the factory has its own resources - the factories are independent of each other - threads are workers in the factory, Multiple workers collaborate to complete a task - one or more workers in a factory - shared space between workers - factory resources -> system-allocated memory (a separate block of memory) - independent between factories -> independent between processes - Multiple workers collaborate to complete a task -> multiple threads collaborate in a process to complete a task - One or more workers in a factory -> a process consisting of one or more threads -> Shared space between workers -> Shared memory of a program (including code segments, data sets, heaps, etc.) between threads in the same processCopy the code

Finally, in more official terms:

Process are the smallest units of CPU resource allocation (is the minimum unit can run with resources and independent) threads are the smallest units of CPU scheduling (thread is based on the process on the basis of units of a program is running, a process can have multiple threads) can also be communication between different processes, but the price is bigger Now, a generic name: Single thread and multi-thread, both refer to the single and many in a process. (So the core still has to belong to a process.)Copy the code

2. Browsers are multi-process

A browser is a multi-process browser that works because the system allocates resources (CPU, memory) to its processesCopy the code

3. What processes do browsers contain?

1.Browser process: there is only one main process of the Browser (responsible for coordination and master control). There are. Responsible for browser interface display and user interaction. Forward, backward, etc. Responsible for page management, creating and destroying other processes. Draw an in-memory Bitmap from the Renderer process onto the user interface. Network resource management and download 2. Third-party plug-in process: each type of plug-in corresponds to one process, and this process is created only when the plug-in is used. 3. Browser Renderer (browser kernel) : The Renderer process is multi-threaded internally. The default is one process per Tab page. The main effect is. Page rendering, script execution, event handling, etcCopy the code
  • Memory enhancement: Opening a web page in the browser is like starting a new process (the process has its own multithreading)

4. Advantages of multi-process browsers

Compared with single-process browser, multi-process has the following advantages: Avoid a single page crash affecting the entire browser. Avoid third-party plug-in crash affecting the entire browser. Multi-process makes full use of multi-nuclear advantages. It is convenient to use sandbox model to isolate processes such as plug-ins and improve browser stabilityCopy the code

Simple to understand: if the browser is a single process, then a Tab page crashes, affecting the entire browser, how bad the experience is; Similarly, if the plugin crashes in a single process, the entire browser will be affected. And multiple processes have many other advantages…

Of course, memory and other resources consumption will be larger, a bit of space for time meaning.

5. Focus on the browser kernel (rendering process)

The point is, as we can see, with all the processes mentioned above, what is the end result for a normal front-end operation? The answer is the rendering process

Page rendering, JS execution, event loop, all take place in this process. The next step is to focus on this process

Browser renderers are multithreaded

Finally, the concept of threads 😭, so kind. So let’s see what threads it contains (to name some of the main resident threads) :

1.GUI rendering thread. Responsible for rendering browser interfaces, parsing HTML, CSS, building DOM trees and RenderObject trees, layout and drawing, etc. . This thread executes when the interface needs to be repainted or when some operation causes reflow. Note that the GUI rendering thread and the JS engine thread are mutually exclusive, the GUI thread is suspended (frozen) while the JS engine is executing, and GUI updates are stored in a queue until the JS engine is idle and executed immediately. Also known as the JS kernel, it handles Javascript scripts. (V8 engine, for example). The JS engine thread is responsible for parsing Javascript scripts and running code. The JS engine waits for tasks to arrive in the task queue and then processes them. There is only one JS thread running on a Tab page (renderer process) at any one time. Also note that the GUI rendering thread and the JS engine thread are mutually exclusive, so if the JS execution time is too long, the page rendering will be incoherent and the page rendering load will block 3. The event triggers the thread. Belongs to the browser, not the JS engine, and is used to control the event loop (understandably, the JS engine is too busy on its own and needs the browser to open another thread). When the JS engine executes a code block such as setTimeOut (or other threads from the browser kernel, such as mouse clicks, AJAX asynchronous requests, etc.), it adds the corresponding task to the event thread. When the corresponding event meets the trigger condition is triggered, the thread will add the event to the end of the queue to be processed, waiting for the JS engine to process. Note that due to the single-threaded nature of JS, these events in the queue must be queued up for the JS engine to process (and then executed when the JS engine is idle) 4. Timing trigger thread. The legendary setInterval and setTimeout thread. Browser timing counters are not counted by JavaScript engines (because JavaScript engines are single-threaded, blocking threads can affect timing accuracy). Therefore, a separate thread is used to time and trigger timing (when the timing is complete, it is added to the event queue, waiting for the JS engine to be idle). Note that the W3C specification in the HTML standard requires that setTimeout intervals below 4ms be counted as 4ms. 5. Asynchronous HTTP request threads. After the XMLHttpRequest connects, the browser opens a new thread request. When a state change is detected, if a callback function is set, the asynchronous thread generates a state change event and places the callback in the event queue. It's executed by the JavaScript engine.Copy the code

6. Communication between Browser process and Renderer process

If you open the task manager yourself and then open a browser, you can see that there are two processes in the task manager (one is the master process and one is the renderer process that opens the Tab page), and then take a look at the whole process: (much simplified)

When the Browser process receives a request from the user, it first needs to fetch the page content (such as downloading resources over the network), then passes the task to the Render process via RendererHost interface. The Renderer interface of the Renderer process receives the message, briefly explains it, hands it to the Render thread, and begins rendering. The render thread receives the request, loads the page, and renders the page, which may require the Browser process to fetch resources and the GPU process to help render. Of course, there could be JS threads manipulating the DOM (which could cause backflow and redrawing). Finally, the Render process passes the result to the Browser process. The Browser process receives the result and draws itCopy the code

7. Tease out the relationship between threads in the browser kernel

  • GUI rendering threads are mutually exclusive with JS engine threads

    Since JavaScript is DOM manipulable, if you render the interface while modifying these element attributes (i.e., the JS thread and the UI thread are running at the same time), the element data obtained before and after the render thread may not be consistent.

    Therefore, to prevent unexpected results from rendering, the browser sets the GUI rendering thread and the JS engine to be mutually exclusive. The GUI thread is suspended when the JS engine executes, and GUI updates are stored in a queue until the JS engine thread is idle.

  • JS blocks page loading

    Try to avoid too long JS execution time, which will cause the page rendering to be inconsistent, resulting in the page rendering load blocking feeling.

  • WebWorker, JS multithreading?

    As mentioned above, JS engines are single-threaded, and JS execution takes too long to block pages, so is it true that JS is incapable of cpu-intensive computing? Later, Web workers were supported in HTML5.

    MDN’s official explanation is that Web workers provide an easy way for Web content to run scripts in background threads. Threads can perform tasks without interfering with the user interface

    A worker is an object created using a constructor (e.g. worker ()) that runs a named JavaScript file that contains the code to be run in the worker thread; Workers runs in a different global context than the current window so using the Window shortcut to get the current global scope (instead of self) within a Worker will return an errorCopy the code

    Think about it this way:

    . When creating a Worker, the JS engine applies to the browser to open a child thread (the child thread is opened by the browser, completely controlled by the main thread, and cannot operate DOM). The JS engine thread communicates with the worker thread in a specific way (postMessage API, which requires the thread to interact with specific data by serializing objects)Copy the code

    Therefore, if there is a very time-consuming work, please open a separate Worker thread, so that no matter how earthshaking inside will not affect the MAIN thread of JS engine, just wait to calculate the result, the result of communication to the main thread can, Perfect!

    Also note that the JS engine is single-threaded, the essence of which remains unchanged. The Worker can understand that the browser is opening a plug-in to the JS engine, which is specially used to solve those massive computing problems.

8. WebWorker SharedWorker

  • Webworkers only belong to one page and are not shared with the Render process (the browser kernel process) of other pages

    . So Chrome creates a new thread in the Render process (each Tab page is a Render process) to run the JavaScript application in the Worker.

  • SharedWorker is shared by all pages of the browser and cannot be implemented in the same way as Worker because it does not belong to a Render process and can be shared by multiple Render processes

    . So Chrome creates a separate process for SharedWorker to run the JavaScript program, and there is only one SharedWorker process for each same JavaScript in the browser, no matter how many times it is created.

It's essentially the difference between a process and a thread. SharedWorker is managed by a separate process and WebWorker is just a thread belonging to the Render processCopy the code

9. Browser rendering process

- The browser enters the URL, the browser main process takes over, opens a download thread, makes an HTTP request (leaving out DNS queries, IP addressing, etc.), waits for the response, retrives the content, and passes it to the Renderer process via the RendererHost interface - the browser rendering process beginsCopy the code

Once the browser kernel gets the content, rendering can be broken down into the following steps: 1. Parsing HTML to build a DOM tree

2. Parse CSS to build a render tree (parse CSS code into a tree data structure, then combine DOM into a render tree) 3. Render tree (Layout/reflow), responsible for the calculation of each element size, position 4. 5. The browser will send the information of each layer to the GPU, which will composite the layers and display them on the screen.Copy the code

The load event is triggered after the rendering is complete. Can you distinguish the load event from the DOMContentLoaded event?

  • When the DOMContentLoaded event is triggered, only when the DOM is loaded, not including stylesheets and images. (For example, scripts loaded with async may not be completed)
  • When the onLoad event is triggered, all the DOM, stylesheets, scripts, and images on the page have been loaded. (Rendered)
  • Order: DOMContentLoaded -> load

Does CSS loading block DOM tree rendering?

First of all, we all know that CSS is asynchronously downloaded by a separate download thread.Copy the code
  • CSS loading does not block DOM tree parsing (DOM builds as usual for asynchronous loading)
  • But it blocks the render tree until the CSS is loaded because the render tree needs CSS information.

This may also be an optimization mechanism for browsers.

The render tree may have to be redrawn or reflow after the CSS is loaded because you may change the style of the DOM nodes below while loading the CSS, causing unnecessary wear and tear if the CSS is not blocked. So it's better to parse the STRUCTURE of the DOM tree first, finish what you can do, and then render the render tree according to the final style after the CSS is loaded.Copy the code

Normal and composite layers

The concept of composite is mentioned in the render step.

To put it simply, there are two main categories of layers rendered by browsers: normal layers and composite layers

First of all, the normal document flow can be understood as a composite layer (this is called the default composite layer, no matter how many elements are added to it, they are in the same composite layer).

Second, the absolute layout (as is fixed) can be removed from the normal document flow, but it still belongs to the default composite layer.

You can then declare a new composite layer, hardware-accelerated, that allocates resources separately (and of course out of the normal document flow, so that no matter what happens in the composite layer, it doesn’t affect the backflow redraw in the default composite layer).

On a GPU, each compound layer is drawn separately, so it does not affect each other, which is why some scenes have the best hardware acceleration effect

In Chrome source debug -> More Tools -> Rendering -> Layer Borders, the yellow is the compound Layer information

How to Become a Composite Layer (Hardware-accelerated)

Turning this element into a composite layer is known as hardware acceleration

  • Most commonly used: translate3D, translateZ

  • Animating (compositing layers are created only during animation execution, and elements return to their previous state before animation starts or ends)

  • The will-chang property, which normally works with opacity and translate (and, after testing, does not function as a composite layer except for the property above, which causes hardware acceleration), tells the browser to change in advance. The browser will start to do some optimization work (this is best released after use)

  • Video Iframe Canvas WebGL and other elements

  • Take the old Flash plugin

    . The difference between Absolute and hardware acceleration is that absolute can be removed from the normal document flow, but it cannot be removed from the default composite layer. Therefore, even if the change of information in Absolute does not change the render tree in the normal document flow, the browser will draw the entire composite layer, so the change of information in Absolute will still affect the entire composite layer drawing. The browser will redraw it. If there is too much content in the composite layer, the drawing information caused by absolute changes too much and the resource consumption is very serious. What does a compound layer do? Generally, an element will become a composite layer after hardware acceleration is enabled, which can be independent of the ordinary document flow. After modification, the whole page can be avoided redrawing, improving performance, but try not to use a large number of composite layers, otherwise due to excessive resource consumption, the page will become more jammed. When using hardware acceleration, use index as much as possible to prevent the browser from creating composite layers for subsequent elements by default. In WebKit CSS3, if this element has hardware acceleration and the index level is low, then other elements (higher or the same level, with the same releative or absolute attribute) after this element will default to composite layer rendering. If a is a composite layer, and B is on top of A, then B is also implicitly converted to a composite layer, which requires special attentionCopy the code

Browser Compatibility Issues

Popularity: Browser compatibility problems are often caused by different browsers (yes, different browsers) having different definitions of certain standards. As the saying goes: No IE, no harm. Normalize.css the default styles of different browsers differ. You can use normalize.css to smooth out the differences. Of course, Reset. You can also customize your own business CSS < link href = "https://cdn.bootcss.com/normalize/7.0.0/normalize.min.css" rel = "stylesheet" > simple and crude method *  { margin: 0; padding: 0; } html5shiv.js solves the problem that browsers below IE9 don't recognize new html5 tags. <! -- [if lt IE 9] > < script type = "text/javascript" SRC = "https://cdn.bootcss.com/html5shiv/3.7.3/html5shiv.min.js" > < / script > <! [endif]--> respond.js Resolves the problem that THE following Browsers of Internet Explorer 9 do not support CSS3 Media Query. < script SRC = "https://cdn.bootcss.com/respond.js/1.4.2/respond.min.js" > < / script > picturefill. Js solution IE 9 10 11 browser does not support, etc The problem of < picture > tag < script SRC = "https://cdn.bootcss.com/picturefill/3.0.3/picturefill.min.js" > < / script > IE conditional comments IE Conditional comment for Internet Explorer only, invalid for other browsers Internet Explorer filter (hack method is more common) For different Internet Explorer, you can use different characters to control the style of a specific version of Internet ExplorerCopy the code

Browser CSS compatible prefix -O-transform :rotate(7deg); // Opera -ms-transform:rotate(7deg); // IE -moz-transform:rotate(7deg); // Firefox -webkit-transform:rotate(7deg); // Chrome transform:rotate(7deg); // If you click the hyperlink, the hover and active styles will not work. If you click the hyperlink, the hover and active styles will not work. If you click the hyperlink, the hover and active styles will not work. Placeholder <input type="text"; Placeholder <input type=" Placeholder "; Placeholder <input type="text"; Placeholder <input type=" Placeholder "; value="Name *" onFocus="this.value = '';" onBlur="if (this.value == '') {this.value = 'Name *'; }"> clear float best practices. Fl {float: left; } .fr { float: right; } .clearfix:after { display: block; clear: both; content: ""; visibility: hidden; height: 0; } .clearfix { zoom: 1; } BFC solve margin overlap problem when adjacent elements set margin, margin will be the maximum value, abandon the small value. To keep margins from overlapping, add a parent element to the child element and set the parent element to BFC: Overflow: hidden; <div class="box" id="box"> <p>Lorem ipsum dolor sit.</p> <div style="overflow: hidden;" > <p>Lorem ipsum dolor sit.</p> </div> <p>Lorem ipsum dolor sit.</p> </div> Display: inline; The following browser cannot use opacity: 0.5; filter: alpha(opacity = 50); filter: progid:DXImageTransform.Microsoft.Alpha(style = 0, opacity = 50); * HTML, * HTML body {background-image: url(about:blank); background-image: url(about:blank); background-attachment: fixed; } *html #menu { position: absolute; top: expression(((e=document.documentElement.scrollTop) ? e : document.body.scrollTop) + 100 + 'px'); } IE6 background flicker problem: links, buttons with CSSsprites as a background, in IE6 there will be background flicker phenomenon. The reason is that IE6 does not cache the background image, which is reloaded every time the hover is triggered.  document.execCommand("BackgroundImageCache", false, true); The date <span> tag is placed before the title <a> tagCopy the code

- IE6 does not support minheight: 350px; _height: 350px; IE8 supports CSS3 background-size. IE7 supports CSS3 background-size. IE8 supports CSS3 background-size. Called background-size polyfill, you can use this file to get IE7 and IE8 to support background-size. Insert an img element into the container and recalculate the width, height, left, and top values to simulate background-size. html { height: 100%; } body { height: 100%; margin: 0; padding: 0; background-image: url('img/37.png'); background-repeat: no-repeat; background-size: cover; -ms-behavior: url('css/backgroundsize.min.htc'); behavior: url('css/backgroundsize.min.htc'); } in ie6-7, line-height does not work when img is placed with text. All set to float width:100% width:100% this is a handy thing to use in Internet Explorer, which will search up the width value layer by layer, ignoring the floating layer. Add width:100% to all floating layers in Firefox. Cursor: pointer :hand is supported by IE6/7/8 while Safari and FF do not support cursor: pointer; Fixed table layout: fixed, TD word-wrap: fixed table layout: fixed, TD word-wrap: Var inp = document.getelementById ('inp') var result = document.getelementbyid ('result') function getKeyCode(e) { e = e ? e : (window.event ? window.event : "") return e.keyCode ? e.keyCode : Function (e) {result.innerhtml = getKeyCode(e)} // The size of the viewable area of the browser window (not including the toolbar and scrollbar) // 1600 * 525 var client_w = document.documentElement.clientWidth || document.body.clientWidth; var client_h = document.documentElement.clientHeight || document.body.clientHeight; / / actual wide web content high (including tool bar and scroll bars sideline) / / 1600 * 8 var scroll_w = document. DocumentElement. ScrollWidth | | document. Body. ScrollWidth; var scroll_h = document.documentElement.scrollHeight || document.body.scrollHeight; / / high actual wide web content (not including the toolbar and a scroll bar line) / / 1600 * 8 var offset_w = document. DocumentElement. The offsetWidth | | document.body.offsetWidth; var offset_h = document.documentElement.offsetHeight || document.body.offsetHeight; / / the height of the rolling var scroll_Top = document. The documentElement. The scrollTop | | document. The body. The scrollTop; Var eventshiv = {// event: function(event) {return event? event : window.event; }, // type compatible getType: function(event) {return event.type; GetTarget: function(event) {return event.target? event.target : event.srcelem; }, // Add event handler addHandler: function(elem, type, listener) { if (elem.addEventListener) { elem.addEventListener(type, listener, false); } else if (elem.attachEvent) { elem.attachEvent('on' + type, listener); } else {// in this case because. [] elem['on' + type] = listener; }}, // Remove the event handler: function(elem, type, listener) { if (elem.removeEventListener) { elem.removeEventListener(type, listener, false); } else if (elem.detachEvent) { elem.detachEvent('on' + type, listener); } else { elem['on' + type] = null; }}, // Add event agent addAgent: function (elem, type, agent, listener) { elem.addEventListener(type, function (e) { if (e.target.matches(agent)) { listener.call(e.target, e); // this points to e.target}}); }, // cancel the default preventDefault: function(event) {if (event.preventdefault) {event.preventdefault (); } else { event.returnValue = false; Function (event) {if (event.stopPropagation) {event.stopPropagation(); } else { event.cancelBubble = true; }}};Copy the code

Website front-end performance optimization

1.浏览器渲染页面的过程
    CSS为什么要放到<head>里面、js放到</body>前面,以及js的异步加载(async、defer)等优化。
2.减少HTTP请求
    . CSS/JS 合并打包
    . 小图标等用iconfont代替
    . 使用base64格式的图片:有些小图片,可能色彩比较复杂,这个时候再用iconfont就有点不合适了,此时可以将
      其转化为base64格式(不能缓存),直接嵌在src中,比如webpack的url-loader设置limit参数即可
    . 减少静态资源的体积
3.减少静态资源的体积
    压缩静态资源:合并打包的js、css文件体积一般会比较大,一些图片也会比较大,这个时候必须要压缩处理。前后端
    分离项目,不论是gulp还是webpack,都有相应的工具包。针对个别图片,有时候也可以单独拿出来处理,我个人经常
    使用这个网站 tinypng.com/ 在线压缩

    编写高效率的CSS:因为现在项目里面基本上都在使用CSS预处理器,Less、SaaS、Stylus等等,这导致了某些初级前端的
    滥用:嵌套5、6层,甚者能达到7、8层,吓死个人!嵌套这么深,影响浏览器查找选择器的速度不说,这也一定程度上
    产出了很多冗余的字节,这个要改、要提醒,一般建议嵌套3层即可。
    
    服务端开启gzip压缩:大招,最近刚知晓,真是太牛逼了,一般的css、js文件能压缩60、70%,当然,这个比率可以设定
    的。前后端分离,如果前端部署用node、express作服务器的话,使用中间件compression即可开启gzip压缩:
        // server.js
        var express = require('express');
        var compress = require('compression');
        var app = express();
        app.use(compress());

4.使用缓存
    设置Http Header里面缓存相关的字段,做进一步的优化。

5.脚本加载的优化
    <1>动态加载
        所谓动态加载脚本就是利用javascript代码来加载脚本,通常是手工创建script元素,然后等到HTML文档解析
        完毕后插入到文档中去。这样就可以很好地控制脚本加载的时机,从而避免阻塞问题。
        
        function loadJS(src) {
          const script = document.createElement('script');
          script.src = src;
          document.getElementsByTagName('head')[0].appendChild(script);
        }
        loadJS('http://example.com/scq000.js');
        
    <2>异步加载
        我们都知道,在计算机程序中同步的模式会产生阻塞问题。所以为了解决同步解析脚本会阻塞浏览器渲染的问题,
        采用异步加载脚本就成为了一种好的选择。利用脚本的async和defer属性就可以实现这种需求:
        
        <script type="text/javascript" src="./a.js" async></script>
        <script type="text/javascript" src="./b.js" defer></script>
        
        虽然利用了这两个属性的script标签都可以实现异步加载,同时不阻塞脚本解析。但是使用async属性的脚本执行
        顺序是不能得到保证的。而使用defer属性的脚本执行顺序可以得到保证。另一方面,defer属性是在html文档解
        析完成后,DOMContentLoaded事件之前就会执行js。async一旦加载完js后就会马上执行,最迟不超过window.onload
        事件。所以,如果脚本没有操作DOM等元素,或者与DOM时候加载完成无关,直接使用async脚本就好。如果需要DOM,
        就只能使用defer了。

    <3>解决异步加载脚本的问题
        上面介绍的异步加载脚本并不是十分完美的。如何处理加载过程中这些脚本的互相依赖关系,就成了实现异步加载
        过程中所需要考虑的问题。一方面,对于页面中那些独立的脚本,如用户统计等插件就可以放心大胆地使用异步加载
        。而另一方面,对于那些确实需要处理依赖关系的脚本,业界已经有很成熟的解决方案了。如采用AMD规范的RequireJS
        ,甚至有采用了hack技术(通过欺骗浏览器下载但不执行脚本)的labjs(已过时)。如果你熟悉promise的话,就知道
        这是在JS中处理异步的一种强有力的工具。下面以promise技术来实现处理异步脚本加载过程中de的依赖问题:
        
        // 执行脚本
        function exec(src) {
            const script = document.createElement('script');
            script.src = src;
        
              // 返回一个独立的promise
            return new Promise((resolve, reject) => {
                var done = false;
        
                script.onload = script.onreadystatechange = () => {
                    if (!done && (!script.readyState || script.readyState === "loaded" || script.readyState === "complete")) {
                      done = true;
        
                      // 避免内存泄漏
                      script.onload = script.onreadystatechange = null;
                      resolve(script);
                    }
                }
        
                script.onerror = reject;
                document.getElementsByTagName('head')[0].appendChild(script);
            });
        }
        
        function asyncLoadJS(dependencies) {
            return Promise.all(dependencies.map(exec));
        }
        
        asyncLoadJS(['https://code.jquery.com/jquery-2.2.1.js', 'https://cdn.bootcss.com/bootstrap/3.3.7/js/bootstrap.min.js']).then(() => console.log('all done'));

    可以看到,我们针对每个脚本依赖都会创建一个promise对象来管理其状态。采用动态插入脚本的方式来管理脚本,然后
    利用脚本onload和onreadystatechange(兼容性处理)事件来监听脚本是否加载完成。一旦加载完毕,就会触发promise的
    resovle方法。最后,针对依赖的处理,是promise的all方法,这个方法只有在所有promise对象都resolved的时候才会触
    发resolve方法,这样一来,我们就可以确保在执行回调之前,所有依赖的脚本都已经加载并执行完毕。

    <4>懒加载(lazyload)
        懒加载是一种按需加载的方式,也通常被称为延迟加载。主要思想是通过延迟相关资源的加载,从而提高页面的加载
        和响应速度。在这里主要介绍两种实现懒加载的技术:虚拟代理技术以及惰性初始化技术。
        a.虚拟代理加载
            所谓虚拟代理加载,即为真正加载的对象事先提供一个代理或者说占位符。最常见的场景是在图片的懒加载中,
            先用一种loading的图片占位,然后再用异步的方式加载图片。等真正图片加载完成后就填充进图片节点中去。
            // 页面中的图片url事先先存在其data-src属性上
            const lazyLoadImg = function() {
              const images = document.getElementsByTagName('img');
              for(let i = 0; i < images.length; i++) {
                  if(images[i].getAttribute('data-src')) {
                      images[i].setAttribute('src', images[i].getAttribute('data-src'));
                      images[i].onload = () => images[i].removeAttribute('data-src');
                  }
              }
            }

        b.惰性初始化
            惰性初始模式是在程序设计过程中常用的一种设计模式。顾名思义,这个模式就是一种将代码初始化的时机推迟
            (特别是那些初始化消耗较大的资源),从而来提升性能的技术。
            
            jQuery中大名鼎鼎的ready方法就用到了这项技术,其目的是为了在页面DOM元素加载完成后就可以做相应的操作,
            而不需要等待所有资源加载完毕后。与浏览器中原生的onload事件相比,可以更加提前地介入对DOM的干涉。当
            页面中包含大量图片等资源时,这个方法就显出它的好处了。在jQuery内部的实现原理上,它会设置一个标志位
            来判断页面是否加载完毕,如果没有加载完成,会将要执行的函数缓存起来。当页面加载完毕后,再一一执行。
            这样一来,就将原本应该马上执行的代码,延迟到页面加载完毕后再执行。
6.利用webpack实现脚本加载优化
    针对懒加载,webpack也提供了十分友好的支持。这里主要介绍两种方式。
        
        <1 .import()方法
            我们知道,在原生es6的语法中,提供了import和export的方式来管理模块。而其import关键字是被设置成静态
            的,因此不支持动态绑定。不过在es6的stage3规范中,引入了一个新的方法import()使得动态加载模块成为
            可能。所以,你可以在项目中使用这样的代码:
            
                $('#button').click(function() {
                  import('./dialog.js')
                    .then(dialog => {
                        //do something
                    })
                    .catch(err => {
                        console.log('模块加载错误');
                    });
                });
                
                //或者更优雅的写法
                $('#button').click(async function() {
                    const dialog = await import('./dialog.js');
                  //do something with dialog
                
                });
            由于该语法是基于promise的,所以如果需要兼容旧浏览器,请确保在项目中使用es6-promise或者
            promise-polyfill。同时,如果使用的是babel,需要添加syntax-dynamic-import插件。
        <2 .require.ensure
            require.ensure与import()类似,同样也是基于promise的异步加载模块的一种方法。这是在
            webpack 1.x时代官方提供的懒加载方案。现在,已经被import()语法取代了。为了文章的完整性,
            这里也做一些介绍。
            在webpack编译过程中,会静态地解析require.ensure中的模块,并将其添加到一个单独的chunk中,
            从而实现代码的按需加载。
            语法如下:
                require.ensure(dependencies: String[], callback: function(require), errorCallback:
                function(error), chunkName: String)
                
        一个十分常见的例子是在写单页面应用的时候,使用该技术实现基于不同路由的按需加载:
            const routes = [
                {path: '/comment', component: r => require.ensure([], r(require('./Comment')), 'comment')}
            ];
6.预加载
    用户在具体的页面使用过程中的体验也很重要。如果能够通过预判用户的行为,提前加载所需要的资源,则可以快速地
    响应用户的操作,从而打造更加良好的用户体验。另一方面,通过提前发起网络请求,也可以减少由于网络过慢导致的
    用户等待时间。因此,“预加载”的技术就闪亮登场了。
        <1>preload规范
            preload 是w3c新出的一个标准。利用link的rel属性来声明相关“proload",从而实现预加载的目的。就像这样:
                <link rel="preload" href="example.js" as="script">
                
            其中rel属性是用来告知浏览器启用preload功能,而as属性是用来明确需要预加载资源的类型,这个资源类型
            不仅仅包括js脚本(script),还可以是图片(image),css(style),视频(media)等等。浏览器检测到这个属性
            后,就会预先加载资源。
            
            这个规范目前兼容性方面还不是很好,所以可以先稍微了解一下。webpack现在也已经有相关的插件,如果感兴趣
            的话,请移步preload-webpack-plugin。
            
        <2>DNS Prefetch 预解析
            还有一个可以优化网页速度的方式是利用dns的预解析技术。同preload类似,DNSPrefetch在网络层面上优化了
            资源加载的速度。我们知道,针对DNS的前端优化,主要分为减少DNS的请求次数,还有就是进行DNS预先获取。
            DNS prefetch就是为了实现这后者。其用法也很简单,只要在link标签上加上对应的属性就行了。

            <meta http-equiv="x-dns-prefetch-control" content="on" /> /* 这是用来告知浏览器当前页面要做DNS预解析 */
            <link rel="dns-prefetch" href="//example.com">
Copy the code

The front end of SEO

Why optimize: Increase your site’s weight and search engine friendliness in order to improve your ranking, increase traffic, improve your (potentially) user experience, and boost sales.

How to achieve:

1, site structure layout optimization: as simple as possible, straight to the point, advocate flat structure in general, the less the site structure level, the easier to be "spider" grab, it is easy to be included. General small and medium-sized website directory structure more than three, "spider" will not be willing to climb down. And according to the relevant data survey: if the visitor has not found the information he needs after three jumps, he is likely to leave. Therefore, a three-tier directory structure is also an experience requirement. To this end we need to do: (1) control the number of home links website home page is the highest weight place, if the home link is too little, there is no "bridge", "spider" can not continue to climb down to the inside page, directly affect the number of site included. But home links can not be too much, once too much, no substantive links, it is easy to affect the user experience, will also reduce the weight of the site's home page, the effect is not good. (2) flat directory level as far as possible so that "spider" as long as jump 3 times, can reach any page inside the site (3) navigation optimization navigation should try to use text, can also be matched with picture navigation, but the picture code must be optimized, < IMG > tag must add "Alt" and "title" attribute, Tell the search engine the location of the navigation, so that even if the image does not display properly, the user can also see the prompt text. Second, on each page should be combined with bread crumbs navigation, benefits: from the aspects of user experience, allows users to understand the current position and the current page Position in the entire site, help the user to quickly understand the organization form of web site, to form a better sense of place, at the same time provides the return each page interface, convenient user operations; For "spider", can clearly understand the site structure, but also increased a large number of internal links, easy to grab, reduce the jump rate. (4) The structure layout of the website -- details that cannot be ignored page head: logo and main navigation, as well as user information. Page body: left body, including breadcrumb navigation and body; Right put popular articles and related articles, benefits: retain visitors, let visitors stay more, for "spider", these articles belong to the relevant links, enhance the relevance of the page, can also enhance the weight of the page. Bottom of page: Copyright information and links. (5) the use of layout, the important content HTML code in the front of the search engine grab HTML content from the top to the bottom, using this feature, you can let the main code read first, advertising and other unimportant code on the bottom. (6) Control the size of the page, reduce HTTP requests, improve the loading speed of the website. A page should not exceed 100K, too large, the page loading speed is slow. When the speed is slow, the user experience is poor, visitors are not retained, and the "spider" will leave if it times out. 2, webpage code optimization (1) highlight important content -- reasonable design title, description and keywords <title> Only emphasize the key can, try to put important keywords in front, keywords do not repeat, try to do each page <title> title do not set the same content. <meta keywords> tags: keywords, keywords, keywords, keywords, keywords <meta Description > <meta Description > <meta Description > <meta Description > <meta Description > <meta Description > <meta Description > <meta Description > (2) Semantic writing OF HTML code, in line with W3C standards to make the code semantic, use the appropriate tag in the appropriate place, with the right tag to do the right thing. Let read source code and "spider" at a glance. For example, h1-H6 is used for the title class, the <nav> tag is used to set the main navigation of the page, ul or OL is used for list code, strong is used for important text, etc. (3) <a> label: in-page link, to add "T know. External links, links to other sites, need to add el="nofollow" to tell the spider not to climb, because once the spider climbs the external link, it won't come back. <a href="https://www.360.cn" title="360 Security Center "class="logo"> The spider considers the h1 tag to be the most important. A page can have at most one H1 tag above the most important title on the page. For example, you can add an H1 tag to the logo on the home page. Use the < H2 > tag for subheadings, and don't use the H tag anywhere else. <img SRC ="cat.jpg" width="300" height="200" Alt ="cat "/ The element defines the table title. The Caption tag must follow the table tag, You can only define one <table border='1'> <tr> < Tbody > <tr> < TD > Apple </ TD > < TD >100</ TD > </tr> <tr> < TD >banana</td> < TD >200</td> </tr> </tbody> </table> (7) <br> tag: only used for text wrapping, such as: <p> First line text content <br/> Second line text content <br/> Third line text content </ P > (8) Do not output important content in JS, because spiders do not read the content in JS, so important content must be in HTML. (9) Use the IFrame framework as little as possible, as spiders generally do not read its contents. (10) Use caution with display: None: For text that you do not want to display, you should set z-index or indent to a large enough negative number to deviate from the browser. Because search engines will filter out display: None. 3, front-end website performance optimization (1) reduce the number of HTTP requests A. CSS Sprites, commonly known as CSS Sprites in China, this is a solution to reduce HTTP requests by combining multiple pictures into one picture. You can access the picture content through the BACKGROUND property of CSS. This scheme also reduces the total number of bytes in the image. B. Merge CSS and JS files. There are many engineering packaging tools on the front end, such as Grunt, gulp, webpack, etc. To reduce the number of HTTP requests, you can use these tools to combine multiple CSS or JS files into a single file before republishing. C. Lazyload is called lazy loading, which can control the content on the web page without loading at the beginning, without sending requests, and load the content immediately when the user operation really needs it. This controls the number of one-time requests for web resources. (2) Control the priority of resource file loading when the browser loads THE HTML content, it parses the HTML content from top to bottom. When it parses the LINK or script tag, it will load the corresponding link content href or SRC. In order to display the page to the user in the first time, it needs to load the CSS in advance. Do not be affected by JS loading. (3) as far as possible outside the chain CSS and JS (structure, performance and behavior of the separation), to ensure the clean web code, It is also useful for future maintenance of <link rel="stylesheet" href="asstes/ CSS /style.css" /> <script SRC ="assets/js/main.js"></script> (4) use the browser cache The browser cache stores network resources locally and waits for the next request for the resources. If the resources already exist, the server does not need to request the resources again and directly reads the resources locally. (5) Basic principle of Reflow reduction: Rearrangement is when changes to the DOM affect the geometry of the element (width and height). The browser recalculates the geometry of the element, invalidating the affected part of the rendering tree. The browser validates the visibility property of all other nodes in the DOM tree, which is why Reflow is inefficient. If Reflow is too frequent, CPU usage will rise sharply. Reduce Reflow, and if you need to add styles during DOM manipulation, use the add class attribute instead of manipulating styles via style. (6) Reduce DOM operations (7) icon IconFont replacement (8) do not use CSS expressions, will affect efficiency (9) use CDN network cache, speed up user access, reduce server pressure (10) enable GZIP compression, faster browsing speed, Search engine spider crawl information will increase (11) pseudo static Settings if it is a dynamic web page, you can open the pseudo static function, let the spider "mistake" this is a static web page, because the static web page is more suitable for the spider's appetite, if the URL with keywords better effect. Dynamic address: http://www.360.cn/index.php pseudo static address: http://www.360.cn/index.html have a correct understanding of SEO is not overly SEO, web site is mostly about the content.Copy the code

The HTTP protocol

1. What is the structure of HTTP packets?

For TCP, transmission is divided into two parts: the TCP header and the data part. HTTP is similar to the header + body structure. To be specific, the start line + the header + the empty line + the entity. HTTP request packets and response packets are different, so we will introduce them separately.Copy the code

The starting line

For request messages, the start line looks like the following: GET /home HTTP/1.1 (method + path + HTTP version). For response packets, the start line looks like this: HTTP/1.1 200 OK The start line of a response packet is also called a status line. It consists of the HTTP version, status code, and cause. It is important to note that in the starting line, each part is separated by a space, and the last part should be followed by a newline, strictly following the ABNF syntax specification.Copy the code

The head

Show the position of the request and response headers in the message:Copy the code

Whether it is a request header or a response header, there are quite a few fields in the header, and there are many features of HTTP involved. Here we will not list them all.

  • Field names are case insensitive
  • The field name cannot contain Spaces or underscores
  • The field name must be followed by:

A blank line

It’s important to distinguish the head from the entity.

Q: What if you deliberately put an empty row in the middle of the head?

Everything after a blank line is treated as an entity.

entity

That’s the actual data, the body part. The request message corresponds to the request body, and the response message corresponds to the response body

2. How to understand the HTTP request method?

What are the request methods?

  • HTTP /1.1 specifies the following request methods (note, all in uppercase):
  • GET: Usually used to obtain resources
  • HEAD: obtains the meta information of the resource
  • POST: data is submitted, that is, uploaded
  • PUT: Modifies data
  • DELETE: Deletes a resource (hardly needed)
  • CONNECT: establishes the connection tunnel for the proxy server
  • OPTIONS: Lists the request methods that can be applied to resources for cross-domain requests
  • TRACE: Traces the transmission path of the request and response

What’s the difference between GET and POST?

  • From a caching perspective, GET requests are actively cached by the browser, leaving a history, whereas POST requests are not cached by default.
  • From an encoding perspective, GET can only urL-encode and can only accept ASCII characters, while POST has no limitations.
  • From a parameter perspective, GET is generally placed in the URL and therefore is not secure, while POST is placed in the request body and is better suited for transmitting sensitive information.
  • From an idempotent point of view, GET is idempotent and POST is not. (Idempotent means to perform the same operation with the same result.)
  • From the TCP perspective, a GET request sends the request packet at once, while a POST is split into two TCP packets, with the header part first, and the body part if the server responds with 100(continue). (Except for Firefox, where POST requests only send a TCP packet)

3: How to understand URIs?

The Uniform Resource Identifier (URI) is simply a Uniform Resource Identifier that distinguishes different resources on the Internet.

The structure of the URI

The really complete structure of a URI looks like this.Copy the code

  • Scheme stands for protocol names such as HTTP, HTTPS, file, and so on. It must be followed by ://.
  • User :passwd@ Indicates the user information used to log in to the host. This parameter is not recommended or commonly used.
  • Host :port Indicates the host name and port.
  • Path indicates the request path, marking the location of resources.
  • Query represents a query parameter in the form of key=val, separated by ampersand.
  • Fragment represents an anchor point in the resource located by the URI. The browser can jump to the corresponding location based on this anchor point.

Here’s an example:

In https://www.baidu.com/s?wd=HTTP&rsv_spt=1, HTTPS is the scheme part and www.baidu.com is the host:port part. Wd =HTTP&rsv_spt=1; wd=HTTP&rsv_spt=1; wd=HTTP&rsv_spt=1; wd=HTTP&rsv_spt=1;Copy the code

URI encoding

Uris can only use ASCII, characters other than ASCII are not supported to display, and some symbols are delimiters that can cause parsing errors if left untreated. Therefore, URIs introduce an encoding mechanism that converts all non-ASCII characters and delimiters into hexadecimal byte values, followed by a %. For example, space is escaped to %20, and ternary is escaped to %E4%B8%89%E5%85%83.Copy the code

4: How to understand the HTTP status code?

1xx: indicates that the protocol processing is in the intermediate state and subsequent operations are required. 2xx: indicates that the status is successful. 3xx: Indicates the redirection status. The location of resources changes and a request needs to be made again. 4xx: The request packet is incorrect. 5xx: An error occurs on the server. 1XX 101 Switching Protocols. When HTTP is upgraded to WebSocket, the server sends status code 101 if it agrees to the change. 2xx 200 OK is the most seen success status code. There is usually data in the response body. 204 No Content Has the same meaning as 200, but there is No body data after the response header. 206 Partial Content as the name implies, Partial Content is used for HTTP block download and breakpoint continuation. Of course, it also includes the corresponding content-range response header field. 3xx 301 Moved Permanently, corresponding to 302 Found Permanently. For example, if your site is upgraded from HTTP to HTTPS and the previous site is no longer used, you should return to 301. By default, the browser will cache optimizer and automatically return to the redirected address on the second visit. If it is only temporarily unavailable, it simply returns 302. Unlike 301, the browser does not cache optimise. 304 Not Modified: This status code is returned when the negotiated cache is hit. See 4XX 400 Bad Request in browser cache: Developers often see confusion, just a general error, and don't know what went wrong. 403 Forbidden: This does not mean that the request packet fails, but that the server forbids access for many reasons, such as legal prohibition and sensitive information. 404 Not Found: Indicates that the resource is Not Found on the server. 405 Method Not Allowed: The request Method is Not Allowed by the server. 406 Not Acceptable: The resource does Not meet the requirements of the client. 408 Request Timeout: The server waits too long. 409 Conflict: Multiple requests are in Conflict. 413 Request Entity Too Large: The Request body data is Too Large. 414 request-uri Too Long: The URI in the Request line is Too large. 429 Too Many Request: The client sends Too Many requests. 431 Request Header Fields Too Large The Fields in the Request Header are Too Large. 5xx 500 Internal Server Error: 5xx 500 Internal Server Error: 501 Not Implemented: The functions requested by the client are Not supported. 502 Bad Gateway: The server itself is normal, but there was an error when accessing the server. 503 Service Unavailable: The server is busy and cannot respond to services temporarily.Copy the code

5. What are the features of HTTP? What are the disadvantages of HTTP?

HTTP features

The features of HTTP are summarized as follows:

(1) Flexibility and expansion, mainly reflected in two aspects. One is semantically free, specifying only basic formats, such as separating words with Spaces and separating fields with newlines, while other parts have no strict syntactic restrictions. Another is the variety of transmission forms, not only can transmit text, but also can transmit pictures, videos and other arbitrary data, very convenient. (2) reliable transmission. HTTP is based on TCP/IP, so it inherits this feature. This is a TCP feature and will not be covered in detail. (3). Request-response. Of course, the requester and the responder are not only between the client and the server. If a server acts as a proxy to connect to the server at the back end, the server will also act as the requester. (4) stateless. The state here refers to the context of the communication process, and each HTTP request is independent and irrelevant. By default, no state information is required.Copy the code

HTTP shortcomings

Statelessness is a disadvantage of HTTP in scenarios that require long connections, where a large amount of context information needs to be saved to avoid transferring a large amount of duplicate information. At the same time, however, some applications are just for retrieving data and do not need to store connection context information. Statelessness reduces network overhead and becomes an advantage of HTTP. Plaintext transmission is a protocol in which packets (mainly headers) do not use binary data but text. Of course, it facilitates debugging, but also exposes HTTP packet information to the outside world and facilitates attackers. WIFI traps take advantage of the shortcomings of HTTP plaintext, lure you into a hot spot, and frantically grab all your traffic to get your sensitive information. Queue header blocking When HTTP is enabled for a long connection, a shared TCP connection can process only one request at a time. If the current request takes too long, other requests are blocked, which is also known as queue header blocking. This is discussed in the following section.Copy the code

6: What do you know about the Accept series field?

The introduction to the Accept series of fields is divided into four parts: data format, compression method, supported language and character set.Copy the code

The data format

In the last section, we talked about the flexible nature of HTTP, which supports a wide variety of data formats. With so many data formats arriving at the client, how does the client know what format it is? First, we need to introduce a standard, MIME(Multipurpose Internet Mail Extensions). It was first used in E-mail systems to allow messages to send arbitrary types of data, which is also common to HTTP. HTTP takes a portion of the MIME Type to mark the data type of the body part of the packet. These types are represented in the Content-Type field, of course, for the sender. The receiver can also use the Accept field if it wants to receive a specific type of data. The values of the two fields can be classified into the following categories: Text: text/ HTML, text/plain, text/ CSS Image: image/ GIF, image/ JPEG, image/ PNG Audio /video: Application: Application/JSON, Application /javascript, Application/PDF, Application /octet-streamCopy the code

Compression way

Of course, these data are usually encoded and compressed. The method of compression is reflected in the sender's Content-Encoding field, and the received method is reflected in the receiver's Accept-Encoding field. Gzip: deflate: another famous compression format br: a compression algorithm invented specifically for HTTP // content-encoding: Gzip // Accept-encoding: gzipCopy the code

Support language

There is also a Content-language field for the sender, which can be used to specify the supported Language in scenarios requiring internationalization, and an Accept-Language field for the receiver. For example, content-language: zh-CN, zh, en // Accept-language: zh-CN, zh, enCopy the code

Character set

Finally, a special field, accept-Charset on the receiving side, specifies the acceptable character set. On the sending side, there is no content-Charset. Instead, it is placed directly in the Content-Type, as specified by the Charset attribute. For example: // Sender content-type: text/ HTML; Charset = UTF-8 // Accept-charset: charset= UTF-8Copy the code

7. How does HTTP transfer data of fixed and variable length?

Fixed-length inclusions

For fixed-length packets, the sender usually carries content-Length to indicate the Length of the packet. Const HTTP = require(' HTTP '); const server = http.createServer(); server.on('request', (req, res) => { if(req.url === '/') { res.setHeader('Content-Type', 'text/plain'); res.setHeader('Content-Length', 10); res.write("helloworld"); }}) server. Listen (8081, () = > {the console. The log (" successful start "); }) after startup access: localhost:8081. Helloworld We tried to set the Length smaller: res.setheader (' content-length ', 8); Hellowor: "Hellowor:" "Hellowor:" "Hellowor:" "Hellowor:" It's actually truncated directly in the body of the HTTP response. Then we try to make the Length larger: res.setheader (' content-length ', 12); The following information is displayed: The page does not work properly and cannot be displayed. As you can see, Content-Length plays a key role in the HTTP transmission process. If not set properly, the transmission will fail.Copy the code

Indefinite inclusion

The above is for fixed-length packets, so how does it transmit for indeterminate length packets? Another HTTP header field must be introduced here:

Transfer-encoding: chunked Indicates chunked data Transfer. Setting this field automatically produces two effects: The Content-Length field is ignored to continuously push dynamic Content based on long connectionsCopy the code

The nodejs program is as follows: const HTTP = require(‘ HTTP ‘);

const server = http.createServer(); server.on('request', (req, res) => { if(req.url === '/') { res.setHeader('Content-Type', 'text/html; charset=utf8'); res.setHeader('Content-Length', 10); res.setHeader('Transfer-Encoding', 'chunked'); Res. Write (" < p > < / p > "); SetTimeout (() => {res.write(" first transmission <br/>"); }, 1000); SetTimeout (() => {res.write(" second transmission "); res.end() }, 2000); }}) server. Listen (8009, () = > {the console. The log (" successful start "); })Copy the code

8. How does HTTP handle transfer of large files?

For large files with hundreds of megabytes or even gigabytes, it is obviously unrealistic to transfer them all in one go. There will be a large amount of waiting time, which will seriously affect user experience. Therefore, HTTP addresses this scenario with a range request solution that allows clients to request only a portion of a resource.Copy the code

How to support

Of course, the premise is that the server needs to support range requests. To support this function, we must add such a response header: Accept-ranges: None to inform the client that range requests are supported.Copy the code

Disassemble the Range field

As for the client, it needs to specify what part of the request is specified by the request header field Range, bytes=x-y. Let’s discuss the format of the Range:

0-499 indicates the start to the 499th byte. 500- indicates the 500th byte to the end of the file. -100 Indicates the last 100 bytes of a file.Copy the code

Upon receiving the request, the server first verifies that the range is valid and returns a 416 error code if it is out of bounds, otherwise it reads the fragment and returns a 206 status code.

At the same time, the server needs to add the Content-range field, which is formatted differently depending on the Range field in the request header.

Specifically, the response header is different when requesting a single segment of data than when requesting multiple segments of data.

Here’s an example:

// Single data Range: bytes=0-9 // Multiple data Range: bytes=0-9, 30-39Copy the code

So we’re going to talk about the two cases.

A single piece of data

Let’s look at multi-segment requests. The response will look like this: HTTP/1.1 206 Partial Content Content-Type: multipart/byteranges; boundary=00000010101 Content-Length: 189 Connection: keep-alive Accept-Ranges: bytes

--00000010101
Content-Type: text/plain
Content-Range: bytes 0-9/96

i am xxxxx
--00000010101
Content-Type: text/plain
Content-Range: bytes 20-29/96

eex jspy e
--00000010101--
Copy the code

Here’s a key field: Content-Type: multipart/byteranges; Boundary =00000010101, which represents the amount of information is as follows:

The request must be multi-segment data the delimiter in the response body of the request is 00000010101Copy the code

Therefore, segments of data in the response body are separated by the delimiter specified here, and the final delimiter is terminated by –.

That’s what HTTP does for large file transfers.

9. How is submission of form data handled in HTTP?

In HTTP, there are two main ways to submit a form, embodied in two different content-Type values:

application/x-www-form-urlencoded
multipart/form-data
Copy the code

Since form submissions are generally POST requests, and GET is rarely considered, we place the default submitted data in the request body.

application/x-www-form-urlencoded

Form content in Application/X-www-form-urlencoded format has the following characteristics: the data will be encoded as & separated key-value pair characters in URL-encoding mode. Such as: / conversion process: {a: 1, b: 2} - > a = 1 & b = 2 - > as follows (final) "% 3 d1%26 b % 3 d2." "Copy the code

multipart/form-data

For multipart/form-data:. The content-Type field in the request header contains boundary and the value of boundary is specified by browser default. Example: the content-type: multipart/form - data; A boundary = - WebkitFormBoundaryRRJKeWfHPGrS4LKe. Data is divided into multiple parts, and each two parts are separated by a delimiter. Each part is expressed with an HTTP header describing the subpackage body, such as Content-Type, and the last delimiter is added -- to indicate the end. The corresponding request body looks like this: Content-disposition: form-data; name="data1"; Content-Type: text/plain data1 ----WebkitFormBoundaryRRJKeWfHPGrS4LKe Content-Disposition: form-data; name="data2"; Content-Type: text/plain data2 ----WebkitFormBoundaryRRJKeWfHPGrS4LKe--Copy the code

summary

It is worth mentioning that the multipart/form-data format is characterized by the fact that each form element is an independent representation of the resource. In addition, you may not have noticed the existence of boundary in the process of writing the business. If you open the packet capture tool, you can indeed see the separation of different form elements. The reason why you do not feel it is that the browser and HTTP encapsulate this series of operations for you.

Moreover, in actual scenes, multipart/form-data is basically used for uploading pictures and other files instead of Application/X-www-form-urlencoded, because there is no need to do URL coding, which takes a lot of time and occupies more space.

10.HTTP1.1 how to solve HTTP header blocking problem?

What is HTTP header blocking?

As you can see from the previous section, HTTP transmission is based on the request-response mode, and the packets must be received once. However, it is worth noting that the tasks in the HTTP transmission are sequentially executed in a task queue. If the first request is processed too slowly, the processing of subsequent requests will be blocked. This is the famous HTTP queue header blocking problem.

Concurrent connections

Allowing multiple long connections to be assigned to a domain name increases the queue of tasks so that one queue’s tasks do not block all other tasks. In RFC2616, the maximum number of concurrent connections for a client is 2, but in fact, in the current browser standard, this limit is much higher, Chrome is 6

Domain name subdivision

Can not one domain name concurrent 6 long connections? Then I’ll give you a few more domains. Such as content1.sanyuan.com, content2.sanyuan.com. Such a sanyuan.com domain name can be divided into many secondary domain names, and they all point to the same server, the number of concurrent long connections can be more, in fact, better solve the problem of queue blocking.

11. What do you know about cookies?

Introduction of cookies

HTTP is a stateless protocol. Each HTTP request is independent and irrelevant. By default, no state information is required. But sometimes you need to save some state. What do you do?

HTTP introduces cookies for this purpose. A Cookie is essentially a small text file stored in the browser, internally as key-value pairs (as you can see in the Application section of the Chrome developer panel). All requests sent to the same domain name carry the same Cookie, and the server gets the Cookie for parsing, then it can get the state of the client. The server can write cookies to the client through the set-cookie field in the response header. Examples are as follows:

// request header Cookie: a= XXX; // Response header set-cookie: a= XXX set-cookie: b= XXXCopy the code

Cookie attribute

Life cycle

Cookie expiration dates can be set using the Expires and max-age attributes.

Expires Indicates an expiration time. Max-age specifies an interval, in seconds, counted from the time the browser receives a packet.Copy the code

If the Cookie expires, the Cookie is deleted and not sent to the server.

scope

There are also two properties about scope: Domain and path, which bind the Domain name and path to the Cookie. If the Domain name or path does not match these two properties before sending the request, the Cookie will not be carried. It is worth noting that for paths, / means that cookies are allowed for any path under the domain name

Safety related

If Secure is enabled, cookies can be transmitted only through HTTPS.

If the cookie field is marked with HttpOnly, it indicates that it can only be transmitted through HTTP and cannot be accessed through JS, which is also an important means to prevent XSS attacks.

Accordingly, there is the SameSite attribute for CSRF attack prevention.

SameSite can be set to three values, Strict, Lax, and None.

A. In Strict mode, the browser completely forbids third-party requests to carry cookies. For example, sanyuan.com can only carry cookies when requested in sanyuan.com domain name, but not in other websites. B. In Lax mode, the Cookie can only be carried if the GET method submits a form condition or the A tag sends a GET request. C. In None mode, which is the default mode, requests automatically carry cookies.Copy the code

The disadvantage of the Cookie

- Capacity defects. Cookies have a maximum size of 4KB and can only be used to store small amounts of information. - Performance defects. Cookies follow the domain name, regardless of whether a certain address below the domain name needs the Cookie, the request will carry the complete Cookie, as the number of requests increases, in fact, will cause a huge performance waste, because the request carries a lot of unnecessary content. This can be resolved by specifying the scope of Domain and Path. - Safety defects. Since cookies are passed in plain text between the browser and the server, they can easily be intercepted by illegal users who then perform a series of tampering and resend them to the server within the Cookie's validity period, which is quite dangerous. In addition, with HttpOnly false, Cookie information can be read directly from JS scripts.Copy the code

12. How to understand HTTP proxies?

We know that HTTP is a protocol based on the request-response model. Generally, the client sends the request and the server responds.

Of course, there are special cases, namely that of proxy servers. After the proxy is introduced, the server acting as a proxy acts as a middleman. For the client, the server responds. For the source server, the client initiates the request and has a dual identity

So what exactly does a proxy server do?

function

1. Load balancing. The client's request will only reach the proxy server first, and the client does not know how many source servers and IP addresses there are. Therefore, the proxy server can take the request and distribute it to different source servers through a specific algorithm, so that the load of each source server is as even as possible. Of course, there are many such algorithms, including random algorithms, polling, consistent hash, LRU(least recently used), and so on, but these algorithms are not the focus of this article and you can explore them yourself if you are interested. 2. Security. The heartbeat mechanism is used to monitor the server in the background, and once the faulty machine is found, it is kicked out of the cluster. In addition, the proxy server filters the upstream and downstream data and limits the flow of illegal IP addresses. 3. Cache proxy. The content is cached to the proxy server so that clients can get it directly from the proxy server instead of going to the source serverCopy the code

Related header field

Via

What if the proxy server needs to identify itself and leave its mark on HTTP traffic?

Record through the Via field. For example, there are now two proxy servers in the middle, and the client will go through this process after sending the request:

Client -> Agent 1 -> Agent 2 -> source serverCopy the code

When the source server receives the request, it gets this field in the request header:

Via: proxy_server1, proxy_server2
Copy the code

When the source server responds, the client ends up with a response header like this:

Via: proxy_server2, proxy_server1
Copy the code

As you can see, the order of proxies in Via is the order in which messages arrive during HTTP transport.

X-Forwarded-For

Forwarded to whomever it is intended For, x-Forwarded-For records the IP address of the requesting party. (Note that this is distinguished from Via, but X-Forwarded-For records the same IP address of the requesting party.)

X-Real-IP

Is a field that retrieves the user’s real IP address. No matter how many proxies are passed, this field always records the original client’S IP address.

Each server has its own domain name, x-Forwarded-Host, and X-Forwarded-Proto, which each record the domain name and protocol name of the client (note that the forwarders are not included).

X-Forwarded-For

As you can see, in front of the X-ray Forwarded – For the field records is the requester’s IP, this means that every different agent, the name of this field will be changed, from the client to the agent 1, this field is the client’s IP, from the agency 1 to 2, the field becomes the agent 1 IP. But this creates two problems:

This means that the proxy must parse the HTTP request header and then modify it, which is less performance than forwarding the data directly.

During HTTPS communication encryption, original packets cannot be modified.

This led to the proxy protocol, which generally uses the plaintext version and simply adds text in this format to the HTTP request line: // PROXY + TCP4/TCP6 + requestor IP address + receiver IP address + request port + receiving port PROXY TCP4 0.0.0.1 0.0.0.2 1111 2222 GET/HTTP/1.1… Copy the code so that you can solve the problem with X-Forwarded-For.

Front-end componentization, engineering, modular development

Object-oriented programming