Ayazero · 2015/09/18 0:57

The article is reproduced from: www.ayazero.com/?p=75

0 x00 takeaway


0 x01 DDOS classification


Before talking about defense, a brief introduction of all kinds of attacks, because DDOS is a kind of attack rather than a kind of attack, and DDOS defense is a relatively automatic but not absolutely automatic process, many evolving attack modes automation may not be able to identify, or need to further expert naked eye judgment.

  • Network layer Attack

    • Syn-flood

      Using the “vulnerability” of three-way handshake in TCP connection, SYN packets with false source address are sent through the original socket, so that the target host can never complete the three-way handshake, which occupies the protocol stack queue of the system, resources can not be released, and then the denial of service, which is one of the most important DDOS attacks in the Internet. Some hardening methods on the network, such as adjusting kernel parameters, can reduce waiting and retry, and speed up resource release. In the case of syn-flood traffic with a small volume, it can be alleviated, but in the case of large traffic, it cannot be used. Syn proxy, SYN cookies, and first-packet discarking (syn packets of the first request) are common methods of syn flood defense.

    • ACK-flood

      For false ACK packets, the target device directly replies to the RST packet discarded connection. Therefore, the damage value is much lower than that of SYN-flood packets. A primitive form of DDOS.

    • UDP-flood

      A UDP packet that uses a raw socket to forge a large number of bogus source addresses. Currently, DNS is the main protocol.

    • ICMP-flood

      Ping the flood, the old way.

  • Application-layer attacks

    • CC

      The name “ChallengeCollapsar” comes from the anti-ddos device “black hole” of Green League, a well-known security company in China. The challenger sends a large number of real HTTP requests to the target through a botnet puppet host or an anonymous proxy server, which ultimately consumes a large number of concurrent resources, slows down the entire website or even completely refuses service.

      The architecture of the Internet pursues extensibility in essence to improve concurrency, various SQL performance optimization measures: Eliminates slow query, table depots, index, optimize the structure of data, and limit the search frequency essence is to solve the resource consumption, and CC is exactly the opposite, filled the server number of concurrent connections, as far as possible make the request to avoid cache and read the database directly, read the database query to find the most consume resources, had better not use index, each query is a full table scan, This allows maximum denial of service with minimal attack resources.

      Internet products and services rely on data analysis to drive improvement and continuous operation. Therefore, in addition to OLTP systems such as front-end APP, middleware and database, there is also OLAP, a big data platform from log collection and storage to data processing and analysis. When CC attacks occur, not only the PART of OLTP is affected. In fact, CC will generate a large number of logs, which will directly affect the later OLAP. The impact includes two aspects, one day’s data statistics are completely wrong. The second level also increases the burden of back-end data processing due to the surge of access logs during CC.

      CC is one of the main means of application layer attack at present. There are some methods in defense, but they cannot solve this problem perfectly.

    • DNS flood

      Massive DNS requests with forged source addresses are used to flood the target DNS server. If the authoritative DNS of a specific enterprise is attacked, you can set the source IP address to the IP address of the DNS server of each ISP to overcome the whitelist restriction and randomize the query content to the domain name of the target enterprise. If the query fails to match the cache, the server load increases.

      DNS provides services not only on UDP-53, but also on TCP. Therefore, one way of defense is to force UDP queries to TCP and request source tracing. If the source address is false, no reply is made. For an enterprise’s own authoritative DNS server, most normal requests come from the recursive resolution of ISP domain names. Therefore, set the whitelist to the LIST of ISP DNS servers. You can further determine the request whose source address is forged as ISP DNS based on the TTL value.

    • Slow connection attack

      For HTTP, the slowloris attack begins with the well-known slowloris attack: Establish an HTTP connection, set a large Content-Length, send only a few bytes at a time, and keep the server thinking that the HTTP header has not been transmitted. Connections quickly run out of connections.

      There are some variations, with slow HTTP POST requests and slow READ requests based on the same principle.

    • DOS attack

      Some server programs have bugs, security holes, or architectural flaws that allow an attacker to send malformed requests to a server that can’t properly process malicious requests and get stuck in a state of denial of service. For example, buffer overflow exists in some versions of APP server programs, and the vulnerability can be triggered but shell cannot be obtained. The attacker can change the program execution process to make it jump to a null pointer or an address that cannot be handled. User-mode errors will lead to process suspension, and the system may crash if the errors cannot be recovered by the kernel.

      This kind of problem also has the effect of denial of service, but it is a vulnerability in essence, which can be solved by the latest version of Patch program. The author believes that it does not belong to the category of DDOS.

  • attacks

    • A hybrid

      In actual high-traffic attacks, TCP and UDP traffic are mixed, and the network layer and application layer attacks are carried out simultaneously.

    • reflective

      DRDOS first disclosed this in 2004 by setting the source address of SYN packets to the target address and then sending a large number of packets to the target address

      Real TCP server sends a TCP SYN packet, which received the SYN packet TCP server to shake hands three times in order to complete the SYN | ACK packet “response” to the target address, completed a “reflection” attack, the attacker hides itself, but there is a problem is made in the attacker’s attack traffic flow and the target received 1:1, Immediately after and SYN | ACK packet to the target is back to the RST packet, the attack rate of return on investment is not high.

      The essence of the reflective attack is to use “challenge – response” agreement, the inquiry package source address by raw socket set to the destination address, the response of the “package” has been sent to the target, if back to the package volume is larger or protocol support recursive effect, attack traffic will be amplified, as a cost-effective flow type of attack.

      The protocols used in reflection attacks include NTP, Chargen, SSDP, DNS, AND RPC PortMap.

    • Flow amplification

      Taking the SSDP protocol common in DRDOS mentioned above as an example, the attacker sets the Search type to ALL to Search for ALL available devices and services. This recursive effect produces very large magnification. An attacker can generate tens or even hundreds of times of reply traffic to the target with only a small amount of query traffic with forged source addresses.

    • Pulse type

      Many attacks last very short periods of time, usually less than five minutes, and show up on traffic charts as spikes.

      Such attacks are popular because “fight-fight-stop-stop” works best, as soon as the defense threshold is triggered and the defense mechanism takes effect the attack stops and the cycle repeats. Mosquitoes don’t bite you, but they fly around your ears. When you turn on the light, they run away and disappear. When you turn off the light, they come back and you can’t sleep.

      Automated defense mechanisms are largely triggered by setting thresholds. Although many vendors claim that their defenses are second-level responses, they are actually more difficult.

      Attack detection at the network layer is usually divided into stream by stream and packet by packet. The former detects ddos attacks on the network at a certain sampling ratio (for example, 1000:1) based on Netflow. The second kind of packet-by-packet detection has shorter detection accuracy and response time, but higher cost. Generally, manufacturers will not ignore TCO to deploy this kind of solution. Even by packet inspection, cleaning of its defence strategy startup also depends on the threshold, and cleaning equipment in general will not series deployment, trigger need drainage after cleaning, so most scenarios can do second level testing but can’t do second level defense, wash with proximal, cloud cleaning and conversion process to trigger more slowly. Therefore, the effective gray period of defense rules can be used to complete the attack before triggering the defense, which is shown as pulse in the result.

    • Link flood

      With the development of the DDOS attack technology, a new attack mode, link-flooding attack, does not attack the target directly but blocks the upper link of the target network. For an enterprise network using IP Anycast, conventional DDOS attack traffic is “distributed” to infrastructures with different addresses, which effectively alleviates heavy traffic attacks. Therefore, an attacker invents a new method to attack the penultimate hop to the target network Traceroute, that is, the upstream route, resulting in link congestion. Domestic ISPs currently do not allow Anycast, so the need for such an attack remains to be seen.

      Attacks on both tier 1 ISPs and IXP can cause link congestion.

0x02 Multi-Layer Defense Structure


DDOS attack is essentially a kind of attack that can only be alleviated but cannot be completely defended. It is not like a vulnerability that is solved by patch. DDOS is not completely cured even if it has purchased and deployed more competitive defense solutions in the current market. Security products such as firewalls, IPS, and WAF boast certain anti-ddos capabilities. However, they are effective against low-traffic DDOS attacks at the application layer, but not against large-traffic DDOS attacks.

Take the situation in mid-2015 as an example, there were nearly 10-20 DDOS attacks in China within a month when the attack traffic reached 300G. This number will be further amplified with the increase of domestic household broadband network speed. How to defend against 200-500 GB attack traffic will be presented in the following section. The complete defense structure can be divided into four layers.

These 4 layers are not strictly defensive in depth, nor are they involved in all defenses, but sometimes only 2 of them are. But for heavy traffic attacks, layer 4 is all there is to it. May be different from the manufacturers’ marketing propaganda, the protection of heavy traffic attack is not as some manufacturers claimed that the manufacturer can solve unilaterally, but the result of multi-layer joint participation in defense.

  • ISP/WAN layer

    This layer is usually invisible to the end user, and if it’s just a small business, it may really be out of reach. But if it is a large Internet company, a public cloud vendor, or even a cloud cleaning vendor, this layer is essential. Because when the traffic exceeds the limit of their own processing must rely on the ability of telecom operators. Although large Internet companies have a large reserve of bandwidth, it has not reached the degree of easy resistance to heavy traffic DDOS. After all, bandwidth is one of the most expensive resources in all IDC costs. Whether cloud computing vendors at present, large Internet companies to claim success against more than 200 g attack news behind with the operator’s ability to resist D, in addition as a third party cloud cleaning platform they in fact more or less dependent on telecom operators, if only rely on themselves cleaning ability, are very limited.

    At present, such as China telecom specializes in anti DDOS cloud bank provides [proximal cleaning] and [flow down] service, to buy its service vendors can customize need black hole routing IP and telecommunications equipment linkage, a black hole routing method is a simple and crude, in addition to attack traffic, part of the real user’s access will be black hole out together, It is a discounted behavior for user experience, and in essence, it is a way to guarantee the link bandwidth reserved for other users. The reason why there is such charging service is that otherwise, the whole station service will be completely inaccessible to all users. For cloud cleaning vendors, in fact, also need to use black hole routing and telecommunications linkage.

    In contrast, the defense mode of traction, cleaning and injection of attack traffic is less challenging to the user experience, but the cost of black-hole routing is higher than that of the defense side, and the delay of triggering the response is longer.

    The main participants in the layer of carrier defense are large Internet companies, public cloud vendors, and cloud cleaning vendors. The biggest significance of this layer is to deal with the huge attack traffic that exceeds its bandwidth reserve and its DDOS defense capability as supplementary mitigation measures.

  • CDN/Internet layer

    CDN is not an anti-ddos product, but for Web services, it just has a certain anti-ddos capability. Take the shopping spree of large e-commerce enterprises as an example, this page view is very large, which is no less than CC of DDOS in many indicators. However, on the platform side, the majority of requests are filtered by verification code at the CDN level. The number of requests that end up in the database is a small fraction of the total number of requests.

    For DDOS attacks of the HTTP CC type, the CDN uses its own bandwidth to resist DDOS attacks. Dynamic requests that cannot be resisted or penetrate the CDN are sent to the source site. If the anti-ddos capability of the front-end of the source site or the bandwidth of the front-end of the source site is limited, the DDOS attacks are complete.

    Cloud strategy is put forward by the cleaning manufacturers, set in advance good website CNAME, domain name to cloud to clean the manufacturer’s DNS server, in general, cloud cleaning vendor’s DNS will penetrate the CDN request back to the source point source station, is detected when the attack occurred, domain name to your own cleaning the cluster, and then will flow back to the source after cleaning.

    The main detection method is to deploy a reverse proxy (NGINx) in front of the customer’s website to host all concurrent connections. To divert attack traffic, prepare an address pool from domain names to IP addresses. When an IP address is attacked, disable it and enable the next IP address in the address pool, and repeat.

    The above are Cloudflare solutions, and the implementation principles of domestic cloud cleaning vendors are similar. However, such schemes have a common problem. Because the domestic environment does not support Anycast, DNS drainage takes a long time to take effect, which depends on the time for DNS recursion to take effect and is completely uncontrollable. At the same time, CDN is only valid for Web services, but not for TCP services directly connected to games.

    The process of using many of these anti-D services online can be summed up in one sentence: change the CNAME reference and wait for the DNS recursion to take effect.

  • DC layer

    Huawei’s latest anti-ddos product supports a T-class bandwidth of 1440 Gbit/s. The working principle is as shown in the following figure. A DDOS probe is deployed at the Egress of the DC in mirroring mode or optical splitting mode. When an attack is detected, traffic is diverted to the off-line DDOS cleaning device and the cleaned traffic is injected back to the service host to complete a closed loop.

    The deployment equipment itself is not very technical, the technical parts have been packaged as a defense algorithm in the product box. However, it should be pointed out that the existing algorithms and learning capabilities of ADS boxes on the market are limited, and they still need human intervention, such as:

    • The adaptive learning ability of business traffic baseline is limited. For example, the traffic model of the e-commerce industry may exceed the learning ability of ADS, and normal traffic has triggered the attack judgment
    • The automation trigger mechanism is based on a threshold, which means it is not fully automated, because a threshold is a value that is relevant to experience and business scenarios
    • The global policy is a universal policy and cannot defend against each sub-service. It is possible that the sub-service has been used by D and the global policy has not been triggered
    • Common DDOS types, ADS, can be handled automatically, but different DDOS types may need to be identified manually
    • The default template policy may not be applicable to some services and needs to be customized

    In addition to the overall architecture design of DDOS defense, professional skills are required in the above examples. At present, the three layer solutions of 3-4 and 7 are covered in ADS equipment.

    In general, ADS devices can mitigate most common DDOS attacks. Comparatively speaking, the attack methods of layer 3-4 are relatively fixed, while the attack methods of layer 7 are constantly changing due to the many protocols involved. Therefore, ADS sometimes fails to provide comprehensive protection for layer 7. Verification code, but still can not very good solution to this problem. The protection of application layer needs to be combined with business, and ADS plays an auxiliary role when it can be used. For example, for the private protocol of game packet, ADS identifies whether the incoming packet contains fingerprint between client and server by adding fingerprint to packet.

  • OS/APP layer

    This is the last line of defense against DDOS. The significance of this layer mainly lies in the last filtering and mitigation of the traffic that passes through ADS devices, and supplementary protection of the application layer protocols that cannot be protected by ADS. For example, FOR NTP reflection, monlist can be disabled through server configuration, and UDP can be directly blocked at the border if no UDP-based service is provided.

    The largest proportion of Online services in the Internet is Web services, so some Internet companies like to do DDOS protection at the system level, such as against CC, and sometimes these functions can also be associated with business security, such as snatching orders against scalpers.

    The reverse proxy forwards complete HTTP requests to the detection server based on the following information:

    • Concurrent requests from the same IP
    • Concurrent requests with the same IP +cookie
    • Same HTTP header settable field
    • Same request URL

    Then save the connection information count table of the client. For example, if the same IP+cookie initiates a connection request again within a unit of time, the counter of the client’S IP will be +1 until the threshold is triggered. Then the server will block the connection.

    The above is an example of CC defense. ADS device itself provides HTTP 302 redirect, verification code, Javascript parsing and other defense methods. Although OS and application layer can do more things, they still have the cost of their own development and long-term maintenance, and this benefit depends on the specific situation.

  • Link bandwidth

    Increasing bandwidth is rarely mentioned in most articles on DDOS defense strategies, but it is the basis of all the above defenses. For example, the second-level CDN is actually bandwidth pooling. Many large enterprises choose to establish their own CDN, which is essentially the behavior of increasing bandwidth by themselves. In addition to CDN, it is necessary to ensure that there is no obvious single point bottleneck in the bandwidth of multiple ISP links at the DC outlet, backup links, and switches in the lower cabinet. Otherwise, anti-ddos can be dealt with. However, when traffic flows through a node, a switch of a different brand is suddenly overwhelmed, and the final result is still a problem.

    Internet companies with mature operation and maintenance experience generally have capacity management. They have prepared in advance for promotional activities, bandwidth during peak hours, and peaks and troughs of IDC resources. DDOS defense is mainly to eliminate the inherent single point of bottleneck in these network solutions.

0x03 Different Types of Enterprises


DDOS defense essentially belongs to resource defense. Although the complete four-layer defense has a good effect, there is an obvious problem with TCO. Besides the TOP10 companies in the Internet industry, this cost is unbearable. Or even if you can afford IT, the proportion of your overall IT investment will seem too high, and affordability doesn’t mean IT’s the right investment. Therefore, the trend and selection of DDOS mitigation solutions are described for different enterprises.

  • A large platform

    The word “big” here has several meanings:

    • Companies are so rich that they can afford to subsidize specific businesses without “too much” input, choosing only the most effective solution for the local tyrants
    • The company is not necessarily in a very profitable stage, but the IDC investment scale is large enough, so in order to protect the existing investment, the total security investment to maintain a fixed percentage is also very necessary, there is no need to save “small money” when IDC plate is very large
    • The investment in DDOS defense is negligible compared to the potential loss due to service outages

    Map to real world companies related to these items:

    • Internet companies valued at $10 billion or more
    • Large public cloud vendors, IaaS and PaaS platforms
    • IDC scale how much is big, this problem is actually difficult to judge, 1W physical server how, now should only be medium
    • High-profit businesses, such as games and online payments, are more frequently targeted by DDOS

    For IDC scope is big, rich company, prevent DDOS mantra is “back to operator, strongly built room”, on the premise of have all the DDOS defense mechanisms, and constantly improve the barriers of IDC infrastructure for an attacker to make a higher threshold, for companies that do the network traffic is higher, anti DDOS is congenital advantage, The expansion of infrastructure due to rapid business growth has become an automatic defense capability, but for companies with less business traffic, it is probably not willing to spend money to buy a bunch of bandwidth.

    For companies with more money, but not so many online servers, it may not be necessary to invest too much in IDC construction, at this time should turn to purchase the means to get as much DDOS defense mechanism as possible.

  • Small and medium-sized enterprises (smes)

    Resource rivalry is certainly not a strong point for smes, so pursuing ROI is the primary anti-ddos strategy. The first extreme money saving mode, usually naked, until the attack to find anti-ddos manufacturers temporary drainage, the effect of this scheme is almost bad, the vast majority of enterprises should be this kind of psychology, drift along, can save the province, there is no fault, but the key time to know where to find who, know what to do.

    The second kind pursues the effect, hopes to have the cost performance. If your service runs on a public cloud platform, you can directly use the anti-ddos capability provided by the cloud platform. For web services, you can purchase services from a cloud cleaning vendor in advance.

0x04 Different Types of Services


Different types of services require different DDOS defense mechanisms. Therefore, the preceding four layers cannot be applied mechanically.

  • Web

    For Web services, terminal users can have a certain tolerance of delay when an attack occurs. In terms of defense mechanism, all four layers are applicable. Large platforms generally have all four layers.

  • games

    Light games in the form of Web API are similar to Web services, while for large online games with TCP sockets, a slight delay can affect the user experience and even make it easy to drop calls. Cloud WAF, CDN and other measures are invalid in this scenario because they are targeted at the Web. Only DNS drainage and ADS can be used for cleaning. For those parts that ADS cannot defend perfectly, some auxiliary actions can be taken by modifying the communication protocol of the client and the server, such as adding tag to packets, and discarding packets without tag if detected. Defense mechanisms are basically tricks that rely on information asymmetry. The DNS diversion section can be used by vendors with HTTPDNS to ease the time for DNS recursion to take effect.

  • Service strategy

    • Classification strategy

      For the platform, some services by DDOS will lead to the whole site service is unavailable, for example, DNS is unavailable, it is equivalent to the whole line service is unavailable, for strong account system applications such as e-commerce, games, if SSO login is unavailable, the whole line service is unavailable, attackers only need to destroy these services to “capture the thief to capture the king”. Therefore, different levels of protection policies should be used for different assets. Based on BCM requirements, assets should be classified into different SLA requirements for availability, and different levels of protection should be implemented for different SLA requirements. In terms of specific protection policies, The services or functions that can cause platform-level SPOF (single point of failure) should be put into higher cost defense measures. The so-called higher cost not only refers to the purchase of more ADS devices, but also the possible establishment of multiple DISASTER recovery nodes, and the monitoring and response priority should be higher.

    • Failover mechanisms

      DDOS defense not only relies on the four layers of DDOS defense, but also depends on the strength of infrastructure. For example, multi-point remote DISASTER recovery, mirror site, hot site, and standby system, and cloud systems need to be deployed across AZs. These can be switched at any time. Putting all your eggs in one basket leads to few options.

      Infrastructure and application level redundancy is the basis of the technology form. It is not enough to have DRP&BCP policy set with it, and it needs real periodic drills to be able to cope with heavy traffic attacks.

    • Detrimental to the service

      At the time of application service design, should try to avoid the single point of bottleneck “, to avoid a single point being DDOS throughout the product it doesn’t work out, but hope to do some service even if logged off or still does not affect other online service (or function), in case of need to abandon the pawn “control handsome scene from time to tome can choose” cutting “, is not in addition to “0” “1”, Or gray scale, such as the original 10 services online, when I encounter attacks as long as the bottom of the important 3 services online.

  • Supplementary means

DDOS attack is not necessarily completely out of the purpose of bring services, such as playing games before encountered players because the purpose of DDOS server didn’t rob to row in the first room, this kind of factors can effect a radical cure through product design, and there are a lot of application layer DDOS just in order to get another goal, had nothing to do with the above four layers of defense mechanism, It’s about product design. So the defense against DDOS is based on motivation, not blind response.

0 x05 NIPS


ADS is essentially a packet filtering device, which is similar to IPS although it has different functions. As mentioned before, it is necessary to provide virtual patch function for IPS of the whole site sometimes. ADS devices can play this role, but there are only a limited number of items. Payload can be customized

General security teams can easily implement manual customization by running POC exploits, capturing packets to find out the characteristics of attack payload, and editing hexadecimal matching rules.

0x06 Break defense and Counter


From the perspective of security managers, even with a complete 4-layer defense mechanism, it is not perfect, the so-called platform with 400-500G protection capacity is completely possible to be destroyed, complete protection capacity is built on the effectiveness of people, strategies and technical means, if there is a problem in these links, The entire anti-ddos process may fail. Examples include the following:

  • The IP address of the blackhole route must be defined and associated by the defense. In the process of association, a notification is sent to the upstream device to block IP addresses. If the interface is unavailable, this function will be invalid, so ISP-level defense may fail
  • CDN cloud cleaning service relies on the cleaning service provider to take over domain name resolution. If the DNS of the cloud cleaning service provider itself is unavailable, it will also lead to the failure of defense at this level. There are many problems like this, and these anti-D vendors are not invulnable
  • ADS may not be deployed in series at ordinary times, but they must be the front-end device of services after attacks are diverted. If the device itself has the possibility of DOS, even if the bypass is triggered, the defense is completely invalid. On the other hand, THE ADS device is usually connected to the management node. It will also lead to a series of problems of ADS defense.
  • Large flow attack need human intervention, safety is the most basic needs or operations staff in the office network management node connected to the ADS, can remote operations ADS equipment, if the office network operations management link failure, not only cut off the operation, will also elevate the scene emergency tension one order of magnitude, make people more.

The above is not to provide new ideas for attack, but to provide alternative perspectives for anti-ddos solution makers to examine the shortcomings of the solution.

0x07 File and trace


It is now possible to file attacks with more than 100 gigabytes of traffic, which is much happier than before. In the past, you could not even file a case without local resources, but filing a case is only the first step in a long journey. If you want to find someone, you must successfully complete the following steps:

  • In the mass of attacks, look for clues that can be traced back to the IP or related domain name of the C&C server
  • “Black” eat “black”, shut down the C&C server
  • Physically locate attackers by logging in to their IP or using third-party APT’s big data resources (if you can get them)
  • Accompany the uncles to the house arrest
  • Go to court

If the person has no special status, you may get what you want, but if there are special people, you will spend months in vain. The ability to hack depends on the security team’s ability to penetrate and have the leisure to do so. This process is still a bit expensive for many companies, and a strong security team alone is enough to kill most companies. I just happened to meet such a team in the past.