Double eleven today, no program!! Write an article quietly: -d

On The eve of New Year’s Day in 2018, the company’s business system traffic suddenly increased 5-6 times, but there was no relevant rehearsal or preparation, so the system broke down and the money no longer came in. For small companies, pain and happiness!

After the fire fighting rescue, personal summed up some small experience, big factory use, but a total of a few people of small and micro companies may be some help.

PHP technology stack, however, the same principle! Don’t worry about which is the best language in the world 🙂

A brief version

  • Upgrade to PHP7
  • Enable OPcache
  • Use logs to identify potential risks and bottlenecks
  • Clearing slow database queries
  • Disable the Debug and Log functions
  • Increase cache time

Added version

  • If you have not yet upgraded to PHP7, please do so if you can. In my opinion, if there is a bit of technical pursuit, it will not be dead in the arms of N years of things are not willing to let go, million years CentOS 6.x or even 5.X type of “stability” is “really stable”? Technical debt can be owed, but it should not be too much. Making wheels is not most small firms’ forte, and the old technology stack is a pain to maintain.

Not intended to target CentOS. I worked on CentOS, Fedora, openSUSE, Mandriva Linux, Elementary OS, Debian, Ubuntu, Ultimate Edition, Red Flag Linux, Ylmf OS(Rain Forest Wind OS, Current name StartOS, Linux Deepin(deep OS, not the usual computer city installed Ghost version!) … With all the Linux distributions I could get my hands on, I finally decided that Ubuntu was friendlier and suitable for white users like me. After graduation, I came across Linux Mint and have been using it for over 5 years now. The year before last, I saved money for a laptop with i7+GTX1070+64G+1TSSD+4K screen. The first thing I got was a tray loaded with Linux Mint. The last laptop I7 +16G+SSD is still the same system. In the second year after graduation, I did iOS development with virtual machine, which felt smoother than the Mac Mini with low configuration, but of course not as smooth as the MacBook Pro with the company.

  • Enable OPcache, the first of several Tips for Maximum PHP7 Performance.

  • Note the following two points:

    • Php-fpm slow log (for example, is there a function like sleep in the code? Is it really necessary?)
    • Slow query logs of the database (indexes, triggers, stored procedures, etc., may cause slow queries or even table locks)
  • Clearing slow database queries

Personally, as for the general Internet small company, the system bottleneck generally appears in the database first. May peacetime business volume is small, hardware resources are sufficient, the problem does not appear, it is estimated that they do not pay too much attention to this problem. Database connection pool problem is not a bottleneck, in the face of a large number of slow queries no matter how much connection pool ah, the first slow query drop good, query slow or even lock table may be only an index problem…

  • Disabling Debug is nothing to say, but I’m afraid I’ll forget!! As for logs, watch out for non-asynchronous, potentially blocking logs (for example, logging directly to each request synchronously is not appropriate).

Too long to see the version (duplicate)

On New Year’s Eve in 2018, the system traffic instantly soared 5~6 times, but there was no advertising investment recently. At that time, the first reaction was to be attacked, but on a calm thought, which was idle and targeted a small company, how big a enemy. Check various data and infer that this is natural traffic.

Due to the lack of relevant drills and emergency plans, the system is broken, the company’s income is basically stopped, pain and happiness!

My first suggestion was to upgrade PHP. After upgrading one of the servers, the memory and CPU consumption was cut in half (note: it didn’t really solve the problem).

Then the other servers were upgraded, but embarrassingly, not long after, the system crashed again, with many more error logs. The PHP framework used by our company is ThinkPHP 3.x, which is impossible to smoothly upgrade due to the overlapping problems left over from history. The result was a downgrade back to version 5.6.

Back to square one, and the problem is not solved!!

Therefore, the next attempt is to increase the number of servers and increase the upper limit of the number of database connections (ali Cloud database). However, the pressure is still not relieved! No relief! In other words, it’s not about that, it’s about throwing good money after bad.

Then I got an SSH account for a server.

Note: MY personal responsibilities are relatively messy, from Android to iOS to Java Web to PHP Web… But I have never been in charge of the maintenance of the company’s servers. Of course, the outsourcing projects I am personally responsible for include server purchase, system environment configuration, project development, deployment and maintenance… It covers everything from 0 to 1.

I enabled Opcache first, then tweaked some parameters for Nginx and PHP-FPM, which helped but didn’t fix the problem.

Then I accidentally turned on PHP’s slow log. Because the system was jammed, a lot of logs, failed to catch the key, fruitless that day.

Recommend a good tool mosh, the key time does not drop line oh

After all, users also want to rest, slowly flow down, the system to return to normal. But I didn’t dream of a solution at night 😀

The next day, continue to check, all kinds of adjustment without fruit. After lunch, my mother called me to ask if I was working overtime. This is…

After hanging up the phone, my sharp eyes noticed a small log. Following the number of lines indicated in the log, I looked through the code and found that the line called a usleep function. In 5 seconds, oh my God, there was no need, so I commented it out and restarted PHP-FPM. I then went through all the usleep functions in the code, commented them out, and restarted PHP-Fpm.

The system returned to normal service and the money started flowing in…

But the story doesn’t end there!

Afternoon flow and gradually increase, to the evening meal point, the system and slow response, but AT that time I eat out, did not take a computer, also can not go back, there is no way, I had to take out a mobile phone to try. Fortunately afternoon had the prescience to prepare a batch restart server PHP-FPM tool, a restart it returned to normal, after a period of time blocked again restart, was effective, but the flow is still increasing, slowly no longer use, but by this time I have finished eating back home. Turn the computer on now and keep fighting!

PSSH (Parallel SSH)

Finally, my colleague located the reason that the session data table of the database of another subsystem did not add index, resulting in a long time to delete the session in batches when the session expired. It was estimated that the table was locked.

I checked with all the other flaws in the main system (which I wasn’t aware of), and the session driver is ☺ switched from mysql to Redis. The stumbling block is a subsystem — an old App that deploys all the traffic — that deploys the database.

After the session table was indexed, the system returned to normal.

The OPcache function was enabled, and the database trigger function was disabled. In fact, the operation of increasing the cache time of the interface did not solve the root of the problem. And this time it didn’t even seem to solve the problem.

The breakthrough of this rescue lies in the log, one is phP-FPM slow execution log, the other is mysql slow query log. You can solve the problem by identifying the cause of the long queue in phP-FPM.

The number of phP-FPM processes is also limited under certain server resources. If there is a process sleep (such as USleep) or the number of database connections reaches the upper limit (due to slow queries), this is fine with sufficient server resources. However, once phP-FPM processes are queued, an avalanche effect can follow. The system went down.

The above is a summary of my experience, there may be a lot of mistakes and deficiencies, welcome to correct and discuss, thank you!

I am an ordinary 985 engineering college, business administration major, double degree. Start a small company to work for 5 years +, whole vegetable engineer, want to change a place close to home to work, recently in the pearl River Delta region looking for a new pit, partial back end direction, for introduction ah (has changed pit)!