(From Lebyte)

As programmers, we all know the story of the genius programmer. Start coding at a young age, create your first profitable website at age 11, go to college at 16, start a company at 17, and become a billionaire at 23.

We love the stories, we love the heroes. They inspire us with productive programming projects and trend-setting ways. From solving complex NP problems to raising millions in Series A funding, they never seem to miss A beat.

But here’s the thing: every developer, even the great ones like Lebyte, will mess things up and get over it.

The only difference is scale: if we screw up, the database records are corrupted; If they screw it up, it could be a billion-dollar mistake. Why are we so afraid of being wrong? Mistakes are good, and there is no better teacher than failure.

But it also carries a certain stigma. No one wants to talk about mistakes because no one wants to be seen as a fool among geniuses.

But such repression has consequences. When a developer makes a mistake, it’s often seen as a personal failure and blamed. “Mike forgot to update the release document” or “Bill picked the wrong branch” are counterproductive.

Failures are usually systemic and provide a good opportunity to identify and correct business deficiencies. There is no better teacher than failure, and we should not be afraid to talk about failure.

In that spirit, I came clean about three of the worst mistakes I made as a budding software developer. I’ll go on to explain how I’ve grown from each failure, and how I’m thankful for each failure.

1

Deleted thousands of urls

When I worked at a large financial institution, I developed a system to clean up unused routing in the F5 network layer. Before blocking, the F5 routing pool could only support about 5000 urls. The system automatically monitors these URL traffic, notifies the owners of unused resources, and cleans them up so that the F5 system does not crash and the constant manual operation is freed up.

The system had been working fine, however, one Sunday I woke up to an email saying that 1000 routes had been deleted the night before, and the user complained that these were active/live urls!

With everyone’s weekend ruined, our team sprang into action. It turns out that an old.yaml configuration file deployed with the application container removes inactive routes within a week instead of a month. Fortunately, I have fail-safe to prevent the deletion of production resources, but the problem is still serious, and if my program does delete active resources, it can cause a company-wide interruption to heavily used applications.

It turns out that most resources that are inactive for a week remain inactive for a month. In other words, important applications don’t go inactive for a week. As a result, the eventual damage was manageable: of the 1,000 urls removed, only a few experienced complaints.

But it was also a tremendous amount of blame and pressure for me and my manager, especially in the early stages when the damage was not clear. So we set up a “war room” and diverted the entire team’s resources to manually rebuilding those lost resources.

Why is that?

At first, I thought it was all my fault. But, in hindsight, it was also a systemic failure. First, the existing F5 routing management system does not meet the business needs, and there is no clear backup/rollback policy, which is a big problem.

In addition, the old configuration files are still hanging there because of the unnecessarily complex deployment process. It’s too bureaucratic. It can go wrong. In the end, the critical task was given to me alone (i.e., no code review/team involvement) and the deadline was so loose that it was a recipe for disaster.

We never took it seriously enough. Now, looking back with experience, this outcome seems inevitable.

How did I grow up?

I’m grateful to my colleagues who stepped up to get us out of this mess.

When my managers and most senior developers told me that they had lost faith in me as an engineer and would no longer allow me to work on important projects, I felt professional pressure I had never felt before.

In other words, they didn’t believe I had done such a stupid thing, and they didn’t believe I could continue with this project or any other important project (which they eventually withdrew).

It was embarrassing, but I admit IT made me cry.

One of my teammates took me out for a beer later, and when I relayed our conversation to him, he said it was unfair and told me how much he and the rest of the team appreciated me.

I’ve been stressed out all week, and when I heard someone say that, I finally broke down. And my supervisor took me to lunch to tide me over. All of these things are fresh in my mind.

It taught me that while the code was well controlled, the infrastructure and data were often poorly controlled. It is critical to manage these components in your system through database migration tools such as DBMate and Terraform and treat them as important as your application code.

Limiting access is also critical in a production environment.

For example, I wouldn’t even keep a local branch on the IDE, preferring instead to lock all direct pushes to non-feature branches team-wide.

By default, database and cloud accounts should be read-only and have clear backup and recovery policies.

For example, in my next job, a developer accidentally deleted files from the Prod S3 bucket. If I hadn’t set up S3 version control policy a week ago (it’s off by default — Amazon sucks!) We could be lost forever.

Finally, I learned my last and most important lesson, which is empathy.

Recently one of my teammates had a similar situation: he put the wrong code into production and we had to manually modify some data.

He felt guilty about what he had done. I take this opportunity to clearly explain that this is because our deployment data migration process is not perfect enough, that this is our team’s failure, not his failure, and that this is what was meant to happen.

At the same time, I reminded him how important the great features he created were to us and the company. The error simply reminded us to re-examine our tools/processes, which prompted him to contribute to the solution. Mistakes are opportunities.

2

Email the code outside the company

Before I left, I emailed the code to myself. After nearly a year of studying the Spring library, I’ve created some pretty good test patterns. I don’t want to forget these good ideas, and I plan to write a series of articles about them on the Medium blog.

About a month later, on my first day on the job, I received a text message that made me pale. “Dude, we got a team accident. Someone e-mailed the code outside the company, and there are legal implications. Do you know who did it? ‘

I immediately called my former manager, but there was no answer. Calls to my colleagues went unanswered. The legal department has stepped in and told them to cut me off. This is really terrible. The new manager sensed something was wrong and asked me about it. He used to be a lawyer, so he told me to hire one just in case. I urgently called my wife’s family lawyer and discussed the situation. Since utility code is involved, it’s unlikely they’ll “pick on me,” but it’s possible.

My wife was in a good mood when she picked me up that day. She asked me how my first day was. “I think I blew it,” I replied, and her face changed. When I told her what had happened, she reassured me that it was stupid, but we could get through it. I spent the next week in a fog, until my former company’s legal team approached me and told me they wouldn’t Sue me if I signed an agreement to remove the code immediately.

Why is that?

I’m trapped by old beliefs, simple as that. Although it may seem like a nefarious conspiracy, the simple fact is that I’m very proud of the patterns and utilities I’ve built, and I think if I lose them, I lose something as a developer.

I had some big ideas about how this could lead to some interesting blog posts, and somehow the benefits outweighed the risks because I was so focused on it.

To this day, I still feel bad for how this incident affected my former teammate. I am 100% responsible for this mistake, but they will have to deal with the consequences. Your team’s reputation can suffer, and dealing with audit issues can cause a lot of trouble for everyone. This is damaging to my professional reputation and has sadly broken many relationships.

How did I grow up?

Most importantly, I have since become very cautious about company emails and internal communications.

Within a week of my new job, several employees were fired because of inappropriate conversations in Slack messages. It was a pretty messy affair: they were all fired, and the rest of us ended up having to go through mandatory HUMAN resources training on workplace harassment.

As incompetent as your company’s tech people are, you should always trust them to have full visibility into your private messages.

Also important is how my wife and parents are there for me.

Faced with such a situation, I felt very frustrated and my mind went blank. It was their calm and understanding that brought me back to reality.

I was on the brink of an existential crisis: how could I have a PhD and be so careless and stupid at the same time? Will that ruin your future? Without the support of my family, I might have gone overboard and made the whole situation worse.

Their advice and guidance led me to hire a lawyer and make my situation less bad.

YAGNI (You ain’t Gonna Need it, You don’t Need it), it’s not just a principle of software.

Does this code really need to be looked at again?

Is it worth the risk, even if it leads to multiple blog posts?

Definitely not. If you’ve left a job, or to start a new life, leave.

Don’t take anything, don’t look back, just look forward.

PS: self-study video Ali P8 strongly recommended 8 Java project combat large collection B station: BV1J64y1R7UN