Confessions of a hacker: I once blew the attack
Originally titled , halfway through writing it, I felt the experience should be worthy of the title.
darkness comes before dawn
We have to go back more than six years., That was the time when China's cybersecurity darkness comes before dawn, Phishing sites are rampant。 Kingsoft and 360 Security guards have all duly launched a promise to pay for the safety of online purchases, Most of these pressures are carried by the anti-fishing program。
At that time, I had been working on the Kingsoft Cloud Check project for four years, from a fresh graduate to a development manager, and had been working in the anti-Trojan field, witnessing the building of the first cloud security system in China, which is still running now.
By chance, I took over the anti-phishing project, which had only one product manager, one operation, half a client-side developer and half a server-side developer, plus me, a total of four people. I later learned that Friends invested dozens of times more manpower in anti-phishing than we did.
The only difference is that I report directly to VP and can direct product, operations, client and server-side development with zero communication costs. An anti-phishing solution that I came up with in the morning was able to go online on the extranet by noon and see the data in the afternoon. This is a huge advantage that I didn't realize until much later: the Quick decisions。
At the time, the project was criticized for its slow response time, with the back-end system taking an average of five minutes to go through the complete analysis and release process. Of course, the amount of analysis that goes on every day is enormous.
So, the first thing I did when I took over the project was to spend a week rewriting the analytics code for the anti-phishing system that However, the new code did not go live. I didn't bother writing code for the next year , because, in the process, I realized that the code, while problematic, was not the bottleneck and that I had more important things to do.
there are trade-offs and gains
Several things were done in the first month.
Improved response time, from five minutes to one minute. This doesn't require any code changes, just smoothing out the process and having give and take.
Got a virus analysis colleague to reverse the friendly anti-phishing module and backtest our week-old detection. This is just for effect comparison purposes, and we don't trust the results of Friends either. Of course Friendlies are doing the same thing.
Start blocking gambling sites. Since there was a debate about whether gambling sites were phishing or not, there was no blocking of such sites. I took over and just intercepted them.
The first three points, to put it bluntly, are that the product looks good in all sorts of numbers. But it's still far from what I'd like it to be.
Due to the lack of manpower, we were not able to do many things and had to give up a lot. For example.
Previously, Dr. Ye led a team of algorithms that did an attempt at machine learning to block gambling phishing sites, which now looks very forward-looking, back in 2010 - 2011. But the false positives are higher and have to be discarded because there is no manpower to do the engineering work on it.
Many phishing sites use encrypted code to evade string rule matching, and the response should be to call the browser engine to run out the real page. Again, no one, too, has given up.
Counterfeit websites are generally handled using image similarity comparison techniques, such as fake banks and Taobao websites. Unsurprisingly, we gave up as well.
There's no more technical input, so how do you go about blocking phishing sites?
fig. find a way out of an impasse
We no longer have any advantage in passively detecting phishing sites. Might as well die trying and move from defense to offense against phishing sites! The strategy of attack is simple: raise the cost of phishing sites to defraud, so they go from empty-handed to out of pocket.
Sometimes the lack of choice often pushes back the truly right choice.
Exactly how to do this will require a targeted analysis of the different types of phishing sites. Due to staffing issues, we will only focus on the first three most harmful phishing sites: taobao, airline tickets and train tickets.
Reducing the interception speed to zero seconds
The first one we analyze is the Taobao category of phishing. The conventional school of technology is to analyze page features, similarity to the Taobao website, etc. The first thing we do, as unskilled wildcatters, is analyze the victim's process of being scammed.
Thanks to the functionality, we can not only get the whole fraud process technically, but we can also communicate directly with the user, as if to restore the case. Through these analyses, we found that if we fail to pop up a warning at the first interaction between the scammer and the user, before any trust has been established, the lure of a low price can lead the user to fall for the scam.
A bit of criminal psychology.
Further, how does the scammer gain the initial trust of the user? He will send a URL like taobao.com, for example . Since the URIs of Taobao products are particularly long, users will not look very carefully, and such a domain name is enough to pass off as a fake.
So, for Taobao phishing, I made a bold decision: drop all content detection, and for certain free top-level domains, turn on host-level black regular matching directly on the extranet. In other words, as long as your domain name looks like Taobao and is a suffix like , it will be directly judged as a phishing site, this of The response time is zero seconds.
This simple rule works surprisingly well: real-time blocking, zero false positives, and a super high block rate all the time with only ongoing maintenance.
This unorthodox approach has also caused some problems for our friends who monitor our detection rates.
Blocking off channels of dissemination
Blocking Taobao-type phishing sites with offensive means has given us a taste and a boost of confidence. The gun immediately begins to be pointed at the large amounts of scammed and difficult to detect airline ticket phishing.
These types of phishing sites are not as distinctive as Most can also provide authentic airline agent credentials , colleagues in anti-phishing operations have also been tricked into manually adding white to such phishing sites. If an experienced person can't tell, how can a machine tell the difference?
To attack phishing sites, zero false positives is a primary requirement and a source of confidence. So it is imperative to find a way to tell, with 100% certainty, whether the real airline ticket agent credentials used by the phishing site actually belong to it or not.
The devil is one foot tall, the Tao is ten foot tall. Although it required checking the IATA website and manual secondary confirmation, we had a good idea of the premise of the attack - zero false positives. All that remains is to find a key point to intercept.
Again, we've restored several scammed cases in full and found a striking similarity - Search engine airfare promotion hides dirt The search is especially rampant in a certain dog, and most users are scammed through this channel.
By blocking this distribution channel, airline ticket phishing will be paid for nothing but no revenue. As a precaution, I wrote a script that monitors several search engines for airfare promotions and verifies them manually every day. Expectedly, these promotions are invariably phishing, with the exception of the regular travel sites.
Without delay, airline ticket anti-phishing also gave up content detection, directly on the external network to open the non-white or black mode: if it is from a dog search airline ticket advertising promotion, not in the white list, directly reported black.
The blocking effect was immediate, but we soon received an email from a dog browser questioning our misreporting: we blocked over 100 of their airline promotions, but the friendly company only blocked 30 of them. Two things can be seen from this: how rampant airline ticket phishing was at the time, and how effective this blocking method was.
It didn't take long for a certain dog browser to break its commercial partnership with us, and soon our own browser went into secret development, which was all an afterthought.
The bottleneck is in the people
Anti-fishing is a battle of wits with a criminal syndicate, with a large number of opponents and a strong monetary interest driving them. As a hacker, interception can only be done through technical means, as well as psychological, economic and other non-technical aspects of analysis. The bottleneck in this model is obvious: people, and people leave this anti-fishing system and the results fall apart.
Is there a one-and-done solution? That's what I've been thinking about. Can fiery AI solve this problem? I find it difficult, at best, to improve the variety of interceptions and reduce the labor input that It's all about defense.。
In my opinion, the only way to make phishing sites go away is to go on the offensive. When the author of Panda Burner was caught, malicious infectious viruses were rare in the country. And while domestic security companies have abundant and corroborating evidence of phishing crimes, few hear of those who have been jailed for it, and may still be in a defensive position for too long.