Artificial intelligence has become a part of our life – its objectively huge potential is now obvious to everyone, more and more is being said about innovative products that give an idea of an AI-operated world of the future, but less so, about the risks associated with the introduction of such technologies – primarily because they are not considered relevant yet.
Nevertheless, while developing solutions for data protection with the introduction of AI, we can confidently say that the relevance of these risks will become a problem not in years, but months from now.
Let's look into it. Starting with the fact that along with promising technological trends - machine learning and AI - the technologies of cyberthreats grow and develop at the same pace, if not faster – recent cases of WannaCry and NonPetya only prove that.
AI algorithms, with all their advantages, have a fundamental problem – data sensitivity. The general weakness with most of the algorithms created so far is that they are trained not to understand information, but to recognize the right answers. Knowing of such algorithm principles allows attackers find ways to deceive them.
There are several potential ways of doing this that we know of. The first one is to apply the poison attack at a training stage. The second is to force algorithm to make the wrong decision using various methods at the stage of application.
We do not present real cases of such attacks, however, desk experiments on images have already been described – when the program mistakes what is in the picture. Same with texts, when the chat bot is forced to give incorrect answers to user’s questions. While this does not pose a serious danger, but as the robots take more and more responsible decisions - and that is inevitable - this can become a big problem.
One of the most interesting cases of such attacks is attacks on computer vision algorithms. A well-known example is that when an image of a panda bear - easily recognizable by a person - is overlaid with "noise" and gradient points form a completely different picture – recognized by an AI as an image of a baboon.
A far more intimidating example was when the same gradient points were used to mask an image of the road “stop” sign - the human eye saw it, but computer vision did not recognize it as a stop sign.
In this case, there is clearly a high probability that an automatic car, having reached the intersection, simply would not stop. Such cases have already been described by investigators, which means that real attacks by the same ransomware is a matter of the nearest future.
On the one hand, being aware of the AI vulnerabilities at an early stage of its development gives scientists the opportunity to eliminate these gaps and make algorithms more effective, but the same knowledge gives cybercriminals the opportunity to improve methods of attacking algorithms and the fact that intruders haven’t used it yet is just a matter of time. The only thing that still deters cybercriminals is access to computing resources.
And what to do with it? In our opinion, the most important step is to process the initial data correctly.
Make sure you use trusted sources
We ourselves use only our own controlled sources and data from our users' computers, which we have the opportunity to verify.
Do a qualitative analysis
Preprocessing of data is needed at all stages, but using it during the training phase will allow to remove anomalies from the original data, reduce the risk of using incorrect data and, consequently, the risk from such attacks. For example, when processing databases with pictures, using them in their original form will make them vulnerable to poisoning attack, while smoothing the image will reduce the probability of such attacks to zero.
One more example. Knowing how the classic antivirus works, malicious attackers regularly re-encrypt the body of a malicious program, changing the hash so that at least for a while it is not recognized by antiviruses that are operating on signatures. We have been familiar with such tricks for a long time – they give us additional information for the AI algorithm, which eventually helps identify the malicious program.
Set the maximum number of situations during the algorithm’s training phase
Every day we train a new model and constantly receive data from Active Protection about what is happening with our customers. We assume that it is possible to uncover our algorithm, come up with a way to attack and start sending us inaccurate data, which would reduce the quality of the algorithm’s performance.
In theory, it could work, but in reality, regularly monitoring the quality of the algorithm and tracking what was yesterday or a week ago, we know how accurate it is. If we see that the accuracy has decreased, we will analyze the sources of new data and understand whether there is a real threat. Constant synchronization and tracking of changes will not prevent the attack itself, but will save you from its consequences.
Do not neglect technical methods
In addition to the correct organization of data processing, do not forget about the technical tools: for example, use the “drop out” method it will allow the network to carry on with its functions correctly even if a certain number of neurons fail. That would provide you with the neural network stability.
The trend of using AI in cybersecurity obliges to pay maximum attention to two important components - high accuracy of ransomware recognition and protecting algorithms from possible attacks. We gave simple examples of how knowing the principles of constructing an algorithm, attackers can find ways to penetrate it.
Therefore, it is important to be aware, to pay maximum attention to the quality of the initial data and to remember that AI can be used not only in cybersecurity, but also in cybercrime - today we can already assume that in the future the malicious programs will be using AI algorithms for their attacks.
Article by Sergey Ulasen, Acronis director of Development, PhD.