DolphinAttack or How To Stealthily Fool Voice Assistants

Six researchers from the Zhejiang University published an excellent paper describing DolphinAttack: a new attack against voice-based assistants such as Siri or Alexa. As usual, the objective is to force the assistant to accept a command that the owner of the assistant did not issue. The attack is more powerful if the owner does not detect its occurrence (excepted, of course, the potential consequences of the accepted command). The owner should not hear a recognizable command or even better hear nothing.

Many attacks try to fool the Speech Recognition system by finding characteristics that may fool the machine learning system that powers the recognition without using actual phonemes. The proposed approach is different. The objective is to fool the audio capturing system rather than the speech recognition.

Humans do not hear ultrasounds, i.e., frequencies greater than 20 kHz. Speech is usually in the range of a few 100 HZ up to 5 kHz. The researchers’ great idea is to exploit the characteristics of the acquisition system.

  1. The acquisition system is a microphone, an amplifier, a low-pass filter (LPF), and an analog to digital converter (ADC), regardless of the Speech Recognition system in use. The LPF filters out the frequencies over 20 kHz and the ADC samples at 44.1 kHz.
  2. Any electronic system creates harmonics due to non-linearity. Thus, if you modulate a signal of fm
    with a carrier at fc, in the Fourier domain, many harmonics will appear such as fC – fm, fC + fm¸ and
    fC as well as their multiples.

You may have guessed the trick. If the attacker modulates the command (fm) with an ultrasound carrier fc, then the resulting signal is inaudible. However, the LPF will remove the carrier frequency before sending it to the ADC. The residual command will be present in the filtered signal and may be understood by the speech recognition system. Of course, the commands are more complicated than a mono-frequency, but the system stays valid.

They modulated the amplitude of a frequency carrier with a vocal command. The carrier was in the range 20 kHz to 25 kHz. They experimented with many hardware and speech recognition. As we may guess, the system is highly hardware dependent. There is an optimal frequency carrier that is device dependent (due to various microphones). Nevertheless, with the right parameters for a given device, they seemed to have fooled most devices. Of course, the optimal equipment requires an ultrasound speaker and adapted amplifier. Usually, speakers have a response curve that cut before 20 kHz.

I love this attack because it thinks out of the box and exploits “characteristics” of the hardware. It is also a good illustration of Law N°6: Security is not stronger than its weakest link.

A good paper to read.

 

Zhang, Guoming, Chen Yan, Xiaoyu Ji, Taimin Zhang, Tianchen Zhang, and Wenyuan Xu. “DolphinAttack: Inaudible Voice Commands.” In ArXiv:1708.09537 [Cs], 103–17. Dallas, Texas, USA: ACM, 2017. http://arxiv.org/abs/1708.09537

 

Picture by http: //maxpixel.freegreatpicture.com/Dolphin-Fish-Animal-Sea-Water-Ocean-Mammal-41436

 

 

Subliminal images deceive Machine Learning video labeling

Current machine learning systems are not designed to defend against malevolent adversaries. Some teams succeeded already to fool image recognition or voice recognition. Three researchers from the University of Washington fooled the Google’s video tagging system. Google offers a Cloud Video Intelligence API that returns the most relevant tags of a video sequence. It is based on deep learning.

The idea of the researchers is simple. They insert the same image every 50 frames in the video. The API returns the tags related to the added image (with an over 90% high confidence) rather than the tags related to the forged video sequence. Thus, rather than returning tiger for the video (98% of the video time), the API returns Audi.

It has never been demonstrated that subliminal images are effective on people. This team demonstrated that subliminal images can be effective on Machine Learning. This attack has many interesting uses.

This short paper is interesting to read. Testing this attack on other APIs would be interesting.

Hossein, Hosseini, Baicen Xiao, and Radha Poovendran. “Deceiving Google’s Cloud Video Intelligence API Built for Summarizing Videos.” arXiv, March 31, 2017. https://arxiv.org/abs/1703.09793

AlphaGo: Round three – The Supremacy

This will most probably my last post on AlphaGo. AlphaGo is the supreme go player.

As announced, end of May 2017, the “Future of Go Summit” occurred in China. During this event, AlphaGo Master won three games against Ke Jie, the top grand master. After this magisterial success, AlphaGo played its last competitive match. The Deepmind team will focus now on new challenges.

 

AlphaGo: round two

Aside

In January 2016, AlphaGo won against a top go player.   This victory was a huge success for Artificial Intelligence (AI).   It was also a huge shock for the go community.  The community has started to study AlphaGo and explore new ways to play go.  New styles may emerge.

End of May, there will be a round two.  Chinese government and AlphaGo set up an event with Chinese professional player to explore this new opportunity.  The central event will be AlphaGo playing against the world number one player.   Many lessons may come from this event.

To be followed…

Neural Networks learning security

In October 2016, Martin Abadi and David Andersen, two Google researchers, published a paper that made the highlights of the newspapers. The title was “Learning to protection communications with adversarial Neural Cryptography.” The newspapers announced that two neural networks learnt autonomously how to protect their communication. This statement was interesting.

As usual, many newspapers simplified the outcome of the publication. Indeed, the experiment operated under some detailed limitations that the newspaper rarely highlighted.

The first limitation is the adversarial model. Usually, in security, we expect Eve not to be able to understand the communication between Alice and Bob. The usual limitations for Eve are either she is passive, i.e., she can only listen to the communication, or she is active, i.e., she can mingle with the exchanged data. In this case, Eve is passive and Eve is a neural network trained by the experimenters. Eve is not a human or one customized piece of software. In other words, it has limited capacities.

The second limitation is the definition of success and secrecy:

  • The training of Alice and Bob to be successful requires that there is an average error rate of 0.05 bits for the reconstruction of a protected message of 16 bit. In cryptography, the reconstruction error must be null. We cannot accept any error in the decryption process.
  • The output of the neural network protected message must not look random. Usually, the randomness of the output is an expected feature of any cryptosystem.

Under these working assumptions, Alice and Bob succeeded to hide their communication from Eve after 20,000 iterations of training. Unfortunately, the paper does not explain how the neural network succeeded, and what types of mathematical methods it implemented (although the researchers modeled a symmetric like cryptosystem, i.e., Alice and Bob shared a common key). There was neither an attempt to protect a textual message and challenge cryptanalysts to break it.

Thus, it is an interesting theoretical work in the field of machine learning but most probably not useful in the field of cryptography. By the way, with the current trends in cryptography to require formal proof of security, any neural network based system would fail this formal proof step.

 

Abadi, Martín, and David G. Andersen. “Learning to Protect Communications with Adversarial Neural Cryptography.” arXiv, no. 1610:06918 (October 21, 2016). http://arxiv.org/abs/1610.06918

 

 

 

 

Artificial Intelligence vs. Genetic Algorithm

AI and Deep Learning are hot topics. Their progress is impressive (see Alpha Go). Nevertheless, they open new security challenges. I do not here speak about the famous singularity point, but rather about basic security issues. This interesting topic raised my interest. Thus, expect to hear from me on the subject.

Can AI be fooled? For instance, can a recognition software be fooled to recognize other things than expected? The answer is yes, and some studies seem to indicate that at least in some fields it may be relatively easy. What does the following image represent?

You most probably have recognized a penguin. So did a well-trained, deep neural network (DNN) software as we may expect. According to you, what does the following image represent?

Once more, did you not recognize a penguin? The same DNN decided that it was a penguin. Of course, this image is not a random image. A. NGUYEN, J. YOSINSKI and J. CLUNE studied how to fool such a DNN in a paper “Deep Neural Networks are Easily Fooled.” They used a genetic algorithm (or evolutionary algorithm) to create such fooling images. Genetic algorithms try to mimic evolution under the assumption that only the fittest elements survive (so called, natural selection). These algorithms start from an initial population. The population is evaluated through a fitness function (here the score of recognition of the image) to select the fittest samples. Then, the selected samples are mutating and crossing over. The resulting offsprings pass again the same selection process. After several generations (in the order of thousands usually), the result is an optimized solution. The researchers applied this technique to fool the DNN with success.

They also attempted to train the DNN with these fooling images has decoys to reject them. Then, they tried the same process with new generations. The newly trained DNN with decoys did not perform better to avoid fooling images. They also succeeded with totally random noisy images, but these images are aesthetically less satisfactory J

Interestingly, the characteristics of the fooling images, such as color, patterns, or repetition may give some hints on what the DNN uses actually as main differentiating features.

This experiment highlights the risks of AI and DNN. They operate as black boxes. Currently, practitioners have no tools to verify whether the AI or DNN may operate properly under adverse conditions. Usually, with signal processing, the researchers can calculate a theoretical false positive rate. To the best of my knowledge, this is not anymore the case with DNN. Unfortunately, false positive rates are an important design factor in security and pattern recognition related to security or safety matters. With AI and DNN, we are loosing predictability of the behavior. This limitation may become an issue soon if we cannot expect them to react properly in non-nominal conditions. Rule 1.1: Always Expect the Attackers to Push the Limits.

A very interesting paper to read.

Nguyen, A., J. Yosinski, and J. Clune. “Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images.” In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 427–36, 2015. doi:10.1109/CVPR.2015.7298640 available at https://arxiv.org/pdf/1412.1897.pdf

 

 

A Milestone in AI: a computer won against a Go champion

I usually only blog about security or Sci-Fi. Nevertheless, I will blog about an entirely unrelated topic as I believe we have reached an important milestone. Artificial Intelligence (AI) is around for many decades with various successes. For several years, AI, through machine learning, has made tremendous progress with some deployed fascinating products or services. For instance, Google Photo has leap-frogged the exploitation of databases of images. It can automatically detect pictures featuring the same person over decades! Some friends told me that it even differentiated natural twins.

Nevertheless, I always believed that go game was out of the reach of AI. Go is a multi-millennial ancient game with extremely simple rules (indeed, only three rules). It is played on a go ban of 19 x 19 positions. Each player adds a stone (white or black) to create the largest territory. The game is extremely complex not only because of the number of possible combinations (it is said to be greater than the number of atoms in the universe) but also by the infinite possible strategies. It exceeds by several amplitudes the complexity of chess. A great game!!!

On January 27, 2016, Google made my belief wrong. For the first time, their software, AlphaGo, won five games to zero against a professional go player. AlphaGo was first trained with 30 million moves. Then, it has been self-reinforced by playing against itself thousands of times. The result is a software at the level of a professional go player. Evidently, AI passed a milestone.

Machine learning will smoothly invade security practices. Training software through logs to detect incidents will be a good starting point.