Some useful lessons from Microsoft Hack

In July, Microsoft disclosed that a Chinese hacker group was able to access the mailboxes of some organizations. The attack used stolen signing keys. Recently, Microsoft published a post-mortem analysis of the incident and its remediation.
The analysis is an interesting read. There are many lessons and best practices. The following are my preferred ones.

Our investigation found that a consumer signing system crash in April of 2021 resulted in a snapshot of the crashed process (“crash dump”).  The crash dumps, which redact sensitive information, should not include the signing key.  In this case, a race condition allowed the key to be present in the crash dump (this issue has been corrected).  The key material’s presence in the crash dump was not detected by our systems (this issue has been corrected).

Memory dump is critical for security.  An attacker may find a key within the memory.  There are many techniques, such as entropy detection, brute force (This was the Muslix attack against AACS), pattern detection for PEM-encoded keys, etc.

Microsoft lists two impressive sets of security safeguards:

  1. Redact sensitive information from crash dumps before issuing them.
  2. Verification of the absence of key material (Like Github proposes when scanning code and binary)

Any secure software developer must know the risk associated with memory dump.  Clear keys in memory should be limited to its strict necessary time.  They should be erased or rewritten with nonce as soon as the code does not need them. 

Black Hat 2023 Day 2

  1. Keynote: Acting national cyber director  discusses the national cybersecurity strategy  and workforce efforts (K. WALDEN)

A new team at the White House of about 100 people dedicated to this task. No comment


The people ḍeciding which features require security reviews are not security experts. Can AI help?

The first issue is that engineering language is different than the normal language.   There is a lot of jargon and acronyms.  Thus, standard LLM may fail.

They explored several strategies of ML.

They used unsupervised training to define vector size (300 dimensions).  Then, they used convolution network with these vectors to make their decision.

The presentation is a good high-level introduction to basic techniques and the journey.

Missed 2% and false 5%.


The standard does not forbid JWE and JWS with asymmetric keys.  By changing the header, it was able to confuse the default behavior.

The second attack uses applications that use two different libraries, crypto and claims.  Each library handles different JSON parsing.   It is then possible to create inconsistency.

The third attack is a DOS by putting the PBKDF2 iteration value extremely high.

My Conclusion

As a developer, ensure at the validation the use of limited known algorithms and parameters.

ChatGPT demonstrates the vulnerability of humans to being bad at testing

When demonstrating a model, are we sure they are not using trained data as input to the demonstration.  This trick ensures PREDICTABILITY.

Train yourself in ML as you will need it.

Very manual methodology using traditional reverse engineering techniques

Laion5B is THE dataset of 5T images.   It is a list of URLs.  But registered domains expire and can be bought.  Thus, they may be poisoned.  It is not a targeted attack, as the attacker does not control who uses it.

0.01% may be sufficient to poison.

It shows the risk of untrusted Internet data.  Curated data may be untrustworthy.

The attack is to use Java polymorphism to override the normal deserialization.  The purpose is to detect this chain.

Their approach uses tainted data analysis and then fuzz.

Black Hat 2023 Day 1

  1. Introduction (J. MOSS)

Jeff MOSS (Founder of DefCon and Black Hat) highlighted some points:

  • AI is about using predictions. 
  • AI brings new issues with Intellectual Properties.   He cited the example of Zoom™ that just decided that all our interactions could be used for their ML training.
  • Need for authentic data.

The current ML models are insecure, but people trust them.  Labs had LLMs available for many years but kept them.  With OpenAI going public, it started the race.

She presents trends for enterprise:

  • Enterprise’s answer to ChatGPT is Machine Learning as a Service (MLaaS).  But these services are not secure.
  • The next generation should be multi-modal models (using audio, image, video, text…).  More potent than monomodal ones such as LLMs.
  • Autonomous agent mixes the data collection of LLM and takes decisions and actions.  These models will need secure authorized access to enterprise data.  Unfortunately, their actions are non-deterministic.
  • Data security for training is critical.  It is even more challenging when using real-time data.

She pointed to an interesting paper about poisoning multi-modal data via image or sound.


Often, the power LED is more or less at the entry of the power supply circuit.  Thus, intensity is correlated to the consumption.

They recorded only the image of the LED to see the effect of the rolling shutter.  Thus, they increase the sampling rate on the LED with the same video frequency.  This is a clever, “cheap” trick.

To attack ECDSA, they used the Minerva attack (2020)

Conclusion: They turned timing attacks into a power attack.  The attacks need two conditions:

  1. The implementation must be prone to some side-channel timing attack.
  2. The target must have a power LED in a simple setting, such as a smart card reader, or USB speakers. 

Despite these limitations, it is clever.


Once more, users trust AI blindly.

The global environment is complex and extends further than ML code.

All traditional security issues are still present, such as dependency injection.

The current systems are not secure against adversarial examples.  They may not even present the same robustness of all data points.

Explainability is insufficient if it is not trustworthy.  Furthermore, the fairness and trustworthiness of the entity using the explanation are essential.


The Multi-Party Computation (MPC) Lindel17 specifies that all further interactions should be blocked when a finalized signature fails.  In other words, the wallet should be blocked.  They found a way to exfiltrate the part key if the wallet is not blocked (it was the case for several wallets)

In the case of GG18 and GG20, they gained the full key by zeroing the ZKP using the CRT (Chinese Remainder Theorem) and choosing a small factor prime.

Conclusion: Adding ZKP in protocols to ensure that some design hypotheses are enforced.


They created H26forge to create vulnerable H264 content.  They attack the semantics out of its specified range.  Decoders may not test all of them.  The tool helps with handling the creation of forged H264. 

Conclusion

This may be devastating if combined with fuzzing.

Enforce the limits in the code.


If the EKU (extended key use) is not properly verified for its purpose, bingo.

Some tested implementations failed the verification.  The speaker forged the signing tools to accept domain-validated certificates for signing code.


Politically correct but not really informative.

Breaching the Samsung S9 Keystore

Most Android devices implement an Android Hardware-backed Keystore.  The Rich Execution Environment (REE) applications, i.e., the unsecure ones, use a hardware root of trust and an application in the Trusted Execution Environment (TEE).  Usually, as all the cryptographic operations occur only in the trusted part, these keys should be safe.

Three researchers from the Tel-Aviv university demonstrated that it is not necessarily the case.  ARM’s TrustZone is one of the most used TEEs.  Each vendor must write its own Trusted Application (TA) that executes in the TrustZone for its key store.  The researchers reverse-engineered the Samsung implementation for S8, S9, S20, and S21.  They succeeded in breaching the keys protected by the key store.

The breach is not due to a vulnerability in TrustZone.  It is due to design errors in the TA.

When REE requests to generate a new key, the TA returns a wrapped key, i.e., a key encrypted with a key stored in the root of trust.  In a simplified explanation, the wrapped key is the newly generated key AES-CGM-encrypted with an IV provided by the REE application and a Hardware-Derived Key (HDK) derived from some information supplied by the REE application and the hardware root of trust key.

 In other words, the REE application provides the IV and some data that generate the HDK.  AES-CGM is a stream cipher (uses AES CTR), and thus it is sensitive to IV reuse.  With a streamcipher, you must never reuse an IV with the same key.  Else, it is easy to retrieve the encrypted message with a known ciphertext.  In this case, the attacker has access to the IV used to encrypt the wrapped key and can provide the same `seed` for generating the HDK.   Game over!

In S20 and S21, the key derivation function adds some randomness for each new HDK.  The attacker cannot anymore generate the same HDK.  Unfortunately, the S20 andS21 TA contains the old derivation function.  The researchers found a way to downgrade to the S9 HDK.  Once more, game over!

Lessons:

  1. Never reuse an IV with a streamcipher.  Do not trust the user to generate a new IV, do it yourself.
  2. A Trusted Execution Environment does not protect from a weak/wicked “trusted” application. 
  3. If not necessary, remove all unused software from the implementation.  You reduce the attack surface.

Reference

A. Shakevsky, E. Ronen, and A. Wool, “Trust Dies in Darkness: Shedding Light on Samsung’s TrustZone Keymaster Design,” 208, 2022.  Available: http://eprint.iacr.org/2022/208

The fall of Titans?

Two French security researchers, Victor Lomne and Thomas Roche, published in January an impressive 55-page report.  The report describes a successful Electro-Magnetic side-channel attack on Google’s Titan security key.  They succeeded in extracting the ECDSA private key.

Titan security key is a FIDO U2F compliant key also known as Google authenticator.  It is functionally similar to Yubikeys.  Its purpose is to serve as a physical token for Two-Factor Authentication (2FA).

Mounting side-channel attacks on secure components like smart cards is “common.”  It usually assumes the attacker has samples to analyze and that the attacker can store arbitrary known secrets in the samples.  This knowledge provides some reference points during the attack.  Once the attack is fine-tuned with the samples using a known secret, it is possible to extract the target’s secret. Unfortunately, this is not true in this specific use case.  When registering, the token generates its ECDSA key pair.  The private key never leaves the token.  It is why it is not possible to back up such tokens.  Thus, it is possible to purchase Titan tokens, but not to feed an arbitrary key pair.  The researchers used an interesting methodology to overcome this issue.

They first identified the secure component used by Titan. They removed the plastic cover and identified NXP A7005.  They found out that some JavaCards have similar characteristics to the NXP A7005.  Thus, they used JavaCards using NXP P5x chips.

Using a 500µm coil with 10µm precision micromanipulators, they measured the EM signature of the ECDSA signing for both Titan and the JavaCard.  The comparison of the two EM signatures confirmed that they used the same implementation.  Thus, they concentrated their effort on the Javacard to design the exploit.  They reverse-engineered the implementation using the EM traces to guess the calculations. They discovered a sensitive leakage and could mount a complex side-channel attack.  The document details the complexity of the attack.  With 4,000 sampled signatures for 2TB of data, they succeeded in extracting the key that they fed to the smart card.

Then, they implemented the same attack on the Titan chip.  They increased the number of samples to 6,000 for 3TB of data.   They succeeded in extracting the private key.

How devastating is this attack?

  • The specialized equipment is about 10K€ (about $12K). The needed skill set is high.  On the  Common Criteria (CC) scale, it has a rating of 27 corresponding to attackers with moderate attack potential.  The corresponding chips are old and are not any more covered by CC certificates.
  • The attack requires the attacker to get the Titan key for several hours to collect the 6,000 samples.  It is not possible to clone it.
  • The attack requires opening the plastic casing.  The operation seems destructive.  For stealthiness, the attacker must be able to repackage the chip in a legitimate case.
  • The attacker needs to return the “borrowed” recased key to the legitimate owner. Else this owner may detect the loss and block the access.
  • This attack impacts not only the Titan token but a long list of components.

Thus, we may forecast that such attack would be efficient only against very high-profile targets.

Conclusions

The attack is an impressive piece of work.  Reading the document gives an overview of the issues a side-channel attack requires to solve. It is extremely interesting.

Diversity of implementation across different products is a costly but secure option.

Continue to use your 2FA tokens.  It is more secure than not using them.  If you lost your 2FA token, change your accounts to use a new one as soon as possible (which should be the case, independently of this attack).

Use 2FA tokens as much as possible.

Reference

Lomne, Victor, and Thomas Roche. “A Side Journey to Titan.” NinjaLab, January 7, 2021. https://ninjalab.io/wp-content/uploads/2021/01/a_side_journey_to_titan.pdf.

My preferred papers at Black Hat 2019

I attended the briefings at Black Hat 2019.  All the presentations I attended were engaging.  Nevertheless, here is the feedback on my preferred ones.  The link gives access to the corresponding slid decks.

A Decade After Bleichenbacher ’06, RSA Signature Forgery Still Works (Sze Yiu Chau)

One of the mitigations to Bleichenbacher’s attack is that the exponent d should be large. Unfortunately, in some standards d is still small, typically 3.  But even with larger exponents, forgery is possible due to vulnerable software.

Forgery uses the fact that many verifiers do not check garbage, parameter lengths, and padding.

He provides a list of vulnerable libraries (that are now fixed).

Lessons:

  • Check everything. No corner cutting.
  • Parsing in security should be bulletproof.  The complexity of the structures and syntax may become an issue.  Complexity is the enemy of security.

Lessons from 3 years of crypto and blockchain audit (Jean-Philippe Aumasson)

Jean Philippe is a Kudelski Security expert. 

He provides a view of most deployed mistakes.  Most are well known.  A few ones that I liked:

  • Weak key derivation from a password. Use a real derivation function.
  • Avoid using panic if the error is not unrecoverable, else it may become a potential DoS.
  • No way to erase securely sensitive memory with garbage collection (for Instance, go)

His preferred language for crypto is Rust.

The slide deck is an excellent refresher of what not to do.  Practitioners should have a look.

Breaking Encrypted Databases: Generic Attacks on Range Queries (Marie-Sarah Lacharite)

She presented how to use access pattern leakage and volume attack leakage to guess the content of the database even if encrypted.

Independently of the provided attacks, the researcher reminded that if using a common encryption key (and same IV) with server-side encryption, it is still possible to perform a range query because the same cleartext generates the same ciphertext.  This may be a PII issue. 

There are some partial solutions to this problem:

  • Order preserving encryption solves the issue
  • Order revealing encryption is even better

Pattern leakages measure the number of returned records per request. She used PQ trees to rebuild the order of the observed answer of access pattern. For N values in the database, N log N queries were needed.

Volume leakage is easier because the attacker may just monitor the communication. For N values in the database, N2 log N of observed queries are needed.

Some possible mitigations:

  • Restricting query types
  • Dummy records
  • Dummy values

The two last solutions may introduce some accuracy issues if not filtered out.

Everybody be Cool, This is a Robbery! (Gabriel CampanaJean-Baptiste Bédrune)

The studied the actual security of Hardware Security Modules (HSM).  HSMs are rarely studied because they are expensive and if attacked they will erase secret.

They used the PKCS11 API.

The targeted HSM used an old version of LINUX (10-year-old). Furthermore, every process runs as root, and there was NO secure boot.  Attackers used fuzzing to find 14 vulnerabilities.  Exploiting a few of them, they could get access to the private root key!

Lesson:

Hardware Tamper Resistance and controlled API are not enough.  The software should assume that the enclave has been breached and be protected correspondingly. 

Breaking Samsung’s ARM TrustZone (Maxime PeterlinAlexandre AdamskiJoffrey Guilbon)

Samsung’s Trustzone works only on Samsung chip Exonis and not Qualcomm’s Snapdragon

The secure OS is Kinibi by Trustonics.

Once more, adversaries used a fuzzier (AFL-based)

Currently, the trustlet has no ASLR and PIE (Position Independent Executable). They used buffer overflow on the trustlet and a trusted vulnerable driver to go inside the Trustzone. From there,  they attacked mmap for accessing Kinibi.

They were able to read and write memory arbitrarily. For instance, they accessed the master key both from EL3 and from EL1.  With the master key, the attacker has access to all the secrets in the Trustzone.

Lesson:

Once more, protect code within the secure enclave.  Defense in depth is critical.

Meltdown and Spectre

On January 2018, security researchers disclosed two attacks coined Meltdown and Spectre. These attacks bypass the memory isolation of modern CPU by exploiting side-channel attacks on hardware-based optimization features of these CPUs. Thus, Meltdown and Spectre can gain arbitrary access to confidential information in the memory of the computer.

Modern CPUs, so-called superscalar computers, do not execute anymore the instructions sequentially. They implement many hardware-based optimization techniques that modify the normal instruction flow. For instance, the CPU executes multiple instructions concurrently to keep the processor’s sub-units as busy as possible (See Eben Upton’s post). Thus, out-of-order execution speculatively executes instructions further down the instruction flow as soon as all needed resources are available. Thus, the CPU may execute an instruction before it is sure that the instruction is needed. If later the CPU determines the instruction was not needed, it discards the corresponding results from its registers. This mechanism is sound architecturally but not at the microarchitecture level. The cache memory still holds the discarded results. Unfortunately, for many years, security researchers have designed side-channel attacks that leak confidential information from the cache. Modern CPUs’ branch predictors attempt to guess the future control flow and, execute the instructions of the predicted instruction flow preemptively. If the predicted decision is wrong, the CPU discards the “results” of the speculative instructions if the prediction was incorrect. Once more, this mechanism is sound architecturally. Unfortunately, the results remain in the cache memory. Covert-side-channel cache attacks can retrieve them.

The attacks

The goal of Meltdown is to dump the kernel memory space from a user-space process. In a simplified explanation, Meltdown operates in two steps. During the first step, Meltdown entices the CPU to access the kernel space through out-of-order instructions. When the instruction flow reaches this execution point, it detects the violation and triggers an exception handling that blocks actual access to the kernel space. During the second step, Meltdown uses covert-channel cache attacks to retrieve the cached “inaccessible” data. Intel memory management maps privileged kernel memory in the user-space. Thus, kernel memory becomes accessible. The usual security assumption is that kernel memory is secure and not accessible on a computer without root access. Meltdown breaks the hardware-enforced isolation between kernel space and user-space.

Meltdown may affect any CPU using out-of-order mechanism and is OS-independent. Meltdown has been successfully tested on Intel x86, Intel XEON processors, and ARM Cortex A57. Meltdown was mounted on cloud containers, such as Docker, successfully. The software countermeasures use KAISER. KAISER is a software patch that prevents the mapping of kernel memory into the user space, thus thwarting Meltdown. The KAISER patch is available for Windows 10, Linux, MacOS and iOS.

The goal of Spectre is to reach information from another process. Spectre exploits branch prediction and speculative execution. It operates in three steps. During the first step, Spectre mistrains the branch predictor by repeatedly executing a given branching. During the second step, Spectre entices the branch predictor to mispredict the control flow. The CPU then executes the speculative code that should perform the “illegal” operations, such as reading unauthorized memory. As in Meltdown, the third step exfiltrates the cached data using a covert-channel cache attack. Spectre accesses from a given user-space the memory of another user-space. Spectre breaks the hardware-enforced isolation between processes.

Spectre has been successfully implemented on recent Intel processors, AMD Ryzen, AMD FX, and AMD PRO. Spectre was implemented on Windows and Linux-based OS. It was written in C and also in JavaScript. The countermeasure would be to halt predictive execution on sensitive execution paths. This is a difficult task as the current instruction set is not fit for that purpose. The alternative solution is to implement in the code mechanisms that reduce the impact of the leaked information (for instance, combining conditional select and conditional move. In other words, developers must be aware of the covert-channel cache attack and implement adequate countermeasures. Compilers may also implement some tricks.

As Spectre can be mounted with JavasScript, malicious adware may become the first exploits using Spectre in the field. Thus, browsers are receiving patches to mitigate the risk. The exploitability via JavaScript is worrying.

Google’s Project Zero released concurrently three vulnerabilities, coined variant 1 to 3. These three vulnerabilities are identical to Meltdown and Spectre. Variant 1 and 2 correspond to Spectre whereas variant 3 maps to Meltdown.

Conclusion

Meltdown and Spectre are not due to bugs. They are the consequences of a new breed of side-channel attacks exploiting information leaking at the microarchitectural level for speed optimization.

It is interesting to notice that Paul Kocher is one of the researchers disclosing Meltdown and Spectre. In 1996, Paul designed the first side channel attack. His attack disrupted the security of smart cards. Since 1996, side-channel attacks have been among the most prolific, complex fields of research in security.

We want/need the CPUs to be faster. Thus, silicon designer added these optimization features to go faster. Unfortunately, most trivial countermeasures would defeat the benefit. For instance, cache attacks may be defeated by randomizing or equalizing the access time, which would annihilate the purpose of the cache. New hardware architecture, as well as new instruction sets, will help to defend. Nevertheless, we have a new class of side channel attacks to take into account. No doubts that variants will soon flourish.