Towards Understanding Google Titan

Introduction

Recently, much-privileged software (e.g., BIOS and UEFI firmware) attacks [3, 2] are identified. Attackers are able to remotely or locally compromise the boot firmware and install persistent rootkits. In order to defend against such low-level powerful attacks, Google has committed to designing and building Titan as a trusted chip to harden the Google Cloud Platform (GCP).

Titan is first introduced at Google Cloud Next 2017. It is secure, low-power Google’s purpose-built chip to establish hardware root of trust for both machines and peripherals on cloud infrastructure, allowing people to securely identify and authenticate legitimate access at the hardware level. Titan not only provides private key storage and management, secure boot and a hardware-based root of trust, but also offers two important additional security properties: remediation and first-instruction integrity. Trust can be re-established through remediation in the event that bugs in Titan firmware are found and patched, and first-instruction integrity allows us to identify the earliest code that runs on each machine’s startup cycle.

In the following of this article, we will do our best efforts to figure out the working mechanisms of Google Titan, although there are limited documents describing its design details. In addition, we also do some insight comparisons between Titan and TPM, which provides similar functionalities.

2 Titan vs. TPM

2.1 Topological Location

According to Google online documents [5, 6], Titan communicates with the main CPU via the Serial Peripheral Interface (SPI) bus, and interposes between the boot firmware flash (e.g., BIOS) of the first privileged component, e.g., the Baseboard Management Controller (BMC) or Platform Controller Hub (PCH), allowing Titan to observe every byte of boot firmware. In addition, Titan is able to gate PCH/BMC access to the boot firmware flash (e.g., BIOS) until after it has verified the flash content. Thus, Titan should be located on SPI bus, between boot firmware flash and PCH/BMC. For TPM, it is usually connected via LPC bus. Sometimes, SPI is also used to connect to the TPM [7, 8]. According to the topological locations of Titan and TPM (shown in Figure 1), we can visually identify why Titan is able to hold the machine in reset while TPM can not. alt

Figure 1: The locations of Titan and TPM on a typical Intel system. Titan is connected via SPI bus, while TPM is usually attached to LPC bus.

2.2 Hardware Features

TPM as a trusted hardware module comprises components:

An Execution Engine (processor),
A cryptographic co-processor
A hardware random number generator,
A non-volatile memory,
Some accessible PCRs Similarly, in order to provide secure boot, remediation and first-instruction integrity, Titan is designed as a chip that comprises the following hardware components:
A secure application processor,
A cryptographic co-processor,
A hardware random number generator,
An embedded static RAM (SRAM),
An embedded flash (non-volatile memory),
A read-only memory block，
A monotonic counter Based on the above observations, we find that Titan and TPM have many similar hardware features (i.e., application processor, cryptographic co-processor, random number generator, non-violative memory, monotonic counter), which are mainly to serve key storage and management, secure boot and hardware root of trust, and to defend against rollback attacks. There are only limited differences in hardware features. One difference is Titian provides SRAM working together with the application processor (seems to be more powerful) to get better performance. Another mainly difference is Titan is able to provide proactive measurement (Titian proactively measures BIOS/UEFI’s code), while TPM only has passive working mode (TPM never proactively measures outside environment).

2.3 Security Features

2.3.1 Secure Boot

The secure bootsup sequence for a host machine with Titan is as follows, starting from power up/reset:

Titan holds the machine in reset. 2.Titan’s application processor executes code from its embedded read-only memory (boot ROM). 3.Titan runs a Memory Built-In Self-Test. This step is to ensure that: (a) All memory (including boot ROM) has not been tampered with. (b) The firmware in the on-chip flash is not tampered with. 4.The code from boot ROM loads and executes the verified firmware from the on-chip flash. 5.Titan’s firmware verifies the host’s boot firmware flash，e.g.，BIOS/UEFI. 6.Titan signals readiness to release the rest of the machine from reset. 7.The CPU then loads the basic firmware (BIOS or UEFI) from the boot firmware flash, which performs further hardware/software configuration. 8.Once the machine is sufficiently configured, the boot firmware accesses the ”boot sector” on the machine’s persistent storage, and loads a special program called the ”boot loader” into the system memory. 9.The boot firmware then passes execution control to the boot loader, which loads the initial OS image from storage into system memory and passes execution control to the operating system.

TPM-based secure boot assumes BIOS as the root of trust in ROM is secure. Thus, the boot sequence of the TPM-based secure boot skips the rst 6 steps and starts from Step 7 of the Titan-based secure boot. It is noteworthy that many existing attacks indicate that this assume is not reliable in practice [3, 2]. Fortunately, TPM together with Intel TXT [4] or AMD SVM [1] is able to provide dynamic root of trust. With this technology, it allows the system to build up root of trust in Step 9, without trusting any previous steps.

2.3.2 Remediation

Remediation is based on strong (cryptographic) identity. To provide strong identity, the Titan chip manufacturing process generates unique keying material for each chip. At the same time, Google securely stores this material-along with provenance information - into a registry database. It means that Google is able to identify each Titan chip by leveraging the unique keying material and the registry database. To be safe, the contents of this database are cryptographically protected using keys maintained in an offline quorum-based Titan Certification Authority (CA). Individual Titan chips can generate Certificate Signing Requests (CSRs) directed at the Titan CA, which - under the direction of a quorum of Titan identity administrators - can verify the authenticity of the CSRs using the information in the registry database before issuing identity certificates.

The Titan-based identity system is able to provide several interesting security features, which are listed as follows:

1.Enable back-end systems to securely provision secrets and keys to individual Titan-enabled machines, or jobs running on those machines. 2.Verify the firmware running on the chips, as the code identity of the firmware is hashed into the on-chip key hierarchy. 3.Fix bugs in Titan firmware, and issue certificates that can only be wielded by patched Titan chips. 4.Chain and sign critical audit logs, making those logs tamper-evident. To offer tamper-evident logging capabilities, Titan cryptographically associates the log messages with successive values of a secure monotonic counter maintained by Titan, and signs these associations with its private key. This binding of log messages with secure monotonic counter values ensures that audit logs cannot be altered or deleted without detection, even by insiders with root access to the relevant machine.

TPM does not provide the identity-based rumination feature, although each TPM has a unique identity, e.g., Endorsement Key (EK). On the con- trary, TPM-based security services aim to anonymize the identity of the TPM to provide better privacy protection.

2.3.3 Remote Attestation

TPM is able to provide remote attestation. There is no direct evidence/documents to explicitly describe Titan-based remote attestation. However, based on the existing hardware and software features (e.g., tamper-evident logging and sophisticated key hierarchy), I believe it is not hard to extend Titian to provide remote attestation.

2.3.4 First-instruction Integrity

The first instruction in First-instruction Integrity (FiI) is the first instruction of the boot firmware, i.e., BIOS/UEFI, not the first instruction of Titan’s code. TPM does not support this feature. To achieve FiI, Titan should satisfy the following requirements:

1.R1: Titan’s code executes first, before any code in BIOS/UEFI. 2.R2: Titan’s code should be trust or verified. 3.R3: BIOS/UEFI’s code should be verified before gaining CPU. 4.R4: The execution transfer from Titan to the boot firmware flash should be non-interruptible.

Both Titan and the boot firmware flash are on SPI bus, and Titian is located between EMC/PCH and boot firmware flash (Figure 1). The topological advantage allows Titan to gate PCH/BMC access to the boot firmware flash and execute its code before BIOS/UEFI’s code (achieving R1). It is noteworthy that the above conclusion for R1 is based on an implicitly assumption: there is no hardware locating between Titan and PCH/BMC to gating PCH/BMC access to Titan. An insider attacker may physically violate this assumption.

The code in Titan is in two parts: (1) C1: code in read-only ROM, and (2) C2: code in the on-chip ash. For code C1, it is implicitly trusted, as the boot ROM is physically protection (inside Titan chip) and read-only. For code C2, it is cryptographically verified by C1 every time the chip boots (achieving R2).

The topological location allows Titan code to access the code on the boot firmware flash. Thus, after achieving R2, Titan code is also able to verify BIOS/UEFI’s code, (achieving R3).

There is no document discussing about R4 requirement. It seems that this requirement is implicitly achieved by Titan and the topological location. It is noteworthy that any attacks that are able to break R4 can break the FiI property.

Overall, these four requirements (R1, R2, R3 and R4) are explicitly or implicitly achieved through the combination of hardware and software design.

alt

Conclusions

In this article, we did the comparisons between Titan and TPM, from three aspects: (1) topological location, (2) hardware features and (3) security features. Putting all comparisons together, we get the following Table 1. Overall, Titan not only can provide almost all TPM features (remote attestation is unclear), but also can provide remediation and FiI. In addition, an important essential difference is that Titan is able to provide proactive measurement but TPM can not.

References

[1] AMD. Amd64 architecture programmer’s manual volume 2: System programming. March 2017. [2] Furtak Andrew, Gorobets Mikhail, Bazhaniuk Oleksandr, and Bulygin Yuriy. Firmware is the new black - analyzing past three years of bios/ue security vulnerabilities. July 2017. [3] Monroe Bruce, Branco Rodrigo, and Zimmer Vincent. Firmware is the new black-analyzing past three years of bios/ue security vulnerabili- ties. July 2017. [4] Intel. Intel 64 and IA-32 architectures software developer’s manual com- bined volumes: 1, 2a, 2b, 2c, 2d, 3a, 3b, 3c, 3d and 4. March 2017. [5] Google Cloud team. Google infrastructure security design overview. Jan- uary 2017. [6] Google Cloud Security team. Titan in depth: Security in plaintext. August 2017. [7] Allan Tomlinson. Introduction to the TPM, pages 155–172. Springer US, Boston, MA, 2008. [8] Wikipedia. Trusted platform module. September 2017.

This article reprinted from Baidu Security Laboratory.

Harry Ren

Towards Understanding Google Titan

Comments