913 Views

The CXL Arbitrator & Multiplexer in a Nutshell

CXL is an aspiring new technology for high bandwidth devices like Accelerators, GPUs etc. In an era where there is growing need of High-Performance Computing (HPC), CXL offers high bandwidth and a low latency connectivity between Host (typically a CPU) and Devices like accelerators, memory expansion devices etc.

 

CXL leverages the existing PCIe 5.0 Physical Layer infrastructure and the PCIe alternate protocol negotiation process with some added advancements to support data transfer from multiple protocols.

 

CXL introduced a new component, Arbitrator and Multiplexer, to facilitate the use of legacy PCIe Physical layer. Arb-Mux dynamically multiplexes data coming from multiple protocols (CXL.IO and CXL.Cache-Mem) and routes it to the Physical Layer. This approach helps industry to transition and take advantage of the new capabilities enabled by CXL without having to make many updates in the Physical Layer, which has been one of the most complex components to design.

 

(Snippet showing placement of Arb-Mux in a CXL Stack)

 

Following are some salient features of the Arb – Mux:

 

  1. Used for sharing the same physical layer with multiple link layers.
  2. Ability to multiplex traffic from multiple protocols.
  3. Provides arbitration (Tx) and data steering (Rx) of the CXL.IO & CXL.Cache-Mem flits.
  4. Supports Arb-Mux bypass feature if link is to be accessed in PCIe only mode.
  5. Virtual Link State Machines (vLSMs) helps each layer to be in sync to that of the link’s state.
  6. Virtualized Active/PM states per link layer that allows one protocol to be in PM state and the other in Active state.
  7. Status synchronization to keep handshakes robust during multiple LTSSM transition to recovery.

 

VIRTUAL LINK STATE MACHINE (vLSM):

 

  • Maintains Virtual Link State Machine (vLSMs) for respective CXL link layers.
  • Coordinates state transitions with remote Arb/Mux through link management packets (ALMPs) [feeds into Arbiter on Tx, and consumed on Rx]
  • Determines the link state request for Flex Bus Physical Layer.
  • With the introduction of CXL 2.0, For Arb – Mux link to operate CXL.IO is a minimum requirement.
  • Below is an overview of the vLSM states and its corresponding LTSSM state if applicable.

 

For a detailed description of all vLSM state transitions visit the CXL 2.0 base specification.

ARB – MUX LINK MANAGEMENT PACKETs (ALMP):

 

  • The ARB – MUX uses ALMPs to communicate virtual link state transition requests and status.
  • ALMPs are a 1 DW packet and replicated 4 times on lower 16 bytes of a 528-bit flit to provide data integrity protection.
  • Can be categorized in two types:
    1. Request ALMPs, used to initiate vLSM Active/PM state transition.
    2. State Status ALMPs, used to communicate the current state to remote partner.
  • Any ALMP Error or reception of Unexpected ALMP, results in LTSSM transition to Recovery.

 

ALMP HANDSHAKEs:

 

Since PM and Active are virtualized states, following ALMP handshakes ensure Tx & Rx are in sync with the remote partner vLSM state.

 

  1. Entry/Transit to Active State:
    • First entry to Active to be initiated by DP only, subsequent transition can be initiated by DP or UP.
    • Requires sending and receiving of Active Request ALMPs from both sides.
    • Active Status ALMP is returned once port is ready to receive flits.
    • Two pairs of Active Request and State Status ALMPs seen on the link before both ports vLSMs moves to Active (one for DP request to UP, and vice-versa)

 

(Snippet showing Entry to Active state)

 

  1. Transit to L1.x state:
    • Can only be Initiated by UP, by sending a PM Request ALMP.
    • If DP is ready to transition, it will transition vLSM to requested PM state and respond with a PM State Status ALMP.
    • UP can transition vLSM post receipt of a PM State Status ALMP.

 

(Snippet showing entry to L1 PM state)

 

  1. Status Synchronization Protocol:
    • Required to keep handshakes robust across Recovery transitions of LTSSM.
    • Status ALMPs need to be sent from each vLSMs (both DP and UP) after physical layer exits Recovery (required before Link Layer flits are transmitted)
    • The state indicated in the transmitted State Status ALMP is a snapshot of the vLSM state (snapshot state: vLSM state post LTSSM exit Recovery)
    • A corresponding State Status Resolution based on sent and received Status ALMPs during the synchronization exchange (New State Req & Status ALMP exchange if resolved state not equal to link layer requested state)

 

(Snippet showing Status Exchange during LTSSM exit from Recovery)

 

ARBITRATION POLICY:

 

Arb – Mux provides arbitration (Tx) and Data steering (Rx) of the CXL.IO and CXL.Cache-Mem flits.

Arbitration policy is a weighted round robin with designated registers to program relative weights associated with CXL.IO or CXL.Cache-Mem, respectively. CXL 2.0 Memory Mapped Register contains the Arb – Mux Registers that defines variables to control this weightage.

Below is a pictorial representation showing how Arb – Mux Arbitrates the IO and Cache-Mem flits using the weighted round robin (WRR) method.

 

(Snippet showing Weighted Round Robin (WRR) Arbitration Policy)

 

We have 2 Buffers, CXL.IO flit Buffer and CXL.Cache-Mem flit Buffer. Consider a situation where both Buffers are full. As of now the weight set for CXL.IO is 4 and that of CXL.Cache-Mem is 2. Meaning 4 flits of CXL.IO will be followed by 2 flits of CXL.Cache – Mem.

With classical WRR approach, the 1st cycle will schedule first 3 flits from CXL.IO flit Buffer i.e., P1, P2 and P3. In the 2nd cycle, 1 flit from IO Buffer i.e., P4, since CXL.IO weight was 4, and the remaining 2 flits to be transmitted would be from Cache-Mem Buffer i.e., C1 and C2. Likewise, arbitration will occur for rest of the available flits in the buffers.

In case any one of the buffers is empty, Arb – Mux will automatically take care of sending flits only from that buffer which is non-empty.

 

VERIFICTION CHALLENGES:

 

Since, Arb – Mux is sandwiched between multiple link layers and an existing physical layer, a bug free implementation of this component is of huge responsibility.

Below are some challenges addressed by eInfochips during Arb – Mux Verification:

  • Establishing initial traffic flow: As the first team to have verified the CXL and Arb – Mux designs a major pain point of Verification is the initial traffic flow. Since, Arb – Mux being a relatively new code, verifying the traffic flow becomes a critical item.
  • Transfer Interface: Basically, there are 2 type of interfaces associated with Arb – Mux.
    • Data Interface & Control Interface between:
      • Link Layer and Arb – Mux.
      • Arb – Mux and Physical Layer

 

It is of utmost importance that these data transfer interface routes valid data to and from physical & link layer and vice-versa, during

 

  • Back pressure from Physical Layer (Phy transmitting OS)
  • LTSSM entry/exit from Recovery.
  • Link and Physical layer transition to PM states.
  • And many more.
  • Status Synchronization: Proper exchange of Arb – Mux vLSM states should happen, since it is the Arb – Mux that communicates the state of the physical link to upper layers.
  • Data Arbitration & Multiplexing: Arb – Mux should never prioritize data from one protocol only. As seen above, data arbitration policy is a weighted round robin. Also, while sending data from Arb – Mux to the link layers, it should be routed to valid channels. For Eg: routing CXL.IO flit to CXL.IO link layer and routing CXL.Cache-Mem flit to the CXL.Cache-Mem link layer.

 

The challenges are not only limited to the above listed items. We are in a field of surprises where we face multiple run time obstacles and are sometimes difficult to put into words.

 

DEALING with VERIFCATION BOTTLENECKS:

 

Dealing with above mentioned verification challenges and bottlenecks proficiently requires hard work, perseverance, and a brief level of technical expertise. Below listed attributes presents that how we as a team were able to address those challenges.

  • More than 15 Years of Expertise in PCIe: Since CXL is built on similar lines to that of PCIe, knowledge/expertise of the PCIe protocol plays a vital role in the CXL Verification as well.
  • Adapting to new technologies: VLSI is a competitive industry; It is a race against time for the customers to bring their product(s) out in the market. Adapting a new technology quickly, helped us to serve this and many other requirements of our customers effectively.
  • Root Causing Design Failures: Went above and beyond to aid the designers by root causing the design failures. This involved pinpointing of design defect within the RTL or Source code.

 

The above-mentioned pointers are just not limited to us, it can also be utilized by other teams, which can help them gear up their Arb – Mux and CXL Verification as a whole.

 

CONCLUSION

 

VLSI is an ever-enhancing industry. It is going to see similar enhancement in technology in the time that is to come. Since last two decades eInfochips has made sure to embrace all these new technologies and deliver effectively to client’s expectations by providing them concrete solutions and support. CXL is just another feather in the crown.

 

eInfochips provides a wide range of Verification solutions in the field of Semiconductor. Here at eInfochips, we provide end to end SoC and ASIC Design Services, IP/VIP Verification services across the entire connected product design flow, from test consulting and implementation to the end of life testing support, ensuring high product quality, operational excellence, and agility. To know more contact eInfochips.

 

About the Author

 

Vinit Sheth

Vinit Sheth works as an ASIC Design Verification Engineer at eInfochips. He has 3 years of working experience in the field of Verification and has worked on some complex protocols like CXL, PCIe, CCIX and USB. 

Recent Stories