PCI-Express (peripheral component interconnect express), An international standard for fast serial interconnection. Its original name was "3GIO", which was proposed by Intel in 2001 and aims to replace the old PCI, PCI-X and AGP Bus standard.
PCI-Express (peripheral component interconnect express) is a high-speed serial computer expansion bus standard. Its original name was "3GIO", which was proposed by Intel in 2001 and aims to replace the old PCI, PCI-X and AGP Bus standard.
PCIe is a high-speed serial point-to-point dual-channel high-bandwidth transmission. The connected devices are assigned exclusive channel bandwidth and do not share bus bandwidth. They mainly support active power management, error reporting, end-to-end reliable transmission, hot swapping, and quality of service ( QOS) and other functions.
PCIe was renamed to "PCI-Express" after being issued by the PCI-SIG (PCI Special Interest Organization) certification, or "PCI-e" for short. Its main advantage is its high data transmission rate and considerable development potential.
PCI Express also has a variety of specifications, from PCI Express x1 to PCI Express x32, which can meet the needs of low-speed devices and high-speed devices that will appear in a certain period of time in the future. The latest PCI-Express interface is the PCIe 3.0 interface. Its bit rate is 8Gbps, which is about twice the bandwidth of the previous generation. It includes a series of important new functions such as transmitter and receiver equalization, PLL improvement, and clock data recovery. Used to improve data transmission and data protection performance.
Suppliers of PCIe flash memory cards include: INTEL, IBM, LSI, OCZ, Samsung (planned), SanDisk, STEC, SuperTalent, and Toshiba (planned), etc., and the massive data growth makes users larger and more scalable For the application of more powerful systems, PCIe 3.0 technology can achieve greater system design flexibility by adding the latest LSI MegaRAID controller and the excellent performance of HBA products. As of January 2019, current mainstream motherboards all support pcie 3.0.
PCIe has many improvements over previous standards, including higher maximum system bus throughput, lower I/O pin count and smaller physical size, better bus device performance scaling, and more detailed error detection and reporting Mechanism (Advanced Error Reporting, AER) and local hot swap function. The updated version of the PCIe standard provides hardware support for I/O virtualization.
The PCI Express electrical interface is also used in various other standards. The most notable ones are the ExpressCard as an expansion card interface for notebook computers and the SATA Express as a computer storage interface.
The PCI Express 2.0 specification mainly made a major upgrade in data transmission speed, that is, doubled from the previous 2.5GT/s bus frequency to 5GT/s, which means that the previous PCI Express 2.0 x16 interface can double to an amazing 8GB /s bus bandwidth (1GB/s=8Gbps).
The latest version is PCI-E 3.0, which is the latest standard for expansion cards that can be used in mainstream personal computers in production. There is also PCI-E (version 1.0) that has not yet been delisted. The AMD RD890 chipset released in the second quarter of 2009 will be the first to support the PCI-E 3.0 version. The bandwidth of 2.0 is double that of 1.0, and the bandwidth of 3.0 is double that of version 2.0, which is 5GHz x 4.
1. The key point is to increase the frequency of the PCI Express bus: the data transmission rate of each serial line doubles from 2.5Gbps to 5Gbps, and the bandwidth doubles accordingly.
2. It can better support future high-end graphics cards. Even if the power consumption reaches 225W or 300W, only PCI Express can be used for power supply.
3. The new "Input Output Virtualization" (IOV) technology allows multiple virtual machines to share PCI devices such as network cards.
4. The PCI-E cable sub-standard allows PCI devices to be connected to computers through standardized copper cables, and the speed of each line can reach 2.5Gbps, which is suitable for the occasions of adding multiple network cards as input and output expansion modules for high-end servers. .
5. Finally, there is the long-term plan codenamed "Geneseo". The technology is developed in cooperation with industry giants such as Intel and IBM, which can make coprocessors such as graphics processing units and encryption processing units better connected to the central processor.
6. Relevant improvements to dynamic link speed and link width management, and active state power management (ASPM).
EMC's recently updated cache strategy has consolidated solid-state PCI Express's position in servers and will play an important role in improving the efficiency of enterprise data storage with other IT vendors.
But whether PCI Express flash has fundamentally affected the entire industry and whether it is attractive to a typical data center is still worthy of discussion. Solid-state storage technology has two sides, and IT companies are cautious about new challenges. But no one will deny the superior performance of PCIe whether it is in the cache or in the main memory.
The main advantage of PCIe is its ability to reduce latency. The PCIe device is directly connected to the PCIe bus, making the cache and data closer to the CPU. They eliminated the overhead of traditional storage protocols, and EMC believes that under the right conditions, it can achieve far better performance than SSDs with serial SCSI and SATA solid-state drives sold in 2008.
Conceptually, the PCI Express bus is a high-speed serial replacement of the older PCI/PCI-X bus. One of the main differences between the PCI Express bus and the old PCI is the bus topology. PCI uses a shared parallel bus architecture, where the PCI host and all devices share a common set of addresses, data, and control lines. In contrast, PCI Express is based on a point-to-point topology, with separate serial links connecting each device to the root system (host). Due to its shared bus topology, the PCI bus in a single direction can be arbitrated (in the case of multiple hosts) and limited to one host at a time. In addition, the old PCI clock scheme limited the bus clock to the slowest peripheral on the bus (regardless of the devices involved in the bus transaction). In contrast, PCI Express bus links support full-duplex communication between any two endpoints, and concurrent access across multiple endpoints has no inherent limitations.
In terms of bus protocol, PCI Express communication is encapsulated in data packets. The task of packing and unpacking data and status message traffic is handled by the transaction layer of the PCI Express port. The fundamental differences in electrical signals and bus protocols require the use of different mechanical form factors and expansion connectors (thus, a new motherboard and a new adapter board are required ); PCI slot and PCI Express slot are not interchangeable. At the software level, PCI Express retains backward compatibility with PCI; traditional PCI system software can detect and configure newer PCI Express devices without explicitly supporting the PCI Express standard, but the new PCI Express features are not accessible. The PCI Express link between two devices can consist of 1 to 32 lanes. In a multi-channel link, packet data is striped on the channel, and the peak data throughput is proportional to the entire link width. The channel count is automatically negotiated during device initialization and can be limited by either endpoint. For example, a single-channel PCI Express (×1) card can be inserted into a multi-channel slot (×4, ×8, etc.), and the maximum number of mutually supported channels is automatically negotiated during the initialization cycle. The link can automatically configure itself dynamically to use fewer channels and provide fault tolerance in the presence of bad or unreliable channels. The PCI Express standard defines multiple width slots and connectors: ×1, ×4, ×8, ×12, ×16, and ×32. This allows the PCI Express bus to serve cost-sensitive applications that do not require high throughput, as well as key features such as 3D graphics, networks (10 Gigabit Ethernet or multi-port Gigabit Ethernet), and enterprise storage (SAS or Fibre Channel) Application of performance.
For reference, using a four-way (×4) PCI-X (133MHz 64-bit) device and a PCI Express 1.0 device have roughly the same peak unidirectional transfer rate of 1064MB/s. The PCI Express bus has better performance than the PCI-X bus when multiple devices transmit data simultaneously, or when communication with PCI Express peripherals is bidirectional.
PCI Express devices communicate through logical connections called interconnects or links. Links are point-to-point communication channels between two PCI Express ports, allowing them to send and receive ordinary PCI requests (configuration, I/O or memory read/write) and interrupts (INTx, MSI or MSI-X). At the physical level, a link consists of one or more channels. Low-speed peripherals (such as 802.11 Wi-Fi cards) use single-channel (×1) links, while graphics adapters typically use wider and faster 16-channel links.
The channel consists of two differential signal pairs, one for receiving data and the other for sending. Therefore, each channel consists of four lines or signal traces. Conceptually, each channel is used as a full-duplex byte stream, transmitting 8-bit "byte" format packets in both directions between link endpoints. The physical PCI Express link may contain 1 to 32 lanes, more precisely including 1, 2, 4, 8, 12, 16, or 32 lanes. The channel count is prefixed with “×” (for example, “×8” indicates an eight-channel card or slot), and ×16 is the maximum size commonly used.
Due to the inherent limitations of the latter, including half-duplex operation, excessive signal counting, and inherently lower bandwidth due to timing offsets, the traditional parallel bus has chosen a bonded serial bus architecture. Timing offsets come from separate electrical signals within parallel interfaces traveling on wires of different lengths, potentially different printed circuit board (PCB) layers, and possibly different signal speeds. Although transmitted simultaneously as a single word, the signals on the parallel interface have different travel durations and reach their destinations at different times. When the interface clock period is shorter than the maximum time difference between signal arrivals, it is impossible to recover the transmitted word. Since the timing offset on the parallel bus may reach a few nanoseconds, the resulting bandwidth is limited to a few hundred megahertz.
There is no timing offset in the serial interface, because there is only one differential signal in each direction in each channel, and because the clock information is embedded in the serial signal itself, there is no external clock signal. Therefore, the typical bandwidth of serial signals is limited to a few gigahertz. PCI Express is an example of the general trend for serial interconnects to replace parallel buses; other examples include Serial ATA (SATA), USB, Serial Attached SCSI (SAS), FireWire (IEEE 1394), and RapidIO. In digital video, commonly used examples are DVI, HDMI and DisplayPort.
The multi-channel serial design adds flexibility, which can allocate fewer channels to slower devices.
PCI Express cards are suitable for their physical size or larger slots (using ×16 as the largest), but may not be suitable for smaller PCI Express slots; for example, ×16 cards may not be suitable for ×4 or ×8 slots. Some slots use open sockets to allow physically longer cards and negotiate the best electrical and logical connections.
The number of channels actually connected to the slot may also be  less than the number supported by the physical slot size. An example is a ×16 slot that can run ×1, ×2, ×4, ×8, ×16 cards. When running ×4 cards, only 4 channels are provided. The specification can be read as "×16 (×4 mode)", and the "×size @×speed" symbol ("×16 @×4") is also common. The advantage is that such a slot can accommodate a wider range of PCI Express cards without the need for motherboard hardware to support full transfer rates.
The card itself is designed and manufactured in various sizes. For example, solid state drives (SSD) in the form of PCI Express cards often use HHHL (half-height, half-length) and FHHL (full-height, half-length) to describe the physical dimensions of the card. (The upper four are PCIe slots, and the lowermost one is a PCI slot).
All PCI Express cards may consume up to 3A at +3.3V (9.9W). The +12V and total power they may consume depends on the type of card:
×1 card is limited to 0.5A when combined with +12V (6W) and 10W.
×4 and wider cards are limited to 2.1A when combined with +12V (25W) and 25W.
After initialization and software configuration as a "high-power device", a full-size × 1 card may reach the 25 W limit.
After initialization and software configuration as a "high-power device", a full-size ×16 graphics card may reach the 5.5A limit when combined with +12V (66 W) and 75 W.
The optional connector adds 75 W (6 pins) or 150 W (8 pins) +12 V power supply, and then a total of 300 W (2×75 W + 1×150 W) can be achieved. Some cards use two 8-pin connectors, but this has not been standardized, so this card cannot carry the official PCI Express logo. This configuration allows a total of 375 W (1×75 W + 2×150 W), and may be standardized through the PCI-SIG and PCI Express 4.0 standards. The 8-pin PCI Express connector may be confused with the EPS12V connector, which is mainly used to power SMP and multi-core systems.
PCI Express mini cards based on PCI Express (also known as Mini PCI Express, Mini PCIe, Mini PCI-E, mPCIe and PEM) are alternatives to the Mini PCI form factor. It was developed by PCI-SIG. The host device supports PCI Express and USB 2.0 connections, and each card can use either standard. Most notebook computers built after 2005 use PCI Express for expansion cards; however, as of 2015, many vendors are using the newer M.2 form factor for this purpose.
Due to different sizes, PCI Express mini cards are not compatible with standard full-size PCI Express slots; however, there are passive adapters that allow them to be used in full-size slots.
1. Physical size
The size of the PCI Express mini card is 30×50.95 mm (width×length) of the full mini card. There is a 52-pin edge connector consisting of two staggered rows with a pitch of 0.8 mm. There are eight contacts per row, a gap is equivalent to four contacts, and then another 18 contacts. The thickness of the board is 1.0 mm, excluding parts. It also specifies a "half mini card" (sometimes referred to as HMC), which has approximately half of the physical length of 26.8 mm.
2. Electrical interface
PCI Express Mini Card Edge Connector provides multiple connections and buses:
PCI Express×1 (with SMBus)
Wire for diagnosing the status of the wireless network LED (ie Wi-Fi) on the computer case
SIM card (specification for UIM signal) for GSM and WCDMA applications.
Another future expansion of the PCIe channel
1.5 V and 3.3 V power supply
3. Mini-SATA (mSATA) variant
Although sharing the Mini PCI Express form factor, the mSATA slot is not necessarily compatible with Mini PCI Express. Therefore, only certain notebooks are compatible with mSATA drives. Most compatible systems are based on Intel’s Sandy Bridge processor architecture and use the Huron River platform. Lenovo ThinkPad T, W and X series laptops released from March to April 2011 support mSATA SSD cards in the WWAN card slot. ThinkPad Edge E220s / E420s and Lenovo IdeaPad Y460 / Y560 also support mSATA.
Some laptops (especially ASUS Eee PC, Apple MacBook Air, and Dell mini9 and mini10) use PCI Express mini cards as solid state drives. This variant uses reserved and several non-reserved pins to achieve SATA and IDE interface pass-through, only USB, ground wire, and sometimes the core PCIe×1 bus remain unchanged. This makes "miniPCIe" flash memory and solid state drives sold in netbooks mostly incompatible with true PCI Express Mini implementations.
In addition, the typical Asus miniPCIe SSD is 71 mm long, resulting in Dell's 51 mm model often being incorrectly called half-length. In 2009, it announced the launch of a true 51 mm Mini PCIe SSD with two stacked PCB layers that can provide higher storage capacity. The announced design retains the PCIe interface, making it compatible with the standard mini PCIe slot. There is currently no industrial product development.
Intel has many desktop boards, and its PCIe×1 mini card slot usually does not support mSATA SSD. The PCIe x1 Mini-Card slot (usually multiplexed with the SATA port) is provided on the Intel support website. This is a list of desktop boards that support mSATA.
4. Mini PCIe v2
The new version of Mini PCI Express, M.2 replaces the mSATA standard. The computer bus interfaces provided through the M.2 connector are PCI Express 3.0 (up to four channels), Serial ATA 3.0, and USB 3.0 (single logical ports of the latter two). This depends on the required level of host support and device type, and the manufacturer of the M.2 host or device decides which interfaces to support.
5. PCI Express external wiring
PCI Express external cable PCI Express external wiring (also called external PCI Express, cable PCI Express or ePCIe) specification was released by PCI-SIG in February 2007.
Standard cables and connectors have been defined as ×1, ×4, ×8, and ×16 link widths, and the transfer rate of each channel is 250 MB/s. PCI-SIG also expects the specification to evolve to 500 MB/s, such as PCI Express 2.0. The maximum cable length remains unchanged. An example of the use of cable PCI Express is a metal case that contains many PCI slots and PCI-to-ePCIe adapter circuits. Without the ePCIe specification, this device cannot be implemented.
Several other types of expansion cards come from PCIe, these include:
·High and low cards
·ExpressCard: The follow-up version of the PC card (with × 1 PCIe and USB 2.0; hot swappable)
·PCI Express ExpressModule: a hot-swap modular form defined for servers and workstations
· XQD card: CompactFlash Association's PCI Express-based flash card standard
· XMC: Similar to CMC/PMC appearance (VITA 42.3)
AdvancedTCA: Complementary to CompactPCI for larger applications; supports serial-based backplane topology
· AMC: Supplement to the AdvancedTCA specification; supports the processor and I/O module on the ATCA board (×1, ×2, ×4 or ×8 PCIe).
FeaturePak: a small expansion card format (43×65 mm) suitable for embedded and small applications, which can realize two 1 PCIe connections on high-density connectors as well as USB, I2C and up to 100 I/O points
General IO: A variant of Super Micro Computer Inc, specifically for low-profile rack cabinets. Its connector bracket is the opposite, so it cannot be installed in a normal PCI Express socket, but it is pin compatible and can be inserted if the bracket is removed.
Thunderbolt: A variant of Intel that combines DisplayPort and PCIe protocols and is compatible with Mini DisplayPort. Thunderbolt 3.0 also combines USB 3.1 and uses USB Type-C form factor instead of Mini DisplayPort.
Serial digital video output: Some 9xx series Intel chipsets allow another output of integrated video to be added to the PCIe slot (mainly dedicated and 16 channels).
· M.2 (formerly known as NGFF)
· M-PCIe brings PCIe 3.0 to mobile devices (such as tablets and smartphones) through the M-PHY physical layer
· U.2 (formerly known as SFF-8639).
In early development, PCIe was originally called HSI (for high-speed interconnection), and before finalizing its PCI-SIG name PCI Express, changed its name to 3GIO (third-generation I/O). The technical working group called the Arapajo Working Group (AWG) developed the standard. For the first draft, the ad hoc working group included only Intel engineers; then the ad hoc working group was expanded to include industry partners.
PCI Express is an evolving and improving technology.
As of 2013, PCI Express version 4 has been drafted and is expected to reach final specifications in 2017. At the 2016 PCI SIG Annual Developer Conference and the Intel Developer Forum, Synopsys showed a system running on PCIe 4.0, and Mellanox provided a suitable network card.
In 2003, PCI-SIG introduced PCIe 1.0a with a data rate of 250 MB/s per channel and a transfer rate of 2.5 gigatransfer (GT/s) per second. The transmission rate is expressed as the amount of transmission per second, not the number of bits per second, because the transmission volume includes overhead bits that do not provide additional throughput; PCIe 1.x uses an 8b/10b encoding scheme, resulting in 20% (= 2/10 ) The original channel bandwidth.
In 2005, PCI-SIG introduced PCIe 1.1. This updated specification includes clarifications and several improvements, but is fully compatible with PCI Express 1.0a. The data rate has not changed.
PCI-SIG announced the PCI Express Base 2.0 specification on January 15, 2007. The PCIe 2.0 standard doubles the transfer rate from PCIe 1.0 to 5 GT/s, and the throughput per channel increases from 250 MB/s to 500 MB/s. Therefore, the 32-channel PCIe connector (×32) can support a total throughput of up to 16 GB/s.
The PCIe 2.0 motherboard slot is fully backward compatible with PCIe v1.x cards. PCIe 2.0 cards also generally use the available bandwidth of PCI Express 1.1 and are backward compatible with PCIe 1.x motherboards. In general, a graphics card or motherboard designed for v2.0 will work with another v1.1 or v1.0a.
PCI-SIG also stated that PCIe 2.0 has improvements to the point-to-point data transfer protocol and its software architecture.
Intel’s first chipset supporting PCIe 2.0 is X38. As of October 21, 2007, various manufacturers (Abit, Asus, Gigabyte) began shipping. AMD started using its AMD 700 chipset series to support PCIe 2.0, and nVidia started with MCP72. All Intel chipsets, including the Intel P35 chipset, support PCIe 1.1 or 1.0a.
Like 1.x, PCIe 2.0 uses an 8b/10b encoding scheme, so each channel provides an effective 4 Gbit/s maximum transmission rate of 5 GT/s raw data rate.
PCI Express 2.1 (whose specification date is March 4, 2009) supports most of the management, support, and troubleshooting systems planned for full implementation in PCI Express 3.0. However, the speed is the same as PCI Express 2.0. Unfortunately, the increased slot power breaks backward compatibility between PCI Express 2.1 cards and some older motherboards of 1.0/1.0a, but most motherboards with PCI Express 1.1 connectors are practically adopted by manufacturers The program provides BIOS updates to support PCIe 2.1 for backward compatibility.
PCI Express 3.0 Base Specification Version 3.0 was provided in November 2010 after multiple delays. In August 2007, PCI-SIG announced that PCI Express 3.0 will perform a bit rate of 8 gigabits per second (GT/s) and will be backward compatible with existing PCI Express. It was also announced at the time that the final specification of PCI Express 3.0 would be delayed until the second quarter of 2010. New features of the PCI Express 3.0 specification include some optimizations to enhance signaling and data integrity, including transmitter and receiver equalization, PLL improvements, clock data recovery, and channel enhancements for currently supported topologies.
PCI-SIG analysis found that a six-month technical analysis was conducted on the feasibility of PCI-SIG interconnect bandwidth expansion and found that 8 gigabit transmission rates per second can be manufactured in mainstream silicon process technology and can be deployed On the existing low-cost materials and infrastructure, while maintaining full compatibility with the PCI Express protocol stack (negligible impact).
PCI Express3.0 upgrades the encoding scheme from the previous 8b/10b encoding to 128b/130b, reducing the bandwidth overhead from 20% of PCI Express 2.0 to approximately 1.54% (= 2/130). This is achieved by a technique called "scrambling", which applies known binary polynomials to the data flow in the feedback topology. Because scrambling polynomials are known, data can be recovered by running the data using the feedback topology of the inverse polynomial. The 8 GT/s bit rate of PCI Express 3.0 effectively provides 985 MB/s per channel, which is actually double the channel bandwidth of PCI Express 2.0
On November 18, 2010, the PCI Special Interest Group officially released the completed PCI Express 3.0 specification to its members to build devices based on the new version of PCI Express.
In September 2013, the PCI Express 3.1 specification was announced to be released at the end of 2013 or early 2014. It integrates various improvements of the PCI Express 3.0 specification in three areas: power management, performance, and functionality. It was released in November 2014.
On November 29, 2011, PCI-SIG announced that PCI Express 4.0 provides a 16Gb/s bit rate, doubling the bandwidth provided by PCI Express 3.0, while maintaining software support and backward compatibility of used mechanical interfaces. The PCI Express 4.0 specification will also bring OCuLink-2, which is an alternative to Thunderbolt connectors. OCuLink version 2 will have up to 16 GT/s (total 8GB/s×4 channels), while the maximum bandwidth of the Thunderbolt 3 connector is 5GB/s. In addition, active and idle power optimization should also be studied. The final specifications are expected to be released in 2017.
In August 2016, Synopsys demonstrated a test machine running PCIe 4.0 at the Intel Developer Forum. Their intellectual property rights have been granted to several companies that plan to provide their chips and products by the end of 2016.
In June 2018, the SD Association has basically completed the formulation of a new generation of SD 7.0 standard specifications, and plans to officially announce it at the MWC conference in Shanghai on June 26-28, 2018. 
Some vendors provide PCIe fiber products, but these are usually only available in certain circumstances. Among them, transparent PCIe bridging is better than using more mainstream standards (such as InfiniBand or Ethernet), and additional software may be required to support its current implementation. The distance is not the original bandwidth, and usually a full ×16 link is not implemented.
Thunderbolt was jointly developed by Intel and Apple as a general-purpose high-speed interface that combines DisplayPort ports. It was originally intended to be an all-fiber interface, but most early implementations were hybrid copper fiber systems due to the creation of consumer-friendly fiber interconnects. With one notable exception, the Sony VAIO Z VPC-Z2 uses a non-standard USB port with optical components to connect to an external PCIe display adapter. Apple has been the main driver for Thunderbolt adoption in 2011, although several other suppliers have announced new products and systems with Thunderbolt.
The mobile PCIe specification (abbreviated as M-PCIe) allows the PCI Express architecture to run on MIPI Alliance's M-PHY physical layer technology. Based on the widely adopted M-PHY and its low-power design, mobile PCIe allows PCI Express to be used in tablets and smartphones.
OCuLink (representing "Optical Copper Link") is an extension of "Cable Express PCI Express" as a competitor to Thunderbolt interface version 3. The OCuLink version 1.0, which will be released in the fall of 2015, supports PCIe 3.0 x4 lane (8 GT/s, 3.9 GB/s) fiber-optic version that is routed through copper cables. It may appear in the future.
PCIe links are dedicated unidirectional coupling around a serial (1-bit) point-to-point connection called a channel. This is in stark contrast to the early PCI connection, which is a bus-based system in which all devices share the same bidirectional 32-bit or 64-bit parallel bus.
PCI Express is a layered protocol consisting of a transaction layer, data link layer and physical layer. The data link layer is subdivided into media access control (MAC) sublayers. The physical layer is subdivided into logical and electronic sublayers. The physical logical sublayer contains the physical coding sublayer (PCS). These terms borrow from the IEEE 802 network protocol model.
The PCIe physical layer (PHY, PCIEPHY, PCI Express PHY or PCIe PHY) specification is divided into two sublayers, corresponding to electrical and logical specifications. The logical sublayer is sometimes further divided into MAC sublayer and PCS, although this division is not a formal part of the PCIe specification. The PCI Express (PIPE) PHY interface (58) published by Intel defines the MAC/PCS functional partition and the interface between these two sublayers. The PIPE specification also identifies the physical media connection (PMA) layer, which includes serializers/deserializers (SerDes) and other analog circuits; however, because SerDes implementations vary greatly between ASIC vendors, PIPE does not specify PCS and Interface between PMA.
In terms of level, each channel consists of two unidirectional LVDS pairs in units of 2.5, 5, 8, or 16 Gbit/s, depending on the ability to negotiate. Transmit and receive are separate differential pairs, with a total of four data lines per channel.
The connection between any two PCIe devices is called a link and is constructed from a collection of one or more lanes. All devices must support a single-channel (×1) link to a minimum. The device can optionally support a wider link consisting of 2, 4, 8, 12, 16 or 32 channels. This can achieve very good compatibility in two ways:
PCIe cards are physically adapted (and work properly) in any slots at least as large as they are (for example, x-size cards will work in any size slots);
As long as the ground connection required for a larger physical slot is provided, a slot with a larger physical size (for example, ×16) can have fewer channel connections (for example, ×1, ×4, ×8, or ×12).
In both cases, PCIe negotiates the maximum number of channels that are mutually supported. Verified many graphics cards, motherboards, and BIOS versions to support ×1, ×4, ×8, and ×16 connections on the same connection.
Although the two will be signal compatible, it is usually impossible to place a physically larger PCIe card (for example, a ×16 size card) in a smaller slot-although if the PCIe slot is changed or used In addition to the riser board, most motherboards will allow this. The PCIe connector has a width of 8.8 mm, a height of 11.25 mm, and a variable length. The length of the fixed part of the connector is 11.65mm, including two rows of 11 (a total of 22), and the length of the other part varies according to the number of channels. The pins are separated by 1mm intervals, and the thickness of the card entering the connector is 1.8mm.
PCIe sends all control messages over the same link as the data, including interrupts. Serial protocols are never blocked, so the delay is still comparable to regular PCI with dedicated interrupt lines.
The data sent on the multi-channel link is interleaved, which means that each successive byte is passed continuously. The PCIe specification strips this interleaving as data. While a large amount of hardware complexity is required to synchronize (or de-skew) the input stripe data, striping can significantly reduce the delay of the n-th byte on the link. Although the channels are not tightly synchronized, for 2.5 / 5/8 GT / s, the channel deviation is 20/8/6 ns, so the hardware buffer can realign the stripe data. Due to padding requirements, striping may not necessarily reduce the latency of small packets on the link.
Like other high data rate serial transmission protocols, the clock is embedded in the signal. At the physical level, PCI Express 2.0 uses an 8b/10b encoding scheme to ensure that strings of consecutive identical numbers (zero or 1) are of limited length. This code is used to prevent the receiver from losing the position of the bit edge. In this coding scheme, every eight (uncoded) payload data bits are replaced with 10 (coded) bits of transmitted data, resulting in 20% overhead in electrical bandwidth. In order to increase the available bandwidth, PCI Express version 3.0 instead uses 128b/130b code scrambling. 128b/130b encoding relies on scrambling to limit the running length of the same digital string in the data stream and ensure that the receiver remains synchronized to the transmitter. It also reduces electromagnetic interference (EMI) by preventing duplicate data patterns in the transmitted data stream.
The data link layer performs three important services for the PCIe Express link:
Sort the transaction layer data packets (TLP) generated by the transaction layer,
Acknowledge protocols (ACK and NAK signaling) to ensure the reliable delivery of TLP between the two endpoints. These acknowledgment protocols explicitly require the replay of unacknowledged/bad TLPs and the initialization and management of flow control credits
On the sending side, the data link layer generates an increasing sequence number for each output TLP. It serves as a unique identification label for each transmitted TLP and is inserted into the header of the outbound TLP. A 32-bit cyclic redundancy check code (called link CRC or LCRC in this context) is also appended to the end of each output TLP.
At the receiving end, the LCRC and sequence number of the received TLP are verified in the link layer. If the LCRC check fails (indicating data error) or the sequence number is out of range (the TLP received from the last valid reception is not continuous), the bad TLP and any TLPs received after the bad TLP are considered invalid and discarded. The receiver sends a negative confirmation message (NAK) to the sequence number of the invalid TLP, requesting that all TLPs of that sequence number be resent. If the received TLP passes the LCRC check and has the correct sequence number, it is considered valid. The link receiver increments the sequence number (tracks the last received good TLP) and forwards the valid TLP to the receiver's transaction layer. The ACK message is sent to the remote transmitter, indicating that the TLP was successfully received (and extended all TLPs with past sequence numbers).
If the transmitter receives a NAK message, or does not receive an acknowledgment (NAK or ACK) before the timeout period expires, the transmitter must retransmit all TLPs that lack a positive acknowledgment (ACK). In addition to the continuous failure of the device or transmission medium, the link layer provides a reliable connection to the transaction layer because the transmission protocol ensures that the TLP is transmitted on unreliable media.
In addition to sending and receiving TLP generated by the transaction layer, the data link layer also generates and consumes DLLP, data link layer data packets. The ACK and NAK signals are communicated through DLLP, and the flow control credit information is also the same as some power management messages and flow control credit information (representing the transaction layer).
In fact, the number of unconfirmed TLPs on the link is limited by two factors: the size of the transmitter's replay buffer (you must store a copy of all sent TLPs until the remote receiver confirms them), and the flow control receiver Credit to the transmitter. PCI Express requires all recipients to issue a minimum amount of credit to ensure that a link allows PCIConfig TLP and message TLP to be sent.
PCI Express implements split transactions (transactions with request and response time intervals), allowing links to carry other traffic while the target device collects the response data.
PCI Express uses credit-based flow control. In this scheme, the device advertises an initial credit amount for each received buffer in its transaction layer. When a device at the opposite end of the link sends a transaction to that device, it calculates the amount of credit consumed by each TLP from its account. The sending device can only transmit the TLP when doing so, so that the credit count it consumes does not exceed its credit limit. When the receiving device completes the TLP process from its buffer, it sends a credit return signal to the sending device, thereby increasing the credit limit by the amount of recovery. Credit counters are modular counters, and comparison of consumer credits and credit limits requires modular arithmetic. The advantage of this scheme (compared to other methods, such as wait states or handshake-based transmission protocols) is that the delay of credit returns does not affect performance, provided that no credit limit is encountered. If each device is designed with sufficient buffer size, this assumption is usually satisfied.
PCIe 1.x is often referenced to support a data rate of 250 MB/s per channel in each direction. This number is calculated by dividing the physical signaling rate (2.5 gigabits) by the coding overhead (10 bits per byte). This means that a 16-line (×16) PCIe card can theoretically reach 16×250 MB/s = 4 GB/s in each direction. Although this is correct in terms of data bytes, a more meaningful calculation is based on the available data payload rate, which depends on the traffic profile, which is a function of advanced (software) applications and intermediate protocol levels.
Like other high data rate serial interconnection systems, PCIe has protocol and processing overhead due to additional transmission robustness (CRC and acknowledgment). Long-term continuous unidirectional transmissions (such as those in high-performance storage controllers) can approach 95% of the original (lane) data rate of PCIe. These transfers can also benefit the most from increasing the number of channels (×2, ×4, etc.). But in more typical applications (such as USB or Ethernet controllers), traffic profiles are characterized by short data packets with frequent mandatory acknowledgments. Due to the overhead of packet parsing and forced interruption (in the host interface of the device or the CPU of the PC), this traffic will reduce the efficiency of the link. As a protocol for devices connected to the same printed circuit board, it does not need the same tolerance of transmission errors as the protocol used for long-distance communication, so this loss of efficiency is not special for PCIe.
PCI Express runs in consumer, server, and industrial applications as a motherboard-level interconnect (connecting motherboard peripherals), a passive backplane interconnect, and as an expansion card interface for add-on boards.
In almost all modern (as of 2012) PCs (from consumer laptops and desktops to enterprise data servers), the PCIe bus serves as the main board-level interconnect, connecting the host system processor with integrated peripherals (surface mount ICs) ) Connected, and additional peripherals (expansion card). In most of these systems, the PCIe bus coexists with one or more traditional PCI buses for backward compatibility with a large number of traditional PCI peripherals.
As of 2013, PCI Express has replaced AGP with the default interface of the graphics card on the new system. Almost all graphics cards released by AMD (ATI) and Nvidia since 2010 use PCI Express. Nvidia uses PCIe's high-bandwidth data transmission as its Extensible Link Interface (SLI) technology, which allows multiple graphics cards of the same chipset and model to run in series to improve performance. AMD has also developed a multi-GPU system based on PCIe called CrossFire. AMD and Nvidia have released motherboard chipsets that support up to four PCIe×16 slots, allowing three GPU and four GPU card configurations.
Theoretically, by connecting the laptop to any PCIe desktop graphics card (encapsulated in its own external shell with powerful power and heat dissipation), the external PCIe can provide the desktop graphics functions for the laptop; it may use the ExpressCard interface or Thunderbolt interface. The ExpressCard interface provides a bit rate of 5 Gbit/s (0.5 GB/s throughput), while the Thunderbolt interface provides a bit rate of up to 40 Gbit/s (5 GB/s throughput).
The external card hub introduced in 2010 can be connected to a laptop or desktop computer through a PCI ExpressCard slot. These hubs can accept full-size graphics cards. Examples include MSI GUS, ViDock from Village Instrument, ASUS XG station, Bplus PE4
H V3.2 adapter and more improvised DIY equipment. However, such solutions are limited by the size (usually only x1) and version of the PCIe slots available on laptops.
In 2008, AMD announced the launch of ATI XGP technology, which is based on a proprietary cabling system compatible with PCIe×8 signal transmission. Fujitsu Amilo and Acer Ferrari One laptops provide this connector. Fujitsu has since launched the AMILO GraphicBooster housing for XGP. Around 2010, Acer launched XGP's Dynavivid graphics dock Intel Thunderbolt interface to give new and faster products the opportunity to connect externally with PCIe cards. Magma released ExpressBox 3T, which can accommodate three PCIe cards (two at ×8 and one at ×4). MSI also released the Thunderbolt GUS II, a PCIe chassis dedicated to graphics cards. Other products, such as Sonnet’s Echo Express and mLogi’s mLink are Thunderbolt PCIe chassis, which are smaller in size. However, all of these products require a computer with a Thunderbolt port (that is, a Thunderbolt device). For example, the Apple Macbook Pro was released in autumn 2013.
For the professional market, Nvidia has developed Quadro Plex external PCIe series GPUs that can be used for advanced graphics applications. These video cards require a PCI Express x8 or x16 slot for connecting to Plex's host-side card through VHDCI that supports eight PCIe lanes.
The PCI Express protocol can be used as a data interface for flash memory devices, such as memory cards and solid state drives (SSD).
XQD card is a PCI Express memory card format developed by the CompactFlash Association, with a transfer rate of up to 500 MB/s
Many high-performance enterprise-class SSDs are designed as PCI Express RAID controller cards with flash memory chips placed directly on the circuit board, using proprietary interfaces and custom drivers to communicate with the operating system; compared to serial ATA or SAS drives , Which allows much higher transfer rates (more than 1 GB/s) and IOPS (more than one million I/O operations per second). For example, in 2011, OCZ and Marvell jointly developed a PCI Express 3.0× A 16-slot local PCI Express solid-state drive controller with a maximum capacity of 12 TB, performance of 7.2 GB/s, continuous transmission, and up to 2.52 million IOPS in random transfer.
SATA Express is an interface for connecting SSDs. It provides multiple PCI Express channels for connected storage devices as a pure PCI Express connection. M.2 is a specification for internally installed computer expansion cards and related connectors, and also uses multiple PCI Express lanes.
PCI Express storage devices can implement AHCI logical interfaces for backward compatibility and NVM Express logical interfaces to provide faster I/O operations by leveraging the internal parallelism provided by such devices. Enterprise SSDs can also implement SCSI through PCI Express.
Some data center applications (such as large computer clusters) require fiber optic interconnects due to the inherent distance limitations of copper cables. Generally, network-oriented standards such as Ethernet or Fibre Channel are sufficient for these applications, but in some cases, routable protocols
The overhead introduced is undesirable and requires lower-level interconnections such as InfiniBand, RapidIO, or NUMAlink. Local bus standards (such as PCIe and HyperTransport) can in principle be used for this purpose, but as of 2015, solutions can only be obtained from dolphin suppliers (such as Dolphin ICS).
Other communication standards based on high-bandwidth serial architecture include InfiniBand, RapidIO, HyperTransport, Intel QuickPath Interconnect, and Mobile Industry Processor Interface (MIPI). The difference is based on the trade-off between flexibility and scalability versus latency and overhead. For example, adding complex header information to the transmitted packet allows complex routing (PCI Express can pass the optional end-to-end TLP prefix feature). The extra overhead reduces the effective bandwidth of the interface and complicates the bus discovery and initialization software. In addition, making the system hot-swappable requires software to track network topology changes. InfiniBand is one such technology.
Another example is to make data packets shorter to reduce latency (if the bus must operate as a memory interface, you need to do this). Smaller packets mean that the packet header consumes a higher percentage of packets, which reduces the effective bandwidth. Examples of bus protocols designed for this purpose are RapidIO and HyperTransport.
PCI Express is located somewhere in the middle and aims at design as a system interconnection (local bus) rather than a device interconnection or routing network protocol. In addition, its software transparency design goals limit the protocol and slightly increase its latency.
The delay in the implementation of PCIe 4.0 led to the efforts of the Gen-Z Alliance, CCIX, and an open Coherent Accelerator processor interface (CAPI), all announced before the end of 2016.
FPGA Spartan-3A Family 400K Gates 8064 Cells 667MHz 90nm Technology 1.2V 400-Pin FBGA
FPGA Spartan-3A Family 400K Gates 8064 Cells 667MHz 90nm Technology 1.2V 256-Pin FTBGA
CPLD CoolRunner -II Family 6K Gates 256 Macro Cells 256MHz 0.18um Technology 1.8V 208-Pin PQFP
CPLD CoolRunner -II Family 6K Gates 256 Macro Cells 152MHz 0.18um Technology 1.8V 132-Pin CSBGA
CPLD CoolRunner -II Family 6K Gates 256 Macro Cells 152MHz 0.18um Technology 1.8V 256-Pin FTBGA