| The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low |
| level communications API across PCIe currently implemented for MIC. Currently |
| SCIF provides inter-node communication within a single host platform, where a |
| node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of |
| communicating over the PCIe bus while providing an API that is symmetric |
| across all the nodes in the PCIe network. An important design objective for SCIF |
| is to deliver the maximum possible performance given the communication |
| abilities of the hardware. SCIF has been used to implement an offload compiler |
| runtime and OFED support for MPI implementations for MIC coprocessors. |
| |
| ==== SCIF API Components ==== |
| The SCIF API has the following parts: |
| 1. Connection establishment using a client server model |
| 2. Byte stream messaging intended for short messages |
| 3. Node enumeration to determine online nodes |
| 4. Poll semantics for detection of incoming connections and messages |
| 5. Memory registration to pin down pages |
| 6. Remote memory mapping for low latency CPU accesses via mmap |
| 7. Remote DMA (RDMA) for high bandwidth DMA transfers |
| 8. Fence APIs for RDMA synchronization |
| |
| SCIF exposes the notion of a connection which can be used by peer processes on |
| nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A |
| process in a SCIF node initiates a SCIF connection to a peer process on a |
| different node via a SCIF "endpoint". SCIF endpoints support messaging APIs |
| which are similar to connection oriented socket APIs. Connected SCIF endpoints |
| can also register local memory which is followed by data transfer using either |
| DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and |
| kernel mode clients which are functionally equivalent. |
| |
| ==== SCIF Performance for MIC ==== |
| DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus |
| SCIF shows the performance advantages of SCIF for HPC applications and runtimes. |
| |
| Comparison of TCP and SCIF based BW |
| |
| Throughput (GB/sec) |
| 8 + PCIe Bandwidth ****** |
| + TCP ###### |
| 7 + ************************************** SCIF %%%%%% |
| | %%%%%%%%%%%%%%%%%%% |
| 6 + %%%% |
| | %% |
| | %%% |
| 5 + %% |
| | %% |
| 4 + %% |
| | %% |
| 3 + %% |
| | % |
| 2 + %% |
| | %% |
| | % |
| 1 + |
| + ###################################### |
| 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+- |
| 1 10 100 1000 10000 100000 |
| Transfer Size (KBytes) |
| |
| SCIF allows memory sharing via mmap(..) between processes on different PCIe |
| nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap |
| latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs. |
| |
| SCIF has a user space library which is a thin IOCTL wrapper providing a user |
| space API similar to the kernel API in scif.h. The SCIF user space library |
| is distributed @ https://software.intel.com/en-us/mic-developer |
| |
| Here is some pseudo code for an example of how two applications on two PCIe |
| nodes would typically use the SCIF API: |
| |
| Process A (on node A) Process B (on node B) |
| |
| /* get online node information */ |
| scif_get_node_ids(..) scif_get_node_ids(..) |
| scif_open(..) scif_open(..) |
| scif_bind(..) scif_bind(..) |
| scif_listen(..) |
| scif_accept(..) scif_connect(..) |
| /* SCIF connection established */ |
| |
| /* Send and receive short messages */ |
| scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..) |
| |
| /* Register memory */ |
| scif_register(..) scif_register(..) |
| |
| /* RDMA */ |
| scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..) |
| |
| /* Fence DMAs */ |
| scif_fence_signal(..) scif_fence_signal(..) |
| |
| mmap(..) mmap(..) |
| |
| /* Access remote registered memory */ |
| |
| /* Close the endpoints */ |
| scif_close(..) scif_close(..) |