You are hereVirtuosoNext™ Designer ignites TI’s C6678 RoC on Parsec VF360

VirtuosoNext™ Designer ignites TI’s C6678 RoC on Parsec VF360

By eric.verhulst - Posted on 05 August 2016

Printer-friendly version

Altreonic has now ported VirtuosoNext™ Designer to the Texas Instruments’ 8-core C6678 DSP of Sundance’ Parsec VF360 VPX board. The board has 8 floating point DSPs and an Altera Stratix-V on board and is a real single chip signal processing embedded super computer.  Running at 1.25 GHz, the eight cores deliver together up to 224 GFlops with a peak bandwidth of 16 Gbytes/s. 

Sundance’ 3U OpenVPX board packs a lot of processing power with a lot of high-speed and advanced I/O on a single board. Harnessing the power of it however is a complex undertaking. At Altreonic we call the TI C6678 a “RoC” (a Rack on a Chip”. It really is an embedded real-time parallel supercomputer on a chip. The TI chip is different from mainstream processors as found in servers and on the desktop in several aspects. First of all, it is designed to be a real-time computer. While the chip can have about 1000 interrupt sources, it also has numerous DMA engines, dedicated high speed memory blocks and a very fast TeraNet highway switching bus to move the data around. In parallel processing, data communication must match the processing power. The bottleneck is first of all the communication latency and secondly the bandwidth. Secondly to keep the communication and processing subsystems at work all the time, the application must be written as a concurrent set of Tasks so that computation can overlap with communication. And because the applications are embedded, power consumption must be low and the form factor small. 

To reach the best performance, efficient software architecture is needed. This is where VirtuosoNext™ comes in.  VirtuosoNext™ Designer is a fifth generation of the original Virtuoso RTOS. The latter was developed since 25 years ago with parallel processing in mind. It was used at the time on systems with up to 12000 processors, amongst which using TI’s C40 parallel DSP but also Analog Devices SHARCs. Its reliability was proven on ESA’s Rosetta mission.

Since then, formal methods were used to redevelop the whole RTOS with a supporting modelling and code generation environment from scratch. This has made the RTOS kernel not only a lot smaller (up to a factor 10) but also a lot more robust and easier to maintain and support. Code size of binary images for compiled small applications on TI’s C66xx DSPs start at about 20 Kbytes and go up to about 25 Kbytes with all services included.  The benefit of the small code size is that more space will be left in L1 cache for application code. The full kernel with application can also be placed in the on-chip L2 cache that is then partly configured as SRAM. Small is beautiful in the embedded domains as it saves memory, cost and energy.

The small code and low latency are of benefit for the applications. A semaphore loop (using 4 kernel services and 6 context switches) only takes 2.81 microseconds. The interrupt latency from IRQ to ISR shows a histogram with a spread between 160 and 260 nanoseconds. To enter a Task from an interrupt the histogram shows a spread between 940 and 1728 nanoseconds. All tests were compiled with the O3 switch. 

To illustrate the communication performance, Altreonic developed tests that share the EDMA engine for copying data between the chip memory banks. This shows application level bandwidths of up to 8,36 Gbytes/s (MSMC to MSMC memory copying in blocks of 512 Kbytes).

What really sets VirtuosoNext™ apart is the programming model that makes parallel and concurrent programing very easy to achieve. Using the VirtuosoNext™ Visual Designer, the developer develops his application using Tasks and Interaction Entities (e.g. semaphores, FIFOs, etc.), independently of the target processor and network topology. He can even cross develop on a PC. Next he maps the Tasks and Interaction Entities to a specific processing node and just recompiles the code. This allows for example all Tasks to transparently use any of the C6678 peripherals and on-chip resources (example: the EDMA engine) or even the FPGA on the VF360 board. The result is a massive gain in productivity, smaller code size and hence better performance.

Besides high performance and productivity, VirtuosoNext™ is also designed for safety and security critical applications. As the code is generated as a static image, it eliminates many of the runtime errors that can occur with more traditional dynamic (RT)OS. The packet switching architecture also reduces typical pointer errors and provides extra security. The kernel itself is formally developed and available with an optional Qualification Package should certification be a must. Full documentation also comes with the Open Technology License. Last but not least, the protected version of VirtuosoNext™ has build in support for error detection and recovery as well as fine grain space and time partitioning. The latter makes use of the C6678 MPU and allows catching memory access violations at the Task level. The unique architecture of VirtuosoNext™ provides the protection of a traditional hypervisor with the real-time responsiveness of an optimised RTOS.

The combination of VirtuosoNext™ with TI’s C6678 8core DSP SoC and Altera’s Stratix FPGA provide for a very powerful, reliable and flexible embedded single board computer whereby more powerful systems can be build by connecting them together over in a single rack. The backplane for the VF360 is the VITA65 OpenVPX, which has plenty of bandwidth in the form of PCI Express or Serial Rapid IO (SRIO) of fast switched serial interface for board to board communication, high-speed LVDS Parallel I/O and Ethernet TCP/IP for secure and long-distance interfaces.

About Altreonic

Altreonic is specialized in trustworthy systems and software engineering, using a unified system engineering methodology. The latter is supported by GoedelWorks; an en-to-end systems engineering environment that supports producing qualification and certification during the engineering activities. VirtuosoNext™ Designer is based on a formally developed network-centric RTOS kernel with supporting tools like Visual Designer for modeling and code generation and Event Tracer for a visual analysis of the application behavior.

Altreonic has a long history of supporting customers in the aerospace-defense domain. The technology is internally applied to the development of a light weight electric vehicle platform.

For more information about Altreonic, visit the website.

Other publications: 

- A Goedel Series booklet discussing the performance on MP targets.

- An overview of the VirtuosoNext benchmark data

About Sundance

Sundance designs, develops, manufactures, and markets internationally high performance signal processing and reconfigurable systems for original equipment manufacturers in the MILCOM, Communication and Signal Processing markets. Leveraging its multiprocessor expertise and experience, Sundance provides OEMs with modular systems as well as data acquisition, I/O, communication and interconnectivity products that are essential to multiprocessor systems where scalability and performance are essential.

For more information about Sundance, visit the website here. More information about VF360 is available here. More information about VITA65/OpenVPX is here. More information about VITA57/FMC is here. More information about C667x DSPs from Texas Instruments is here.



Syndicate content