Moteur de recherche d'offres d'emploi CEA

Hardware acceleration of nonlinear functions for Vision Transformer (ViT) inference

Vacancy details

General information


The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :
• defence and security,
• nuclear energy (fission and fusion),
• technological research for industry,
• fundamental research in the physical sciences and life sciences.

Drawing on its widely acknowledged expertise, and thanks to its 16000 technicians, engineers, researchers and staff, the CEA actively participates in collaborative projects with a large number of academic and industrial partners.

The CEA is established in ten centers spread throughout France



Description de l'unité

The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development, and innovation. This technological research organization is active in three major fields: energy, health and information technologies, and defense. Recognized as an expert in its fields, the CEA is fully integrated into the European research area and has a growing international presence. Located in the Ile de France (Saclay), the mission of the Laboratory for Systems and Technology Integration (LIST) is to contribute to technology transfer and promote innovation in the field of embedded systems.

Position description


Engineering science



Job title

Hardware acceleration of nonlinear functions for Vision Transformer (ViT) inference


Hardware implementation to accelerate nonlinear functions for ViT inference

Contract duration (months)


Job description

The internship concerns the development of dedicated hardware to accelerate nonlinear functions for vision transformer (ViT) inference. ViTs are widely used in the field of computer vision (CV) for, e.g., image classification, object detection, and image segmentation... An input image for ViT is divided into a set of image patches, called "visual tokens". The visual tokens are embedded in a set of coded vectors of fixed dimension. The position of a patch in the image is integrated with the coded vector and fed into the transformer's encoder network. The ViT encoder comprises several blocks, each of which consists of three main processing elements: Layer Normalization, Multi-head Self Attention Network (MSA), and Multi-Layer Perceptron (MLP). These various processing elements include several nonlinear functions, such as Layer Normalization, Softmax, and GELU activation, which performs operations like exponential, hyperbolic tangent, and division. When implemented in hardware, these functions introduce significant latency and demand a substantial amount of resources, leading to a significant increase in power consumption.


As part of this internship, the candidate will propose a suitable hardware architecture for nonlinear functions. This architecture should reduce computational complexity, power consumption, and resource utilization while maintaining reasonable accuracy. This part of the internship will enable the candidate to analyze the vision transformer (ViT) model and characterize the nonlinear functions that constitute the encoder. This analysis will allow him to propose an optimized solution to implement these functions. For performance evaluation, the candidate will develop the hardware of the proposed solution and implement it on FPGA. The results of the internship could be published in an international conference.

This internship will enable the candidate to acquire knowledge in the field of neural networks and to develop skills in hardware design and FPGA implementation.

Applicant Profile

Requested profile:     

Master's degree (BAC+5)

Skills: VHDL, C/C++, AI, Computer Vision, DNN

Required documents: CV + cover letter + rankings

Position location



Job location

France, Ile-de-France, Essonne (91)



Candidate criteria


  • English (Fluent)
  • French (Fluent)

Prepared diploma

Bac+5 - Diplôme École d'ingénieurs

PhD opportunity



Position start date