AI Kernel Optimization Engineer
The Role:
As an AI Kernel optimization Engineer, you will play a key role in pushing the limits of AI inference performance on Openchip RISC-V platforms.
You will design, implement, and optimize AI compute kernels (Gen AI Large Language Model, AI Vision, CNNs, etc) and runtime components to fully exploit the underlying hardware architecture — from vector/matrix units and memory hierarchies down to the assembly level.
Your work will directly influence how efficiently AI models run on Openchip So
Cs, shaping the performance of next-generation inference accelerators. You will collaborate closely with hardware architects, compiler engineers, and AI framework developers to achieve optimal hardware–software co-design.
Key Responsibilities:
· Develop, optimize, profile, and debug AI compute kernels (e.g., GEMM, attention, activation) targeting Openchip RISC-V architectures.
· Identify and resolve performance bottlenecks at the ISA, compiler, and runtime levels.
· Collaborate with the hardware and architecture teams to guide design decisions that improve real-world performance.
· Contribute to the development and tuning of AI runtime and graph execution engines.
· Evaluate and benchmark AI inference workloads on Openchip platforms.
· Implement performance analysis tools and scripts to automate profiling and validation.
· Work with AI frameworks (e.g., Py
Torch, ONNX Runtime, Tensor
RT, TVM) to ensure efficient mapping to Openchip targets.
· Stay up to date on AI kernel optimization trends, emerging hardware acceleration techniques, and open-source contributions.
Required Qualifications:
· MSc or Ph
D in Computer Engineering or Computer Science, or equivalent practical experience.
· 3+ years of experience in performance optimization for AI Inference or HPC use cases.
Technical skills:
· Strong background in low-level performance optimization (vectorization, memory access optimization, loop