MENAJOBS.ai - The Middle East's Elite AI Talent MatrixWhere people find jobs
arrow_backBack to jobs

Pantheion AI

AI Engineer — The Edge (Pantheion-Nano)

Abu Dhabi, UAEFull-TimeOn-sitePosted Apr 6, 2026
Model CompressionQuantizationC/C++Embedded LinuxMCUTensor Flow Lite

Role overview

Pantheion-Nano is the technical challenge of taking frontier Arabic AI capability and compressing it into the most constrained hardware environments imaginable — microcontrollers, mobile SoCs, smart city sensors, and connected vehicles — without losing the Arabic language understanding that makes it useful. As AI Engineer for The Edge, you will own this compression and optimization challenge end to end: designing the quantization pipelines, hardware-specific inference runtimes, and embedded SDKs that allow GCC hardware partners to bake Pantheion-Nano directly into their products.

What you will do

  • Design and implement the full Pantheion-Nano model compression pipeline: knowledge distillation from Pantheion-1, structured and unstructured pruning, post-training quantization (PTQ) and quantization-aware training (QAT) across GGUF, GPTQ, AWQ, and INT4/INT8 formats
  • Develop hardware-specific inference runtimes and optimization profiles for target deployment environments: ARM Cortex-M/A series, Qualcomm Hexagon DSP, MediaTek APU, NVIDIA Jetson Orin, and mobile SoCs
  • Build the Pantheion-Nano C/C++ inference runtime library for embedded Linux, bare-metal MCU, and Android/iOS mobile deployment targets
  • Design and implement the Arabic language capability retention evaluation framework for Nano variants: measuring which compression techniques best preserve dialect-aware NLP capability at each model size tier
  • Develop on-device Arabic speech recognition and keyword spotting pipelines for IoT sensor integration use cases — optimized for Arabic phoneme sets and Gulf dialect acoustic patterns
  • Build hardware certification testing suites for GCC OEM and smart city platform partners — automated benchmarking of latency, memory footprint, power consumption, and Arabic NLP accuracy
  • Develop and maintain the Pantheion-Nano embedded SDK: developer-facing Python wrappers, C++ APIs, hardware abstraction layers, and deployment guides targeting GCC hardware partner engineering teams
  • Collaborate with hardware partners (Qualcomm, MediaTek, ARM) on chipset-level AI accelerator integration and NPU optimization

Skills profile

Required skills

Model CompressionQuantizationC/C++Embedded LinuxMCUTensor Flow Lite

Required qualifications

Domain knowledge

  • 5+ years of AI/ML engineering experience with at least 2 years specializing in on-device AI, edge ML, or embedded systems
  • Deep expertise in model compression: knowledge distillation, pruning, quantization (PTQ, QAT, GPTQ, AWQ, GGUF), and neural architecture search for size-constrained deployment
  • Hands-on experience building inference runtimes or deploying LLMs on resource-constrained hardware (mobile, embedded Linux, MCU)
  • Strong C/C++ proficiency for embedded systems programming, alongside Python for model development and pipeline tooling
  • Experience with embedded AI frameworks: TensorFlow Lite, ONNX Runtime, llama.cpp, ExecuTorch, or hardware-specific SDKs (Qualcomm AI Engine, MediaTek NeuroPilot)
  • Understanding of hardware architecture: memory hierarchies, NPU/DSP capabilities, and power/performance tradeoffs across embedded SoC families

Preferred qualifications

Bonus domain experience

  • Experience optimizing Arabic NLP models (ASR, NLU, or LLM) for edge deployment — understanding of how Arabic morphological complexity affects tokenization and inference at constrained model sizes
  • Prior work with smart city IoT platforms, industrial edge AI, or automotive embedded AI
  • Familiarity with GCC-relevant hardware ecosystem: Qualcomm Snapdragon platforms, ARM Mali GPUs, or NVIDIA Jetson in smart city or security applications
  • Experience with on-device speech recognition, keyword spotting, or wake-word detection for Arabic language
  • Contributions to open-source edge AI projects (llama.cpp, MLC-LLM, TensorFlow Lite, or equivalent)