AI Computer Vision Engineer Jobs
Engineer an advanced Generative AI pipeline capable of transforming the context of existing datasets, including shifting time‑of‑day, changing seasons, or altering biomes and weather systems while preserving small target objects like drones. Take ownership of a multi‑pass diffusion pipeline to adapt scene contexts and maximize physical realism. Improve custom masking and high‑resolution depth‑patching algorithms to anchor small objects in 3D space, eliminating artifacts. Generate large‑scale augmented datasets and quantify their impact on downstream model performance, designing experiments to measure the effect of synthetic data on the accuracy, recall, and robustness of object detectors when tested against real‑world edge cases.
The AI Research Engineer will build and maintain end‑to‑end data pipelines for large‑scale image and video datasets including collection, filtering, augmentation, conditioning alignment, and efficient storage/sampling. They will implement model architectures such as diffusion, autoregressive, flow‑based, diffusion transformers, and maintain high‑throughput PyTorch training loops for large‑scale image and video diffusion models. The role involves running and managing large‑scale training experiments on multi‑GPU and multi‑node setups, debugging training instabilities, loss spikes, and convergence issues. The engineer will apply quantization, pruning, and knowledge distillation techniques to compress models without sacrificing quality, collaborate with researchers to translate state‑of‑the‑art research papers into working implementations, and build and maintain evaluation pipelines for image quality, video consistency, and perceptual metrics. They will also set up and maintain human annotation and evaluation pipelines using services like AWS GroundTruth, profile and optimize training speed, GPU memory utilization, and iteration time, implement inference optimizations to reduce latency and compute cost, and work with acceleration toolchains such as torch.compile, Triton, TensorRT, or ONNX where appropriate.
As an AI/Computer Vision Intern, you will design and train state‑of‑the‑art object detection models based on a sequence of frames tailored for specific mission‑critical targets, integrate these models into existing end‑to‑end tracking algorithms that maintain lock under high dynamics and occlusion, profile and optimize models using TensorRT or NPU‑specific toolchains for real‑time inference on low‑power onboard hardware, curate and manage high‑quality datasets using both real‑world flight footage and synthetic data, integrate the vision pipeline into the flight stack collaborating with the GNC team to convert detections into flight commands, and benchmark and validate performance through quantitative metrics and field testing in diverse environmental conditions.
Conduct research on state‑of‑the‑art Computer Vision methodologies and participate in creating and curating training and validation datasets. Perform statistical analyses and develop visualization tools to ensure data quality. Build and refine training pipelines and metrics to enhance model performance. Develop and optimize Computer Vision algorithms for multiple robotics/aerospace projects. Implement ML/CV models into production‑ready environments, ensuring seamless integration with Harmattan AI’s systems and conducting rigorous code reviews. Test algorithms in real‑world environments and develop monitoring tools to track model performance and continuously improve deployed solutions. Work closely with software and simulation teams to align development with system requirements and communicate findings effectively to stakeholders.
The Computer Vision Engineer is responsible for developing the front‑end of the visual inertial odometry (VIO) algorithmic stack, including matching between frames and stereo pairs, calibration of camera intrinsic and extrinsic parameters, and detection of obstruction. They will implement and optimize the algorithmic stack for embedded platforms, conduct testing, validation, and monitoring of algorithms in simulation and real‑world environments, and develop inspection and monitoring tools. The role also involves cross‑team collaboration, working closely with system engineers, optical engineers, and software engineers, and effectively communicating findings to stakeholders.
Design and implement algorithms for 3D point cloud processing, object recognition, and segmentation. Enhance and optimize SLAM algorithms for real‑time application in mobile and static environments. Integrate and optimize AI technologies such as Open3D and 2D+3D inference models into existing systems for improved 2D & 3D data analysis and visualization. Collaborate with cross‑functional teams in Pakistan and Hong Kong to integrate new features into SpatialSense. Conduct R&D to explore new techniques in computer vision and machine learning for infrastructure monitoring. Ensure the robustness and accuracy of computer vision applications under various operational conditions. Design and develop computer vision algorithms and models for object detection, image classification, segmentation, and tracking. Optimize computer vision algorithms and models to leverage NVIDIA hardware like GPUs and specialized accelerators. Collaborate with hardware engineers to utilize latest features of NVIDIA hardware platforms. Conduct performance profiling and benchmarking on NVIDIA hardware to identify bottlenecks and optimize resource use. Implement and integrate computer vision algorithms into scalable, robust, real‑time systems on NVIDIA hardware. Collaborate with researchers and academic partners to evaluate state‑of‑the‑art computer vision techniques on NVIDIA hardware.
Computer Vision Engineer, Geometry & Perception
Lead and manage the acquisition program lifecycle, including due diligence, integration, and adoption to completion across multiple acquisitions. Collaborate with cross‑functional stakeholders and establish program management foundations and processes to ensure successful implementations within Anduril.
Research Engineer – Synthetic Data for Vision
Build and maintain synthetic data generation pipelines such as neural rendering, diffusion/score‑based models, controllable generative priors, and procedural assets with controls for pose, expression, illumination, materials, and sensor characteristics. Apply transfer learning and domain adaptation techniques including self‑supervised pretraining, style/appearance transfer, and sim‑to‑real to bridge distribution gaps between synthetic and real data. Integrate off‑the‑sale and open‑source components as appropriate, fine‑tune or distill models to meet latency, memory, and quality targets on target hardware. Establish end‑to‑end systems covering capture, calibration, generation, data curation, quality gates, rendering/evaluation suites, and deployment. Define evaluation frameworks for datasets and models focusing on coverage, bias, sim‑to‑real gaps, and task‑level KPIs such as gaze error, iterating based on quantitative results. Survey literature across graphics, vision, and generative machine learning, prototype, adapt, and create new approaches to advance facial reconstruction, appearance modeling, and synthetic data quality.
Senior Machine Learning Engineer, Computer Vision
The Senior Computer Vision Engineer leads the development of multi‑camera perception and localization systems, focusing on image‑based search, vector database integration, and re‑ranking strategies. The role involves algorithm and system design for object tracking, scene understanding, cross‑camera reasoning, and scalable visual matching and retrieval across large‑scale deployments.
#J-18808-Ljbffr