Efficient Deployment of AI Applications in the Edge-Network-Cloud Continuum

FRANCE

il y a 12 jours

Toward Scalable and Sustainable AI Across Heterogeneous Resources

Supervisor: Frédéric Giroire, CNRS Director of Research, 3IA chair holder.

Laboratory: COATI team, I3S laboratory (Université Côte d’Azur/CNRS) and Inria

Place: Centre Inria de l’Université Côte d’Azur, 2004 route des Lucioles, Sophia Antipolis, France

Context and Motivation

The deployment of AI applications is undergoing a paradigm shift with the advent of 5G/6G networks, the Internet of Things (IoT), and edge computing. This evolution enables services to be deployed across the edge-network-cloud continuum, leveraging heterogeneous resources from edge devices (e.g., smartphones, microcontrollers) to cloud data centers. This new paradigm addresses critical challenges such as computing resource constraints, bandwidth limitations, memory availability, and energy efficiency, while introducing new complexities in resource allocation, model deployment, and system optimization.

At the same time, AI models, especially deep neural networks, are becoming increasingly complex, requiring substantial computational power, memory, and energy. For instance, large models often exceed the capabilities of edge devices, while cloud-centric deployments face bandwidth and latency bottlenecks. The need for efficient deployment strategies that balance these constraints is more pressing than ever.

Scientific Objectives

This thesis aims to develop methods for efficient deployment of AI applications in the edge-network-cloud continuum, addressing resource constraints across computing, bandwidth, memory, and energy. The research will address the following challenges:

Efficient Deployment Strategies

Model Compression : Investigate techniques such as quantization, pruning, and knowledge distillation to reduce the computational and memory footprint of deep learning models without sacrificing accuracy.
Cascade Systems : Explore early-exit architectures and multi-stage inference to dynamically select the most appropriate model (from lightweight to heavyweight) based on real-time constraints (battery level, network latency, device memory).
Federated Learning : Study federated learning (FL) as a means to distribute AI training and inference across edge devices, reducing the need for data centralization and lowering resource costs (computing, bandwidth, energy) associated with data transfer and cloud compute. FL allows models to be trained locally on devices, with only model updates (not raw data) being communicated, thus improving efficiency and privacy.
Resource-Aware Scheduling : Design algorithms to optimize task placement (edge vs. cloud) and scheduling policies for AI workloads, balancing latency, bandwidth, compute, memory, and energy.

Trade-offs Between Efficiency and Performance

Quantitative Analysis : Measure the resource usage (computing, bandwidth, memory, energy) of AI workloads across different deployment scenarios (edge, network, cloud) and model configurations.
Adaptive Configurations : Develop adjustable models that can be reconfigured on-the-fly to adapt to varying resource constraints and application requirements.

Operational Impact

Extend existing frameworks to estimate the resource consumption (compute, memory, bandwidth, energy) of AI deployments, accounting for both local and distributed execution.
Propose deployment strategies that minimize resource waste, improve scalability, and ensure reliable performance across heterogeneous environments.

Research Activities

Analyze resource usage of AI deployments in the edge-network-cloud continuum.
Design algorithmic methods for efficient scheduling of AI workloads.
Investigate trade-offs between resource efficiency and model accuracy in compression techniques.
Develop adaptive deployment frameworks using cascade systems and early-exit models.
Evaluate environmental impact of proposed methods using lifecycle assessment tools.

Required Skills and Profile

The ideal candidate should have:

Knowledge of machine learning, especially neural networks, graph neural networks, or federated learning.
Strong mathematical, optimization, and algorithmic background.
Programming expertise in Python, with experience in PyTorch or TensorFlow.
Familiarity with networking and edge computing (e.g., MEC, IoT, 5G/6G).
Analytical skills for designing and evaluating optimization algorithms.
Fluency in English.

#J-18808-Ljbffr

Entreprise

3IA Côte d'Azur

Plateforme de publication

WHATJOBS

Offres pouvant vous intéresser

PhD Thesis proposal (3 years) M/F Toward Sustainable AI/ Frugal Generative GNNs for Graph Synthesis

PARIS, 75

il y a 14 jours

Applied AI Engineer

PARIS, 75

il y a 6 jours

Multimodal AI algorithm for predicting the locomotor behavior of aquatic animals in turbulent flows

FRANCE

il y a 14 jours

Lab Director / AI Infrastructure for Large Scale Models M/F

BOULOGNE BILLANCOURT

il y a 14 jours