Optimizing AI Model Inference on Serverless Cloud Platforms: A Scalable Approach

Journal Title: International Journal of Current Science Research and Review - Year 2025, Vol 8, Issue 05

Abstract

The increasing prevalence of Artificial Intelligence (AI) and Machine Learning (ML) models across various industries has highlighted the critical need for efficient and scalable deployment strategies. Traditional deployment methods often struggle with adapting to fluctuating demands and maintaining cost-effectiveness. Serverless computing has emerged as a promising solution to address these challenges. This paper investigates the deployment of AI models within serverless architectures on Amazon Web Services (AWS), specifically focusing on AWS Lambda and Knative. The study analyzes the limitations of conventional deployment approaches and proposes innovative strategies leveraging the capabilities of serverless technologies. Furthermore, it presents a rigorous evaluation of the performance characteristics of these serverless deployment strategies, discusses crucial security and privacy considerations, incorporates illustrative real-world case studies, and outlines potential future research directions.

Authors and Affiliations

Prudhvi Naayini, Chiranjeevi Bura,

Keywords

Related Articles

The Effects of Air Quality on Mental Health, and A Comparative Study of Teenagers Aged 12-18 During January and February in Lampang, Phranakhon Si Ayutthaya, and Bangkok

Nowadays, populations around the world are suffering from mental health issues, especially depression, and the number is estimated to increase every year. Not only depression issues but also air pollution that is surging...

Proposed Business Model Using Business Model Canvas (BMC) to Increase Neutrafix Platform Users (Case Study: PT Telekomunikasi Indonesia International)

This comprehensive study, entitled “Propose Business Model using Business Model Canvas to Increase Neutrafix Platform User”, Neutrafix is a voice & mobility e-commerce platform owned by PT Telekomunikasi Indonesia Intern...

Measurement of Diabetic Patient’s Kidneys CT Number using Computed Tomography

The study was conducted in Khartoum state on diabetic patients utilizing Computed Tomography. The aim was to assess CT numbers for both kidneys and analyze their correlation with age. A total of 522 participants were exa...

Analysis of Factors Affecting Complaints of Lower Back Pain among Nurses at Bhayangkara Hospital Kendari

Background: One of the important factors that affect the quality of nurses’ work is occupational health and safety. One of the occupational diseases is Low Back Pain (LBP). Occupations that have a high risk of back pain...

Heavy Equipment Workforce Planning: An Analytic Hierarchy Process Approach for Local Worker Composition Regulation Compliance – A Coal Company Case Study

The Newcastle Coal Price surge, currently at $457.80 in September 2022, drives coal companies, like one in East Kutai, to ramp up production by acquiring equipment and operators. However, compliance with Kutai East Regio...

Download PDF file
  • EP ID EP765351
  • DOI 10.47191/ijcsrr/V8-i5-02
  • Views 11
  • Downloads 0

How To Cite

Prudhvi Naayini, Chiranjeevi Bura, (2025). Optimizing AI Model Inference on Serverless Cloud Platforms: A Scalable Approach. International Journal of Current Science Research and Review, 8(05), -. https://www.europub.co.uk/articles/-A-765351