Artificial Intelligence Official Machine Learning Blog of Amazon Web Services
- How Omada Health scaled patient care by fine-tuning Llama models on Amazon SageMaker AIby Breanne Warner on January 12, 2026 at 4:56 pm
This post is co-written with Sunaina Kavi, AI/ML Product Manager at Omada Health. Omada Health, a longtime innovator in virtual healthcare delivery, launched a new nutrition experience in 2025, featuring OmadaSpark, an AI agent trained with robust clinical input that delivers real-time motivational interviewing and nutrition education. It was built on AWS. OmadaSpark was designed
- Crossmodal search with Amazon Nova Multimodal Embeddingsby Tony Santiago on January 10, 2026 at 12:06 am
In this post, we explore how Amazon Nova Multimodal Embeddings addresses the challenges of crossmodal search through a practical ecommerce use case. We examine the technical limitations of traditional approaches and demonstrate how Amazon Nova Multimodal Embeddings enables retrieval across text, images, and other modalities. You learn how to implement a crossmodal search system by generating embeddings, handling queries, and measuring performance. We provide working code examples and share how to add these capabilities to your applications.
- Accelerating LLM inference with post-training weight and activation using AWQ and GPTQ on Amazon SageMaker AIby Pranav Murthy on January 9, 2026 at 6:09 pm
Quantized models can be seamlessly deployed on Amazon SageMaker AI using a few lines of code. In this post, we explore why quantization matters—how it enables lower-cost inference, supports deployment on resource-constrained hardware, and reduces both the financial and environmental impact of modern LLMs, while preserving most of their original performance. We also take a deep dive into the principles behind PTQ and demonstrate how to quantize the model of your choice and deploy it on Amazon SageMaker.
- How Beekeeper by LumApps optimized user personalization with Amazon Bedrockby Mike Koźmiński on January 9, 2026 at 4:10 pm
Beekeeper’s automated leaderboard approach and human feedback loop system for dynamic LLM and prompt pair selection addresses the key challenges organizations face in navigating the rapidly evolving landscape of language models.
- Sentiment Analysis with Text and Audio Using AWS Generative AI Services: Approaches, Challenges, and Solutionsby Caique de Almeida, Guilherme Rinaldo, Paulo Finardi, Victor Costa Beraldo, Vinicius Caridá on January 9, 2026 at 4:06 pm
This post, developed through a strategic scientific partnership between AWS and the Instituto de Ciência e Tecnologia Itaú (ICTi), P&D hub maintained by Itaú Unibanco, the largest private bank in Latin America, explores the technical aspects of sentiment analysis for both text and audio. We present experiments comparing multiple machine learning (ML) models and services, discuss the trade-offs and pitfalls of each approach, and highlight how AWS services can be orchestrated to build robust, end-to-end solutions. We also offer insights into potential future directions, including more advanced prompt engineering for large language models (LLMs) and expanding the scope of audio-based analysis to capture emotional cues that text data alone might miss.
- Architecting TrueLook’s AI-powered construction safety system on Amazon SageMaker AIby Pranav Murthy on January 9, 2026 at 4:03 pm
This post provides a detailed architectural overview of how TrueLook built its AI-powered safety monitoring system using SageMaker AI, highlighting key technical decisions, pipeline design patterns, and MLOps best practices. You will gain valuable insights into designing scalable computer vision solutions on AWS, particularly around model training workflows, automated pipeline creation, and production deployment strategies for real-time inference.
- Scaling medical content review at Flo Health using Amazon Bedrock (Part 1)by Liza Zinovyeva on January 8, 2026 at 6:25 pm
This two-part series explores Flo Health’s journey with generative AI for medical content verification. Part 1 examines our proof of concept (PoC), including the initial solution, capabilities, and early results. Part 2 covers focusing on scaling challenges and real-world implementation. Each article stands alone while collectively showing how AI transforms medical content management at scale.
- Detect and redact personally identifiable information using Amazon Bedrock Data Automation and Guardrailsby Himanshu Dixit on January 8, 2026 at 4:14 pm
This post shows an automated PII detection and redaction solution using Amazon Bedrock Data Automation and Amazon Bedrock Guardrails through a use case of processing text and image content in high volumes of incoming emails and attachments. The solution features a complete email processing workflow with a React-based user interface for authorized personnel to more securely manage and review redacted email communications and attachments. We walk through the step-by-step solution implementation procedures used to deploy this solution. Finally, we discuss the solution benefits, including operational efficiency, scalability, security and compliance, and adaptability.
- Speed meets scale: Load testing SageMakerAI endpoints with Observe.AI’s testing toolby Aashraya Sachdeva on January 8, 2026 at 4:12 pm
Observe.ai developed the One Load Audit Framework (OLAF), which integrates with SageMaker to identify bottlenecks and performance issues in ML services, offering latency and throughput measurements under both static and dynamic data loads. In this blog post, you will learn how to use the OLAF utility to test and validate your SageMaker endpoint.
- Migrate MLflow tracking servers to Amazon SageMaker AI with serverless MLflowby Rahul Easwar on December 29, 2025 at 5:29 pm
This post shows you how to migrate your self-managed MLflow tracking server to a MLflow App – a serverless tracking server on SageMaker AI that automatically scales resources based on demand while removing server patching and storage management tasks at no cost. Learn how to use the MLflow Export Import tool to transfer your experiments, runs, models, and other MLflow resources, including instructions to validate your migration’s success.
- Build an AI-powered website assistant with Amazon Bedrockby Shashank Jain on December 29, 2025 at 4:42 pm
This post demonstrates how to solve this challenge by building an AI-powered website assistant using Amazon Bedrock and Amazon Bedrock Knowledge Bases.
- Programmatically creating an IDP solution with Amazon Bedrock Data Automationby Raian Osman on December 24, 2025 at 5:26 pm
In this post, we explore how to programmatically create an IDP solution that uses Strands SDK, Amazon Bedrock AgentCore, Amazon Bedrock Knowledge Base, and Bedrock Data Automation (BDA). This solution is provided through a Jupyter notebook that enables users to upload multi-modal business documents and extract insights using BDA as a parser to retrieve relevant chunks and augment a prompt to a foundational model (FM).
- AI agent-driven browser automation for enterprise workflow managementby Kosti Vasilakakis on December 24, 2025 at 5:22 pm
Enterprise organizations increasingly rely on web-based applications for critical business processes, yet many workflows remain manually intensive, creating operational inefficiencies and compliance risks. Despite significant technology investments, knowledge workers routinely navigate between eight to twelve different web applications during standard workflows, constantly switching contexts and manually transferring information between systems. Data entry and validation tasks
- Agentic QA automation using Amazon Bedrock AgentCore Browser and Amazon Nova Actby Kosti Vasilakakis on December 24, 2025 at 5:20 pm
In this post, we explore how agentic QA automation addresses these challenges and walk through a practical example using Amazon Bedrock AgentCore Browser and Amazon Nova Act to automate testing for a sample retail application.
- Optimizing LLM inference on Amazon SageMaker AI with BentoML’s LLM- Optimizerby Josh Longenecker on December 24, 2025 at 5:17 pm
In this post, we demonstrate how to optimize large language model (LLM) inference on Amazon SageMaker AI using BentoML’s LLM-Optimizer to systematically identify the best serving configurations for your workload.
- Exploring the zero operator access design of Mantleby Anthony Liguori on December 23, 2025 at 10:18 pm
In this post, we explore how Mantle, Amazon’s next-generation inference engine for Amazon Bedrock, implements a zero operator access (ZOA) design that eliminates any technical means for AWS operators to access customer data.
- AWS AI League: Model customization and agentic showdownby Marc Karp on December 23, 2025 at 5:36 pm
In this post, we explore the new AWS AI League challenges and how they are transforming how organizations approach AI development. The grand finale at AWS re:Invent 2025 was an exciting showcase of their ingenuity and skills.
- Accelerate Enterprise AI Development using Weights & Biases and Amazon Bedrock AgentCoreby James Yi on December 23, 2025 at 5:32 pm
In this post, we demonstrate how to use Foundation Models (FMs) from Amazon Bedrock and the newly launched Amazon Bedrock AgentCore alongside W&B Weave to help build, evaluate, and monitor enterprise AI solutions. We cover the complete development lifecycle from tracking individual FM calls to monitoring complex agent workflows in production.
- How dLocal automated compliance reviews using Amazon Quick Automateby Martin Da Rosa on December 23, 2025 at 5:24 pm
In this post, we share how dLocal worked closely with the AWS team to help shape the product roadmap, reinforce its role as an industry innovator, and set new benchmarks for operational excellence in the global fintech landscape.
- Advancing ADHD diagnosis: How Qbtech built a mobile AI assessment Model Using Amazon SageMaker AIby Antonio Martellotta on December 23, 2025 at 5:11 pm
In this post, we explore how Qbtech streamlined their machine learning (ML) workflow using Amazon SageMaker AI, a fully managed service to build, train and deploy ML models, and AWS Glue, a serverless service that makes data integration simpler, faster, and more cost effective. This new solution reduced their feature engineering time from weeks to hours, while maintaining the high clinical standards required by healthcare providers.
























