AWS Machine Learning Blog Official Machine Learning Blog of Amazon Web Services
- Build a dynamic, role-based AI agent using Amazon Bedrock inline agentsby Ishan Singh on February 13, 2025 at 8:56 pm
In this post, we explore how to build an application using Amazon Bedrock inline agents, demonstrating how a single AI assistant can adapt its capabilities dynamically based on user roles.
- Use language embeddings for zero-shot classification and semantic search with Amazon Bedrockby Tom Rogers on February 13, 2025 at 8:53 pm
In this post, we explore what language embeddings are and how they can be used to enhance your application. We show how, by using the properties of embeddings, we can implement a real-time zero-shot classifier and can add powerful features such as semantic search.
- Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrockby Sue Cha on February 12, 2025 at 5:44 pm
In this post, we explore how to use Amazon Bedrock to generate synthetic training data to fine-tune an LLM. Additionally, we provide concrete evaluation results that showcase the power of synthetic data in fine-tuning when data is scarce.
- Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AIby Daniel Zagyva on February 12, 2025 at 5:41 pm
Researchers developed Medusa, a framework to speed up LLM inference by adding extra heads to predict multiple tokens simultaneously. This post demonstrates how to use Medusa-1, the first version of the framework, to speed up an LLM by fine-tuning it on Amazon SageMaker AI and confirms the speed up with deployment and a simple load test. Medusa-1 achieves an inference speedup of around two times without sacrificing model quality, with the exact improvement varying based on model size and data used. In this post, we demonstrate its effectiveness with a 1.8 times speedup observed on a sample dataset.
- LLM-as-a-judge on Amazon Bedrock Model Evaluationby Adewale Akinfaderin on February 12, 2025 at 5:36 pm
This blog post explores LLM-as-a-judge on Amazon Bedrock Model Evaluation, providing comprehensive guidance on feature setup, evaluating job initiation through both the console and Python SDK and APIs, and demonstrating how this innovative evaluation feature can enhance generative AI applications across multiple metric categories including quality, user experience, instruction following, and safety.
- From concept to reality: Navigating the Journey of RAG from proof of concept to productionby Vivek Mittal on February 12, 2025 at 5:27 pm
In this post, we explore the movement of RAG applications from their proof of concept or minimal viable product (MVP) phase to full-fledged production systems. When transitioning a RAG application from a proof of concept to a production-ready system, optimization becomes crucial to make sure the solution is reliable, cost-effective, and high-performing.
- Meta SAM 2.1 is now available in Amazon SageMaker JumpStartby Marco Punio on February 11, 2025 at 11:09 pm
We are excited to announce that Meta’s Segment Anything Model (SAM) 2.1 vision segmentation model is publicly available through Amazon SageMaker JumpStart to deploy and run inference. Meta SAM 2.1 provides state-of-the-art video and image segmentation capabilities in a single model. In this post, we explored how SageMaker JumpStart empowers data scientists and ML engineers to discover, access, and deploy a wide range of pre-trained FMs for inference, including Meta’s most advanced and capable models to date.
- Falcon 3 models now available in Amazon SageMaker JumpStartby Niithiyn Vijeaswaran on February 11, 2025 at 10:16 pm
We are excited to announce that the Falcon 3 family of models from TII are available in Amazon SageMaker JumpStart. In this post, we explore how to deploy this model efficiently on Amazon SageMaker AI.
- Building a virtual meteorologist using Amazon Bedrock Agentsby Salman Ahmed on February 11, 2025 at 8:53 pm
In this post, we present a streamlined approach to deploying an AI-powered agent by combining Amazon Bedrock Agents and a foundation model (FM). We guide you through the process of configuring the agent and implementing the specific logic required for the virtual meteorologist to provide accurate weather-related responses.
- Amazon Q Business simplifies integration of enterprise knowledge bases at scaleby Omar Elkharbotly on February 11, 2025 at 5:11 pm
In this post, we demonstrate how to build a knowledge base solution by integrating enterprise data with Amazon Q Business using Amazon S3. This approach helps organizations improve operational efficiency, reduce response times, and gain valuable insights from their historical data. The solution uses AWS security best practices to promote data protection while enabling teams to create a comprehensive knowledge base from various data sources.
- Faster distributed graph neural network training with GraphStorm v0.4by Theodore Vasiloudis on February 11, 2025 at 5:03 pm
GraphStorm is a low-code enterprise graph machine learning (ML) framework that provides ML practitioners a simple way of building, training, and deploying graph ML solutions on industry-scale graph data. In this post, we demonstrate how GraphBolt enhances GraphStorm’s performance in distributed settings. We provide a hands-on example of using GraphStorm with GraphBolt on SageMaker for distributed training. Lastly, we share how to use Amazon SageMaker Pipelines with GraphStorm.
- Transforming credit decisions using generative AI with Rich Data Co and AWSby Daniel Wirjo on February 10, 2025 at 8:05 pm
The mission of Rich Data Co (RDC) is to broaden access to sustainable credit globally. Its software-as-a-service (SaaS) solution empowers leading banks and lenders with deep customer insights and AI-driven decision-making capabilities. In this post, we discuss how RDC uses generative AI on Amazon Bedrock to build these assistants and accelerate its overall mission of democratizing access to sustainable credit.
- Build agentic AI solutions with DeepSeek-R1, CrewAI, and Amazon SageMaker AIby Surya Kari on February 10, 2025 at 7:33 pm
In this post, we demonstrate how you can deploy an LLM such as DeepSeek-R1—or another FM of your choice—from popular model hubs like SageMaker JumpStart or Hugging Face Hub to SageMaker AI for real-time inference. We explore inference frameworks like Hugging Face TGI which helps streamline deployment while integrating built-in performance optimizations to minimize latency and maximize throughput. Additionally, we showcase how the SageMaker developer-friendly Python SDK simplifies endpoint orchestration, allowing seamless experimentation and scaling of LLM-powered applications.
- Automate bulk image editing with Crop.photo and Amazon Rekognitionby Rahul Bhargava on February 10, 2025 at 6:50 pm
In this post, we explore how Crop.photo uses Amazon Rekognition to provide sophisticated image analysis, enabling automated and precise editing of large volumes of images. This integration streamlines the image editing process for clients, providing speed and accuracy, which is crucial in the fast-paced environments of ecommerce and sports.
- Revolutionizing business processes with Amazon Bedrock and Appian’s generative AI skillsby Sunil Bemarkar on February 10, 2025 at 6:37 pm
AWS and Appian’s collaboration marks a significant advancement in business process automation. By using the power of Amazon Bedrock and Anthropic’s Claude models, Appian empowers enterprises to optimize and automate processes for greater efficiency and effectiveness. This blog post will cover how Appian AI skills build automation into organizations’ mission-critical processes to improve operational excellence, reduce costs, and build scalable solutions.
- Governing the ML lifecycle at scale, Part 4: Scaling MLOps with security and governance controlsby Jia (Vivian) Li on February 7, 2025 at 8:25 pm
This post provides detailed steps for setting up the key components of a multi-account ML platform. This includes configuring the ML Shared Services Account, which manages the central templates, model registry, and deployment pipelines; sharing the ML Admin and SageMaker Projects Portfolios from the central Service Catalog; and setting up the individual ML Development Accounts where data scientists can build and train models.
- Accelerate your Amazon Q implementation: starter kits for SMBsby Nneoma Okoroafor on February 7, 2025 at 5:29 pm
Starter kits are complete, deployable solutions that address common, repeatable business problems. They deploy the services that make up a solution according to best practices, helping you optimize costs and become familiar with these kinds of architectural patterns without a large investment in training. In this post, we showcase a starter kit for Amazon Q Business. If you have a repository of documents that you need to turn into a knowledge base quickly, or simply want to test out the capabilities of Amazon Q Business without a large investment of time at the console, then this solution is for you.
- Building the future of construction analytics: CONXAI’s AI inference on Amazon EKSby Tim Krause on February 7, 2025 at 5:21 pm
CONXAI Technology GmbH is pioneering the development of an advanced AI platform for the Architecture, Engineering, and Construction (AEC) industry. In this post, we dive deep into how CONXAI hosts the state-of-the-art OneFormer segmentation model on AWS using Amazon Simple Storage Service (Amazon S3), Amazon Elastic Kubernetes Service (Amazon EKS), KServe, and NVIDIA Triton.
- How Untold Studios empowers artists with an AI assistant built on Amazon Bedrockby Olivier Vigneresse on February 7, 2025 at 5:06 pm
Untold Studios is a tech-driven, leading creative studio specializing in high-end visual effects and animation. This post details how we used Amazon Bedrock to create an AI assistant (Untold Assistant), providing artists with a straightforward way to access our internal resources through a natural language interface integrated directly into their existing Slack workflow.
- Protect your DeepSeek model deployments with Amazon Bedrock Guardrailsby Satveer Khurpa on February 7, 2025 at 2:29 am
This blog post provides a comprehensive guide to implementing robust safety protections for DeepSeek-R1 and other open weight models using Amazon Bedrock Guardrails. By following this guide, you’ll learn how to use the advanced capabilities of DeepSeek models while maintaining strong security controls and promoting ethical AI practices.