Together with BMW Group, we at AWS developed a groundbreaking agentic search solution that transforms how users interact with massive datasets. By combining semantic search, SQL querying, and exhaustive analysis within an AI agent framework, the solution enables BMW employees across all skill levels to extract actionable insights from petabytes of data using natural language.

Read the full case study on the AWS Industries blog

The Challenge

BMW Group’s Cloud Data Hub stores 20 petabytes of data with an average daily ingestion of 110 TB. Traditional data analysis requires users to search through hundreds of data assets, write complex SQL queries, and interpret raw results – a significant barrier for non-technical users, especially with terminology variations across German and English descriptions.

Solution: Three Search Strategies

An AI agent intelligently selects from three complementary search approaches based on query characteristics:

Combines semantic similarity with SQL filtering for contextual queries with structured constraints.

Example: “Find brake system feedback in F09 vehicles from Q4 2024”

Uses AI-powered evaluation to analyze all matching records when comprehensive coverage is needed.

Example: “How many brake-related issues occurred on the F00 model?”

SQL Query

Direct structured queries for pure analytics without semantic analysis.

Example: “Count total quality records by vehicle model and severity”

Architecture

The solution combines AWS services for scalability and intelligence:

  • Amazon S3 Vectors – Semantic similarity search on 1,024-dimensional embeddings
  • Amazon Athena – Serverless SQL execution on structured data
  • Amazon Bedrock – Titan Text Embeddings for vector generation, Claude Sonnet 4.5 for orchestration
  • Strands Agents SDK – Agent orchestration and tool selection
  • AWS Lambda – Serverless data ingestion pipeline

Agentic Search Solution Architecture

The incremental ingestion pipeline continuously indexes new quality records as they’re reported, enabling real-time data availability.

Key Benefits

Unified data interface – Handle structured and unstructured data in a single conversation
Natural language access – Enables non-technical users to extract insights independently
Scalable – Serverless architecture scales to zero when not in use
Multilingual – Handles German and English terminology variations effectively
Transparent – AI reasoning is explained for each result

This solution demonstrates how generative AI can transform enterprise data analytics, making petabyte-scale insights accessible to all users regardless of technical expertise.

Learn More

📖 AWS Industries Blog: BMW Group Unlocks Insights from Petabytes of Data with Agentic Search on AWS

☁️ Amazon S3 Vectors: AWS Documentation

🤖 Amazon Bedrock AgentCore: AWS Documentation

📦 Strands Agents SDK: GitHub Repository