Construct and Run Your Personal AI Agent within the Cloud

to carry out helpful work may be comparatively easy. For a easy utility, just a few traces of Python utilizing the boto3 library and the Bedrock API could also be all that’s wanted. You arrange entry to an LLM, ship a immediate and a few enter, obtain a response, and return it to the person.

That method turns into tougher to handle in case your utility has to tackle extra duties. As soon as the mannequin wants to keep up dialog context, select between instruments, observe detailed directions, or coordinate a number of steps, the encompassing utility code begins to resemble an agent framework.

Context Engineering for RAG : The 4 Typed Inputs Behind Each RAG Reply

Immediate Engineering Fails Quietly — Immediate Regression Is Why

That’s the place the usage of Strands and AgentCore comes into play.

Strands gives that agent layer, however working an agent reliably introduces a distinct set of issues. It wants someplace to run, an invocation interface, session isolation, scaling, safety, and probably providers corresponding to long-term reminiscence or managed entry to instruments. Amazon Bedrock AgentCore gives these operational capabilities with out defining how the agent itself ought to behave.

NB. Aside from being a person of their programs I’ve no affiliation or affiliation with AWS.

All photographs proven on this article, other than the headline picture which is AI generated, have been created by the creator.

Strands

Strands is an open-source agent framework from AWS. It gives the application-level parts wanted to create an agent.

An LLM, corresponding to Claude Sonnet, by way of Amazon Bedrock or native fashions by way of Ollama and different suppliers.
A system immediate that defines the agent’s function and behavior.
Instruments the mannequin can select to name.
Dialog messages and context.
An agent loop that sends requests to the mannequin, executes requested instruments, and returns instrument outcomes to the mannequin.

Consider Strands because the AWS equal to LangChain or CrewAI.

A fundamental Strands agent may be created with only a mannequin and a system immediate:

from strands import Agent

mannequin = 'your_LLM_choice'

agent = Agent(
    mannequin=mannequin,
    system_prompt="You're an academic SME assistant.",
)
response = agent("Clarify Newton's first regulation.")

Though I’m framing this text as a Strands/AgentCore double group, Strands itself doesn’t require AgentCore to run. A Strands agent is odd utility code and may run domestically or on different infrastructure. AgentCore is available in when the agent wants managed deployment on AWS infrastructure and manufacturing providers, corresponding to reminiscence, scalability and observability.

The agent we’re constructing

We’ll construct and deploy a Strands-based instructional assistant earlier than including AgentCore Reminiscence to protect person preferences between conversations. Our agent will assist questions on:

Arithmetic
Physics
Chemistry
Geography

The mannequin decides which topic most closely fits every query, so there’s no key phrase listing or separate routing instrument to keep up. It solutions supported questions because the related topic professional and declines something outdoors the 4 supported areas.

This can be a helpful first agent as a result of its behaviour is simple to grasp, but the applying nonetheless entails choices present in bigger programs: choosing a mannequin, writing directions, validating requests, testing mannequin behaviour, deploying the applying, and managing dialog classes.

I’m solely utilizing one Strands agent relatively than separate specialist brokers for every topic. Separate brokers change into worthwhile when topics want totally different fashions, instruments, directions, knowledge, or permissions, however for proper now, that will add pointless coordination and complexity to this preliminary implementation.

AgentCore

Amazon Bedrock AgentCore is a set of managed AWS providers for constructing, deploying, connecting, and working brokers on the AWS cloud. It’s framework-agnostic, so it will probably host brokers constructed with Strands, LangChain, OpenAI Brokers SDK, and different frameworks.
The primary AgentCore capabilities are:

+------------------+---------------------------------------------------------------------------------------------+
| Functionality       | Goal                                                                                     |
+------------------+---------------------------------------------------------------------------------------------+
| Runtime          | Hosts and scales brokers in session-isolated environments. Helps streaming and HTTP, MCP, |
|                  | and A2A protocols.                                                                          |
+------------------+---------------------------------------------------------------------------------------------+
| Reminiscence           | Shops dialog occasions and extracts sturdy info, preferences, summaries, or episodes  |
|                  | to be used throughout classes.                                                                    |
+------------------+---------------------------------------------------------------------------------------------+
| Gateway          | Exposes APIs, Lambda features, and MCP servers as managed instruments that brokers can uncover   |
|                  | and name.                                                                                   |
+------------------+---------------------------------------------------------------------------------------------+
| Id         | Manages inbound authentication and the credentials brokers use to entry exterior providers.  |
+------------------+---------------------------------------------------------------------------------------------+
| Coverage           | Applies Cedar authorisation guidelines to Gateway instrument calls earlier than they attain their targets.    |
+------------------+---------------------------------------------------------------------------------------------+
| Browser          | Offers managed browser classes for brokers that must work together with web sites.           |
+------------------+---------------------------------------------------------------------------------------------+
| Code Interpreter | Runs Python, JavaScript, or TypeScript in remoted managed sandboxes.                       |
+------------------+---------------------------------------------------------------------------------------------+
| Observability    | Sends agent logs, traces, and metrics to providers corresponding to CloudWatch and X-Ray.             |
+------------------+---------------------------------------------------------------------------------------------+
| Evaluations      | Measures agent behaviour and response high quality utilizing built-in or customized evaluators.          |
+------------------+---------------------------------------------------------------------------------------------+

These capabilities are unbiased, that means an agent can use solely those it wants and add others as its duties develop.

We’ll use AgentCore to scaffold our Strands agent and deploy it to AWS with AgentCore Runtime. This creates a traditional undertaking construction in your native system, an area growth workflow, a deployment configuration, and a runtime entry level.

Our SME agent has no exterior instruments and doesn’t want any of the opposite AgentCore capabilities but, however we’ll add a reminiscence part later.

How Strands and AgentCore match collectively

Strands controls what our agent does. The chosen mannequin, system immediate, dialog context, and any instruments are a part of the Strands utility.

AgentCore Runtime controls the place and the way that utility runs. It gives the managed surroundings across the agent: deployment, scaling, session isolation, streaming, and an invocation interface.
This distinction is value noting as a result of deploying an agent with AgentCore doesn’t outline the general system behaviour. That basically comes from the mannequin alternative and the underlying Strands code.

Putting in the event instruments

To observe alongside, you’ll want:

An AWS account.
AWS credentials configured domestically.
Node.js 20 or later.
Python 3.10 or later.
AWS CDK (utilized by AgentCore for deployment).
The AgentCore CLI.
Entry to the chosen mannequin in Amazon Bedrock.

My set up was performed on a Home windows PC utilizing PowerShell. Set up the AWS CLI and Node.js first if these instructions are unavailable.

PS C: > msiexec.exe /i https://awscli.amazonaws.com/AWSCLIV2.msi
PS C: > aws --version

#
# Output
#
aws-cli/2.22.15 Python/3.12.6 Home windows/11 exe/AMD64

Set up the AWS CDK globally with npm:

PS C: > npm set up -g aws-cdk
PS C: > cdk --version


#
# Output
#
2.1126.0 (construct a90d578)

Set up the AgentCore CLI:

PS C: > npm set up -g @aws/agentcore
PS C: > agentcore --version


#
# Output
#
The AgentCore CLI collects aggregated, nameless utilization
analytics to assist enhance the instrument.
To decide out: agentcore config telemetry.enabled false
To audit: agentcore config telemetry.audit true
To study extra: agentcore telemetry --help

0.19.0

The AgentCore CLI create command creates the Python surroundings and installs the generated undertaking’s dependencies when working domestically. The generated pyproject.toml file information dependencies, corresponding to Strands and the AgentCore SDK.

Configure AWS credentials utilizing the method applicable for the surroundings. For native growth, that is generally an AWS profile file.

Affirm that the LLM or inference profile you wish to use is obtainable from the supposed supply area. You are able to do that from the AWS CLI like this:

PS C: > aws bedrock list-foundation-models --region 
PS C: > aws bedrock list-inference-profiles --region

Selecting your mannequin

When creating an AgentCore undertaking, the –model-provider Bedrock command-line flag selects Amazon Bedrock because the supplier however doesn’t specify which basis mannequin the applying intends to make use of.

Strands makes use of certainly one of Anthropic’s fashions by default when no mannequin is provided. On the time of writing, the newest default Strands mannequin is international.anthropic.claude-sonnet-4–6, but it surely’s finest apply to at all times specify a specific mannequin you need Strands to make use of.

This instance code fragment explicitly specifies the usage of Anthropic Claude Sonnet 4.6 by way of a worldwide cross-region inference profile:

from strands.fashions import BedrockModel

mannequin = BedrockModel(
    model_id="international.anthropic.claude-sonnet-4-6",
    region_name="us-west-2",
    temperature=0.2,
    max_tokens=1_500,
)

Bedrock has a number of mannequin households to select from, and Claude Sonnet is an efficient place to begin for SME responses, whereas Claude Haiku or Amazon Nova Micro could swimsuit an easier, cost-sensitive routing workload.

World cross-region inference can enhance availability, however it might not fulfill each data-residency requirement. A geographic or utility inference profile is a more sensible choice when processing should stay inside an outlined geographic boundary, corresponding to Europe or Asia.

Creating the AgentCore undertaking

Use the AgentCore CLI to initialise a Strands undertaking with Bedrock as its mannequin supplier:

PS C:Usersthomaprojectsstrands-agentcore-demo> agentcore create --name SMETriage --framework Strands --protocol HTTP --model-provider Bedrock --build CodeZip --memory none

#
# Output
#
[done] Create SMETriage/ undertaking listing
[done] Put together agentcore/ listing
[done] Initialize git repository
[done] Add agent to undertaking
[done] Arrange Python surroundings

Created:
SMETriage/
app/SMETriage/ Python agent (Strands)
agentcore/ Config and CDK undertaking

Mission created efficiently!

To proceed, navigate to your new undertaking:

cd SMETriage

We already talked about the –mannequin flag. One other vital choice right here is: –construct CodeZip.

CodeZip is AgentCore’s direct-code deployment format for Python functions. As an alternative of constructing a Docker picture and utilizing ECR/ECS, with this flag, the AgentCore CLI:

Collects the agent’s Python supply recordsdata.
Resolves and packages its dependencies.
Creates a ZIP archive containing the applying and Linux
ARM64-compatible dependencies.
Uploads the archive to Amazon S3.
Configures AgentCore Runtime to run the Python entrypoint.

The undertaking information this alternative within the agentcore/agentcore.json file.

After the AgentCore create command has accomplished, you need to see a undertaking folder construction just like this:

SMETriage/
|-- agentcore/
|   |-- agentcore.json       # Mission and deployment configuration
|   |-- aws-targets.json     # Goal AWS account and area
|   `-- .env.native           # Native-only values; gitignored
`-- app/
    `-- SMETriage/
        |-- primary.py          # Agent entrypoint
        `-- pyproject.toml   # Python dependencies

The generated undertaking may be examined earlier than making any adjustments:

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore dev

This may open an internet web page at localhost:8080 or the same tackle, so open that, and you may ask the mannequin questions.

How AgentCore Manages IAM Permissions

The AgentCore CLI creates the agent’s Runtime execution function and provides the permissions required to invoke the chosen Bedrock mannequin, entry configured Reminiscence assets, and write Runtime logs.

You don’t must create this function manually for this instance. Nonetheless, the IAM identification working AgentCore deploy should have already got permission to create and go IAM roles, deploy CloudFormation assets, add CodeZip packages to S3, and handle the required AgentCore assets.

Customers and functions additionally want permission to invoke the deployed agent. Throughout growth, the generated IAM insurance policies are normally enough. Earlier than transferring into manufacturing, these must be reviewed and restricted to the precise Runtime, mannequin, Reminiscence useful resource, and supporting infrastructure.

Implementing the SME agent

Within the primary.py file beneath the app/SMETriage folder, exchange the contents with the next code:

import os
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands import Agent
from strands.fashions import BedrockModel

app = BedrockAgentCoreApp()

MODEL_ID = os.getenv(
    "BEDROCK_MODEL_ID",
    "international.anthropic.claude-sonnet-4-6",
)

REGION = os.getenv("AWS_REGION", "us-west-2")

SUBJECT_PREFIXES = {
    "arithmetic": (
        "That is your Math SME. "
        "The reply to your query is as follows."
    ),
    "physics": (
        "That is your Physics SME. "
        "The reply to your query is as follows."
    ),
    "chemistry": (
        "That is your Chemistry SME. "
        "The reply to your query is as follows."
    ),
    "geography": (
        "That is your Geography SME. "
        "The reply to your query is as follows."
    ),
}

def load_model() -> BedrockModel:
    return BedrockModel(
        model_id=MODEL_ID,
        region_name=REGION,
        temperature=0.2,
        max_tokens=1_500,
    )

def build_agent() -> Agent:
    prefixes = "n".be a part of(
        f"- {topic}: {prefix}"
        for topic, prefix in SUBJECT_PREFIXES.objects()
    )
    return Agent(
        mannequin=load_model(),
        system_prompt=(
            "You're an academic SME triage assistant. "
            "First resolve whether or not the person's query is primarily about "
            "arithmetic, physics, chemistry, geography, or an unsupported "
            "topic. "
            "Reply solely questions on arithmetic, physics, chemistry, "
            "or geography. "
            "For supported questions, start with the precise prefix for the "
            "chosen topic, then give a transparent and correct clarification. "
            "When a query overlaps topics, select the topic most "
            "vital to answering it. "
            "For unsupported questions, reply precisely: "
            ""I am sorry, I do not know the reply to that."nn"
            "Required topic prefixes:n"
            f"{prefixes}"
        ),
    )

@app.entrypoint
def invoke(payload, context):
    immediate = payload.get("immediate")
    if not isinstance(immediate, str) or not immediate.strip():
        return {"error": "A non-empty query is required."}
    if len(immediate) > 2_000:
        return {"error": "Query exceeds the utmost size."}
    response = build_agent()(immediate.strip())
    return {"response": str(response)}

if __name__ == "__main__":
    app.run()

There are three elements value inspecting: mannequin loading, routing directions, and the Runtime entrypoint.

1/ Preserving mannequin configuration specific

The load_model perform centralises the Bedrock configuration.
The BEDROCK_MODEL_ID surroundings variable can override the default LLM with out altering the supply, for instance

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> $env:BEDROCK_MODEL_ID = ""
PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore dev

That is helpful when evaluating fashions or utilizing totally different inference profiles between environments.

2/ Letting the mannequin route questions

The agent doesn’t use a topic key phrase listing to find out the right way to decode a query. The system immediate asks the mannequin to resolve whether or not a query is primarily about one of many supported topics.
That versatile method handles disparate questions corresponding to:

“Clarify the Riemann speculation to a 10-year-old.”
“Why can chrome steel resist corrosion?”
“Why do coastal areas typically have milder climates?”

A hard and fast key phrase classifier would want to anticipate phrases corresponding to Riemann, corrosion, and coastal. The mannequin already understands the relationships between these phrases and their topics.

Some questions sit naturally between topics. Think about: “How does the chemistry of the environment have an effect on local weather?” Chemistry and geography are each affordable selections, so the immediate asks the mannequin to pick whichever is most related to its reply.

This method is versatile, however not absolutely predictable. The mannequin could classify the identical ambiguous query otherwise from one run to the subsequent, or return a response within the flawed format. That is perhaps superb for an academic assistant, but it surely’s a lot riskier if routing controls entry to delicate instruments or knowledge. In that case, you’ll wish to put in place different controls, for instance, structured outputs.

3/ Validating the Runtime request

The entry level checks that the request accommodates a usable query.
That is odd request validation relatively than content material moderation. It prevents malformed and excessively massive requests from reaching the mannequin. The perform embellished with @app.entrypoint is the bridge between AgentCore Runtime and the Strands agent. Runtime provides the payload, the perform validates it, and the Strands agent produces the response.

Testing the agent

After stopping and restarting the agentcore dev command. We will ask the agent questions.

As proven, every reply ought to start with the corresponding SME prefix, and the up to date responses reveal that the brand new agent is in impact. Additionally, an unsupported query, corresponding to “Who wrote Pleasure and Prejudice?” ought to obtain the reply:

"I am sorry, I do not know the reply to that."

An overlapping query is beneficial for seeing how the mannequin resolves ambiguity:

The vital outcome isn’t whether or not it at all times chooses chemistry or geography. The reply ought to use one matching prefix and provides a related clarification.

Deploying the agent to the AWS Cloud

As soon as the agent behaves accurately domestically, deploy it to AWS utilizing:

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore deploy

AgentCore Deploy

Mission: SMETriage
Goal: us-east-2:0123456789

[done] Validate undertaking
[done] Verify dependencies
[done] Construct CDK undertaking
[done] Synthesize CloudFormation
[done] Verify stack standing
[done] Bootstrap AWS surroundings
[done] Computing diff adjustments...
[done] Publish belongings

╭────────────────────────────────────────────────╮
│ ✓ Deploy to AWS Full │
│ │
│ [████████████████████] 5/5 │
╰────────────────────────────────────────────────╯

Deployed 1 stack(s): AgentCore-SMETriage-default

Word: Transaction search enabled. It takes ~10 minutes for transaction search to be absolutely lively and for traces from
invocations to be listed.

Log: agentcore.clilogsdeploydeploy-20260611-104449.log

We will examine the deployed Runtime:

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore standing --type agent

Brokers
SMETriage: Deployed - Runtime: READY
(arn:aws:bedrock-agentcore:us-east-2:0123456789:runtime/SMETriage_SMETriage-r5adp27A24)
URL: https://bedrock-agentcore.us-east-2.amazonaws.com/runtimes/arnpercent3Aawspercent3Abedrock-agentcorepercent3Aus-east-2percent3A6963531187
45percent3Aruntimepercent2FSMETriage_SMETriage-r5adp27A24/invocations

Then invoke it with:

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore invoke "Clarify how likelihood is utilized in on a regular basis life." --stream

{
"response": "That is your Math SME. The reply to your query is as follows.nnProbability is the mathematical research of chance and probability, and it seems in lots of features of on a regular basis life. Listed below are some key examples:nn## Climate Forecastingn- Meteorologists use likelihood to foretell the **probability of rain or snow** (e.g., "70% probability of rain")n- These predictions are primarily based on historic knowledge and atmospheric modelsnn## Insurance coverage and Financen- Insurance coverage corporations calculate **danger possibilities** to set premium pricesn- Banks assess the **likelihood of mortgage default** when lending moneyn- Traders consider the **chance of returns** on investmentsnn## Medication and Healthn- Docs use likelihood to evaluate **diagnostic check accuracy** and illness riskn- Medical trials depend on likelihood to find out if a **therapy is efficient**n- Genetic counselors calculate the **likelihood of inheriting circumstances**nn## Video games and Gamblingn- Card video games, cube, and lotteries are all ruled by **mathematical likelihood**n- Understanding odds helps folks make **knowledgeable choices** about risknn## On a regular basis Choice-Makingn- Deciding whether or not to **carry an umbrella** primarily based on forecast probabilityn- Estimating the **chance of site visitors** when planning a commuten- Assessing **security dangers** in each day activitiesnn## High quality Controln- Producers use likelihood to **predict defect charges** in productionnnIn essence, likelihood helps us **quantify uncertainty** and make better-informed choices in an unpredictable world.n"
}

Session: 735d6d6b-b2de-40d3-ae3c-741821c4810f
To renew: agentcore invoke --session-id 735d6d6b-b2de-40d3-ae3c-741821c4810f
Log: C:Usersthomaprojectsstrands-agentcore-demosmetriageagentcore.clilogsinvokeinvoke-SMETriage-20260611-105138.log

The deployed Runtime wants permission to invoke the chosen Bedrock mannequin or inference profile. Anthropic fashions used to require a one-time use-case submission earlier than first use, however that’s not the case. The primary time you invoke the mannequin, you’re routinely entitled to make use of it, and also you’ll obtain a few emails to that impact.

If invocation fails, examine IAM permissions, mannequin availability, inference profile, supply area, and any organisation service management insurance policies.

NB. Bear in mind, at this level, AgentCore Runtime gives the managed execution surroundings, however the utility behaviour stays within the Strands code. Altering the system immediate or mannequin requires updating and redeploying the applying.

Including AgentCore Performance — Reminiscence

Initially, I discussed that AgentCore consists of many various elements, e.g., Gateway, Runtime, Observability, and so forth…

We’ve already seen the right way to use the Runtime. Now we’ll add Reminiscence to our instance. You will have observed that, beforehand, once we invoked AgentCore to ask our questions, a session ID was returned. These group associated turns in a single dialog. So in case you have a follow-up query, you’ll be able to go within the session ID of the unique query to get again associated info. For instance,

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore invoke "Clarify Newton's first regulation." --stream
{
"response": "physics: That is your Physics SME. The reply to your query is as follows.nn**Newton's First Legislation of Movement** (also referred to as the **Legislation of Inertia**) states:nn> *An object at relaxation stays at relaxation, and an object in movement stays in movement at a relentless velocity (identical velocity and course), except acted upon by a web exterior pressure.*nn---nn### Key Ideas:nn1. **Inertia** – That is the tendency of an object to withstand adjustments to its state of movement. The extra mass an object has, the higher its inertia.nn2. **At Relaxation** – If an object is stationary, it would stay stationary except a pressure acts on it. For instance, a guide sitting on a desk won't transfer by itself.nn3. **In Movement** – If an object is transferring, it would proceed transferring in a straight line on the identical velocity except a pressure (corresponding to friction, gravity, or utilized pressure) acts on it.nn4. **Internet Exterior Power** – It's the *mixed/resultant* pressure that issues. If all forces on an object cancel out (web pressure = 0), the item behaves as if no pressure is performing on it.nn---nn### On a regular basis Instance:nWhen a automobile stops abruptly, passengers lurch **ahead** — their our bodies have been in movement and have a tendency to **keep in movement**, demonstrating inertia.nnThis regulation basically defines what a **pressure** is: one thing that *adjustments* the state of movement of an object.n"
}

Session: 7dbe017a-e306-4891-88d1-b707a5d6f894
To renew: agentcore invoke --session-id 7dbe017a-e306-4891-88d1-b707a5d6f894
Log: C:Usersthomaprojectsstrands-agentcore-demosmetriageagentcore.clilogsinvokeinvoke-SMETriage-20260611-133213.log

We will use the Session ID: 7dbe017a-e306–4891–88d1-b707a5d6f894 to ask a follow-up.

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore invoke --session-id 7dbe017a-e306-4891-88d1-b707a5d6f894 "Give me an instance" --stream
 Scenario 

Session: 7dbe017a-e306-4891-88d1-b707a5d6f894
To renew: agentcore invoke --session-id 7dbe017a-e306-4891-88d1-b707a5d6f894
Log: C:Usersthomaprojectsstrands-agentcore-demosmetriageagentcore.clilogsinvokeinvoke-SMETriage-20260611-133413.log

Nonetheless, session IDs are NOT long-term reminiscence mechanisms. For that, we will use AgentCore’s reminiscence functionality.

Including AgentCore Reminiscence

AgentCore Reminiscence shops dialog occasions and may extract helpful long-term information from them. It has built-in methods that serve totally different functions:

+-----------------+---------------------------------------------------------------+
| Technique        | What it extracts                                              |
+-----------------+---------------------------------------------------------------+
| USER_PREFERENCE | A person's selections, most popular fashion, and recurring preferences. |
+-----------------+---------------------------------------------------------------+
| SEMANTIC        | Sturdy info from conversations.                             |
+-----------------+---------------------------------------------------------------+
| SUMMARIZATION   | Summaries of conversations.                                   |
+-----------------+---------------------------------------------------------------+
| EPISODIC        | Sequences of interactions that may inform later behaviour.    |
+-----------------+---------------------------------------------------------------+

For our SME agent, USER_PREFERENCE is the only match. It permits the agent to recollect statements corresponding to:

“Use UK English for spelling”
“All the time reply in Pirate converse.”
“Maintain solutions temporary.”

Actor IDs and session IDs

Reminiscence wants two identifiers:

+------------+-------------------------------------------------------------+---------------------------------------------+
| Identifier | That means                                                     | When it adjustments                             |
+------------+-------------------------------------------------------------+---------------------------------------------+
| actor_id   | The learner whose sturdy recollections are being saved and     | Maintain it secure for a similar learner.        |
|            | retrieved.                                                  |                                             |
+------------+-------------------------------------------------------------+---------------------------------------------+
| session_id | One dialog containing associated turns.                  | Use a brand new worth for a brand new dialog.     |
+------------+-------------------------------------------------------------+---------------------------------------------+

The AgentCore Runtime provides the session_id variable. The appliance provides the ID of the person by way of the actor_id. In our instance, it’s handed in a documented customized X-Learner-Id header which identifies the person (actor_id) making the request.

You must use an opaque application-generated identifier corresponding to learner-7f83a2 or a UUID for the actor_id. Don’t place an e mail tackle, title, or different private info within the ID.

The Runtime’s requestHeaderAllowlist makes X-Learner-Id accessible by way of context.request_headers. In a manufacturing utility, derive this worth from an authenticated identification and don’t belief a learner ID provided instantly by an untrusted shopper.

Create the Reminiscence useful resource

From the undertaking root, add a Reminiscence useful resource:

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore add reminiscence `
>> --name LearnerPreferences `
>> --strategies USER_PREFERENCE `
>> --expiry 30

This could add a brand new part to agentcore/agentcore.json:

{
  "recollections": [
    {
      "name": "LearnerPreferences",
      "eventExpiryDuration": 30,
      "strategies": [
        {
          "type": "USER_PREFERENCE",
          "namespaceTemplates": [
            "/users/{actorId}/preferences/"
          ]
        }
      ]
    }
  ]
}

The namespace retains every learner’s preferences separate. Its trailing slash prevents prefix collisions between comparable actor IDs. The 30-day expiry controls how lengthy uncooked reminiscence occasions are retained.

When deployed, AgentCore:

Creates the Reminiscence useful resource.
Provides the Runtime function permission to learn and write it.
Injects its ID into the Runtime as MEMORY_LEARNERPREFERENCES_ID.

Strands integrates with AgentCore Reminiscence by way of the AgentCoreMemorySessionManager.

For every invocation, the applying:

1. Will get the learner ID from the allowed X-Learner-Id header or a userId subject provided by an utility.

2. Will get the present dialog ID from the Runtime context.

3. Builds an AgentCore Reminiscence configuration for that learner and dialog.

4. Creates a Strands agent with the reminiscence session supervisor.

5. Lets the session supervisor retailer messages and retrieve related preferences.

The code creates a Strands agent per invocation as a result of its reminiscence configuration belongs to a single actor and session. The Bedrock mannequin and system immediate can nonetheless be reused.

Right here is the revised Strands Agent code.

import os

from bedrock_agentcore.reminiscence.integrations.strands.config import (
    AgentCoreMemoryConfig,
    RetrievalConfig,
)
from bedrock_agentcore.reminiscence.integrations.strands.session_manager import (
    AgentCoreMemorySessionManager,
)
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands import Agent
from strands.fashions import BedrockModel

app = BedrockAgentCoreApp()

MODEL_ID = os.getenv(
    "BEDROCK_MODEL_ID",
    "international.anthropic.claude-sonnet-4-6",
)
REGION = os.getenv("AWS_REGION", "us-west-2")
MEMORY_ID = os.getenv("MEMORY_LEARNERPREFERENCES_ID")

SUBJECT_PREFIXES = {
    "arithmetic": (
        "That is your Math SME. "
        "The reply to your query is as follows."
    ),
    "physics": (
        "That is your Physics SME. "
        "The reply to your query is as follows."
    ),
    "chemistry": (
        "That is your Chemistry SME. "
        "The reply to your query is as follows."
    ),
    "geography": (
        "That is your Geography SME. "
        "The reply to your query is as follows."
    ),
}

def load_model() -> BedrockModel:
    return BedrockModel(
        model_id=MODEL_ID,
        region_name=REGION,
        temperature=0.2,
        max_tokens=1_500,
    )

def build_system_prompt() -> str:
    prefixes = "n".be a part of(
        f"- {topic}: {prefix}"
        for topic, prefix in SUBJECT_PREFIXES.objects()
    )

    return (
        "You're an academic SME triage assistant. "
        "First resolve whether or not the person's query is primarily about "
        "arithmetic, physics, chemistry, geography, or an unsupported "
        "topic. "
        "Reply solely questions on arithmetic, physics, chemistry, "
        "or geography. "
        "For supported questions, start with the precise prefix for the "
        "chosen topic, then give a transparent and correct clarification. "
        "When a query overlaps topics, select the topic most "
        "vital to answering it. "
        "Use any retrieved learner preferences when deciding the extent, "
        "fashion, and examples within the reply. "
        "For unsupported questions, reply precisely: "
        ""I am sorry, I do not know the reply to that."nn"
        "Required topic prefixes:n"
        f"{prefixes}"
    )

MODEL = load_model()
SYSTEM_PROMPT = build_system_prompt()

def build_session_manager(
    actor_id: str,
    session_id: str,
) -> AgentCoreMemorySessionManager | None:
    if not MEMORY_ID:
        return None

    namespace = f"/customers/{actor_id}/preferences/"
    memory_config = AgentCoreMemoryConfig(
        memory_id=MEMORY_ID,
        actor_id=actor_id,
        session_id=session_id,
        retrieval_config={
            namespace: RetrievalConfig(
                top_k=3,
                relevance_score=0.5,
            )
        },
    )

    return AgentCoreMemorySessionManager(
        agentcore_memory_config=memory_config,
        region_name=REGION,
    )

def get_actor_id(payload, context) -> str | None:
    actor_id = payload.get("userId")
    if isinstance(actor_id, str) and actor_id.strip():
        return actor_id.strip()

    request_headers = getattr(context, "request_headers", None) or {}
    for title, worth in request_headers.objects():
        if title.casefold() == "x-learner-id":
            if isinstance(worth, str) and worth.strip():
                return worth.strip()

    return None

@app.entrypoint
def invoke(payload, context):
    immediate = payload.get("immediate")
    actor_id = get_actor_id(payload, context)

    if not isinstance(immediate, str) or not immediate.strip():
        return {"error": "A non-empty query is required."}

    if len(immediate) > 2_000:
        return {"error": "Query exceeds the utmost size."}

    if not actor_id:
        return {"error": "A non-empty userId is required."}

    if len(actor_id) > 128:
        return {"error": "userId exceeds the utmost size."}

    session_id = getattr(context, "session_id", None) or "local-session"
    session_manager = build_session_manager(actor_id, session_id)
    agent = Agent(
        mannequin=MODEL,
        session_manager=session_manager,
        system_prompt=SYSTEM_PROMPT,
    )

    response = agent(immediate.strip())
    return {
        "response": str(response),
        "memory_enabled": session_manager will not be None,
    }

if __name__ == "__main__":
    app.run()

Save this and re-deploy as earlier than. As soon as that’s performed, we will invoke it like this.

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore invoke `
"I favor my solutions in Pirate converse" `
-H "X-Learner-Id: learner-7f83a2" `
--stream

{
"response": "Arrr, I've famous yer choice for Pirate converse, matey! Go forward and ask me yer query about arithmetic, physics, chemistry, or geography, and I am going to reply ye within the best Pirate tongue on the seven seas! 🏴‍☠️⚓n",
"memory_enabled": true
}

Session: 2a94f2d9-9e40-417e-a53a-ad9393f0490d
To renew: agentcore invoke --session-id 2a94f2d9-9e40-417e-a53a-ad9393f0490d
Log: C:Usersthomaprojectsstrands-agentcore-demosmetriageagentcore.clilogsinvokeinvoke-SMETriage-20260611-135649.log

We see that the memory_enabled flag is ready to true, and now we simply go in our Learner-Id once we ask a query, like this…

PS C:Usersthomaprojectsstrands-agentcore-demosmetriage> agentcore invoke `
"Clarify Newton's second regulation." `
-H "X-Learner-Id: learner-7f83a2" `
--stream
{
"response": "physics: That is your Physics SME. The reply to your query is as follows.nnAhoy, matey! Collect 'spherical and lend me yer ear! ⚓nn**Newton's Second Legislation of Movement**, arrr, it's one of many best treasures of all science!nn---nn🏴‍☠️ **The Legislation, in plain Pirate phrases:**nn> **Power = Mass × Acceleration**n> *(or as we scrawl it on the ship's hull: **F = ma**)*nn---nn**What does it imply, ye ask?**nn- ⚓ **Power (F)** — This be the push or pull ye give somethin', measured in Newtons, arrr!n- 💀 **Mass (m)** — This be how a lot *stuff* makes up yer cannonball or treasure chest, measured in kilograms!n- 🌊 **Acceleration (a)** — This be how briskly yer velocity be *changin'*, measured in meters per second squared!nn---nn**A superb instance from the excessive seas:**nnIf ye hearth a **heavy cannonball** and a **mild cannonball** with the identical pressure, the lighter one flies off with MORE acceleration! The heavier one be sluggish, like a barnacle-covered hull!nnSo keep in mind, the **larger the mass**, the **tougher ye should push** to get the identical acceleration! nnNow hoist the sails and go apply yer physics, ye scallywag! 🦜n",
"memory_enabled": true
}

Session: d665e549-6bf2-470d-96ce-506b5be141e2
To renew: agentcore invoke --session-id d665e549-6bf2-470d-96ce-506b5be141e2
Log: C:Usersthomaprojectsstrands-agentcore-demosmetriageagentcore.clilogsinvokeinvoke-SMETriage-20260611-155936.log

Operating prices

The Strands Brokers SDK itself is open supply and free to make use of. You may run it:

Regionally in your laptop computer
On EC2
In a Docker container
On Lambda
Anyplace else you select

You solely pay for the LLM you’re calling (e.g. Amazon Nova, Anthropic Claude, OpenAI GPT, and so forth.) and any infrastructure you’re working it on (EC2, Lambda, and so forth.). So in the event you construct a Strands agent that calls Amazon Bedrock, your invoice is actually:

Bedrock mannequin inference prices +  EC2/Lambda/and so forth. prices (if relevant)

AgentCore is a managed platform on AWS that gives enterprise capabilities and is billable relying on which elements of AgentCore are getting used. In our examples, we used Runtime and reminiscence, each of which might incur prices.

For extra info, please see the hyperlinks under.

https://aws.amazon.com/bedrock/agentcore/pricing

https://aws.amazon.com/bedrock/pricing/

Abstract

This text mentioned the right way to create and run an agentic workflow on AWS. The 2 primary parts that can help you do that are referred to as Strands and AgentCore.

Strands is used to outline what your agent is able to and which mannequin it makes use of to do its work. Strands presents versatile mannequin assist. You should utilize any LLM in Amazon Bedrock that helps instrument use and streaming, a mannequin from Anthropic’s Claude mannequin household by way of the Anthropic API, a mannequin by way of Ollama, and plenty of different mannequin suppliers, corresponding to OpenAI by way of LiteLLM. You may moreover outline your personal customized mannequin supplier if wished.

AgentCore lets you check the agent domestically earlier than full deployment to AWS’s cloud infrastructure. AgentCore is agent-agnostic: you’ll be able to create Brokers utilizing Strands as we confirmed, in addition to different agentic authoring programs like CrewAI and LangGraph.

AgentCore has way more performance than this, together with the power so as to add Reminiscence, Observability, Analysis and extra to your agentic workflow.

The instance we coded was a subject-matter professional triage agent that helps answering questions in arithmetic, physics, chemistry, and geography. It used Bedrock, with the Anthropic Sonnet LLM, to grasp and interpret every query and choose the suitable professional response. Questions outdoors the supported topics have been politely declined.

I additionally talked in regards to the distinction between Runtime classes and long-term reminiscence. A Runtime session preserves context throughout a single dialog, permitting the agent to reply follow-up questions. That context is misplaced when a brand new session begins.

AgentCore Reminiscence addresses this by permitting helpful info to hold over throughout separate conversations. By associating preferences with a person by way of the actor_id and X-Learner-Id request header subject, the agent can keep in mind directions corresponding to retaining solutions brief or, within the instance I confirmed, responding to questions in pirate converse.

These preferences are retrieved when asking a query with the suitable learner ID, permitting the agent to reply utilizing a most popular fashion with out having to repeat it within the request.

if you wish to study extra, the official documentation for AWS Strands is obtainable right here. For extra info on AgentCore, click on this hyperlink.