An introduction to AWS Bedrock | In the direction of Knowledge Science

Introduction to Small Language Fashions: The Full Information for 2026

Coding the Pong Recreation from Scratch in Python

initially of 2026, AWS has a number of associated but distinct elements that make up its agentic and LLM abstractions.

Bedrock is the mannequin layer that permits entry to giant language fashions.
Brokers for Bedrock is the managed utility layer. In different phrases, AWS runs the brokers for you based mostly in your necessities.
Bedrock AgentCore is an infrastructure layer that permits AWS to run brokers you develop utilizing third-party frameworks resembling CrewAI and LangGraph.

Aside from these three companies, AWS additionally has Strands, an open supply Python library for constructing brokers outdoors of the Bedrock service, which might then be deployed on different AWS companies resembling ECS and Lambda.

It may be complicated as a result of all three agentic-based companies have the time period “Bedrock” of their names, however on this article, I’ll give attention to the usual Bedrock service and present how and why you’ll use it. customary Bedrock service and present how and why you’ll use it.

As a service, Bedrock has solely been out there on AWS since early 2023. That ought to provide you with a clue as to why it was launched. Amazon may clearly see the rise of Giant Language Fashions and their influence on IT structure and the methods growth course of. That’s AWS’s meat and potatoes, and so they have been eager that no one was going to eat their lunch.

And though AWS has developed a number of LLMs of its personal, it realised that to remain aggressive, it could have to make the very high fashions, resembling these from Anthropic, out there to customers. And that’s the place Bedrock steps in. As they mentioned in their very own blurb on their web site,

… Bedrock is a totally managed service that provides a selection of high-performing basis fashions (FMs) from main AI firms like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, together with a broad set of capabilities you should construct generative AI purposes, simplifying growth whereas sustaining privateness and safety.

How do I entry Bedrock?

Okay, in order that’s the idea behind the why of Bedrock, however how will we get entry to it and truly use it? Not surprisingly, the very first thing you want is an AWS account. I’m going to imagine you have already got this, but when not, click on the next hyperlink to set one up.

https://aws.amazon.com/account

Usefully, after you register for a brand new AWS account, variety of the companies that you simply use will fall beneath the so-called “free tier” at AWS, which implies your prices must be minimal for one 12 months following your account creation – assuming you don’t go loopy and begin firing up enormous compute servers and such like.

There are three fundamental methods to make use of AWS companies.

By way of the console. In case you’re a newbie, this may in all probability be your most popular route because it’s the simplest method to get began
By way of an API. In case you’re helpful at coding, you’ll be able to entry all of AWS’s companies through an API. For instance, for Python programmers, AWS supplies the boto3 library. There are comparable libraries for different languages, resembling JavaScript, and so on.
By way of the command line interface (CLI). The CLI is a further device you’ll be able to obtain from AWS and permits you to work together with AWS companies straight out of your terminal.

Be aware that, to make use of the latter two strategies, it is best to have login credentials arrange in your native system.

What can I do with Bedrock?

The quick reply is that you are able to do many of the issues you’ll be able to with common chat fashions from OpenAI, Anthropic, Google, and so forth. Underlying Bedrock are plenty of basis fashions that you need to use with it, resembling:-

Kimi K2 Considering. A deep reasoning mannequin
Claude Opus 4.5. To many individuals, that is the highest LLM out there so far.
GPT-OSS. OpenAI’s open supply LLM

And plenty of, many others apart from. For a full record, take a look at the next hyperlink.

https://aws.amazon.com/bedrock/model-choice

How do I exploit Bedrock?

To make use of Bedrock, we’ll use a mixture of the AWS CLI and the Python API supplied by the boto3 library. Ensure you have the next setup as a prerequisite

An AWS account.
The AWS CLI has been downloaded and put in in your system.
An Id and Entry Administration (IAM) person is ready up with acceptable permissions and entry keys. You are able to do this through the AWS console.
Configured your person credentials through the AWS CLI like this. Generally, three items of data have to be equipped. All of which you’ll get from the earlier step above. You can be prompted to enter related data,

$ aws configure

AWS Entry Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Entry Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default area identify [None]: us-west-2
Default output format [None]:

Giving Bedrock entry to a mannequin

Again within the day (a number of months in the past!), you had to make use of the AWS administration console to request entry to explicit fashions from Bedrock, however now entry is mechanically granted while you invoke a mannequin for the primary time.

Be aware that for Anthropic fashions, first-time customers might have to submit use case particulars earlier than they will entry the mannequin. Additionally be aware that entry to high fashions from Anthropic and different suppliers willincur prices so please make sure you monitor your billing usually and take away any mannequin entry you not want.

Nevertheless, we nonetheless have to know the mannequin identify we need to use. To get an inventory of all Bedrock-compatible fashions, we are able to use the next AWS CLI command.

aws bedrock list-foundation-models

This can return a JSON end result set itemizing numerous properties of every mannequin, like this.

{
    "modelSummaries": [
        {
            "modelArn": "arn:aws:bedrock:us-east-2::foundation-model/nvidia.nemotron-nano-12b-v2",
            "modelId": "nvidia.nemotron-nano-12b-v2",
            "modelName": "NVIDIA Nemotron Nano 12B v2 VL BF16",
            "providerName": "NVIDIA",
            "inputModalities": [
                "TEXT",
                "IMAGE"
            ],
            "outputModalities": [
                "TEXT"
            ],
            "responseStreamingSupported": true,
            "customizationsSupported": [],
            "inferenceTypesSupported": [
                "ON_DEMAND"
            ],
            "modelLifecycle": {
                "standing": "ACTIVE"
            }
        },
        {
            "modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0",
...
...
...

Select the mannequin you want and be aware its modelID from the JSON output, as we’ll want this in our Python code later. An essential caveat to that is that you simply’ll usually see the next in a mannequin description,

...
...
"inferenceTypesSupported": [
    "INFERENCE_PROFILE"
]
...
...

That is reserved for fashions that:

Are giant or in excessive demand
Require reserved or managed capability
Want express value and throughput controls

For these fashions, we are able to’t simply reference the modelID in our code. As a substitute, we have to reference an inference profile. An inference profile is a Bedrock useful resource that’s certain to a number of foundational LLMs and a area.

There are two methods to acquire an inference profile you need to use. The primary is to create one your self. These are referred to as Software Profiles. The second manner is to make use of one among AWS’s Supported Profiles. That is the simpler choice, because it’s pre-built for you and also you simply have to get hold of the related Profile ID related to the inference profile to make use of in your code.

If you wish to take the route of making your Software Profile, take a look at the suitable AWS documentation, however I’m going to make use of a supported profile in my instance code.

For an inventory of Supported Profiles in AWS, take a look at the hyperlink beneath:

https://docs.aws.amazon.com/bedrock/newest/userguide/inference-profiles-support.html#inference-profiles-support-system

For my first code instance, I need to use Claude’s Sonnet 3.5 V2 mannequin, so I clicked the road above and noticed the next description.

I took be aware of the profile ID ( us.anthropic.claude-3–5-sonnet-20241022-v2:0 ) and one of many legitimate supply areas ( us-east-1 )

For my second two instance code snippets, I’ll use OpenAI’s open-source LLM for textual content output and AWS’s Titan Picture generator for photos. Neither of those fashions requires an inference profile, so you’ll be able to simply use the common modelID for them in your code.

NB: Whichever mannequin(s) you select, be certain that your AWS area is ready to the proper worth for every.

Setting Up a Growth Surroundings

As we’ll be performing some coding, it’s greatest to isolate your setting so we don’t intrude with any of our different initiatives. So let’s do this now. I’m utilizing Home windows and the UV bundle supervisor for this, however use whichever device you’re most comfy with. My code will run in a Jupyter pocket book.

uv init bedrock_demo --python 3.13
cd bedrock_demo
uv add boto3 jupyter

# To run the pocket book, sort this in
uv run jupyter pocket book

Utilizing Bedrock from Python

Let’s see Bedrock in motion with a number of examples. The primary will likely be easy, and we’ll step by step improve the complexity as we go.

Instance 1: A easy query and reply utilizing an inference profile

This instance makes use of the Claude Sonnet 3.5 V2 mannequin we talked about earlier. As talked about, to invoke this mannequin, we use a profile ID related to its inference profile.

import json
import boto3

brt = boto3.consumer("bedrock-runtime", region_name="us-east-1")

profile_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"

physique = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 200,
    "temperature": 0.2,
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is the capital of France?"}
            ]
        }
    ]
})

resp = brt.invoke_model(
    modelId=profile_id,
    physique=physique,
    settle for="utility/json",
    contentType="utility/json"
)

information = json.hundreds(resp["body"].learn())

# Claude responses come again as a "content material" array, not OpenAI "decisions"
print(information["content"][0]["text"])

#
# Output
#
The capital of France is Paris.

Be aware that invoking this mannequin (and others prefer it) creates an implied subscription between you and AWS’s market. That is not a recurring common cost. It solely prices you when the mannequin is definitely used, however its greatest to watch this to keep away from sudden payments. It is best to obtain an electronic mail outlining the subscription settlement, with a hyperlink to handle and/or cancel any current mannequin subscriptions which are arrange.

Instance 2: Create a picture

A easy picture creation utilizing AWS’s personal Titan mannequin. This mannequin isn’t related to an inference profile, so we are able to simply reference it utilizing its modelID.

import json
import base64
import boto3

brt_img = boto3.consumer("bedrock-runtime", region_name="us-east-1")
model_id_img = "amazon.titan-image-generator-v2:0"

immediate = "A hippo driving a motorbike."

physique = json.dumps({
    "taskType": "TEXT_IMAGE",
    "textToImageParams": {
        "textual content": immediate
    },
    "imageGenerationConfig": {
        "numberOfImages": 1,
        "top": 1024,
        "width": 1024,
        "cfgScale": 7.0,
        "seed": 0
    }
})

resp = brt_img.invoke_model(
    modelId=model_id_img,
    physique=physique,
    settle for="utility/json",
    contentType="utility/json"
)

information = json.hundreds(resp["body"].learn())

# Titan returns base64-encoded photos within the "photos" array
img_b64 = information["images"][0]
img_bytes = base64.b64decode(img_b64)

out_path = "titan_output.png"
with open(out_path, "wb") as f:
    f.write(img_bytes)

print("Saved:", out_path)

On my system, the output picture regarded like this.

Instance 3: A technical help triage assistant utilizing OpenAI’s OSS mannequin

This can be a extra complicated and helpful instance. Right here, we arrange an assistant that may take issues reported to it by non-technical customers and output extra questions you may want the person to reply, in addition to the almost definitely causes of the problem and what additional steps to take. Like our earlier instance, this mannequin isn’t related to an inference profile.

import json
import re
import boto3
from pydantic import BaseModel, Subject
from typing import Listing, Literal, Non-compulsory

# ----------------------------
# Bedrock setup
# ----------------------------
REGION = "us-east-2"
MODEL_ID = "openai.gpt-oss-120b-1:0"

brt = boto3.consumer("bedrock-runtime", region_name=REGION)

# ----------------------------
# Output schema
# ----------------------------
Severity = Literal["low", "medium", "high"]
Class = Literal["account", "billing", "device", "network", "software", "security", "other"]

class TriageResponse(BaseModel):
    class: Class
    severity: Severity
    abstract: str = Subject(description="One-sentence restatement of the issue.")
    likely_causes: Listing[str] = Subject(description="Prime believable causes, concise.")
    clarifying_questions: Listing[str] = Subject(description="Ask solely what is required to proceed.")
    safe_next_steps: Listing[str] = Subject(description="Step-by-step actions secure for a non-technical person.")
    stop_and_escalate_if: Listing[str] = Subject(description="Clear purple flags that require knowledgeable/helpdesk.")
    recommended_escalation_target: Non-compulsory[str] = Subject(
        default=None,
        description="If severity is excessive, who to contact (e.g., IT admin, financial institution, ISP)."
    )

# ----------------------------
# Helpers
# ----------------------------
def invoke_chat(messages, max_tokens=800, temperature=0.2) -> dict:
    physique = json.dumps({
        "messages": messages,
        "max_tokens": max_tokens,
        "temperature": temperature
    })

    resp = brt.invoke_model(
        modelId=MODEL_ID,
        physique=physique,
        settle for="utility/json",
        contentType="utility/json"
    )
    return json.hundreds(resp["body"].learn())

def extract_content(information: dict) -> str:
    return information["choices"][0]["message"]["content"]

def extract_json_object(textual content: str) -> dict:
    """
    Extract the primary JSON object from mannequin output.
    Handles widespread instances like  blocks or additional textual content.
    """
    textual content = re.sub(r".*?", "", textual content, flags=re.DOTALL).strip()

    begin = textual content.discover("{")
    if begin == -1:
        elevate ValueError("No JSON object discovered.")

    depth = 0
    for i in vary(begin, len(textual content)):
        if textual content[i] == "{":
            depth += 1
        elif textual content[i] == "}":
            depth -= 1
            if depth == 0:
                return json.hundreds(textual content[start:i+1])

    elevate ValueError("Unbalanced JSON braces; couldn't parse.")

# ----------------------------
# The helpful operate
# ----------------------------
def triage_issue(user_problem: str) -> TriageResponse:
    messages = [
        {
            "role": "system",
            "content": (
                "You are a careful technical support triage assistant for non-technical users. "
                "You must be conservative and safety-first. "
                "Return ONLY valid JSON matching the given schema. No extra text."
            )
        },
        {
            "role": "user",
            "content": f"""
User problem:
{user_problem}

Return JSON that matches this schema:
{TriageResponse.model_json_schema()}
""".strip()
        }
    ]

    uncooked = invoke_chat(messages)
    textual content = extract_content(uncooked)
    parsed = extract_json_object(textual content)
    return TriageResponse.model_validate(parsed)

# ----------------------------
# Instance
# ----------------------------
if __name__ == "__main__":
    downside = "My laptop computer is related to Wi-Fi however web sites will not load, and Zoom retains saying unstable connection."
    end result = triage_issue(downside)
    print(end result.model_dump_json(indent=2))

Right here is the output.

"class": "community",
  "severity": "medium",
  "abstract": "Laptop computer reveals Wi‑Fi connection however can not load web sites and Zoom 
              studies an unstable connection.",
  "likely_causes": [
    "Router or modem malfunction",
    "DNS resolution failure",
    "Local Wi‑Fi interference or weak signal",
    "IP address conflict on the network",
    "Firewall or security software blocking traffic",
    "ISP outage or throttling"
  ],
  "clarifying_questions": [
    "Are other devices on the same Wi‑Fi network able to access the internet?",
    "Did the problem start after any recent changes (e.g., new software, OS update, VPN installation)?",
    "Have you tried moving closer to the router or using a wired Ethernet connection?",
    "Do you see any error codes or messages in the browser or Zoom besides "unstable connection"?"
  ],
  "safe_next_steps": [
    "Restart the router and modem by unplugging them for 30 seconds, then power them back on.",
    "On the laptop, forget the Wi‑Fi network, then reconnect and re-enter the password.",
    "Run the built‑in Windows network troubleshooter (Settings → Network & Internet → Status → Network troubleshooter).",
    "Disable any VPN or proxy temporarily and test the connection again.",
    "Open a command prompt and run `ipconfig /release` followed by `ipconfig /renew`.",
    "Flush the DNS cache with `ipconfig /flushdns`.",
    "Try accessing a simple website (e.g., http://example.com) and note if it loads.",
    "If possible, connect the laptop to the router via Ethernet to see if the issue persists."
  ],
  "stop_and_escalate_if": [
    "The laptop still cannot reach any website after completing all steps.",
    "Other devices on the same network also cannot access the internet.",
    "You receive error messages indicating hardware failure (e.g., Wi‑Fi adapter not found).",
    "The router repeatedly restarts or shows error lights.",
    "Zoom continues to report a poor or unstable connection despite a working internet test."
  ],
  "recommended_escalation_target": "IT admin"
}

Abstract

This text launched AWS Bedrock, AWS’s managed gateway to basis giant language fashions, explaining why it exists, the way it suits into the broader AWS AI stack, and how one can use it in observe. We lined mannequin discovery, area and credential setup, and the important thing distinction between on-demand fashions and those who require inference profiles – a typical supply of confusion for builders.

By means of sensible Python examples, we demonstrated textual content and picture technology utilizing each customary on-demand fashions and those who require an inference profile.

At its core, Bedrock displays AWS’s long-standing philosophy: summary infrastructure complexity with out eradicating management. Reasonably than pushing a single “greatest” mannequin, Bedrock treats basis fashions as managed infrastructure elements – swappable, governable, and region-aware. This implies a future the place Bedrock evolves much less as a chat interface and extra as a mannequin orchestration layer, tightly built-in with IAM, networking, value controls, and agent frameworks.

Over time, we’d anticipate Bedrock to maneuver additional towards standardised inference contracts (subscriptions) and clearer separation between experimentation and manufacturing capability. And with their Agent and AgentCore companies, we’re already seeing deeper integration of agentic workflows with Bedrock, positioning fashions not as merchandise in themselves however as sturdy constructing blocks inside AWS methods.