Skip to content

fix: use IAM roles for bedrock access#578

Open
nrfulton wants to merge 24 commits into
generative-computing:mainfrom
nrfulton:fix/bedrock_aws_access_key_id
Open

fix: use IAM roles for bedrock access#578
nrfulton wants to merge 24 commits into
generative-computing:mainfrom
nrfulton:fix/bedrock_aws_access_key_id

Conversation

@nrfulton

@nrfulton nrfulton commented Mar 5, 2026

Copy link
Copy Markdown
Member

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

@github-actions

github-actions Bot commented Mar 5, 2026

Copy link
Copy Markdown
Contributor

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@mergify

mergify Bot commented Mar 5, 2026

Copy link
Copy Markdown

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@nrfulton

nrfulton commented Mar 5, 2026

Copy link
Copy Markdown
Member Author

@jakelorocco you are tagged as a reviewer because I changed core.py to only call post-processing if there are no exceptions. The status quo continues to give us pain.

@nrfulton nrfulton changed the title Fix/bedrock aws access key fix: use IAM roles for bedrock access Mar 5, 2026
@jakelorocco

Copy link
Copy Markdown
Contributor

I think that sounds good. I will review once the changes are done. We will need to make sure the finally block / telemetry stuff doesn't cause issues with this change.

@jakelorocco

Copy link
Copy Markdown
Contributor

Looks like there's another draft PR open to more comprehensively fix the post process issue: #580 (comment). I will take a closer look tomorrow.

@leothomas

leothomas commented Mar 12, 2026

Copy link
Copy Markdown

Hey folks - thank you so much for adding support for the BedRock non-mantle endpoints through the LiteLLM backend! I'm currently working on integrating Mellea with a fork of IBM's FactReasoner. I made a couple additional (local) edits to this PR in order to get it to work for me, which include:

  1. Validating authentication using boto3: this PR just checks for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables but users can also be authenticated through named AWS profiles (AWS_PROFILE env var) or something like an EC2 instance or Lambda function would directly have an IAM role attached. So I update it to use boto3 to validate that it can reach the Bedrock service, regardless of how it's authneticated
  2. Added explicit handling of logprobs: when executing queries against a Custom Model Import, with a model that supports logprob output (eg: Llama-X.X-instruct), I'm able to use the model_options parameter in the mfuncs.ainstruct function to request logprobs output alongside the main response. I also had to add logprobs support to LiteLLM, which I've forked and I'll be opening a PR for that
  3. Added a num_retries parameter to the bedrock.create_beckrock_litellm_backend() factory to take advantage of liteLLM's native retry-with-backoff - this is important for Custom Model Imports as they are cycled off of Bedrock's compute infrastructure after a period of un-used and therefore have a bit of a coldstart that needs to be managed.

Let me know if these changes would be valuable to contribue back to the main Mellea repo, I'm happy to either open a PR from my own fork (which would be branched off of this PR), or wait till this PR is merged to open mine, or any other way to contribute the changes that makes sense with your workflow; I'm just trying to avoid forking on top of a fork! (the edits are all in the backend/bedrock.py and backends/litellm.py files)

Once again, thank for getting the LiteLLM Bedrock backend working!

@jakelorocco

Copy link
Copy Markdown
Contributor

@leothomas, I think all these changes sound good. I will check in with @nrfulton today and see if we can get this base PR merged so that your PRs can be opened. I can't test this myself.

The logprobs is definitely something we've been looking into supporting for a little while (mainly just unifying it across backends). We'd love to see your implementation for litellm. Thank you!

@leothomas

Copy link
Copy Markdown

Hey there! I pulled @nrfulton 's branch into my own forks and then added the edits mentioned above. Here's the branch if you want to check it out: https://github.com/leothomas/mellea/tree/bedrock_aws_access_key_id

@jakelorocco I'll wait for y'all to merge this PR to open one from my branch!

@leothomas

leothomas commented Mar 13, 2026

Copy link
Copy Markdown

Also, here's the LiteLLM fork with LogProbs: https://github.com/leothomas/litellm/tree/feature/logprobs

I still have to write tests and all that good stuff before opening a PR against LiteLLM

@github-actions github-actions Bot added the bug Something isn't working label May 7, 2026
Assisted-by: Claude Code
Signed-off-by: Nathan Fulton <gitcommit@nfulton.org>
@nrfulton nrfulton marked this pull request as ready for review May 7, 2026 21:38
@nrfulton nrfulton requested a review from a team as a code owner May 7, 2026 21:38
@nrfulton nrfulton requested review from ajbozarth and planetf1 May 7, 2026 21:38
@nrfulton nrfulton mentioned this pull request May 7, 2026
7 tasks
@nrfulton nrfulton requested review from jakelorocco and removed request for planetf1 May 7, 2026 21:42
@nrfulton

nrfulton commented May 7, 2026

Copy link
Copy Markdown
Member Author

I changed up the auto-selected reviewers to match the reviewers assigned to #849 .

FYI @ajbozarth and @jakelorocco: TL;DR: same state as #849. This is the OG PR for that work. Switched here because maintainer commit rights were disabled for the #849 branch and there were a number of failing pre-commit hooks + a merge required to proceed.

I think this is good to go if tests pass. Please give a once over since there's been a fair amount of changes since this PR was first opened.

The main design decisions:

  1. logprobs will be an ad hoc field (see discussion on feat: bedrock aws access key #849)
  2. the test should already be skipped if the AWS token isn't present. We don't have that token in the testing infra, so it'll be skipped in nightlies.

FYI, all tests pass on my local machine with AWS_BEARER_TOKEN_BEDROCK set.

$ export AWS_BEARER_TOKEN_BEDROCK=XXXXXXXXX; pytest test/backends/test_bedrock_openai.py
================================================================================== test session starts ==================================================================================
platform darwin -- Python 3.12.9, pytest-9.0.3, pluggy-1.6.0
rootdir: /Users/nathan/dev/mellea
configfile: pyproject.toml
plugins: nbmake-1.5.5, recording-0.13.4, Faker-40.15.0, cov-7.1.0, xdist-3.8.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, asyncio-1.3.0, langsmith-0.8.3, anyio-4.13.0
timeout: 900.0s
timeout method: signal
timeout func_only: False
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 8 items                                                                                                                                                                       

test/backends/test_bedrock_openai.py ........                                                                                                                                     [100%]

==================================================================================== tests coverage =====================================================================================
___________________________________________________________________ coverage: platform darwin, python 3.12.9-final-0 ____________________________________________________________________

Coverage HTML written to dir htmlcov
Coverage JSON written to file coverage.json
=================================================================================== 8 passed in 5.91s ===================================================================================
$ export AWS_BEARER_TOKEN_BEDROCK=XXXXXXXXX; pytest test/backends/test_bedrock.py
================================================================================== test session starts ==================================================================================
platform darwin -- Python 3.12.9, pytest-9.0.3, pluggy-1.6.0
rootdir: /Users/nathan/dev/mellea
configfile: pyproject.toml
plugins: nbmake-1.5.5, recording-0.13.4, Faker-40.15.0, cov-7.1.0, xdist-3.8.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, asyncio-1.3.0, langsmith-0.8.3, anyio-4.13.0
timeout: 900.0s
timeout method: signal
timeout func_only: False
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 1 item                                                                                                                                                                        

test/backends/test_bedrock.py .                                                                                                                                                   [100%]

==================================================================================== tests coverage =====================================================================================
___________________________________________________________________ coverage: platform darwin, python 3.12.9-final-0 ____________________________________________________________________

Coverage HTML written to dir htmlcov
Coverage JSON written to file coverage.json
================================================================================== 1 passed in 10.37s ===================================================================================

@leothomas

Copy link
Copy Markdown

Thanks for following up on this and getting it over the finish line @nrfulton - again please let me know if I can support in any way!

Comment thread mellea/backends/model_ids.py Outdated
hf_model_name="openai/gpt-oss-20b", # OpenAI GPT-OSS 20B
ollama_name="gpt-oss:20b", # Ollama
bedrock_name="openai.gpt-oss-20b",
bedrock_litellm_name="bedrock/converse/openai.gpt-oss-120b-1:0",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to 20B.

)


def _assert_bedrock_auth() -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be calling this on the OpenAI path as well? Does the OpenAI SDK work with the creds path?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split looks intentional — create_bedrock_openai_backend is bearer-token-only so there's nothing for _assert_bedrock_auth to check on that path. Happy for @nrfulton to confirm.

Comment thread mellea/backends/bedrock.py Outdated
Comment on lines +18 to +27
def _assert_region(region: str | None) -> None:
resolved_region = (
region
or os.environ.get("AWS_REGION_NAME")
or os.environ.get("AWS_DEFAULT_REGION")
or os.environ.get("AWS_REGION")
)
assert resolved_region is not None, (
"you must specify a region: pass `region` explicitly or set AWS_REGION_NAME, AWS_DEFAULT_REGION, or AWS_REGION."
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the OpenAI and LiteLLM SDKs check for these same env vars?

Use this instead of `create_bedrock_openai_backend` when you need auth with an AWS_ACCESS_KEY_ID.
"""
_assert_bedrock_auth()
_assert_region(region)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If region is passed in, this assertion could succeed but the region isn't passed into the actual LiteLLM Backend init.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

psschwei's commit addresses this — _resolve_region(region) result is now forwarded as aws_region_name in model_options. @nrfulton to confirm that's the intended fix.

Comment thread mellea/backends/bedrock.py Outdated
Comment on lines +132 to +133
# TODO litellm doesn't even appear to use this...?
backend._base_url = None # type: ignore

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this? or add a comment why setting the _base_url to None is necessary?

Comment thread mellea/backends/bedrock.py Outdated

def create_bedrock_mantle_backend(
def create_bedrock_litellm_backend(
model_id: ModelIdentifier | str, region: str | None = None, num_retries: int = 15

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 seems like a high default for retries. I did a quick search and it seems like Litellm uses exponential waits. I couldn't find a number, but that seems like this might resolve in long hangs.

Comment thread mellea/backends/ollama.py Outdated
# Extract top-level Ollama params that must not be forwarded into `options`.
logprobs = model_opts.pop("logprobs", None)
top_logprobs = model_opts.pop("top_logprobs", None)
MelleaLogger.get_logger().info(f"Tools for call: {tools.keys()}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved back to the indented block above.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to add at least one qualitative test for the litellm code path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Comment thread docs/docs/integrations/bedrock.md Outdated

m = MelleaSession(
backend=create_bedrock_mantle_backend(
backend=create_bedrock_openai_backend(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the litellm path given the import above.

Comment thread docs/docs/integrations/bedrock.md
Comment thread mellea/backends/bedrock.py Outdated
Comment thread mellea/backends/bedrock.py Outdated
Comment thread mellea/backends/bedrock.py

@planetf1 planetf1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things separate from the existing thread. Inline comments above cover a misnamed function in the troubleshooting section, a boto3 import regression for base-install users, assert-as-config-validation (which gets stripped under -O), and the missing deprecation alias.

One more that cannot be pinned inline — docs/docs/integrations/bedrock.md line 45 still has a stray prose reference to create_bedrock_mantle_backend that was not updated by the rename.

psschwei added 2 commits June 4, 2026 18:00
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
@psschwei

psschwei commented Jun 4, 2026

Copy link
Copy Markdown
Member

To try to help move this along, I pushed a commit that attempted to address the outstanding review items from @jakelorocco and @planetf1 . If we're not happy with it, I don't mind reverting either.

@jakelorocco jakelorocco left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paul's changes addressed my concerns.

@psschwei

psschwei commented Jun 5, 2026

Copy link
Copy Markdown
Member

@leothomas we'll need you to sign-off on your commits before we can merge this : https://github.com/generative-computing/mellea/pull/578/checks?check_run_id=79626306036

@leothomas

Copy link
Copy Markdown

Hey @psschwei thank you for pushing this along!

@nrfulton would you be able to grant me write access to your repo (nrfulton/mellea.git) so that I can push the commit signoffs to your PR?

@leothomas

Copy link
Copy Markdown

@nrfulton just following up on being granted collaborator access to your fork so I push the commit signatures - let me know if that's not possible for any reason, I can pull my own fork off of your's, pushed the signed commits there and then make a PR to your branch, but I think the direct access grant would be simpler!

Thank you all again for helping move this forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feat: Native AWS SigV4 (IAM) Support for Bedrock OpenAI-Compatible Endpoints

5 participants