Skip to content

feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#718

Open
fede-kamel wants to merge 33 commits intocohere-ai:mainfrom
fede-kamel:feat/oci-client
Open

feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#718
fede-kamel wants to merge 33 commits intocohere-ai:mainfrom
fede-kamel:feat/oci-client

Conversation

@fede-kamel
Copy link

@fede-kamel fede-kamel commented Jan 26, 2026

Overview

I noticed that the Cohere Python SDK has excellent integration with AWS Bedrock through the BedrockClient implementation. I wanted to contribute a similar integration for Oracle Cloud Infrastructure (OCI) Generative AI service to provide our customers with the same seamless experience.

Motivation

Oracle Cloud Infrastructure offers Cohere's models through our Generative AI service, and many of our enterprise customers use both platforms. This integration follows the same architectural pattern as the existing Bedrock client, ensuring consistency and maintainability.

Implementation

This PR adds comprehensive OCI support with:

Features

  • OciClient (V1 API) and OciClientV2 (V2 API) classes
  • Full authentication support:
    • Config file (default ~/.oci/config)
    • Custom profiles
    • Direct credentials
    • Instance principal (for OCI compute instances)
    • Resource principal
  • Complete API coverage:
    • Embed (all models: english-v3.0, light-v3.0, multilingual-v3.0)
    • Chat with streaming support (Command R and Command A models)
    • V2 API support with Command A models (command-a-03-2025)
  • Region-independent: Uses display names instead of region-specific OCIDs
  • Automatic V1/V2 API detection and transformation

Architecture

  • Follows the proven BedrockClient pattern with httpx event hooks
  • Request/response transformation between Cohere and OCI formats
  • Lazy loading of OCI SDK as optional dependency
  • Connection pooling for optimal performance

Testing

  • 14 comprehensive integration tests (100% passing)
  • Tests cover: authentication, embed, chat, chat_stream, error handling
  • Multiple model variants tested

Documentation

  • README section with usage examples
  • All authentication methods documented
  • Installation instructions for optional OCI dependency

Files Changed

  • src/cohere/oci_client.py (910 lines) - Main OCI client implementation
  • src/cohere/manually_maintained/lazy_oci_deps.py (30 lines) - Lazy OCI SDK loading
  • tests/test_oci_client.py (393 lines) - Comprehensive integration tests
  • README.md - OCI usage documentation
  • pyproject.toml - Optional OCI dependency
  • src/cohere/__init__.py - Export OciClient and OciClientV2

Test Results

14 passed, 8 skipped, 0 failed

Skipped tests are for OCI service limitations (base models not callable via on-demand inference).

Breaking Changes

None. This is a purely additive feature.

Checklist

  • Code follows repository style (ruff passing)
  • Tests added and passing
  • Documentation updated
  • No breaking changes

Note

High Risk
Adds a large new OCI transport layer (signing, request/response rewriting, and streaming event transformation) and also changes the manually-maintained AWS client API/behavior (drops mode support and embed params), which could be breaking for existing users.

Overview
Adds first-class Oracle Cloud Infrastructure (OCI) support by introducing OciClient (v1) and OciClientV2 (v2) that route Cohere embed/chat/chat_stream calls through OCI Generative AI via httpx event hooks, including request signing, model-name normalization, and bidirectional payload/stream event transformations.

Updates packaging and docs to make OCI optional (oci extra in pyproject.toml), lazily import the OCI SDK (lazy_oci_deps.py), export the new clients from cohere.__init__, and document OCI setup/auth methods and limitations in README.md.

Adjusts AWS integrations by fixing the Bedrock/SageMaker SigV4 host header rewrite in aws_client.py, and simplifying the manually maintained cohere_aws.Client to always initialize in SageMaker mode while narrowing embed() (removes output_dimension/embedding_types and dict-return path); AWS unit tests are removed and replaced with a comprehensive tests/test_oci_client.py suite (mix of live-gated integration and pure unit transformation tests).

Written by Cursor Bugbot for commit d48cca4. This will update automatically on new commits. Configure here.

@fede-kamel
Copy link
Author

fede-kamel commented Jan 26, 2026

@walterbm-cohere @daniel-cohere @billytrend-cohere

Hey maintainers,

Friendly bump on this PR - would appreciate your feedback when you have a chance. Happy to address any concerns or make changes as needed.

Thanks.

@fede-kamel
Copy link
Author

Addressed Bugbot feedback:

  1. V2 streaming ends with wrong event (High) - Now emits message-end event before returning on [DONE]

  2. Direct OCI credentials can crash (Medium) - Added validation: when oci_user_id is provided, oci_fingerprint and oci_tenancy_id are now required with a clear error message

@fede-kamel
Copy link
Author

@sanderland Thanks for the approvals on this PR and the others (#717, #698, #697)! What are the next steps to get these merged?

@fede-kamel
Copy link
Author

@sanderland quick ping on this one since you approved earlier - could you please take a final look when you have a moment? Thanks!

@fede-kamel
Copy link
Author

@billytrend-cohere @walterbm-cohere @daniel-cohere @sanderland

This PR is the canonical Oracle proposal for OCI support in cohere-python, and it now reflects the review feedback that was actionable for the OCI integration scope.

What is in scope on this branch:

  • OciClient and OciClientV2 support for OCI authentication flows
  • embed, chat, and streaming support
  • V2 request/response and streaming handling aligned with the existing SDK surface
  • focused OCI regression coverage and live OCI validation

Current verification on this branch:

  • focused OCI/AWS suite passes locally
  • live OCI validation passes for the runnable cases under the available OCI profiles
  • remaining skips are expected OCI model or region availability gaps, not code failures

Please treat #718 as the official Oracle OCI integration proposal going forward. If there are any remaining blockers to merge, please call them out directly on this PR and I will address them here.

OCI doesn't provide a generation ID in responses. Previously used modelId
which is the model name (e.g. 'cohere.command-r-08-2024'), not a unique
generation identifier. Now generates a proper UUID.
- Add validation for direct credentials (user_id requires fingerprint and tenancy_id)
- Emit message-end event for V2 streaming before [DONE]
@fede-kamel
Copy link
Author

fede-kamel commented Mar 15, 2026

Validation on current PR head 687ef1e8 is complete.

OCI test results from this branch:

  • local OCI suite: PYTHONPATH=src python -m pytest tests/test_oci_client.py -> 29 passed, 26 skipped
  • live OCI in us-chicago-1: 46 passed, 9 skipped
  • live OCI in eu-frankfurt-1: 45 passed, 10 skipped

Models exercised in the passing live runs:

  • embed-english-v3.0
  • embed-multilingual-v3.0
  • embed-english-light-v3.0 in us-chicago-1
  • command-r-08-2024
  • command-a-03-2025

Expected skips in live OCI remain limited to service/model availability constraints:

  • command-a-reasoning-08-2025 availability depends on region/service support
  • rerank on OCI on-demand is not available in the way these tests expect
  • embed-english-light-v3.0 is not available in eu-frankfurt-1 for the tested configuration

So on this PR head, all runnable OCI tests pass, and the remaining skips are expected OCI capability or region-availability gaps rather than code failures.

@fede-kamel
Copy link
Author

Validation update from /Users/federico.kamelhar/Projects/cohere-python:

  • PYTHONPATH=src python -m pytest tests/test_oci_client.py -q -> 31 passed, 16 skipped
  • TEST_OCI=1 ... PYTHONPATH=src python -m pytest tests/test_oci_client.py -q -> 47 passed in one OCI region
  • TEST_OCI=1 ... PYTHONPATH=src python -m pytest tests/test_oci_client.py -q -> 47 passed in a second OCI region

The remaining 16 skipped in the local run are only the live OCI classes gated on TEST_OCI; once enabled, the supported OCI suite is fully green.

Live models exercised in the passing runs:

  • embed-english-v3.0
  • embed-multilingual-v3.0
  • command-r-08-2024
  • command-a-03-2025

The OCI test file was also trimmed to supported scenarios only, so the live runs no longer depend on permanently skipped coverage for unsupported on-demand generation/rerank or region-specific model availability.

"embeddings": embeddings,
"texts": [], # OCI doesn't return texts
"meta": meta,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing response_type discriminant in embed response

Medium Severity

The embed response dict is missing the response_type field required by the EmbedResponse discriminated union. EmbedResponse is an Annotated union with UnionMetadata(discriminant="response_type"), discriminating between EmbeddingsFloatsEmbedResponse (expects "embeddings_floats") and EmbeddingsByTypeEmbedResponse (expects "embeddings_by_type"). Without it, discriminated union resolution in _convert_union_type fails and falls through to undiscriminated matching, which is fragile and could break if the SDK's internal resolution logic changes.

Fix in Cursor Fix in Web

transformed_content.append(item)
oci_msg["content"] = transformed_content
else:
oci_msg["content"] = msg.get("content", [])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content None not defaulting to empty array

Low Severity

When a V2 chat message has content explicitly set to None (e.g., an assistant message with only tool calls), msg.get("content", []) returns None rather than [] because dict.get only uses the default when the key is absent, not when its value is None. This sends "content": null to OCI instead of an empty array, which may be rejected by the OCI API.

Fix in Cursor Fix in Web

- Remove OciClient (V1 API) class entirely
- Remove all is_v2 parameter threading and body-sniffing detection
- Remove V1 chat (single message string), V1 stream format, generate/rerank endpoints
- OciClientV2 is now the only client, always uses COHEREV2 apiFormat
- Add validation: raise ValueError if messages array is missing
- Add oci_client.py to .fernignore to survive Fern regeneration
- Update __init__.py and lazy_oci_deps.py to reference only OciClientV2
- Update tests: remove V1 test classes, add missing-messages test
- All 41 tests pass (11 integration + 30 unit) against LUIGI_FRA_API
import uuid

import httpx
import requests
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unconditional requests import adds hidden hard dependency

Low Severity

The requests library is imported unconditionally at the top of oci_client.py, but it's only used inside the map_request_to_oci closure for creating a PreparedRequest for OCI signing. While requests is a project dependency, the OCI client was designed with lazy loading in mind (the OCI SDK uses lazy_oci()), yet requests is eagerly imported when the module loads. This is inconsistent with the lazy-loading pattern used for the OCI SDK itself.

Fix in Cursor Fix in Web

…dation

- Add OciClient back for V1 API (Command R, embeddings)
- OciClientV2 for V2 API (Command A, COHEREV2 format)
- Fail-fast validation: V2 body on V1 client or V1 body on V2 client
  raises ValueError with clear guidance on which client to use
- V1/V2 determined by client class, not body sniffing
- Full V1 chat, stream, and response transforms restored
- Par to par with BedrockClient/BedrockClientV2 architecture
- 47 tests pass (14 integration + 33 unit) against OCI GenAI
fede-kamel added a commit to fede-kamel/cohere-developer-experience that referenced this pull request Mar 24, 2026
Add OCI client exports so they survive Fern regeneration of __init__.py.
Follows the same pattern as BedrockClient/SagemakerClient.

Related: cohere-ai/cohere-python#718
@fede-kamel
Copy link
Author

@mkozakov @fern-support This PR is ready for review.

Summary of latest changes:

  • Both V1 and V2 clientsOciClient (V1, Command R) and OciClientV2 (V2, Command A), matching the BedrockClient/BedrockClientV2 pattern
  • Fail-fast validation — using the wrong request format with the wrong client raises a clear ValueError (no silent body sniffing)
  • 47 tests passing (14 integration + 33 unit) against the live OCI GenAI inference layer
  • Fern-safeoci_client.py is in .fernignore, lazy deps under manually_maintained/

Companion PR for Fern config: cohere-ai/cohere-developer-experience#712 (adds OciClient/OciClientV2 to additional_init_exports so __init__.py survives regeneration).

Tracking issue: #735

@fede-kamel
Copy link
Author

Final test run — 47/47 passed

All tests passing against the live OCI GenAI inference layer (us-chicago-1):

tests/test_oci_client.py::TestOciClient::test_chat PASSED                [  2%]
tests/test_oci_client.py::TestOciClient::test_chat_stream PASSED         [  4%]
tests/test_oci_client.py::TestOciClient::test_embed PASSED               [  6%]
tests/test_oci_client.py::TestOciClientV2::test_chat_stream_v2 PASSED    [  8%]
tests/test_oci_client.py::TestOciClientV2::test_chat_v2 PASSED           [ 10%]
tests/test_oci_client.py::TestOciClientV2::test_embed_v2 PASSED          [ 12%]
tests/test_oci_client.py::TestOciClientV2::test_embed_with_model_prefix_v2 PASSED [ 14%]
tests/test_oci_client.py::TestOciClientAuthentication::test_config_file_auth PASSED [ 17%]
tests/test_oci_client.py::TestOciClientAuthentication::test_custom_profile_auth PASSED [ 19%]
tests/test_oci_client.py::TestOciClientErrors::test_invalid_model PASSED [ 21%]
tests/test_oci_client.py::TestOciClientErrors::test_missing_compartment_id PASSED [ 23%]
tests/test_oci_client.py::TestOciClientModels::test_command_a_chat PASSED [ 25%]
tests/test_oci_client.py::TestOciClientModels::test_embed_english_v3 PASSED [ 27%]
tests/test_oci_client.py::TestOciClientModels::test_embed_multilingual_v3 PASSED [ 29%]
tests/test_oci_client.py::TestOciClientTransformations (33 unit tests) ALL PASSED [31-100%]

======================= 47 passed in 7.90s ========================

V1 (OciClient) — Command R family, COHERE apiFormat

Test Model Result
test_embed embed-english-v3.0 2x 1024-dim float vectors
test_chat command-r-08-2024 V1 text response
test_chat_stream command-r-08-2024 V1 text-generation stream events

V2 (OciClientV2) — Command A family, COHEREV2 apiFormat

Test Model Result
test_embed_v2 embed-english-v3.0 dict with float_ key
test_chat_v2 command-a-03-2025 V2 message response
test_chat_stream_v2 command-a-03-2025 V2 content-delta SSE events

Validation tests

Test What it proves
test_v2_client_rejects_v1_request OciClientV2 + V1 body → clear ValueError
test_v1_client_rejects_v2_request OciClient + V2 body → clear ValueError

Fern safety

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

There are 6 total unresolved issues (including 3 from previous reviews).

Fix All in Cursor

)
request.url = URL(url)
request.headers["host"] = request.url.host
headers["host"] = request.url.host
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SigV4 signing uses stale host after URL rewrite

High Severity

The line headers["host"] = request.url.host was removed from the request hook. The headers variable is a copy of the original request headers (containing api.cohere.com), made before the URL is rewritten to the Bedrock/SageMaker endpoint. Without updating headers["host"], the AWSRequest used for SigV4 signing receives the stale host, producing an invalid signature. This breaks all Bedrock and SageMaker API calls with authentication errors.

Additional Locations (1)
Fix in Cursor Fix in Web

if os.environ.get('AWS_DEFAULT_REGION') is None:
os.environ['AWS_DEFAULT_REGION'] = aws_region
self._sess = lazy_sagemaker().Session(sagemaker_client=self._service_client)
self.mode = Mode.SAGEMAKER
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Legacy AWS client loses Bedrock mode and embed parameters

Medium Severity

The mode parameter was removed from cohere_aws.Client.__init__, hard-coding it to Mode.SAGEMAKER. Additionally, output_dimension and embedding_types were removed from embed(), and dict-typed embedding responses are no longer handled. The PR states "No breaking changes," but existing code using Client(mode=Mode.BEDROCK) or passing embedding_types to embed() will break.

Additional Locations (1)
Fix in Cursor Fix in Web

self.lines = lines

def __iter__(self) -> typing.Iterator[bytes]:
return self.lines
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate Streamer class across two files

Low Severity

A new Streamer class in manually_maintained/streaming.py is identical to the existing Streamer class in aws_client.py. Both extend SyncByteStream with the same __init__ and __iter__ implementation. The OCI client imports from the new file while the AWS client keeps its own copy, creating redundant logic that risks diverging over time.

Additional Locations (1)
Fix in Cursor Fix in Web

@fede-kamel
Copy link
Author

@mkozakov Quick summary of where everything stands — all work is done and tested on our side.

3 PRs, all ready for review:

  1. cohere-python#718 (this PR) — OCI client with OciClient (V1) + OciClientV2 (V2), 47 tests passing live against OCI GenAI
  2. cohere-python#698 — embed_stream() for memory-efficient embedding, 9 unit tests + 6 e2e validated via OCI
  3. cohere-developer-experience#712 — one-line Fern config change so OciClient/OciClientV2 survive init.py regeneration

All code is Fern-safe — zero auto-generated files modified across any of these PRs. Let me know if anything needs adjustment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants