feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#718
feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#718fede-kamel wants to merge 33 commits intocohere-ai:mainfrom
Conversation
0e341c1 to
fdebc00
Compare
a284ea8 to
d7c7ef6
Compare
|
@walterbm-cohere @daniel-cohere @billytrend-cohere Hey maintainers, Friendly bump on this PR - would appreciate your feedback when you have a chance. Happy to address any concerns or make changes as needed. Thanks. |
49f92cc to
8b9c63d
Compare
|
Addressed Bugbot feedback:
|
3bf4c54 to
b78c63e
Compare
|
@sanderland Thanks for the approvals on this PR and the others (#717, #698, #697)! What are the next steps to get these merged? |
|
@sanderland quick ping on this one since you approved earlier - could you please take a final look when you have a moment? Thanks! |
1119be8 to
12938af
Compare
fb5c414 to
b3fcdee
Compare
|
@billytrend-cohere @walterbm-cohere @daniel-cohere @sanderland This PR is the canonical Oracle proposal for OCI support in What is in scope on this branch:
Current verification on this branch:
Please treat |
OCI doesn't provide a generation ID in responses. Previously used modelId which is the model name (e.g. 'cohere.command-r-08-2024'), not a unique generation identifier. Now generates a proper UUID.
- Add validation for direct credentials (user_id requires fingerprint and tenancy_id) - Emit message-end event for V2 streaming before [DONE]
a9f125f to
9f1e924
Compare
|
Validation on current PR head OCI test results from this branch:
Models exercised in the passing live runs:
Expected skips in live OCI remain limited to service/model availability constraints:
So on this PR head, all runnable OCI tests pass, and the remaining skips are expected OCI capability or region-availability gaps rather than code failures. |
|
Validation update from
The remaining Live models exercised in the passing runs:
The OCI test file was also trimmed to supported scenarios only, so the live runs no longer depend on permanently skipped coverage for unsupported on-demand generation/rerank or region-specific model availability. |
| "embeddings": embeddings, | ||
| "texts": [], # OCI doesn't return texts | ||
| "meta": meta, | ||
| } |
There was a problem hiding this comment.
Missing response_type discriminant in embed response
Medium Severity
The embed response dict is missing the response_type field required by the EmbedResponse discriminated union. EmbedResponse is an Annotated union with UnionMetadata(discriminant="response_type"), discriminating between EmbeddingsFloatsEmbedResponse (expects "embeddings_floats") and EmbeddingsByTypeEmbedResponse (expects "embeddings_by_type"). Without it, discriminated union resolution in _convert_union_type fails and falls through to undiscriminated matching, which is fragile and could break if the SDK's internal resolution logic changes.
| transformed_content.append(item) | ||
| oci_msg["content"] = transformed_content | ||
| else: | ||
| oci_msg["content"] = msg.get("content", []) |
There was a problem hiding this comment.
Content None not defaulting to empty array
Low Severity
When a V2 chat message has content explicitly set to None (e.g., an assistant message with only tool calls), msg.get("content", []) returns None rather than [] because dict.get only uses the default when the key is absent, not when its value is None. This sends "content": null to OCI instead of an empty array, which may be rejected by the OCI API.
- Remove OciClient (V1 API) class entirely - Remove all is_v2 parameter threading and body-sniffing detection - Remove V1 chat (single message string), V1 stream format, generate/rerank endpoints - OciClientV2 is now the only client, always uses COHEREV2 apiFormat - Add validation: raise ValueError if messages array is missing - Add oci_client.py to .fernignore to survive Fern regeneration - Update __init__.py and lazy_oci_deps.py to reference only OciClientV2 - Update tests: remove V1 test classes, add missing-messages test - All 41 tests pass (11 integration + 30 unit) against LUIGI_FRA_API
| import uuid | ||
|
|
||
| import httpx | ||
| import requests |
There was a problem hiding this comment.
Unconditional requests import adds hidden hard dependency
Low Severity
The requests library is imported unconditionally at the top of oci_client.py, but it's only used inside the map_request_to_oci closure for creating a PreparedRequest for OCI signing. While requests is a project dependency, the OCI client was designed with lazy loading in mind (the OCI SDK uses lazy_oci()), yet requests is eagerly imported when the module loads. This is inconsistent with the lazy-loading pattern used for the OCI SDK itself.
…dation - Add OciClient back for V1 API (Command R, embeddings) - OciClientV2 for V2 API (Command A, COHEREV2 format) - Fail-fast validation: V2 body on V1 client or V1 body on V2 client raises ValueError with clear guidance on which client to use - V1/V2 determined by client class, not body sniffing - Full V1 chat, stream, and response transforms restored - Par to par with BedrockClient/BedrockClientV2 architecture - 47 tests pass (14 integration + 33 unit) against OCI GenAI
Add OCI client exports so they survive Fern regeneration of __init__.py. Follows the same pattern as BedrockClient/SagemakerClient. Related: cohere-ai/cohere-python#718
|
@mkozakov @fern-support This PR is ready for review. Summary of latest changes:
Companion PR for Fern config: cohere-ai/cohere-developer-experience#712 (adds Tracking issue: #735 |
Final test run — 47/47 passedAll tests passing against the live OCI GenAI inference layer ( V1 (OciClient) — Command R family, COHERE apiFormat
V2 (OciClientV2) — Command A family, COHEREV2 apiFormat
Validation tests
Fern safety
|
| ) | ||
| request.url = URL(url) | ||
| request.headers["host"] = request.url.host | ||
| headers["host"] = request.url.host |
There was a problem hiding this comment.
SigV4 signing uses stale host after URL rewrite
High Severity
The line headers["host"] = request.url.host was removed from the request hook. The headers variable is a copy of the original request headers (containing api.cohere.com), made before the URL is rewritten to the Bedrock/SageMaker endpoint. Without updating headers["host"], the AWSRequest used for SigV4 signing receives the stale host, producing an invalid signature. This breaks all Bedrock and SageMaker API calls with authentication errors.
Additional Locations (1)
| if os.environ.get('AWS_DEFAULT_REGION') is None: | ||
| os.environ['AWS_DEFAULT_REGION'] = aws_region | ||
| self._sess = lazy_sagemaker().Session(sagemaker_client=self._service_client) | ||
| self.mode = Mode.SAGEMAKER |
There was a problem hiding this comment.
Legacy AWS client loses Bedrock mode and embed parameters
Medium Severity
The mode parameter was removed from cohere_aws.Client.__init__, hard-coding it to Mode.SAGEMAKER. Additionally, output_dimension and embedding_types were removed from embed(), and dict-typed embedding responses are no longer handled. The PR states "No breaking changes," but existing code using Client(mode=Mode.BEDROCK) or passing embedding_types to embed() will break.
Additional Locations (1)
| self.lines = lines | ||
|
|
||
| def __iter__(self) -> typing.Iterator[bytes]: | ||
| return self.lines |
There was a problem hiding this comment.
Duplicate Streamer class across two files
Low Severity
A new Streamer class in manually_maintained/streaming.py is identical to the existing Streamer class in aws_client.py. Both extend SyncByteStream with the same __init__ and __iter__ implementation. The OCI client imports from the new file while the AWS client keeps its own copy, creating redundant logic that risks diverging over time.
Additional Locations (1)
|
@mkozakov Quick summary of where everything stands — all work is done and tested on our side. 3 PRs, all ready for review:
All code is Fern-safe — zero auto-generated files modified across any of these PRs. Let me know if anything needs adjustment. |


Overview
I noticed that the Cohere Python SDK has excellent integration with AWS Bedrock through the
BedrockClientimplementation. I wanted to contribute a similar integration for Oracle Cloud Infrastructure (OCI) Generative AI service to provide our customers with the same seamless experience.Motivation
Oracle Cloud Infrastructure offers Cohere's models through our Generative AI service, and many of our enterprise customers use both platforms. This integration follows the same architectural pattern as the existing Bedrock client, ensuring consistency and maintainability.
Implementation
This PR adds comprehensive OCI support with:
Features
~/.oci/config)Architecture
Testing
Documentation
Files Changed
src/cohere/oci_client.py(910 lines) - Main OCI client implementationsrc/cohere/manually_maintained/lazy_oci_deps.py(30 lines) - Lazy OCI SDK loadingtests/test_oci_client.py(393 lines) - Comprehensive integration testsREADME.md- OCI usage documentationpyproject.toml- Optional OCI dependencysrc/cohere/__init__.py- Export OciClient and OciClientV2Test Results
Skipped tests are for OCI service limitations (base models not callable via on-demand inference).
Breaking Changes
None. This is a purely additive feature.
Checklist
Note
High Risk
Adds a large new OCI transport layer (signing, request/response rewriting, and streaming event transformation) and also changes the manually-maintained AWS client API/behavior (drops
modesupport and embed params), which could be breaking for existing users.Overview
Adds first-class Oracle Cloud Infrastructure (OCI) support by introducing
OciClient(v1) andOciClientV2(v2) that route Cohereembed/chat/chat_streamcalls through OCI Generative AI via httpx event hooks, including request signing, model-name normalization, and bidirectional payload/stream event transformations.Updates packaging and docs to make OCI optional (
ociextra inpyproject.toml), lazily import the OCI SDK (lazy_oci_deps.py), export the new clients fromcohere.__init__, and document OCI setup/auth methods and limitations inREADME.md.Adjusts AWS integrations by fixing the Bedrock/SageMaker SigV4 host header rewrite in
aws_client.py, and simplifying the manually maintainedcohere_aws.Clientto always initialize in SageMaker mode while narrowingembed()(removesoutput_dimension/embedding_typesand dict-return path); AWS unit tests are removed and replaced with a comprehensivetests/test_oci_client.pysuite (mix of live-gated integration and pure unit transformation tests).Written by Cursor Bugbot for commit d48cca4. This will update automatically on new commits. Configure here.