Today we're releasing three things: Oculus, a standalone hybrid-reasoning vision-language model that outperforms systems 10x its size; oceanir-search, semantic search across your analysis history; and Oceanir-Memory, persistent context that learns from every query.
For months, we've been working on a fundamental question: how do you build a vision model that's small enough to run anywhere, smart enough to reason through complex visual tasks, and integrated enough to remember what it's learned?
The answer is Oculus—and the ecosystem we're building around it.
What We're Shipping
01 — Model
Oculus-0.1
Hybrid-reasoning VLM built on the OO1 Architecture. Small model, large-model performance.
02 — Feature
oceanir-search
Semantic search across your entire analysis history using natural language.
03 — Feature
Oceanir-Memory
Persistent memory that learns from every analysis. Cross-session context.
A Small Model That Thinks Big
Oculus is a hybrid-reasoning vision-language model built on the Oceanir-Oculus OO1 Architecture. It's designed to outperform systems 10x larger on visual reasoning and perception tasks, with optimization for running on commodity GPUs or edge devices.
The key insight: instead of scaling parameters, we scale reasoning. Oculus uses structured thinking traces to work through complex visual tasks step-by-step, and perceptive tool calling to zoom and crop on relevant regions automatically.
"The best vision model isn't the largest one—it's the one that knows where to look and how to think about what it sees."
- OO1 Architecture Paper, 2025
from oceanir import Oculus
model = Oculus.from_pretrained("OceanirAI/Oculus-0.1")
# Basic VQA
answer = model.ask("image.jpg", "What is this?")
# With reasoning traces
answer = model.ask("scene.jpg", "Count the people", think=True)
# With focus/zoom for fine details
answer = model.ask("document.jpg", "Read the fine print", focus=True)
# Structured JSON output
result = model.generate(image, prompt="Describe objects", mode="json")How Oculus Works
Oculus introduces two core mechanisms that enable small-model performance at large-model quality:
Thinking Traces
When think=True, Oculus generates structured reasoning wrapped in <think>...</think> tags. This multi-step reasoning dramatically improves performance on counting, spatial relationships, and ambiguous scenes.
Perceptive Focus
When focus=True, Oculus automatically identifies regions of interest and zooms/crops to examine them in detail. This enables fine-grained perception—reading small text, detecting tiny objects, analyzing dense scenes—without increasing model size.
What Oculus Can Do
Oculus brings six core capabilities to visual understanding:
Reasoning via Thinking Traces
Structured reasoning traces for multi-step decisions. Improves small-object understanding and ambiguous spatial tasks.
Perceptive Focus (Zoom & Crop)
Automatic zooming and cropping to focus on relevant regions. Dramatically improves fine-grained perception.
Structured Outputs
Reliable JSON generation for predictable downstream integration. Points, boxes, polygons, and more.
Complex OCR
Enhanced text recognition across cluttered, low-resolution, and distorted regions. Works on documents, diagrams, and dense scenes.
Desktop & UI Understanding
Better performance on UI navigation and everyday workflows. Detects UI elements with types and bounding boxes.
Edge-Ready Architecture
Optimized for commodity GPUs and edge devices. Small model footprint with large-model performance.
Find What You've Seen
Traditional search requires exact matches—dates, filenames, coordinates. oceanir-search works differently. It understands what you're describing and retrieves matching analyses using semantic similarity.
Search your analysis history using natural language:
- "That street with the blue and white tiles in Portugal"
- "The intersection near that pink Art Deco building"
- "Photos from last week that showed rooftop terraces"
- "All analyses where we detected Japanese text"
oceanir-search indexes every analysis automatically—extracted features, detected objects, OCR text, reasoning traces—and makes it all searchable through a single query interface.
Context That Persists
Every geolocation tool we've used has the same limitation: no memory. Analyze an image of a street in Barcelona, and later when you see a similar street, the system processes it like it's the first time. This is wasteful.
Oceanir-Memory changes this. Every analysis you run contributes to a persistent knowledge store that improves future queries:
Previously analyzed locations are recognized immediately. No redundant processing.
Architectural patterns, signage styles, and regional signatures are indexed automatically.
Insights from one session inform the next. Your knowledge graph grows with every analysis.
Memory is encrypted per-user. You can delete everything instantly—no backups, no holds.
Try It Now
Oculus-0.1 is available now for research and non-commercial use. Install the Python SDK and start experimenting:
pip install oceanir
oceanir-search and Oceanir-Memory are available to all Pro and Enterprise users on the Oceanir platform. Your analysis history is automatically indexed—just start searching.
Experience the Full Stack
Oculus, oceanir-search, and Oceanir-Memory are available now. Run Oculus locally or use the full platform with search and memory.

