[ MODEL_01 ]

Oceanir Intelligence
Oculus_v0.1
Unified_Stack

OCULUS

A vision model.
Reasoning and Instruct in one stack.

Oculus is a high-performance vision-language intelligence unit designed for standalone operation. It combines deep logical reasoning with direct instruction following in a singular 8GB architecture.

[ RESEARCH_RELEASE ]

State

Operational

Focus

Designed for offline visual understanding, entity detection, and complex scene reasoning.

[ 02_MODES ]

Unified Logic.

Reasoning and Instruct are not separate models—they are native capabilities of the Oculus stack.

THINK_TRUE

Reasoning Mode

Chain-of-thought analysis for complex scene decomposition and logical visual deduction.

DIRECT_EXEC

Instruct Mode

Direct response generation for VQA, object counting, and high-speed entity detection.

[ 03_ARCHITECTURE ]

Built to
Run Cold.

Total_Size

~8GB

DINOv2-Large

2.3GB

Core semantic feature extraction via Vision Transformer.

SPEC_NODE_01

SigLIP

1.1GB

High-accuracy vision-language alignment module.

SPEC_NODE_02

BLIP-LLM

3.0GB

Unified core for captioning and VQA reasoning.

SPEC_NODE_03

Expert-Heads

1.5GB

Fine-tuned visual question answering weights.

SPEC_NODE_04

Projector

0.8GB

Cross-modal high-dimensional bridge.

SPEC_NODE_05

Detection

0.1GB

Fast-path object localization units.

SPEC_NODE_06

[ 04_SPECIFICATIONS ]

Technical Data.

Memory Req

8GB

VRAM / RAM

Operation

Offline

100% Standalone

Logic

Hybrid

Instruct + COT

Encoders

Dual

DINOv2 + SigLIP

[ 05_QUICK_START ]

SYSTEM_CLIREADY

01$ pip install oceanir

02from oceanir import Oculus

03model = Oculus.from_pretrained("oceanir/oculus")

04answer = model.ask(img, "Verify this scene", think=True)

Oceanir Vision Intelligence

Research Release v0.1.0 — 2026