Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Research
  3. Wintermute
  4. Portuguese-Language Foundation Models (Sabiá)

Portuguese-Language Foundation Models (Sabiá)

Maritaca AI's Sabiá-3 LLM matches GPT-4o accuracy on 64 Brazilian professional exams at 3-4x lower cost per token — the first sovereign foundation model for Portuguese.

Geography: Americas · South America · Brazil

Back to WintermuteBack to BrazilView interactive version

Maritaca AI, a Campinas-based startup spun out of Unicamp's computer science department, developed the Sabiá family of large language models specifically trained on Portuguese text. Sabiá-3, released in 2024, achieves accuracy comparable to GPT-4o across 64 Brazilian exams including the OAB (bar exam), ENEM (university entrance), and ENADE (professional certifications) — while costing 3-4x less per token than frontier US models.

The strategic significance is linguistic sovereignty in the AI era. Portuguese is spoken by 260+ million people across Brazil, Portugal, Angola, Mozambique, and other Lusophone nations. Models trained primarily on English text underperform on Portuguese legal, medical, and cultural tasks — a gap that widens for domain-specific applications like legal analysis (the Juru model specializes in Brazilian law). Maritaca AI builds on high-quality Portuguese training datasets curated from Common Crawl with industrial-grade filtering.

Brazil's AI Plan 2024-2028 explicitly targets sovereign AI capability, including foundation models. Maritaca AI represents the private-sector complement: a commercially viable LLM that keeps Portuguese language processing under Brazilian control rather than depending entirely on OpenAI, Google, or Chinese alternatives. The model is being adopted by Brazilian enterprises for customer service, document analysis, and compliance workflows where Portuguese-language accuracy is mission-critical.

TRL
7/9Operational
Impact
2/5
Investment
4/5
Category
Software

Book a research session

Bring this signal into a focused decision sprint with analyst-led framing and synthesis.
Research Sessions