Table of Contents
Overview
In the rapidly evolving world of AI, new language models are constantly emerging, each with its own strengths and specializations. Today, we’re diving into Shisa.AI, a groundbreaking open-source Japanese-English bilingual large language model suite developed by AKA Virtual Inc. in Tokyo, Japan. Officially launched on June 3rd, 2025, with Product Hunt recognition on June 6th, its flagship model Shisa V2 405B represents a watershed moment for Japanese AI—being the highest-performing LLM ever trained in Japan and demonstrating that Japanese AI labs can compete on the global stage.
Key Features
Shisa.AI delivers revolutionary capabilities that establish new standards for Japanese AI:
World-Class Bilingual JA/EN Performance: Shisa V2 405B achieves a Japanese average score of 80.49, surpassing GPT-4 (74.45) and GPT-4 Turbo (76.49) while matching state-of-the-art models like GPT-4o (85.32) and DeepSeek-V3 (82.95) in Japanese benchmarks.
Comprehensive Model Family: Complete lineup from 7B to 405B parameters, with each model setting new state-of-the-art performance in its respective size class:
-
7B-14B models: Apache 2.0 and MIT licenses for unrestricted commercial use
-
32B-70B models: Enterprise-grade performance with extensive context support
-
405B flagship: Japan’s most powerful open-source model rivaling GPT-4o
Advanced Synthetic Data Pipeline: Revolutionary approach using synthetic data-driven post-training optimization instead of expensive continuous pre-training, achieving up to +32% improvement over base models in Japanese tasks.
Open-Source Excellence: Full transparency with models, datasets, and training code released under permissive licenses (Apache 2.0, MIT, Llama community licenses), enabling unrestricted research and commercial deployment.
Specialized Japanese Benchmarks: Includes purpose-built evaluation suites:
-
shisa-jp-ifeval: Advanced Japanese instruction-following
-
shisa-jp-rp-bench: Role-playing and conversational capabilities
-
shisa-jp-tl-bench: High-quality Japanese-English translation
Multilingual CJK Support: 405B model incorporates Korean and Traditional Chinese data, making it truly multilingual for East Asian languages.
How It Works
Shisa.AI leverages cutting-edge fine-tuning methodologies specifically optimized for Japanese language excellence. Built on robust foundation models (Llama 3.1, Qwen 2.5, Mistral, Phi-4), the team developed proprietary datasets through hundreds of experiments and evaluation cycles.
The core innovation lies in the ultra-orca-boros-en-ja-v1 dataset—a meticulously filtered, regenerated, and resampled bilingual dataset that serves as one of the most robust resources for improving Japanese capabilities in any base model. This dataset is freely available under Apache 2.0 license, democratizing high-quality Japanese AI development globally.
Training leverages Japan’s unique copyright environment, which provides explicit legal protections for AI training data usage—giving Japanese developers a significant advantage in data acquisition and model development.
The 405B model required over 50x the compute resources of the 70B version, with computational infrastructure provided by Ubitus K.K. and METI GENIAC, demonstrating significant investment in Japanese AI sovereignty.
Use Cases
Shisa.AI’s exceptional capabilities enable diverse applications across multiple sectors:
Japanese AI Research and Development: Provides researchers with world-class Japanese language models for advancing AI research, with performance validated against international standards.
Enterprise Bilingual Applications: Deploy production-ready chatbots, translation services, and content generation systems that excel in Japanese business contexts while maintaining English proficiency.
Educational Technology: Create sophisticated language learning platforms, automated essay scoring systems, and educational content generation specifically optimized for Japanese learners.
Content Creation and Media: Generate high-quality Japanese content, perform nuanced translation work, and create culturally appropriate marketing materials with unprecedented accuracy.
Government and Public Services: Support Japanese digital transformation initiatives with sovereign AI technology that doesn’t depend on foreign providers for critical language processing capabilities.
Academic Benchmarking: Establish new standards for Japanese LLM evaluation through comprehensive benchmark suites that measure real-world performance rather than generic metrics.
Technical Performance
Benchmark Leadership: Detailed performance comparison using GPT-4.1 as evaluator:
Model | JA Avg | EN Avg | ELYZA 100 | JA MT-Bench | Rakuda |
---|---|---|---|---|---|
GPT-4.1 (2025-04-14) | 88.55 | 78.94 | 9.38 | 9.32 | 9.90 |
GPT-4o (2024-11-20) | 85.32 | 73.34 | 9.30 | 9.52 | 9.88 |
DeepSeek-V3 | 82.95 | 76.52 | 9.08 | 8.30 | 9.28 |
Shisa V2 405B | 80.49 | 71.63 | 8.88 | 9.08 | 9.20 |
GPT-4 Turbo | 76.49 | 68.64 | 8.66 | 8.12 | 8.35 |
Scaling Efficiency: Demonstrable improvement across all model sizes:
-
Llama 3.1 8B → Shisa V2 8B: +32.6% Japanese performance improvement
-
Mistral Nemo 12B → Shisa V2 12B: +24.6% improvement
-
Qwen 2.5 32B → Shisa V2 32B: +15.3% improvement
Pros & Cons
Advantages
-
World-class Japanese performance: Highest-performing LLM ever developed in Japan, matching international flagship models
-
Complete open-source ecosystem: Full model family, datasets, and training code freely available under permissive licenses
-
Sovereign AI leadership: Demonstrates Japan’s capability for independent, world-class AI development
-
Commercial readiness: Apache 2.0 and MIT licensed models enable unrestricted commercial deployment
-
Continuous innovation: Active development with new benchmarks and evaluation frameworks
-
Cost-effective scaling: Synthetic data approach provides superior efficiency compared to traditional pre-training methods
-
Cultural alignment: Developed specifically for Japanese language nuances and cultural context
Disadvantages
-
Substantial compute requirements: 405B model demands significant infrastructure for training and inference
-
East Asian language focus: Primarily optimized for Japanese, English, and limited Korean/Chinese rather than global multilingual support
-
Emerging ecosystem: Newer community compared to established international models like GPT-4 or Claude
-
Resource intensity: Large models require substantial GPU resources for deployment and fine-tuning
How Does It Compare?
Shisa.AI occupies a unique position in the global LLM landscape as a sovereign AI achievement:
vs. GPT-4o: While GPT-4o maintains slight advantages in overall Japanese performance (85.32 vs 80.49), Shisa V2 405B is open-source, commercially deployable, and specifically optimized for Japanese use cases without dependency on foreign providers.
vs. DeepSeek-V3: DeepSeek-V3 offers broader multilingual capabilities, but Shisa V2 provides superior Japanese cultural alignment and is specifically designed for Japanese market needs with full transparency.
vs. Claude/Anthropic Models: Commercial models offer broader capabilities but lack the open-source transparency, Japanese optimization, and sovereign deployment options that Shisa.AI provides.
vs. Previous Japanese Models: Shisa V2 represents a quantum leap—previous Japanese models typically lagged significantly behind international standards, while Shisa V2 directly competes with GPT-4 class performance.
Sovereign AI Impact
Shisa.AI represents more than technological achievement—it demonstrates Sovereign AI capabilities that enable nations to develop models reflecting their own language and culture. Japan’s explicit copyright protections for AI training provide strategic advantages that Shisa.AI successfully leverages.
The project proves that smaller AI labs can achieve world-class results through focused optimization, quality data curation, and specialized training methodologies—challenging the assumption that only large tech corporations can develop frontier AI models.
Getting Started
Accessing Shisa.AI is designed for immediate deployment:
-
Download models: Complete Shisa V2 collection available on Hugging Face
-
Try live demo: Interactive chat at chat.shisa.ai featuring Shisa V2 405B (FP8)
-
Access datasets: Core training datasets available under Apache 2.0 license
-
Commercial deployment: Apache 2.0 and MIT licensed models enable unrestricted business use
-
Research collaboration: Full transparency enables academic research and improvement
Final Thoughts
Shisa.AI represents a paradigm shift in global AI development, proving that focused, culturally-aligned AI research can achieve world-class results outside traditional tech giants. The project’s success in matching GPT-4o performance while maintaining complete open-source transparency establishes new standards for sovereign AI development.
For organizations requiring Japanese language excellence without foreign dependency, Shisa.AI provides unprecedented capabilities. The comprehensive model family ensures options for every deployment scenario, from edge devices (7B) to enterprise-scale applications (405B).
The project’s broader significance extends beyond Japanese AI—it demonstrates how nations can achieve AI sovereignty through strategic focus, quality data development, and innovative training methodologies. As AI becomes increasingly critical for national competitiveness, Shisa.AI serves as a blueprint for independent, world-class AI development.
Whether you’re a researcher, developer, or organization seeking Japanese AI capabilities, Shisa.AI offers a mature, performant, and completely open solution that rivals the best commercial offerings while providing the transparency and control that open-source enables.