Skip to content
Guard0
Back to blog
·7 min read·Guard0 Team

Introducing TrustVector: The Open-Source AI Trust Directory

TrustVector.dev provides independent, evidence-based security evaluations for AI models, agents, and MCP servers. 106 systems evaluated and growing.

#Open Source#AI Evaluation#TrustVector#AI Security#MCP
Introducing TrustVector: The Open-Source AI Trust Directory

When you're evaluating a new JavaScript library, you check npm downloads, GitHub stars, security advisories, and maybe the dependency tree. When you're evaluating a database, you look at benchmarks, compliance certifications, and security track records.

But when you're evaluating an AI system—a model you'll integrate into your product, an agent platform you'll deploy in your enterprise, an MCP server you'll connect to your data—what do you check?

Vendor marketing? That one blog post someone wrote? Trust vibes?

We think the AI ecosystem deserves better. That's why we built TrustVector.

106
AI Systems Evaluated
38
Models Assessed
34
MCP Servers Reviewed
* * *

What is TrustVector?

TrustVector.dev is an open-source directory of evidence-based trust assessments for AI systems. We independently evaluate models, agents, and MCP servers across security, privacy, performance, and trust dimensions.

The numbers so far:

  • 106 AI systems evaluated
  • 38 models (Claude, GPT, Gemini, Llama, and more)
  • 34 agents (enterprise platforms, open-source frameworks)
  • 34 MCP servers (official and community)

Each evaluation includes:

  • Security assessment: Vulnerability to injection, data leakage, abuse
  • Privacy analysis: Data handling, retention, third-party sharing
  • Performance benchmarks: Capability, reliability, consistency
  • Trust score: Composite rating for quick comparison
  • Methodology transparency: How we tested and what we found
* * *

Why We Built This

The Trust Gap

Every week, we see announcements of new AI models, agent platforms, and tool integrations. The pace is exhilarating—and terrifying.

Security and engineering teams are being asked to evaluate and approve AI systems faster than ever, often with inadequate information:

  • Vendor documentation focuses on capabilities, not risks
  • Security disclosures are inconsistent or absent
  • Third-party evaluations are rare and often vendor-sponsored
  • There's no standardized way to compare AI system security

This creates a trust gap. Organizations deploy AI systems based on incomplete information, hoping for the best.

iTHE AI TRUST GAP

Security teams are being asked to evaluate AI systems faster than ever — with vendor docs that focus on capabilities, not risks. TrustVector provides the independent, standardized evaluations that the ecosystem has been missing.

One Team's Untested AI Is Another Team's Security Incident

We've seen it happen:

  • A company integrates an AI model without knowing its prompt injection resistance
  • A team deploys an MCP server without understanding its permission model
  • An enterprise adopts an agent platform without assessing data handling

Then something goes wrong. Data leaks. Systems are compromised. Compliance is violated.

The information to prevent these issues often exists—it's just not accessible or standardized.

TrustVector changes that.

* * *

How TrustVector Works

Evaluation Framework

TRUSTVECTOR EVALUATION PIPELINE
AI SystemRed Team TestingPrivacy AnalysisPerformance TestTrust AssessmentTrustVector Score

We evaluate every system across five dimensions:

DimensionWhat We Assess
SecurityPrompt injection resistance, output safety, access controls
PrivacyData handling, retention policies, third-party sharing
PerformanceCapability, reliability, consistency across conditions
TrustTransparency, documentation, responsible disclosure
Operational ExcellenceMonitoring, logging, enterprise readiness

Each dimension gets a score, and we calculate a composite TrustVector Score for quick comparison.

Here is what a sample TrustVector score profile looks like for a well-regarded model. The radar view makes it immediately clear where strengths and weaknesses lie.

SAMPLE TRUSTVECTOR SCORE PROFILE
82Safety68Privacy75Reliability71Fairness88Transparency79Security

Testing Methodology

We don't just read documentation—we test systems:

For Models:

  • Red team testing for prompt injection and jailbreaks
  • Data extraction attempts
  • Output safety evaluation
  • Consistency testing across similar prompts

For Agents:

  • Tool abuse testing
  • Privilege escalation attempts
  • Memory security evaluation
  • Behavioral anomaly analysis

For MCP Servers:

  • Authentication and authorization testing
  • Input validation assessment
  • Permission model analysis
  • Data exposure testing

Transparency

Every evaluation includes:

  • What we tested
  • How we tested it
  • What we found
  • Evidence and artifacts
  • Date and version tested

You can see exactly how we reached our conclusions.

* * *

What's In the Directory

Models

We evaluate models from major providers:

  • Anthropic: Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5
  • OpenAI: GPT-5.4, GPT-5.4-mini, GPT-5.4-nano, o3, o4-mini
  • Google: Gemini 3 Pro, Gemini 3 Flash, Gemini 2.5 Pro
  • Meta: Llama 4, Llama 3.3
  • Mistral: Mistral Large 2, Mixtral 8x22B
  • And many more

For each model, you get security analysis, privacy assessment, and comparison with similar models.

Agents

We evaluate agent platforms across categories:

Enterprise Platforms:

  • Microsoft Copilot Studio
  • Salesforce AgentForce
  • ServiceNow AI Agents

Cloud Provider Agents:

  • AWS Bedrock Agents
  • Google Vertex AI Agents

Open-Source Frameworks:

  • LangChain agents
  • LangGraph agents
  • CrewAI agents
  • AutoGen agents

MCP Servers

MCP is new, but we're building comprehensive coverage:

Official Servers:

  • Anthropic MCP servers
  • Reference implementations

Community Servers:

  • Popular open-source MCP servers
  • Common integrations

We're adding new evaluations every week.

* * *

How to Use TrustVector

Before Deploying an AI System

  1. Search for the system on TrustVector
  2. Review the security and privacy assessment
  3. Check the TrustVector Score against your requirements
  4. Read the detailed findings for specific concerns
  5. Make an informed deployment decision

When Comparing Options

TrustVector lets you compare multiple systems:

  • GPT-5.2 vs Claude Opus 4.5 for your security use case
  • AgentForce vs Copilot Studio for your deployment
  • Different MCP servers for your integration needs

Compare scores and findings side by side.

For Ongoing Monitoring

Bookmark systems you've deployed. We update evaluations when:

  • New versions are released
  • Security issues are discovered
  • Our methodology improves

Stay informed about systems in your environment.

* * *

Contributing to TrustVector

TrustVector is open source and community-driven. There are several ways to contribute:

Submit a System for Evaluation

Know an AI system that should be in TrustVector? Submit it:

  1. Open an issue on our GitHub repository
  2. Provide system details and where to access it
  3. We'll add it to our evaluation queue

Contribute an Evaluation

Want to evaluate a system yourself? We welcome contributions:

  1. Fork the repository
  2. Follow our evaluation methodology
  3. Submit a pull request with your evaluation
  4. We'll review and merge

Report Issues

Found a problem with an evaluation? Discovered new information about a system?

  1. Open an issue with details
  2. We'll investigate and update

Improve the Methodology

Have ideas for better evaluation approaches?

  1. Open a discussion on GitHub
  2. Propose methodology improvements
  3. Help us make evaluations more rigorous
* * *

Supported by Guard0

TrustVector is an open-source project supported by Guard0, the agent accountability platform.

We believe the AI ecosystem needs independent, transparent evaluation. TrustVector is our contribution to that goal—no strings attached.

The project is MIT licensed. Use it, fork it, contribute to it.

* * *

What's Next

We're continuously expanding TrustVector:

Coming Soon:

  • Automated evaluation pipelines for faster coverage
  • API access for programmatic queries
  • Embeddable badges for evaluated systems
  • Integration with CI/CD for deployment gates

On the Roadmap:

  • 500+ systems evaluated by end of 2026
  • Enterprise evaluation requests
  • Custom evaluation criteria
  • Community-driven rankings
* * *

Try It Now

Visit TrustVector.dev to:

  • Browse 106 evaluated AI systems
  • Search for specific models, agents, or MCP servers
  • Compare systems side by side
  • Contribute your own evaluations

And remember: One team's untested AI is another team's security incident.

Don't be that team.

* * *
See Guard0 in action

Live walkthrough of agent discovery, risk scoring, and policy enforcement.

*Key Takeaways
  • TrustVector fills the AI evaluation gap with independent, evidence-based security assessments
  • 106 systems evaluated across models, agents, and MCP servers from all major providers
  • Every evaluation includes transparent methodology — see exactly how we test and what we find
  • Open source and community-driven under MIT license — contributions welcome
  • Free to use with no account required and no vendor lock-in
* * *

Join the Community

TrustVector is just one part of our mission to make AI security accessible. Connect with us:

* * *

Need More Than Evaluation?

TrustVector tells you if an AI system is trustworthy. Guard0 makes your agents accountable — discovering every agent, assessing every risk, and proving every action.

Join the Beta → Get Early Access

* * *

See also AIHEM for hands-on AI security training. For automated security scanning, try g0.

TrustVector is maintained by the Guard0 team and the open-source community. Visit trustvector.dev or contribute on GitHub.

G0
Guard0 Team
Building the future of AI security at Guard0

Get Started

Developers

Try g0 on your codebase

Learn more about g0 →
Self-Serve

Start free on Cloud

Dashboards, AI triage, compliance tracking. Free for up to 5 projects.

Start Free →
Enterprise

Governance at scale

SSO, RBAC, CI/CD gates, self-hosted deployment, SOC2 compliance.