{
  "session_id": "20260403_030040_296718",
  "model": "claude-opus-4-6",
  "base_url": "https://api.anthropic.com",
  "platform": "telegram",
  "session_start": "2026-04-03T03:00:40.422380",
  "last_updated": "2026-04-03T03:01:43.472156",
  "system_prompt": "You are Hermes Agent, an intelligent AI assistant created by Nous Research. You are helpful, knowledgeable, and direct. You assist users with a wide range of tasks including answering questions, writing and editing code, analyzing information, creative work, and executing actions via your tools. You communicate clearly, admit uncertainty when appropriate, and prioritize being genuinely useful over being verbose unless otherwise directed below. Be targeted and efficient in your exploration and investigations.\n\nConversation started: Friday, April 03, 2026 03:00 AM\nModel: claude-opus-4-6\nProvider: anthropic\n\nYou are on a text messaging communication platform, Telegram. Please do not use markdown as it does not render. You can send media files natively: to deliver a file to the user, include MEDIA:/absolute/path/to/file in your response. Images (.png, .jpg, .webp) appear as photos, audio (.ogg) sends as voice bubbles, and videos (.mp4) play inline. You can also include image URLs in markdown format ![alt](url) and they will be sent as native photos.",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "web_extract",
        "description": "Extract content from web page URLs. Returns page content in markdown format. Also works with PDF URLs (arxiv papers, documents, etc.) — pass the PDF link directly and it converts to markdown text. Pages under 5000 chars return full markdown; larger pages are LLM-summarized and capped at ~5000 chars per page. Pages over 2M chars are refused. If a URL fails or times out, use the browser tool to access it instead.",
        "parameters": {
          "type": "object",
          "properties": {
            "urls": {
              "type": "array",
              "items": {
                "type": "string"
              },
              "description": "List of URLs to extract content from (max 5 URLs per call)",
              "maxItems": 5
            }
          },
          "required": [
            "urls"
          ]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "web_search",
        "description": "Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions.",
        "parameters": {
          "type": "object",
          "properties": {
            "query": {
              "type": "string",
              "description": "The search query to look up on the web"
            }
          },
          "required": [
            "query"
          ]
        }
      }
    }
  ],
  "message_count": 11,
  "messages": [
    {
      "role": "user",
      "content": "Research DeepSeek API and Moonshot.ai (Kimi) API as of April 2026. For each: 1) Current model lineup and pricing, 2) Whether they have OpenAI-compatible endpoints, 3) Best models available (DeepSeek V3, R1, etc. and Kimi K2.5, etc.), 4) Any free tiers or promotional credits. Also research OpenRouter's current free model offerings and whether they still offer free access to new frontier models temporarily. Return detailed findings."
    },
    {
      "role": "assistant",
      "content": "",
      "reasoning": "Let me research all three APIs simultaneously.",
      "finish_reason": "tool_calls",
      "reasoning_details": [
        {
          "signature": "EukBClkIDBgCKkAsxxovnxgN9RvNlFvWVgR0G+fC/YB8XbDJBj8BIDm5kFfAMlWdolhZ7yCXkfTxEZODSf2QhvmKdIg1Ep1Wl5paMg9jbGF1ZGUtb3B1cy00LTY4ABIMaWDVDw/aznglgTNvGgxt+dhCNxpbebWfitMiMNq1Wu6LLvELrsg0rFLLbatitHgzzlGpra3dSoEBAQr71JrXQe5k7zXpNySDDcZHOyo+a7DKXffUJQGN2PBXBTytvIA7Y8hZ2/4g6fJo6aWM80BrP3rapuF/zTzc7aGrNH1PVU3hLi/YmMefOFhSV1EYAQ==",
          "thinking": "Let me research all three APIs simultaneously.",
          "type": "thinking"
        }
      ],
      "tool_calls": [
        {
          "id": "toolu_019T4J5F5AUuxCV2QsWi5GXG",
          "call_id": "toolu_019T4J5F5AUuxCV2QsWi5GXG",
          "response_item_id": "fc_toolu_019T4J5F5AUuxCV2QsWi5GXG",
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"DeepSeek API pricing models 2025 2026\"}"
          }
        },
        {
          "id": "toolu_016gW3fE5q6f1AmKbiabHwey",
          "call_id": "toolu_016gW3fE5q6f1AmKbiabHwey",
          "response_item_id": "fc_toolu_016gW3fE5q6f1AmKbiabHwey",
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"Moonshot.ai Kimi API pricing models 2025 2026\"}"
          }
        },
        {
          "id": "toolu_01Art2mfkvCP1atZw4JTNaub",
          "call_id": "toolu_01Art2mfkvCP1atZw4JTNaub",
          "response_item_id": "fc_toolu_01Art2mfkvCP1atZw4JTNaub",
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"OpenRouter free models 2025 2026 frontier models\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\n  \"success\": true,\n  \"data\": {\n    \"web\": [\n      {\n        \"title\": \"DeepSeek Pricing Explained (2025): Models, Token Costs, and Tiers\",\n        \"url\": \"https://www.juheapi.com/blog/deepseek-pricing-explained-2025-models-token-costs-and-tiers\",\n        \"description\": \"GPT-5 and Claude Sonnet 4 show evident cost advantages via Wisdom Gate. DeepSeek model is entirely free until January 1, 2026 at Wisdom Gate.\",\n        \"position\": 1\n      },\n      {\n        \"title\": \"DeepSeek AI Pricing - Free Chat & Affordable API | 2025 Guide\",\n        \"url\": \"https://deepseek.ai/pricing\",\n        \"description\": \"# DeepSeek AI Pricing - Free Chat & Affordable API | 2025 Guide. [DeepSeek AI](https://deepseek.ai/). [Home](https://deepseek.ai/)Products Compare Resources[Pricing](https://deepseek.ai/pricing). 1.   [Home](https://deepseek.ai/). # DeepSeek AI Pricing. DeepSeek offers the most cost-effective AI API on the market. [Start Free](https://chat.deepseek.com/). *   DeepSeek-V3: $0.14 input / $0.28 output. *   DeepSeek-R1: $0.55 input / $2.19 output. [Get API Key](https://platform.deepseek.com/). [Contact Sales](https://deepseek.com/). ## Price Comparison: DeepSeek vs Competitors. | DeepSeek-V3 | $0.14 | $0.28 |. | DeepSeek-R1 | $0.55 | $2.19 |. ### Is DeepSeek free to use? Yes, DeepSeek offers free access to their chat interface at [chat.deepseek.com](https://chat.deepseek.com/). ### How much does the DeepSeek API cost? DeepSeek API pricing is significantly lower than competitors. DeepSeek-R1 (reasoning model) costs $0.55 per million input tokens and $2.19 per million output tokens. The DeepSeek API costs about 95% less than GPT-4 Turbo while offering comparable performance on many benchmarks. Try DeepSeek Chat for free or get an API key to integrate into your applications. [Try Free Chat](https://chat.deepseek.com/)[View Documentation](https://deepseek.ai/docs).\",\n        \"position\": 2\n      },\n      {\n        \"title\": \"Models & Pricing - DeepSeek API Docs\",\n        \"url\": \"https://api-docs.deepseek.com/quick_start/pricing\",\n        \"description\": \"# Models & Pricing. The prices listed below are in units of per 1M tokens. A token, the smallest unit of text that the model recognizes, can be a word, a number, or even a punctuation mark. We will bill based on the total number of input and output tokens by the model. ## Model Details​. **NOTE: The `deepseek-chat` and `deepseek-reasoner` correspond to the model version DeepSeek-V3.2 (128K context limit), which differs from the APP/WEB version.**. | MODEL | | deepseek-chat | deepseek-reasoner |. | MODEL VERSION | | DeepSeek-V3.2  (Non-thinking Mode) | DeepSeek-V3.2  (Thinking Mode) |. | PRICING | 1M INPUT TOKENS (CACHE HIT) | $0.028 | |. | 1M INPUT TOKENS (CACHE MISS) | $0.28 | |. | 1M OUTPUT TOKENS | $0.42 | |. The expense = number of tokens × price. Product prices may vary and DeepSeek reserves the right to adjust them. We recommend topping up based on your actual usage and regularly checking this page for the most recent pricing information.\",\n        \"position\": 3\n      },\n      {\n        \"title\": \"DeepSeek API Pricing Calculator & Cost Guide (Apr 2026) - CostGoat\",\n        \"url\": \"https://costgoat.com/pricing/deepseek-api\",\n        \"description\": \"## DeepSeek API Cost Calculator - Monthly Pricing. | DeepSeek V3.2 reasoning Thinking mode with Chain-of-Thought deepseek-reasoner | 128K 64K output | 82 | In: $0.15 Out: $0.42 (50% cache hits) | $0.36 |. | DeepSeek V3.2 Non-thinking mode for general tasks deepseek-chat | 128K 8K output | 63 | In: $0.15 Out: $0.42 (50% cache hits) | $0.36 |. The DeepSeek API provides programmatic access to DeepSeek's V3.2 model in two modes: non-thinking (deepseek-chat) for general tasks and thinking mode (deepseek-reasoner) for advanced reasoning. Use deepseek-chat (non-thinking mode) for general tasks with 8K max output. As of September 29, 2025, DeepSeek V3.2 powers both deepseek-chat and deepseek-reasoner with unified pricing: $0.028 cache hit, $0.28 cache miss, $0.42 output per million tokens. DeepSeek V3.2 is one of the most cost-effective APIs available at $0.28/$0.42 per million tokens (cache miss). DeepSeek V3.2 pricing per 1M tokens (unified for both modes): Cache hit $0.028, Cache miss $0.28, Output $0.42.\",\n        \"position\": 4\n      },\n      {\n        \"title\": \"DeepSeek V3.2 Speciale API Pricing 2026 - Price Per Token\",\n        \"url\": \"https://pricepertoken.com/pricing-page/model/deepseek-deepseek-v3.2-speciale\",\n        \"description\": \"Try Price Per Token MCP. Price Per TokenPrice Per Token. # DeepSeek V3.2 Speciale API Pricing 2026. Compare pricing, benchmarks, and providers for DeepSeek V3.2 Speciale. Compare DeepSeek V3.2 Speciale with 0 similar models by price. | DS Deepseek | DeepSeek V3.2 Speciale | $0.400 | $1.200 | 89.6 | 86.3 | 87.1 | 163,840 | Try |. \\\\* Some models use tiered pricing based on prompt length. DeepSeek V3.2 Speciale is available from multiple providers with different pricing and availability. How does DeepSeek V3.2 Speciale perform? Compare DeepSeek V3.2 Speciale's benchmark performance against all models. ## Try DeepSeek V3.2 Speciale. Test DeepSeek V3.2 Speciale directly in your browser. ## Compare DeepSeek V3.2 Speciale. DeepSeek V3.2 Speciale costs $0.000400 per 1,000 input tokens and $0.001200 per 1,000 output tokens. DeepSeek V3.2 Speciale is available from 0 provider(s). We compare DeepSeek V3.2 Speciale with 0 similarly-priced models. ## All Deepseek Models. See pricing for all Deepseek models. #### Models & Pricing.\",\n        \"position\": 5\n      }\n    ]\n  }\n}",
      "tool_call_id": "toolu_019T4J5F5AUuxCV2QsWi5GXG"
    },
    {
      "role": "tool",
      "content": "{\n  \"success\": true,\n  \"data\": {\n    \"web\": [\n      {\n        \"title\": \"Kimi API Pricing Calculator & Cost Guide (Apr 2026) - CostGoat\",\n        \"url\": \"https://costgoat.com/pricing/kimi-api\",\n        \"description\": \"# Kimi API Pricing Calculator & Complete Cost Guide. Calculate Kimi API costs for K2, K2 Thinking, and Moonshot models. ## Kimi API Cost Calculator - Monthly Pricing. Kimi API provides access to Moonshot AI's large language models, including the flagship Kimi K2.5 - a native multimodal model with vision, thinking modes, and agentic capabilities. Choose Kimi K2 for coding and agentic tasks where you need strong performance at budget-friendly prices. Cached tokens cost only $0.15/M vs $0.60/M for K2 models. Kimi's automatic context caching reduces input costs by 75% ($0.15/M vs $0.60/M for K2). Get alerts when cache hit rates drop, when K2 Turbo usage spikes, or when web search costs exceed thresholds. Kimi K2 API is not free but very affordable. Kimi API is fully compatible with OpenAI SDKs. Kimi K2 uses a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters but only 32 billion activated per request. Yes, Kimi API has automatic context caching.\",\n        \"position\": 1\n      },\n      {\n        \"title\": \"Kimi K2 0711 API Pricing 2026 - Costs, Performance & Providers\",\n        \"url\": \"https://pricepertoken.com/pricing-page/model/moonshotai-kimi-k2\",\n        \"description\": \"[![Image 2: Moonshot AI](https://pricepertoken.com/logos/moonshot-color.svg)](https://pricepertoken.com/pricing-page/provider/moonshotai \\\"View all Moonshotai models\\\"). | ![Image 3: Moonshot AI](https://pricepertoken.com/logos/moonshot-color.svg) [Moonshotai](https://pricepertoken.com/pricing-page/provider/moonshotai) | [Kimi K2 0711](https://pricepertoken.com/pricing-page/model/moonshotai-kimi-k2) | $0.550 | $2.200 | 55.6 | 82.4 | 76.6 | 131,000 | [Try](https://pricepertoken.com/playground?model=moonshotai-kimi-k2) Vote |. | ![Image 5: Mistral AI](https://pricepertoken.com/logos/Mistral-Ai--Streamline-Svg-Logos.svg) [Mistral AI](https://pricepertoken.com/pricing-page/provider/mistral-ai) | [Mistral Large 3 2512](https://pricepertoken.com/pricing-page/model/mistral-ai-mistral-large-2512) | $0.500 | $1.500 | — | — | — | 262,144 | [Try](https://pricepertoken.com/playground?model=mistral-ai-mistral-large-2512) Vote |. | ![Image 6: OpenAI](https://pricepertoken.com/logos/openai-svgrepo-com.svg) [OpenAI](https://pricepertoken.com/pricing-page/provider/openai) | [GPT-3.5 Turbo](https://pricepertoken.com/pricing-page/model/openai-gpt-3.5-turbo) | $0.500 | $1.500 | — | 46.2 | 29.7 | 16,385 | [Try](https://pricepertoken.com/playground?model=openai-gpt-3.5-turbo) Vote |. | ![Image 7: Ai21](https://pricepertoken.com/logos/ai21.svg) [Ai21](https://pricepertoken.com/pricing-page/provider/ai21) | [Jamba Instruct](https://pricepertoken.com/pricing-page/model/ai21-jamba-instruct) | $0.500 | $0.700 | — | — | — | 256,000 | [Try](https://pricepertoken.com/playground?model=ai21-jamba-instruct) Vote |. | ![Image 9: OpenAI](https://pricepertoken.com/logos/openai-svgrepo-com.svg) [OpenAI](https://pricepertoken.com/pricing-page/provider/openai) | [Babbage](https://pricepertoken.com/pricing-page/model/openai-babbage) | $0.500 | $0.500 | — | — | — | 2,048 | [Try](https://pricepertoken.com/playground?model=openai-babbage) Vote |. | ![Image 11: Qwen](https://pricepertoken.com/logos/qwen-color.svg) [Qwen](https://pricepertoken.com/pricing-page/provider/qwen) | [Qwen VL Max](https://pricepertoken.com/pricing-page/model/qwen-qwen-vl-max) | $0.520 | $2.080 | — | — | — | 131,072 | [Try](https://pricepertoken.com/playground?model=qwen-qwen-vl-max) Vote |. [Open Cost Calculator](https://pricepertoken.com/pricing-calculator/model/moonshotai-kimi-k2). [Open Playground](https://pricepertoken.com/playground?model=moonshotai-kimi-k2). [View all Moonshotai models →](https://pricepertoken.com/pricing-page/provider/moonshotai). *   [LLM Prices](https://pricepertoken.com/).\",\n        \"position\": 2\n      },\n      {\n        \"title\": \"New Kimi K2 Models & Updated Pricing\",\n        \"url\": \"https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter\",\n        \"description\": \"# New Kimi K2 Models & Updated Pricing. # New Kimi K2 Models & Updated Pricing. [platform](https://platform.moonshot.ai/blog/tags/platform). [Back](https://platform.moonshot.ai/blog). > Two big updates on the Kimi API: we're shipping new models and updating our pricing. ## New Model: `kimi-k2-thinking`[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#new-model-kimi-k2-thinking). Moonshot AI introduces **kimi-k2-thinking**&**kimi-k2-thinking-turbo** for complex reasoning, multi-step instructions, and agent-like tasks. ![Image 1: Kimi K2 Benchmark Chart](https://platform.moonshot.ai/blog/_next/image?url=%2Fblog%2F_next%2Fstatic%2Fmedia%2Fbenchmark_chart.2d6ede91.png&w=3840&q=75). ### Key Features:[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#key-features). ## Performance Benchmarks[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#performance-benchmarks). ![Image 2: Comparison Chart](https://platform.moonshot.ai/blog/_next/image?url=%2Fblog%2F_next%2Fstatic%2Fmedia%2Fcomparison_chart.88cc4947.jpg&w=3840&q=75). The new Kimi K2 models demonstrate superior performance across various benchmarks, positioning them as leading solutions for complex AI tasks. ## Updated Pricing[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#updated-pricing). ![Image 3: Pricing Chart](https://platform.moonshot.ai/blog/_next/image?url=%2Fblog%2F_next%2Fstatic%2Fmedia%2Fpricing_chart.2c14d2d2.png&w=3840&q=75). ### New Pricing Structure:[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#new-pricing-structure). ## Getting Started[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#getting-started). *   **Quick Start Guide**: [platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model (opens in a new tab)](https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model). *   **Technical Blog**: [moonshotai.github.io/Kimi-K2/thinking.html (opens in a new tab)](https://moonshotai.github.io/Kimi-K2/thinking.html). ## Community & Support[](https://platform.moonshot.ai/blog/posts/Kimi_API_Newsletter#community--support). *   **Twitter**: [@Kimi_Moonshot (opens in a new tab)](https://x.com/Kimi_Moonshot). _This newsletter announces the launch of Kimi K2 thinking models with enhanced reasoning capabilities and significantly reduced pricing, making advanced AI more accessible to developers and businesses._. 2026 © Moonshot AI[Console](https://platform.moonshot.ai/console)[Docs](https://platform.moonshot.ai/docs).\",\n        \"position\": 3\n      },\n      {\n        \"title\": \"Moonshot AI Models – Pricing & Specs | Requesty\",\n        \"url\": \"https://www.requesty.ai/models/moonshot\",\n        \"description\": \"kimi-k2.5. 262K, 262K, $0.60, $3.00. ⚡. kimi-k2-thinking. 131K, —, $0.60, $2.50 ... 2026 Requesty Ltd. All rights reserved. All systems operational.\",\n        \"position\": 4\n      },\n      {\n        \"title\": \"Kimi AI Pricing 2026: Plans, Membership Cost & API Token Rates\",\n        \"url\": \"https://kimik2ai.com/pricing/\",\n        \"description\": \"## Kimi AI Pricing (2026): Plans, Tokens, API Costs, and How to Pick the Best Value. Choose a membership for powerful features in the Kimi app, or use token based API billing when you’re building products and automations. | ****Kimi AI** (Moderato)** | **$19/mo** | Design→code + “agentic” tasks | Includes monthly quotas for Deep Research + “OK Computer” + weekly Kimi Code requests; **API fees not included** |. A third-party provider listing for Kimi K2.5 on OpenRouter shows token pricing and context length (useful as a market reference for developers): **$0.50/M input tokens and $2.80/M output tokens** (as displayed on that page at the time of writing). ## Kimi API pricing: how token billing works. Kimi often offers free access in some form depending on region and product surface, but premium features and higher quotas are gated behind membership, and API is typically paid usage. * ****Kimi API**** is **token-based** and varies by provider; market listings like OpenRouter and ArtificialAnalysis provide useful planning snapshots.\",\n        \"position\": 5\n      }\n    ]\n  }\n}",
      "tool_call_id": "toolu_016gW3fE5q6f1AmKbiabHwey"
    },
    {
      "role": "tool",
      "content": "{\n  \"success\": true,\n  \"data\": {\n    \"web\": [\n      {\n        \"title\": \"Models: 'free' - OpenRouter\",\n        \"url\": \"https://openrouter.ai/models/?fmt=cards&q=free&supported_parameters=tools&order=most-popular\",\n        \"description\": \"It is a reasoning model that is incredibly speed efficient even at long contexts.by stepfunJan 29, 2026 256K context$0/M input tokens$0/M output tokens. Trinity-Large-Preview reflects Arcee’s efficiency-first design philosophy, offering a production-oriented frontier model with open weights and permissive licensing suitable for real-world applications and experimentation.by arcee-aiJan 27, 2026 131K context$0/M input tokens$0/M output tokens. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.by arcee-aiDec 1, 2025 131K context$0/M input tokens$0/M output tokens. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.by nvidiaSep 5, 2025 128K context$0/M input tokens$0/M output tokens. *   OpenAI: gpt-oss-120b (free)gpt-oss-120b (free) 1.13B tokens   gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.by openaiAug 5, 2025 131K context$0/M input tokens$0/M output tokens.\",\n        \"position\": 1\n      },\n      {\n        \"title\": \"Models: 'free' | OpenRouter\",\n        \"url\": \"https://openrouter.ai/models/?q=free\",\n        \"description\": \"It is a reasoning model that is incredibly speed efficient even at long contexts.](https://openrouter.ai/stepfun/step-3.5-flash:free)by [stepfun](https://openrouter.ai/stepfun)Jan 29, 2026 256K context$0/M input tokens$0/M output tokens. The model is fully open with open-weights, datasets and recipes so developers can easily customize, optimize, and deploy the model on their infrastructure for maximum privacy and security.](https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free)by [nvidia](https://openrouter.ai/nvidia)Dec 14, 2025 256K context$0/M input tokens$0/M output tokens. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.](https://openrouter.ai/nvidia/nemotron-nano-9b-v2:free)by [nvidia](https://openrouter.ai/nvidia)Sep 5, 2025 128K context$0/M input tokens$0/M output tokens. *   [OpenAI: gpt-oss-120b (free)gpt-oss-120b (free)](https://openrouter.ai/openai/gpt-oss-120b:free) 1.2B tokens   [gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.](https://openrouter.ai/openai/gpt-oss-120b:free)by [openai](https://openrouter.ai/openai)Aug 5, 2025 131K context$0/M input tokens$0/M output tokens.\",\n        \"position\": 2\n      },\n      {\n        \"title\": \"OpenRouter Free Models: All 25 Listed (Mar 2026) - CostGoat\",\n        \"url\": \"https://costgoat.com/pricing/openrouter-free-models\",\n        \"description\": \"| [NVIDIA: Nemotron 3 Super (free)](https://openrouter.ai/nvidia/nemotron-3-super-120b-a12b:free) nvidia/nemotron-3-super-120b-a12b:free | NVIDIA | 262K | Tools |. | [Qwen: Qwen3 Next 80B A3B Instruct (free)](https://openrouter.ai/qwen/qwen3-next-80b-a3b-instruct:free) qwen/qwen3-next-80b-a3b-instruct:free | Qwen | 262K | Tools |. | [Qwen: Qwen3 Coder 480B A35B (free)](https://openrouter.ai/qwen/qwen3-coder:free) qwen/qwen3-coder:free | Qwen | 262K | Tools |. | [NVIDIA: Nemotron 3 Nano 30B A3B (free)](https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free) nvidia/nemotron-3-nano-30b-a3b:free | NVIDIA | 256K | Tools |. | [Free Models Router](https://openrouter.ai/openrouter/free) openrouter/free | OpenRouter | 200K | Vision Tools |. | [OpenAI: gpt-oss-120b (free)](https://openrouter.ai/openai/gpt-oss-120b:free) openai/gpt-oss-120b:free | OpenAI | 131K | Tools |. | [OpenAI: gpt-oss-20b (free)](https://openrouter.ai/openai/gpt-oss-20b:free) openai/gpt-oss-20b:free | OpenAI | 131K | Tools |. | [NVIDIA: Nemotron Nano 12B 2 VL (free)](https://openrouter.ai/nvidia/nemotron-nano-12b-v2-vl:free) nvidia/nemotron-nano-12b-v2-vl:free | NVIDIA | 128K | Vision Tools |. | [NVIDIA: Nemotron Nano 9B V2 (free)](https://openrouter.ai/nvidia/nemotron-nano-9b-v2:free) nvidia/nemotron-nano-9b-v2:free | NVIDIA | 128K | Tools |. | [Meta: Llama 3.3 70B Instruct (free)](https://openrouter.ai/meta-llama/llama-3.3-70b-instruct:free) meta-llama/llama-3.3-70b-instruct:free | Meta | 66K | Tools |. [Gemini API Pricing](https://costgoat.com/pricing/gemini-api)[Claude API Pricing](https://costgoat.com/pricing/claude-api)[Google Veo Pricing](https://costgoat.com/pricing/google-veo)[AI Cost Calculators](https://costgoat.com/ai-cost-calculators)[Replicate API Pricing](https://costgoat.com/)[OpenRouter API Pricing](https://costgoat.com/pricing/openrouter)[OpenRouter Free Models](https://costgoat.com/pricing/openrouter-free-models).\",\n        \"position\": 3\n      },\n      {\n        \"title\": \"Models: ')' | OpenRouter\",\n        \"url\": \"https://openrouter.ai/)\",\n        \"description\": \"It is a reasoning model that is incredibly speed efficient even at long contexts.by stepfunJan 29, 2026 256K context$0/M input tokens$0/M output tokens. *   NVIDIA: Nemotron 3 Super (free)Nemotron 3 Super (free) 593B tokens    Academia (#31) Finance (#22) SEO (#45) Programming (#6) Roleplay (#28)   NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Trinity-Large-Preview reflects Arcee’s efficiency-first design philosophy, offering a production-oriented frontier model with open weights and permissive licensing suitable for real-world applications and experimentation.by arcee-aiJan 27, 2026 131K context$0/M input tokens$0/M output tokens. The model is fully open with open-weights, datasets and recipes so developers can easily customize, optimize, and deploy the model on their infrastructure for maximum privacy and security.by nvidiaDec 14, 2025 256K context$0/M input tokens$0/M output tokens. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.by nvidiaSep 5, 2025 128K context$0/M input tokens$0/M output tokens.\",\n        \"position\": 4\n      },\n      {\n        \"title\": \"Models - OpenRouter\",\n        \"url\": \"https://openrouter.ai/OpenRouter\",\n        \"description\": \"**Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/healer-alpha)by [openrouter](https://openrouter.ai/openrouter)Mar 11, 2026 262K context. Note: All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/aurora-alpha)by [openrouter](https://openrouter.ai/openrouter)Feb 9, 2026 128K context. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/pony-alpha)by [openrouter](https://openrouter.ai/openrouter)Feb 6, 2026 200K context. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/sherlock-dash-alpha)by [openrouter](https://openrouter.ai/openrouter)Nov 15, 2025 1.84M context. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/sherlock-think-alpha)by [openrouter](https://openrouter.ai/openrouter)Nov 15, 2025 1.84M context. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/polaris-alpha)by [openrouter](https://openrouter.ai/openrouter)Nov 6, 2025 256K context.\",\n        \"position\": 5\n      }\n    ]\n  }\n}",
      "tool_call_id": "toolu_01Art2mfkvCP1atZw4JTNaub"
    },
    {
      "role": "assistant",
      "content": "",
      "reasoning": null,
      "finish_reason": "tool_calls",
      "tool_calls": [
        {
          "id": "toolu_0182v9TdhVJArRLFTHS5nR6J",
          "call_id": "toolu_0182v9TdhVJArRLFTHS5nR6J",
          "response_item_id": "fc_toolu_0182v9TdhVJArRLFTHS5nR6J",
          "type": "function",
          "function": {
            "name": "web_extract",
            "arguments": "{\"urls\": [\"https://api-docs.deepseek.com/quick_start/pricing\", \"https://costgoat.com/pricing/kimi-api\", \"https://costgoat.com/pricing/openrouter-free-models\"]}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\n  \"results\": [\n    {\n      \"url\": \"https://api-docs.deepseek.com/quick_start/pricing\",\n      \"title\": \"Models & Pricing - DeepSeek API Docs\",\n      \"content\": \"![DeepSeek API Docs Logo](https://cdn.deepseek.com/platform/favicon.png)\\n![DeepSeek API Docs Logo](https://cdn.deepseek.com/platform/favicon.png)\\n\\n# Models & Pricing\\n\\nThe prices listed below are in units of per 1M tokens. A token, the smallest unit of text that the model recognizes, can be a word, a number, or even a punctuation mark. We will bill based on the total number of input and output tokens by the model.\\n\\n## Model Details[​](#model-details \\\"Direct link to Model Details\\\")\\n\\n**NOTE: The `deepseek-chat` and `deepseek-reasoner` correspond to the model version DeepSeek-V3.2 (128K context limit), which differs from the APP/WEB version.**\\n\\n`deepseek-chat`\\n`deepseek-reasoner`\\n\\n|  |  |  |  |\\n| --- | --- | --- | --- |\\n| MODEL | | deepseek-chat | deepseek-reasoner |\\n| BASE URL | | <https://api.deepseek.com> | |\\n| MODEL VERSION | | DeepSeek-V3.2  (Non-thinking Mode) | DeepSeek-V3.2  (Thinking Mode) |\\n| CONTEXT LENGTH | | 128K | |\\n| MAX OUTPUT | | DEFAULT: 4K MAXIMUM: 8K | DEFAULT: 32K MAXIMUM: 64K |\\n| FEATURES | [Json Output](/guides/json_mode) | ✓ | ✓ |\\n| [Tool Calls](/guides/tool_calls) | ✓ | ✓ |\\n| [Chat Prefix Completion（Beta）](/guides/chat_prefix_completion) | ✓ | ✓ |\\n| [FIM Completion（Beta）](/guides/fim_completion) | ✓ | ✗ |\\n| PRICING | 1M INPUT TOKENS (CACHE HIT) | $0.028 | |\\n| 1M INPUT TOKENS (CACHE MISS) | $0.28 | |\\n| 1M OUTPUT TOKENS | $0.42 | |\\n\\n## Deduction Rules[​](#deduction-rules \\\"Direct link to Deduction Rules\\\")\\n\\nThe expense = number of tokens × price.\\nThe corresponding fees will be directly deducted from your topped-up balance or granted balance, with a preference for using the granted balance first when both balances are available.\\n\\nProduct prices may vary and DeepSeek reserves the right to adjust them. We recommend topping up based on your actual usage and regularly checking this page for the most recent pricing information.\\n\\n![WeChat QRcode](https://cdn.deepseek.com/official_account.jpg)\",\n      \"error\": null\n    },\n    {\n      \"url\": \"https://costgoat.com/pricing/kimi-api\",\n      \"title\": \"Kimi API Pricing Calculator & Cost Guide (Apr 2026) - CostGoat\",\n      \"content\": \"# Kimi API Pricing & Cost Guide (April 2026)\\n\\nKimi API, powered by Moonshot AI, provides OpenAI-compatible access to high-performance models including the flagship K2.5. It features a 1-trillion parameter Mixture-of-Experts (MoE) architecture and built-in context caching.\\n\\n## 💰 Pricing TL;DR\\n*   **K2.5 (Multimodal):** $0.60 (Input) / $3.00 (Output) per 1M tokens.\\n*   **K2 / K2 Thinking:** $0.60 (Input) / $2.50 (Output) per 1M tokens.\\n*   **K2 Turbo:** $1.15 (Input) / $8.00 (Output) per 1M tokens.\\n*   **Context Caching:** $0.10–$0.15 per 1M tokens (**75–83% savings**).\\n*   **Web Search:** $0.005 per call.\\n*   **Bonus:** $5 voucher awarded upon reaching $5 cumulative recharge.\\n\\n---\\n\\n## 📊 Detailed Model Pricing (Per 1M Tokens)\\n\\n| Model | Context | Quality Score | Input Cost | Output Cost |\\n| :--- | :--- | :--- | :--- | :--- |\\n| **Kimi K2.5** | 262K | - | $0.60 | $3.00 |\\n| **Kimi K2 Thinking** | 262K | 80 | $0.60 | $2.50 |\\n| **Kimi K2 Thinking Turbo**| 262K | 80 | $1.15 | $8.00 |\\n| **Kimi K2 0905** | 262K | 61 | $0.60 | $2.50 |\\n| **Kimi K2 Turbo** | 262K | 51 | $1.15 | $8.00 |\\n| **Moonshot V1 (8K)** | 8K | 51 | $0.20 | $2.00 |\\n| **Moonshot V1 (32K)** | 32K | 51 | $1.00 | $3.00 |\\n| **Moonshot V1 (128K)** | 131K | 51 | $2.00 | $5.00 |\\n\\n---\\n\\n## 🚀 Key Features & Technical Specs\\n*   **OpenAI Compatibility:** Drop-in replacement using the `api.moonshot.ai/v1` endpoint. Supports tool calls, JSON mode, and streaming.\\n*   **Automatic Context Caching:** No configuration required. System automatically caches repeated context (e.g., long system prompts), reducing input costs by 75%.\\n*   **K2 Turbo High-Speed Mode:** Optimized for latency; outputs 60–100 tokens/second.\\n*   **Web Search Integration:** Enable via `$web_search` tool. Charges apply per call plus the tokens returned in the search results.\\n\\n---\\n\\n## 📈 Rate Limits & Tiers\\nLimits scale based on your **cumulative recharge** amount:\\n\\n| Tier | Recharge | Concurrency | RPM | TPM |\\n| :--- | :--- | :--- | :--- | :--- |\\n| **Tier 0** | $1 | 1 | 3 | 500K |\\n| **Tier 1** | $10 | 50 | 200 | 2M |\\n| **Tier 2** | $20 | 100 | 500 | 3M |\\n| **Tier 3** | $100 | 200 | 1,000 | 3M |\\n| **Tier 5** | $3,000 | 1,000 | 10,000 | - |\\n\\n*Note: RPM = Requests Per Minute; TPM = Tokens Per Minute. Tier 0 is capped at 1.5M tokens/day; Tier 1+ has no daily cap.*\\n\\n---\\n\\n## 💡 Cost Optimization Tips\\n1.  **Maximize Cache Hits:** Keep system messages and initial context consistent to trigger the $0.15/M cached rate.\\n2.  **Strategic Model Selection:** Use **K2 0905** for general tasks and **K2 Thinking** only for complex reasoning to save on output costs.\\n3.  **Monitor Context Length:** `kimi-latest` auto-selects pricing based on context size. Staying under 8K tokens is significantly cheaper ($0.20 vs $2.00).\\n4.  **Selective Web Search:** Only trigger `$web_search` when real-time data is essential, as search results inflate input token counts.\\n\\n---\\n\\n## ❓ FAQ\\n*   **Is there a free trial?** No, but a $1 minimum recharge activates the account, and a $5 bonus is given after $5 in total recharges.\\n*   **Why is it so cheap?** The MoE architecture only activates 32B parameters per request (out of 1T), reducing compute costs while maintaining high quality.\\n*   **How do I start?** Sign up at `platform.moonshot.ai`, recharge $1, and generate an API key.\",\n      \"error\": null\n    },\n    {\n      \"url\": \"https://costgoat.com/pricing/openrouter-free-models\",\n      \"title\": \"OpenRouter Free Models: All 25 Listed (Mar 2026) - CostGoat\",\n      \"content\": \"# OpenRouter Free Models: Comprehensive Guide (April 2026)\\n\\nOpenRouter provides a selection of AI models at zero cost, requiring no credit card or upfront payment. These models are subsidized by OpenRouter and are accessible via an OpenAI-compatible API.\\n\\n## 📊 Complete List of Free Models (28)\\n\\n| Model ID | Provider | Context | Key Capabilities |\\n| :--- | :--- | :--- | :--- |\\n| `google/lyria-3-pro-preview` | Google | 1.0M | Vision |\\n| `google/lyria-3-clip-preview` | Google | 1.0M | Vision |\\n| `qwen/qwen3.6-plus-preview:free` | Qwen | 1.0M | Tools |\\n| `nvidia/nemotron-3-super-120b-a12b:free` | NVIDIA | 262K | Tools |\\n| `qwen/qwen3-next-80b-a3b-instruct:free` | Qwen | 262K | Tools |\\n| `qwen/qwen3-coder:free` | Qwen | 262K | Tools |\\n| `stepfun/step-3.5-flash:free` | StepFun | 256K | Tools |\\n| `nvidia/nemotron-3-nano-30b-a3b:free` | NVIDIA | 256K | Tools |\\n| `openrouter/free` | OpenRouter | 200K | Vision, Tools |\\n| `minimax/minimax-m2.5:free` | Minimax | 197K | Tools |\\n| `arcee-ai/trinity-mini:free` | Arcee AI | 131K | Tools |\\n| `openai/gpt-oss-120b:free` | OpenAI | 131K | Tools |\\n| `openai/gpt-oss-20b:free` | OpenAI | 131K | Tools |\\n| `z-ai/glm-4.5-air:free` | Z.ai | 131K | Tools |\\n| `google/gemma-3-27b-it:free` | Google | 131K | Vision |\\n| `meta-llama/llama-3.2-3b-instruct:free` | Meta | 131K | - |\\n| `nousresearch/hermes-3-llama-3.1-405b:free` | Nous | 131K | - |\\n| `arcee-ai/trinity-large-preview:free` | Arcee AI | 131K | Tools |\\n| `nvidia/nemotron-nano-12b-v2-vl:free` | NVIDIA | 128K | Vision, Tools |\\n| `nvidia/nemotron-nano-9b-v2:free` | NVIDIA | 128K | Tools |\\n| `meta-llama/llama-3.3-70b-instruct:free` | Meta | 66K | Tools |\\n| `liquid/lfm-2.5-1.2b-thinking:free` | LiquidAI | 33K | Reasoning |\\n| `liquid/lfm-2.5-1.2b-instruct:free` | LiquidAI | 33K | - |\\n| `cognitivecomputations/dolphin-mistral-24b-venice-edition:free` | Venice | 33K | Uncensored |\\n| `google/gemma-3-4b-it:free` | Google | 33K | Vision |\\n| `google/gemma-3-12b-it:free` | Google | 33K | Vision |\\n| `google/gemma-3n-e2b-it:free` | Google | 8K | - |\\n| `google/gemma-3n-e4b-it:free` | Google | 8K | - |\\n\\n## ⚡ Rate Limits & Usage\\nFree models are subject to specific usage caps. Exceeding these results in a `429` error.\\n*   **Requests per Minute:** 20\\n*   **Requests per Day:** 200\\n*   **Cost:** $0 (No credit card required)\\n\\n> **Note:** Limits apply per-model. If you hit a limit on one model, you may still be able to use another free model until its specific limit is reached.\\n\\n## 🏆 Best Models by Use Case\\n\\n*   **Coding:** `Qwen3 Coder 480B` (Top choice), `DeepSeek R1`, `Llama 3.3 70B`.\\n*   **Reasoning:** `DeepSeek R1`, `Qwen3 235B Thinking`, `Qwen3 Next 80B`.\\n*   **Vision & Multimodal:** `Qwen3 VL 235B Thinking`, `NVIDIA Nemotron Nano 12B VL`.\\n*   **General Purpose:** `Llama 3.3 70B`, `Mistral Small 3.1 24B`, `Gemma 3 27B`.\\n\\n## 🛠️ Implementation Guide\\nTo use these models, point your OpenAI SDK to the OpenRouter endpoint:\\n\\n*   **Base URL:** `https://openrouter.ai/api/v1`\\n*   **Model ID:** Must end in `:free` (e.g., `qwen/qwen3-coder:free`)\\n\\n### Production Warning\\n> \\\"You can [use free models in production], but it's not recommended... Free models may be removed or have limits adjusted without notice. For production use, add credits... and use paid model variants for guaranteed availability.\\\"\\n\\n## ❓ FAQ Highlights\\n*   **Are they really free?** Yes, there are no hidden fees or API key charges.\\n*   **Do I need an account?** Yes, an OpenRouter account and API key are required.\\n*   **How do I handle rate limits?** Limits reset daily. For higher throughput, you must switch to paid variants by adding credits to your account.\",\n      \"error\": null\n    }\n  ]\n}",
      "tool_call_id": "toolu_0182v9TdhVJArRLFTHS5nR6J"
    },
    {
      "role": "assistant",
      "content": "",
      "reasoning": null,
      "finish_reason": "tool_calls",
      "tool_calls": [
        {
          "id": "toolu_01CAXiy3EVyVWHPHZ4N7ykTV",
          "call_id": "toolu_01CAXiy3EVyVWHPHZ4N7ykTV",
          "response_item_id": "fc_toolu_01CAXiy3EVyVWHPHZ4N7ykTV",
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"DeepSeek API OpenAI compatible endpoint free credits 2025 2026\"}"
          }
        },
        {
          "id": "toolu_01Ag2zd76aiuCMe6XUmxpXoP",
          "call_id": "toolu_01Ag2zd76aiuCMe6XUmxpXoP",
          "response_item_id": "fc_toolu_01Ag2zd76aiuCMe6XUmxpXoP",
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"OpenRouter free frontier models temporary promotion new model launch\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\n  \"success\": true,\n  \"data\": {\n    \"web\": [\n      {\n        \"title\": \"DeepSeek Free Plans, Trials, and Subscriptions: Token Grants ...\",\n        \"url\": \"https://www.datastudios.org/post/deepseek-free-plans-trials-and-subscriptions-token-grants-usage-caps-and-api-pricing-strategies\",\n        \"description\": \"# DeepSeek Free Plans, Trials, and Subscriptions: Token Grants, Usage Caps, and API Pricing Strategies for Late 2025/2026. This article details every available DeepSeek plan—free chat, multimillion-token API trials, pay-as-you-go billing, volume discounts, and best practices for maximizing no-cost and low-cost AI in late 2025/2026. Anyone can use DeepSeek Chat on web or mobile to access the latest public model (currently DeepSeek V3.2), sending unlimited messages and uploading files within daily reset quotas. ## **Every new DeepSeek API account receives a 5-million-token grant and a time-boxed trial window without requiring a credit card.**. When developers register for the API, DeepSeek credits each new account with **5 million free tokens** (roughly $8.40 value) valid for **30 days**. With its combination of unlimited free chat, multi-million-token API trials, discounted pay-as-you-go rates, and flexible enterprise features, DeepSeek is now an attractive platform for everyone from students to startups to global enterprises.\",\n        \"position\": 1\n      },\n      {\n        \"title\": \"DeepSeek Free Tier 2026 — Free Models, Credits & Limits\",\n        \"url\": \"https://pricepertoken.com/endpoints/deepseek/free\",\n        \"description\": \"# DeepSeek Free Tier. DeepSeek gives 5 million free tokens to new users on sign-up. Get the free PDF guide — practical tips to cut your token usage and API costs. ## Free Tier Details. Free Credits5 million tokens for new accounts. Models AvailableDeepSeek V3, R1, and all DeepSeek models. DeepSeek also offers free unlimited access through chat.deepseek.com. After free credits, pricing is very competitive: $0.28/M input (cache miss), $0.42/M output for DeepSeek V3.2. Yes, DeepSeek offers free access. DeepSeek gives 5 million free tokens to new users on sign-up. No. DeepSeek also offers free unlimited access through chat.deepseek.com. After free credits, pricing is very competitive: $0.28/M input (cache miss), $0.42/M output for DeepSeek V3.2. DeepSeek provides 5M tokens in free credits for new accounts. DeepSeek also offers free unlimited access through chat.deepseek.com. After free credits, pricing is very competitive: $0.28/M input (cache miss), $0.42/M output for DeepSeek V3.2. * Best LLM for GitHub Copilot. * Best LLM for Continue.dev.\",\n        \"position\": 2\n      },\n      {\n        \"title\": \"Complete Free Guide 2026 | DeepSeek R1 & V3 Tutorial - YUV.AI\",\n        \"url\": \"https://yuv.ai/learn/deepseek\",\n        \"description\": \"With models like DeepSeek V3 for general tasks, R1 for reasoning, and Coder for programming, it's become one of the most popular AI tools in 2026. * **Strong Coding** - DeepSeek Coder excels at programming tasks. from openai import OpenAI # DeepSeek uses OpenAI-compatible API - we can use our existing code! client = OpenAI( api_key=\\\"your-deepseek-api-key\\\", base_url=\\\"https://api.deepseek.com\\\" ) # Use DeepSeek V3 for general tasks response = client.chat.completions.create( model=\\\"deepseek-chat\\\", # V3 model messages=[ {\\\"role\\\": \\\"system\\\", \\\"content\\\": \\\"You are a helpful assistant.\\\"}, {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Explain quantum computing in simple terms\\\"} ] ) print(response.choices[0].message.content). ## DeepSeek for Coding: Complete Guide. DeepSeek Coder is one of the best free coding assistants available. We can paste our code and ask DeepSeek to review it. For sensitive data, we recommend running DeepSeek locally using Ollama or LM Studio for complete privacy. DeepSeek R1 is a reasoning-focused model that shows its thinking process step-by-step, similar to OpenAI's o1.\",\n        \"position\": 3\n      },\n      {\n        \"title\": \"Every Free AI API in 2026: The Complete Guide to Zero-Cost Inference\",\n        \"url\": \"https://awesomeagents.ai/tools/free-ai-inference-providers-2026/\",\n        \"description\": \"| **Fireworks AI** | Llama 3.1 405B, DeepSeek R1 | 10 RPM (free) | Limited without CC | No |. Mistral offers a free \\\"Experiment\\\" tier on their La Plateforme API that includes access to all Mistral models: Large, Small, Codestral, Pixtral 12B, and even their embedding and OCR models. The rate limits on the free tier are restrictive at 2 requests per minute, but you get 1 billion tokens per month, and the model quality is excellent. **Limits:** Rate-limited, cold starts on unpopular models, limited to models under ~10B parameters on free tier. SambaNova offers a truly persistent free tier, not just credits, with access to Llama 3.3 70B, Llama 3.1 (up to 405B), Qwen 2.5 72B, and other models on their custom RDU hardware. Together AI doesn't offer a free tier: they require a minimum $5 credit purchase to get started. **Best for coding tasks:** GitHub Models for quick testing, or Google AI Studio plus OpenCode for a full free coding assistant setup.\",\n        \"position\": 4\n      },\n      {\n        \"title\": \"[Free] Unlimited access to Deepseek AI models until 2026 + 55 ...\",\n        \"url\": \"https://www.reddit.com/r/juheapi/comments/1n1bjgu/free_unlimited_access_to_deepseek_ai_models_until/\",\n        \"description\": \"[Free] Unlimited access to Deepseek AI models until 2026 + 55 Million free tokens for other models (GPT-5, Claude, Gemini). Wisdom Gate is\",\n        \"position\": 5\n      }\n    ]\n  }\n}",
      "tool_call_id": "toolu_01CAXiy3EVyVWHPHZ4N7ykTV"
    },
    {
      "role": "tool",
      "content": "{\n  \"success\": true,\n  \"data\": {\n    \"web\": [\n      {\n        \"title\": \"Free Models Router - API Pricing & Providers - OpenRouter\",\n        \"url\": \"https://openrouter.ai/openrouter/free\",\n        \"description\": \"**Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/hunter-alpha). **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/healer-alpha). **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.](https://openrouter.ai/openrouter/polaris-alpha). Note: It’s free to use during this testing period, and prompts and completions are logged by the model creator for feedback and training.](https://openrouter.ai/openrouter/sonoma-dusk-alpha). Note: It’s free to use during this testing period, and prompts and completions are logged by the model creator for feedback and training.](https://openrouter.ai/openrouter/sonoma-sky-alpha). This is an improved version of Horizon Alpha Note: It’s free to use during this testing period, and prompts and completions are logged by the model creator for feedback and training.](https://openrouter.ai/openrouter/horizon-beta). Note: It’s free to use during this testing period, and prompts and completions are logged by the model creator for feedback and training.](https://openrouter.ai/openrouter/horizon-alpha).\",\n        \"position\": 1\n      },\n      {\n        \"title\": \"Qwen 3.6 Plus Preview Launches on OpenRouter with Free 1M ...\",\n        \"url\": \"https://gentic.news/article/qwen-3-6-plus-preview-launches-on\",\n        \"description\": \"Qwen 3.6 Plus Preview Launches on OpenRouter with Free 1M Token Context, Disrupting API Pricing. # Qwen 3.6 Plus Preview Launches on OpenRouter with Free 1M Token Context, Disrupting API Pricing. Alibaba's Qwen team has released a preview of Qwen 3.6 Plus on OpenRouter with a 1 million token context window, charging $0 for both input and output tokens. Qwen 3.6 Plus Preview Launches on OpenRouter with Free 1M Token Context, Disrupting API Pricing. The launch is notable for its aggressive pricing: the model offers a **1 million token context window** with **$0 cost for both input and output tokens** during the preview period. The Qwen 3.6 Plus preview is currently accessible via OpenRouter's API, allowing developers to experiment with long-context tasks—such as document analysis, codebase reasoning, and long-form content creation—at zero inference cost. Applications built assuming a permanently free or ultra-cheap long-context API may face sudden cost inflation if the preview ends abruptly or if pricing changes are severe. #model launch#commercial apis#open source ai#ai pricing.\",\n        \"position\": 2\n      },\n      {\n        \"title\": \"Free AI Models on OpenRouter\",\n        \"url\": \"https://openrouter.ai/collections/free-models\",\n        \"description\": \"[Skip to content](https://openrouter.ai/collections/free-models#skip). *   [Models](https://openrouter.ai/models). [Collections](https://openrouter.ai/collections)/Free Models. by[stepfun](https://openrouter.ai/stepfun)256K context$0/M input tokens$0/M output tokens. [![Image 3: Favicon for nvidia](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://nvidia.com/&size=256) ### NVIDIA: Nemotron 3 Super (free)](https://openrouter.ai/nvidia/nemotron-3-super-120b-a12b:free). by[nvidia](https://openrouter.ai/nvidia)262K context$0/M input tokens$0/M output tokens. [![Image 4: Favicon for arcee-ai](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://www.arcee.ai&size=256) ### Arcee AI: Trinity Large Preview (free)](https://openrouter.ai/arcee-ai/trinity-large-preview:free). [![Image 6: Favicon for nvidia](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://nvidia.com/&size=256) ### NVIDIA: Nemotron 3 Nano 30B A3B (free)](https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free). by[nvidia](https://openrouter.ai/nvidia)256K context$0/M input tokens$0/M output tokens. [![Image 8: Favicon for nvidia](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://nvidia.com/&size=256) ### NVIDIA: Nemotron Nano 12B 2 VL (free)](https://openrouter.ai/nvidia/nemotron-nano-12b-v2-vl:free). by[nvidia](https://openrouter.ai/nvidia)128K context$0/M input tokens$0/M output tokens. [![Image 9: Favicon for nvidia](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://nvidia.com/&size=256) ### NVIDIA: Nemotron Nano 9B V2 (free)](https://openrouter.ai/nvidia/nemotron-nano-9b-v2:free). by[nvidia](https://openrouter.ai/nvidia)128K context$0/M input tokens$0/M output tokens. [![Image 10: Favicon for minimax](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://minimaxi.com/&size=256) ### MiniMax: MiniMax M2.5 (free)](https://openrouter.ai/minimax/minimax-m2.5:free). by[minimax](https://openrouter.ai/minimax)197K context$0/M input tokens$0/M output tokens. by[qwen](https://openrouter.ai/qwen)262K context$0/M input tokens$0/M output tokens. by[qwen](https://openrouter.ai/qwen)262K context$0/M input tokens$0/M output tokens. by[meta-llama](https://openrouter.ai/meta-llama)66K context$0/M input tokens$0/M output tokens. [![Image 14: Favicon for liquid](https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://www.liquid.ai/&size=256) ### LiquidAI: LFM2.5-1.2B-Thinking (free)](https://openrouter.ai/liquid/lfm-2.5-1.2b-thinking:free). by[liquid](https://openrouter.ai/liquid)33K context$0/M input tokens$0/M output tokens. by[openai](https://openrouter.ai/openai)131K context$0/M input tokens$0/M output tokens. by[openai](https://openrouter.ai/openai)131K context$0/M input tokens$0/M output tokens.\",\n        \"position\": 3\n      },\n      {\n        \"title\": \"What are the best free openrouter models : r/SillyTavernAI - Reddit\",\n        \"url\": \"https://www.reddit.com/r/SillyTavernAI/comments/1rgdd52/what_are_the_best_free_openrouter_models/\",\n        \"description\": \"Openrouter can see and share your prompts publicly and train models on them. Account->settings->privacy and guardrails. These have to be on if\",\n        \"position\": 4\n      },\n      {\n        \"title\": \"Models - OpenRouter\",\n        \"url\": \"https://openrouter.ai/models\",\n        \"description\": \"[Text](https://openrouter.ai/models?fmt=cards&input_modalities=text). [Image](https://openrouter.ai/models?fmt=cards&input_modalities=image). [File](https://openrouter.ai/models?fmt=cards&input_modalities=file). [Audio](https://openrouter.ai/models?fmt=cards&input_modalities=audio). [Video](https://openrouter.ai/models?fmt=cards&input_modalities=video). [Text](https://openrouter.ai/models?fmt=cards&output_modalities=text). [Image](https://openrouter.ai/models?fmt=cards&output_modalities=image). [Audio](https://openrouter.ai/models?fmt=cards&output_modalities=audio). [Video](https://openrouter.ai/models?fmt=cards&output_modalities=video). [Embeddings](https://openrouter.ai/models?fmt=cards&output_modalities=embeddings). Note: The model collects prompt and completion data that can be used to improve the model.](https://openrouter.ai/qwen/qwen3.6-plus-preview:free)by [qwen](https://openrouter.ai/qwen)Mar 30, 2026 1M context$0/M input tokens$0/M output tokens. It builds on the agentic coding strengths of earlier versions, with a focus on large-scale production environments, multi-system coordination, and seamless integration across modern software stacks, while also supporting web aesthetics generation to produce production-grade landing pages and presentation decks.](https://openrouter.ai/kwaipilot/kat-coder-pro-v2)by [kwaipilot](https://openrouter.ai/kwaipilot)Mar 27, 2026 256K context$0.30/M input tokens$1.20/M output tokens. This model is optimized specifically to deliver industry-leading performance in image understanding, video analysis, object detection, and agentic tool-use.](https://openrouter.ai/reka/reka-edge)by [rekaai](https://openrouter.ai/rekaai)Mar 20, 2026 16K context$0.10/M input tokens$0.10/M output tokens. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.](https://openrouter.ai/xiaomi/mimo-v2-omni)by [xiaomi](https://openrouter.ai/xiaomi)Mar 18, 2026 262K context$0.40/M input tokens$2/M output tokens.\",\n        \"position\": 5\n      }\n    ]\n  }\n}",
      "tool_call_id": "toolu_01Ag2zd76aiuCMe6XUmxpXoP"
    },
    {
      "role": "assistant",
      "content": "Here is my detailed research summary:\n\n---\n\nDEEPSEEK API (as of April 2026)\n\n1) Model Lineup & Pricing:\n- Both API endpoints now run DeepSeek-V3.2 (128K context)\n- \"deepseek-chat\" = V3.2 Non-thinking Mode (max 8K output)\n- \"deepseek-reasoner\" = V3.2 Thinking Mode (max 64K output)\n- UNIFIED pricing for both:\n  -- Input (cache hit): $0.028/M tokens\n  -- Input (cache miss): $0.28/M tokens\n  -- Output: $0.42/M tokens\n- There's also a \"DeepSeek V3.2 Speciale\" variant at $0.40 input / $1.20 output\n\n2) OpenAI-Compatible: YES. Base URL is https://api.deepseek.com - works as a drop-in replacement with the OpenAI Python SDK. Supports tool calls, JSON mode, streaming, FIM completion.\n\n3) Best Models: DeepSeek-V3.2 is the current flagship powering both endpoints. DeepSeek-R1 (reasoning) is still available. DeepSeek Coder also exists for programming tasks.\n\n4) Free Tier: 5 million free tokens for new API accounts (valid 30 days, no credit card needed). Unlimited free chat at chat.deepseek.com.\n\n---\n\nMOONSHOT.AI / KIMI API (as of April 2026)\n\n1) Model Lineup & Pricing (per 1M tokens):\n- Kimi K2.5 (multimodal, 262K ctx): $0.60 in / $3.00 out\n- Kimi K2 Thinking (262K): $0.60 in / $2.50 out\n- Kimi K2 Thinking Turbo (262K): $1.15 in / $8.00 out\n- Kimi K2 0905 (262K): $0.60 in / $2.50 out\n- Kimi K2 Turbo (262K): $1.15 in / $8.00 out\n- Moonshot V1 legacy models (8K/32K/128K): $0.20-$2.00 in / $2.00-$5.00 out\n- Context caching available at $0.10-0.15/M (75-83% savings)\n- Web search: $0.005 per call\n\n2) OpenAI-Compatible: YES. Endpoint: api.moonshot.ai/v1 - fully compatible with OpenAI SDKs, supports tool calls, JSON mode, streaming.\n\n3) Best Models: Kimi K2.5 is the flagship (native multimodal with vision, thinking, agentic capabilities). K2 Thinking is best for complex reasoning. Architecture: 1 trillion total parameters MoE, 32B activated per request.\n\n4) Free Tier: No free trial per se, but $1 minimum recharge activates the account, and a $5 bonus voucher is given after $5 cumulative recharge.\n\n---\n\nOPENROUTER FREE MODELS (as of April 2026)\n\nCurrently ~28 free models available at $0/M input and $0/M output tokens. Notable ones:\n\n- OpenAI gpt-oss-120b (free) - 117B MoE, 131K context\n- OpenAI gpt-oss-20b (free) - 131K context\n- Qwen 3.6 Plus Preview (free) - 1M context (launched March 2026!)\n- Qwen3 Coder 480B A35B (free) - 262K context\n- Qwen3 Next 80B A3B (free) - 262K context\n- NVIDIA Nemotron 3 Super 120B (free) - 262K context\n- StepFun Step 3.5 Flash (free) - 256K context\n- Meta Llama 3.3 70B (free) - 66K context\n- Google Gemma 3 27B (free) - 131K context, vision\n- MiniMax M2.5 (free) - 197K context\n- Multiple Google Gemma and NVIDIA Nemotron variants\n\nFree model promotion pattern: YES, OpenRouter continues to offer free access to new/preview models temporarily. Recent example: Qwen 3.6 Plus Preview launched March 30, 2026 with free 1M context. They also have their own experimental models (hunter-alpha, healer-alpha, polaris-alpha, etc.) that are free during testing periods.\n\nRate limits for free models: 20 requests/minute, 200 requests/day per model. No credit card required.\n\nOpenRouter uses OpenAI-compatible API at https://openrouter.ai/api/v1\n\n---\n\nKEY TAKEAWAYS FOR COST-CONSCIOUS USAGE:\n- DeepSeek is the cheapest mainstream API ($0.28/$0.42 per M tokens, with cache hits at $0.028)\n- Kimi K2.5 is competitive for multimodal at $0.60/$3.00\n- Both DeepSeek and Kimi are fully OpenAI-compatible (drop-in SDK replacement)\n- OpenRouter offers 28 free models including some very capable ones (Qwen3 Coder 480B, gpt-oss-120b, Nemotron 3 Super)\n- OpenRouter continues its pattern of offering frontier model previews for free temporarily",
      "reasoning": "I now have comprehensive information on all three services. Let me compile the findings.",
      "finish_reason": "stop",
      "reasoning_details": [
        {
          "signature": "EpMCClkIDBgCKkBX6t0MPVyoZc7FxqrM6hmMWGMXElVuWY7KY3jAns65rifyzydlV+PZYsGACb9HQm3Ud+KMWT8A3LQjtAt/6/2XMg9jbGF1ZGUtb3B1cy00LTY4ABIMed0VQTE8BgYCWY/8Ggy3DXxV2sZ1lQ9GFZUiMNMJQ7K1r9FJwn75lT9pRNYWWI2IZ1AaZTPmQpCTiRlFh3Zq90IE05PALkgZmTpNqyponsvLc9aNQxAOenO8QF0xjkKoK2TbQoRvheI2PzQuq7E4DHtI1XSjIMAqGtIgc+mvmwbFirYeq5iHu/pacHVbvbcdXW8XgbdZB1R4dFUcK051FEX3ZvwSZDuN0mGL4F13bLnQorI3unQYAQ==",
          "thinking": "I now have comprehensive information on all three services. Let me compile the findings.",
          "type": "thinking"
        }
      ]
    }
  ]
}