NaN: the community's first month in numbers

On May 17 the platform completed its first month, and in this post we are going to look at how everything went in numbers.

Data window: 30 days (17/04/2026 to 17/05/2026)

1. The scale

In 30 days NaN served:

Metric	Value
Successful requests	3.678.787 (≈ 3,68 M)
Total tokens	117.674.297.968 (≈ 117,7 B)
Input tokens	116.222.578.716
Output tokens	1.451.719.252
Embedding tokens	697.810.776
Days with continuous traffic	31 / 31

More than 117 billion tokens generated. Roughly the equivalent of reading 235,000 complete copies of Don Quixote in a month.

2. The community

For now, signing up for the community happens through a waitlist that has not stopped growing over the last month. These are the numbers:

Status	Members
Joined the waitlist	1.027
Currently on the waitlist	151
Subscribed	305

Almost a quarter of those who joined the waitlist ended up entering the community.

Geographic distribution

NaN has been used from 21 countries. The top of the map looked like this:

Country	% requests
🇨🇴 Colombia	30,38 %
🇲🇽 Mexico	21,95 %
🇪🇸 Spain	15,10 %
🇺🇸 USA	13,05 %
🇫🇮 Finland	6,96 %
🇫🇷 France	4,07 %
🇩🇪 Germany	3,06 %
🇨🇦 Canada	1,35 %
🇵🇱 Poland	1,33 %
🇦🇷 Argentina	1,31 %
Rest (11 countries)	1,43 %

LATAM plus Spain add up to 67 % of the traffic. It is predominantly a Spanish-speaking platform for coding agents, with a real presence in Colombia, Mexico, Spain, Argentina, Ecuador, Peru, Uruguay, Chile, Puerto Rico, and El Salvador.

3. The real savings

If those same 115.5 B input tokens plus 1.45 B output (chat completions) had gone through closed providers, the monthly bill would be:

Provider (in/out price per 1M tokens)	Equivalent cost (30 days)
Claude Sonnet 4 ($3 / $15)	$368.374 USD
GPT-4o ($2,50 / $10)	$303.348 USD
Gemini 2.5 Pro ($1,25 / $10)	$158.935 USD
DeepSeek V3 ($0,27 / $1,10)	$32.791 USD
GPT-4o-mini ($0,15 / $0,60)	$18.201 USD

Depending on the model, had we used a private provider we would have spent between ~$18K and over $360K.

What each user saves

User type	Tokens/month	Costs in Claude Sonnet 4	Costs in GPT-4o	Pays in NaN
P50 (median)	112,6 M	$347,13	$287,36	70€ / $75
P90 (power user)	1,11 B	$3.509,97	$2.869,71	70€ / $75

35 members exceeded 1 billion tokens during the month.

The typical NaN user already consumes between $287 and $347 USD/month of value equivalent to GPT-4o or Claude Sonnet 4. The most active 10% sits between $2,800 and $3,500 USD/month of equivalent value. Everyone pays the same: 70€ or $75 depending on the region.

4. Performance

The section we are proudest of from the first month.

Metric	Value
Uptime (excluding client errors)	99,986 %
Global success rate	99,556 %
Our own 5xx errors	505 / 3.695.485 (0,014 %)
Client 4xx errors	13.378 (0,36 %)

Aggregate throughput

Metric	Value
Tokens/second (sustained avg)	~46.056
Tokens/second (peak)	285.270
Tokens/minute (peak)	17.116.195

Latency (chat completions, user view)

Metric	Value
TTFT (time to first token) P50	1.013 ms
TTFT P95	21.066 ms
Total request duration P50	2.660 ms
Total request duration P95	37.245 ms

Roughly 1 second from your request to the first token.

5. Available models

Every member has access to all the models in the stack:

Model	Function	Requests	Tokens
Qwen 3.6 (35B-A3B)	Main chat and coding	3.282.599	114,36 B
Gemma 4 (26B-A4B)	Fast chat, low latency	277.602	2,62 B
Qwen3 Embedding	Vector search, RAG	113.564	698 M
Whisper	Speech-to-text	3.993	N/A
Kokoro	Text-to-speech (af_heart, ef_dora, em_alex)	1.565	N/A

We offer a complete stack of models: LLMs, embeddings, transcription, and speech synthesis, all under the same membership. On top of that, this month we have started to explore the possibility of adding SOTA models.

The first one to arrive is DeepSeek V4 Flash. Next month there will be reports on this new tier of models that we have unlocked.

6. How NaN is used

Distribution by client / SDK:

Client	Requests	%
OpenAI Python SDK (sync + async)	1.666.766	45,32 %
opencode (coding agent in Bun)	742.336	20,18 %
OpenAI JS / Node / Bun	614.231	16,70 %
Python (httpx / requests raw)	378.851	10,30 %
Other	142.366	3,87 %
Go (SDK + raw)	86.142	2,34 %
Anthropic SDK (via proxy)	21.023	0,57 %
PHP (GuzzleHttp)	17.520	0,48 %
Cursor	5.378	0,15 %
Cline	3.513	0,10 %

Two takeaways:

The official OpenAI SDK works against NaN with no changes. You just need to point it at a base_url and a api_key. Most clients use this same communication standard. That explains 45% of the traffic.
opencode has established itself as the community's favorite coding agent: 20% of all traffic, with typically large prompts.

NaN is being used to do coding tasks in languages like Python, JS, Go, PHP, and Rust.

7. Usage patterns

Prompt size (chat completions, tokens)

Percentile	Tokens
P10	140
P50	4.443
P90	100.890
P99	202.467
Maximum	262.052

Half of the calls send more than 4,400 tokens of context. The largest 10% send more than 100,000. NaN is used for coding agents, with entire projects as context.

Day of the week

Day	Requests
Wednesday	644.104
Tuesday	620.146
Monday	561.809
Thursday	527.489
Sunday	466.320
Friday	434.390
Saturday	425.534

Weekdays are when NaN gets used the most, but usage does not drop below 66% on weekends either. So while the presence during working hours is higher, it never stops being used outside of them.

Day-by-day growth

Date	Requests	Tokens	Active users
17/04 (day 1)	28.843	1,06 B	25
30/04	60.634	4,47 B	81
08/05	218.620	4,09 B	127
15/05	107.014	4,58 B	170
16/05	257.787	6,70 B	177
17/05 (peak)	278.492	5,95 B	179

~10x in requests/day and ~7x in daily active users in the first 30 days.

8. Agents and Spaces

Two weeks ago we enabled the option to deploy a hermes agent for each user in their own private Sandbox (microVM). There are currently 128 active agents.
The latest feature released in NaN Cloud is that every community member now gets a private space with 2 vCPU, 4GB of RAM, and 20 GB of disk to deploy applications. There are currently 66 Spaces and 12 user applications deployed on the platform.

9. What is coming

DeepSeek V4 Flash is already available as on-demand SOTA for members who need it.
More inference capacity to sustain the pace of growth.
More open models as they appear, without changing the membership.
A project by and for the community. We will start driving Open Source projects to improve the community experience, especially around the current documentation, support, and the Discord bot.

10. A few recommendations

It is important to understand that Gemma and Qwen have a 256K context window. It is essential to set this limit correctly in the client you use (OpenCode, Pi, etc.) and likewise to define a margin to compact that context before reaching the limit. Example in Opencode.
Try not to drag out or reuse sessions unnecessarily. Do atomic tasks with a beginning and an end that should be born and die within a single session.
Find the right workflow. Something that has worked for several community users is using more powerful models to plan and validate code, and using Qwen or Gemma to execute all the tasks that you need. Now with DeepSeek we can use it as orchestrator/leader.
Using the clients (OpenCode, Pi, Hermes, etc.) exactly as they come by default does not work. The most important thing is your harness, because depending on it the model will get better or worse results.
Given the previous point, make the most of the different Discord channels. Explore and try out new skills, tools, CLIs, clients, and agents. The community is extremely active in answering doubts and questions and giving recommendations.
Remember that every month we will hold two sessions. Either an event or a workshop that you can watch recorded whenever you want on NaN .
Take advantage of Spaces to deploy applications or custom agents! (a short tutorial on how to set up a chatbot deployed on Spaces is coming soon)

NaN was born to bring together people who are building things and who can also take advantage of open inference models. That is how we set up our first server. Today it is already a cluster of 7 servers and 11 GPUs dedicated exclusively to serving models for NaN.

It has been a month of absolute madness and a lot of work to make all of this happen. For my part, all that is left is to thank you for your trust, and know that this is only the beginning. Onward! 🚀