Audio-native language model API — OpenAI-compatible, built for Southeast Asian languages.
Make a real API request with your key. The generated curl command updates as you type.
Transcription endpoint — returns the transcript with optional timestamps & speaker diarization.
curl -X POST http://meralion.org:8010/audio/transcription \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"audio_url": "data:audio/wav;base64,<BASE64_AUDIO>",
"return_diarization": true
}'http://meralion.org:8010For chat / Q&A the API is OpenAI-compatible — point any OpenAI SDK's base_url to http://meralion.org:8010/v1. For speech-to-text, call /audio/transcription directly (see example) — it is the correct endpoint for transcription.
Speech-to-text uses the dedicated /audio/transcription endpoint — it auto-chunks long audio and, with return_diarization: true, returns word-level timestamps and speaker labels. Use /v1/chat/completions only for conversational Q&A about audio, not bulk transcription.
# Speech-to-text: use the dedicated /audio/transcription endpoint.
# It auto-chunks long audio and can return word-level timestamps and
# speaker diarization. (Do NOT use /v1/chat/completions for transcription.)
curl -X POST http://meralion.org:8010/audio/transcription \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"audio_url": "data:audio/wav;base64,<BASE64_AUDIO>",
"return_diarization": true
}'import base64, pathlib, requests
audio_b64 = base64.b64encode(pathlib.Path("speech.wav").read_bytes()).decode()
# Speech-to-text via the dedicated ASR endpoint — handles long audio and,
# with return_diarization, adds word-level timestamps + speaker labels.
resp = requests.post(
"http://meralion.org:8010/audio/transcription",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"audio_url": f"data:audio/wav;base64,{audio_b64}",
"return_diarization": True, # speaker labels + timestamps
},
)
message = resp.json()["choices"][0]["message"]
print(message["content"]) # full transcript
print(message.get("segments")) # [{speaker, start, end, text}, ...]import fs from "node:fs";
const base64 = fs.readFileSync("speech.wav").toString("base64");
// Speech-to-text via the dedicated ASR endpoint (timestamps + diarization).
const resp = await fetch("http://meralion.org:8010/audio/transcription", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
audio_url: `data:audio/wav;base64,${base64}`,
return_diarization: true, // speaker labels + timestamps
}),
});
const { message } = (await resp.json()).choices[0];
console.log(message.content); // full transcript
console.log(message.segments); // speaker-labelled segmentsAuthorization: Bearer YOUR_API_KEYX-API-Key: YOUR_API_KEY?api_key=YOUR_API_KEY/audio/transcriptionSpeech-to-text (ASR) — timestamps & diarization/v1/chat/completionsMultimodal chat / Q&A (not for bulk transcription)/keys/registerRegister for an API key/v1/modelsList available models/keys/usageCheck your usage & limits/keys/tiersView available tiersFull OpenAPI spec: http://meralion.org:8010/docs ↗