Text-to-Speech

OverviewCopied!

Text-to-Speech (TTS) allows you to convert text into speech using a Replica voice. You can generate speech one line at a time or submit multiple lines in a batch for more efficient processing.

PrerequisitesCopied!

Before starting, you'll need:

  • A valid API key

  • A Speaker ID (see Voice Selection)

  • Text content to convert to speech

Single Text-to-Speech GenerationCopied!

The basic TTS process involves a single API call to generate speech from text:

import requests
url = "https://api.replicastudios.com/v2/speech/tts"

# Request payload
payload = {
    "speaker_id": "9b1f5c24-a18b-4b9e-a785-b3a3b3b8751a", # Replace with your chosen Speaker ID
    "text": "Welcome to Replica Studios.",
    "model_chain": "latest",
    "language_code": "en"
}

headers = {
    "Content-Type": "application/json",
    "X-Api-Key": "..." # Replace with your API key
}

# Send the request
response = requests.post(url, json=payload, headers=headers)
result = response.json()
print("Generated Audio URL:", result["url"])

Additional ParametersCopied!

You can customize the speech generation by including additional parameters in your payload:

{
  "speaker_id": "9b1f5c24-a18b-4b9e-a785-b3a3b3b8751a",
  "text": "Welcome to Replica Studios.",
  "model_chain": "latest",
  "language_code": "en",
  "global_pace": 1.2,
  "global_pitch": 0.5,
  "global_volume": 0.5,
  "voice_preset_id": "custom-preset-uuid"
}

For more details on these parameters, see Global Controls.

Batch Text-to-Speech GenerationCopied!

For multiple lines of text, you can use batch processing to submit them all at once via job specifications.

At a minimum, you need to provide discrete text fields, but any other TTS parameters can be included as well.

Parameters in the main payload are applied to all jobs in the batch, parameters in the job specifications have priority over the main payload.

Batch operations are not available on all subscription tiers.

A minimal example would look like this:

import requests

url = "https://api.replicastudios.com/v2/speech/tts/submit"

# Batch payload
payload = {
    "job_specs": [
        {
            "text": "First line of dialogue.",
        },
        {
            "text": "Second line of dialogue.",
        }
    ],
    "speaker_id": "9b1f5c24-a18b-4b9e-a785-b3a3b3b8751a",
    "model_chain": "latest",
    "language_code": "en",
}

headers = {
    "Content-Type": "application/json",
    "X-Api-Key": "..."  # Replace with your API key
}

# Submit batch request
response = requests.post(url, json=payload, headers=headers)
print(response.json())

The response items will return a job_id for each job specification. You can use this to check the status of the job/retrieve the generated audio. It will also return statuses indicating if each job was started successfully.

An example matching the above code is below:

{
	"items": [
		{
			"text": "First line of dialogue.",
			"job_id": "fc243610-0083-4237-bad7-430cbd51cc20",
			"status": "CREATED",
			"error": null
		},
		{
			"text": "Second line of dialogue.",
			"job_id": "dbe60d4a-5534-428d-b35a-c88b6d97e6be",
			"status": "CREATED",
			"error": null
		}
	],
	"success_count": 2,
	"error_count": 0,
	"status": "SUBMITTED"
}

A more complex example payload with additional parameters:

# Batch payload
payload = {
    "job_specs": [
        {
            "speaker_id": "9b1f5c24-a18b-4b9e-a785-b3a3b3b8751a",
            "text": "First line of dialogue.",
        },
        {
            "speaker_id": "07e62901-72c4-46e5-b009-aa0938d749df",
            "text": "Second line of dialogue with a different speaker.",
            "user_tags": ["Trialing"]
        }
    ],
    "speaker_id": "9b1f5c24-a18b-4b9e-a785-b3a3b3b8751a",
    "model_chain": "latest",
    "language_code": "en",
}

With the following response:

{
	"items": [
		{
			"speaker_id": "9b1f5c24-a18b-4b9e-a785-b3a3b3b8751a",
			"text": "First line of dialogue.",
			"job_id": "fc243610-0083-4237-bad7-430cbd51cc20",
			"status": "CREATED",
			"error": null
		},
		{
			"speaker_id": "07e62901-72c4-46e5-b009-aa0938d749df",
			"text": "Second line of dialogue with a different speaker.",
			"job_id": "dbe60d4a-5534-428d-b35a-c88b6d97e6be",
			"status": "CREATED",
			"error": null,
			"user_tags": [
                "Trialing"
            ]
		}
	],
	"success_count": 2,
	"error_count": 0,
	"status": "SUBMITTED"
}

Best PracticesCopied!

Text Length

  • Minimum text length is 3 characters

  • Maximum text length is 2000 characters

  • For longer content, divide into multiple requests or use batch processing

Performance

  • Use batch processing when generating multiple lines

  • Consider using the latest model chain for optimal quality

  • Include appropriate SSML tags for better control over speech characteristics

LimitationsCopied!

  • Minimum 3 characters per text segment

  • Maximum 2000 characters per text segment

  • Batch operations are not available on all subscription tiers

  • 100 lines/jobs per batch request

For troubleshooting common issues, refer to our Troubleshooting Guide.