Create Educational Videos with Python | Generated by AI
To create educational short videos with a black background and captions synchronized with audio generated from text using the DeepSeek API and Google Cloud Text-to-Speech API, you can use Python to orchestrate the process. Below is a step-by-step guide and a Python script that accomplishes this. The script will:
- Use DeepSeek API to generate or refine a script (assuming you provide the educational content).
- Use Google Cloud Text-to-Speech API to convert the script into audio.
- Use a library like
moviepy
to create a video with a black background and captions synchronized with the audio.
Prerequisites
- DeepSeek API Key: Sign up at DeepSeek and obtain an API key.
- Google Cloud Text-to-Speech API:
- Set up a Google Cloud project and enable the Text-to-Speech API.
- Create a service account and download the JSON credentials file.
- Install the Google Cloud Text-to-Speech client library:
pip install google-cloud-texttospeech
.
- Python Libraries:
- Install required libraries:
pip install openai moviepy requests
.
- Install required libraries:
- FFmpeg: Ensure FFmpeg is installed for
moviepy
to handle video rendering (download from FFmpeg website or install via package manager).
Steps
- Generate or Refine Script with DeepSeek API: Use DeepSeek to create or polish the educational script, ensuring it’s concise and suitable for a 1-minute video.
- Convert Text to Audio with Google Cloud Text-to-Speech: Split the script into paragraphs, generate audio for each, and save as separate audio files.
- Create Video with MoviePy: Generate a video with a black background, display captions for each paragraph synchronized with the audio, and combine them into a final 1-minute video.
Python Script
The following script assumes you have a text file with the educational content (paragraphs) and generates a video with a black background and captions.
import os
from openai import OpenAI
from google.cloud import texttospeech
from moviepy.editor import TextClip, CompositeVideoClip, ColorClip, concatenate_videoclips
import requests
# Set up environment variables
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/your/google-credentials.json" # Update with your credentials file path
DEEPSEEK_API_KEY = "your_deepseek_api_key" # Update with your DeepSeek API key
# Initialize DeepSeek client
deepseek_client = OpenAI(api_key=DEEPSEEK_API_KEY, base_url="https://api.deepseek.com")
# Function to refine script with DeepSeek
def refine_script_with_deepseek(script):
prompt = f"""
You are an expert scriptwriter for educational videos. Refine the following script to be concise, clear, and engaging for a 1-minute educational video. Ensure it is suitable for spoken narration and fits within 60 seconds when spoken at a natural pace. Split the script into 2-3 short paragraphs for caption display. Return the refined script as a list of paragraphs.
Original script:
{script}
Output format:
["paragraph 1", "paragraph 2", "paragraph 3"]
"""
response = deepseek_client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": prompt}],
stream=False
)
refined_script = eval(response.choices[0].message.content) # Convert string to list
return refined_script
# Function to generate audio for each paragraph using Google Cloud Text-to-Speech
def generate_audio(paragraphs, output_dir="audio"):
client = texttospeech.TextToSpeechClient()
if not os.path.exists(output_dir):
os.makedirs(output_dir)
audio_files = []
for i, paragraph in enumerate(paragraphs):
synthesis_input = texttospeech.SynthesisInput(text=paragraph)
voice = texttospeech.VoiceSelectionParams(
language_code="en-US",
name="en-US-Wavenet-D" # A natural-sounding English voice
)
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3
)
response = client.synthesize_speech(
input=synthesis_input, voice=voice, audio_config=audio_config
)
audio_file = os.path.join(output_dir, f"paragraph_{i+1}.mp3")
with open(audio_file, "wb") as out:
out.write(response.audio_content)
audio_files.append(audio_file)
return audio_files
# Function to create video with captions and black background
def create_video(paragraphs, audio_files, output_file="educational_video.mp4"):
clips = []
for i, (paragraph, audio_file) in enumerate(zip(paragraphs, audio_files)):
# Create text clip for captions
text_clip = TextClip(
paragraph,
fontsize=40,
color="white",
font="Arial",
size=(1280, 720), # Standard HD resolution
method="caption",
align="center"
)
# Load audio and get its duration
audio_clip = AudioFileClip(audio_file)
duration = audio_clip.duration
# Set text clip duration to match audio
text_clip = text_clip.set_duration(duration)
# Create black background clip
bg_clip = ColorClip(size=(1280, 720), color=(0, 0, 0), duration=duration)
# Combine text and background
video_clip = CompositeVideoClip([bg_clip, text_clip.set_position("center")])
# Add audio to video clip
video_clip = video_clip.set_audio(audio_clip)
clips.append(video_clip)
# Concatinate all clips
final_clip = concatenate_videoclips(clips)
# Write final video
final_clip.write_videofile(output_file, fps=24, codec="libx264", audio_codec="aac")
final_clip.close()
for clip in clips:
clip.close()
# Main function
def main():
# Example input script (replace with your educational content)
input_script = """
Machine learning is a field of artificial intelligence that allows computers to learn from data without being explicitly programmed. It involves algorithms that identify patterns and make predictions. Applications include image recognition, natural language processing, and more. This technology is transforming industries like healthcare and finance.
"""
# Step 1: Refine script with DeepSeek
refined_paragraphs = refine_script_with_deepseek(input_script)
print("Refined Script:", refined_paragraphs)
# Step 2: Generate audio for each paragraph
audio_files = generate_audio(refined_paragraphs)
print("Audio files generated:", audio_files)
# Step 3: Create video with captions and black background
create_video(refined_paragraphs, audio_files)
print("Video created: educational_video.mp4")
if __name__ == "__main__":
main()
How to Use
- Set Up Credentials:
- Replace
"path/to/your/google-credentials.json"
with the path to your Google Cloud service account JSON file. - Replace
"your_deepseek_api_key"
with your DeepSeek API key.
- Replace
- Prepare Input Script:
- Modify the
input_script
variable in themain()
function with your educational content. The script should be a single string with the full text you want to convert into a video.
- Modify the
- Run the Script:
- Save the script as
create_educational_video.py
and run it withpython create_educational_video.py
. - The script will:
- Refine the script using DeepSeek API to ensure it’s concise and split into 2-3 paragraphs.
- Generate MP3 audio files for each paragraph using Google Cloud Text-to-Speech.
- Create a video with a black background, displaying each paragraph as captions synchronized with its corresponding audio.
- Save the script as
- Output:
- The final video will be saved as
educational_video.mp4
in the same directory as the script. - Audio files for each paragraph will be saved in the
audio
directory.
- The final video will be saved as
Notes
- DeepSeek API: The script uses the
deepseek-chat
model to refine the script. Ensure your API key is valid and you have sufficient credits. The DeepSeek API is used here to structure the script for video narration, as it excels in text generation and optimization. - Google Cloud Text-to-Speech: The script uses the
en-US-Wavenet-D
voice for natural-sounding English narration. You can change the voice by modifying thename
parameter inVoiceSelectionParams
(see Google Cloud Text-to-Speech documentation for other voice options). - MoviePy: The video is created in 1280x720 resolution (HD). You can adjust the
size
parameter inTextClip
andColorClip
for different resolutions. - Timing: The script ensures captions are synchronized with audio by setting the text clip duration to match the audio duration. For a 1-minute video, the DeepSeek prompt enforces a concise script.
- Dependencies: Ensure FFmpeg is installed and accessible in your system’s PATH for
moviepy
to work correctly.
Example Output
If your input script is about machine learning, the refined script might look like:
["Machine learning, a branch of AI, enables computers to learn from data.",
"It uses algorithms to find patterns and predict outcomes.",
"Applications include image recognition and healthcare innovations."]
- Each paragraph generates an audio file (e.g.,
paragraph_1.mp3
,paragraph_2.mp3
, etc.). - The final video shows a black background with white captions appearing sequentially, synchronized with the audio narration.
This approach is simple, cost-effective, and produces professional-looking educational videos suitable for platforms like YouTube or educational websites. If you need further customization (e.g., different fonts, caption styles, or additional effects), let me know!