UTF-8 is a common text encoding that supports a wide range of characters and languages.

Why do characters look broken?

Broken characters often appear when software reads a file using the wrong encoding.

How to convert text and subtitle files to UTF-8

Encoding controls how characters are stored

When a file uses the wrong encoding, normal text can appear as broken symbols, question marks, or strange accented characters.

UTF-8 is widely used because it supports many languages and works well across modern platforms.

Subtitle files often expose encoding problems

Subtitles may contain names, punctuation, music notes, and non-English text. If the encoding is misread, captions can become hard to read or fail on upload.

Converting to UTF-8 can make the file easier to reuse in editors, players, and publishing systems.

Check the converted result

After conversion, open the output and scan for names, accented characters, and punctuation. This is the fastest way to catch a bad source encoding guess.

If the original file is damaged or mixed from several encodings, some manual cleanup may still be needed.

How to convert text and subtitle files to UTF-8

Encoding controls how characters are stored

Subtitle files often expose encoding problems

Check the converted result

Tools mentioned in this guide

Frequently asked questions