When you talk, there are “strings” (not actual strings, more like folds) in your neck called vocal cords that make the talking sound. The faster they move (or vibrate), the higher the sound they make. Children’s voices sound higher than adult voices because children’s vocal cords aren’t as big, so they’re able to vibrate faster and make a higher pitched sound.
The same thing happens if you take that sound and play it even faster. You’re artificially making the vibrations of the sound move faster so it sounds higher pitched.
This is true of any sound, not just voices.
You do the same thing when you take a recording and run it faster. You increase the frequency of the audiowave pattern, so the pitch gets shifted up.
So the “Explain it like I’m 18 and just haven’t taken physics for some reason” answer is that in audio, pitch = frequency, and frequency = wavespeed / wavelength. If you speed up the wavespeed without changing the wavelength, you get a higher frequency, which equals higher pitch.