What would verbal components actually sound like?

Something I’ve always wondered is what does a verbal component actually sounds like in-universe. Is it random sounds, gibberish, or do you anime it and chant something in actual words. Given that the deafened status applies a penalty it’s safe to assume you have to be very accurate when doing your verbal components so I’d like to think it’s not too hard to say. As far as I can find there’s nothing that says what language, length, format etc. Other components are very clear on what you are actually doing and how you are doing it. The answer doesn’t have to be stated directly in the rulebook but it must be from Paizo approved material.

Its also safe to assume spellcraft’s identify function uses other things since spellcraft says “Identifying a spell as it is being cast requires no action, but you must be able to clearly see the spell as it is being cast, and this incurs the same penalties as a Perception skill check due to distance, poor conditions, and other factors.” thus whatever allows you to identify the spell is visual not audible. As far as I know, there’s no description as to what the inside of a spellbook looks like for all we know it could just be a bunch of magic circles. If it actually had words and used the owners language I’d argue you’d have to know the language it’s written in. If it’s in a universal language then I guess that would work but again I don’t know what a spellbook looks like.

What does a verbal component actually sound like in practice?