It varies based on the age of the video, newer ones do indeed have separate audio downloads. You can force audio only with
yt-dlp -f bestaudio <url>
This will cause the script to only consider audio-only formats, if bandwidth is a concern. However, how it decides which one is “best” is beyond me. For example, I tried one video and got a webm that contains only an audio track:
~ $ yt-dlp -f bestaudio https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] dQw4w9WgXcQ: Downloading webpage
[youtube] dQw4w9WgXcQ: Downloading ios player API JSON
[youtube] dQw4w9WgXcQ: Downloading android player API JSON
[youtube] dQw4w9WgXcQ: Downloading m3u8 information
[info] dQw4w9WgXcQ: Downloading 1 format(s): 251
[download] Destination: /data/data/com.termux/files/home/storage/movies/ytdl/20091025__Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video.webm
[download] 100% of 3.28MiB in 00:00:00 at 6.91MiB/s
!boinc@sopuli.xyz