bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request


From: Random Coder
Subject: Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround?
Date: Tue, 31 Mar 2015 15:05:19 -0700

On Tue, Mar 31, 2015 at 10:11 AM, Stephen Wells <address@hidden> wrote:
> Dear all - I am currently trying to use wget to obtain mp3 files from the
> Google Translate TTS system. In principle this can be done using:
>
> wget -U Mozilla -O "${string}.mp3" "
> http://translate.google.com/translate_tts?tl=TL&q=${string}";
>
> ...
>
> http://translate.google.com/translate_tts?tl=ru&q=%D0%BC%D0%B0%D0%B7%D0%B0%D1%82%D1%8C
>
> This of course produces a string of gibberish in the resulting mp3 file!


That URL is correct, it's what you'll see a browser send across the
wire for the same string.  Google is producing gibberish because of
some User-agent sniffing that they appear to be doing.

If you change the user agent to something that's more complete, like
"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/41.0.2228.0 Safari/537.36" instead of just Mozilla, it should
work correctly.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]