По ПО

I bought a new Amazon Kindle 3g and thought I could read some Turkish clips in order to learn this language. But there are so many verbal forms and cases for both nouns and adjectives, one could possibly say - as many as in our Russian tongue - and I soon found it really hard. Then I tried to search for inline Kindle Turkish-to-English dictionary in the net. There are some "online" ones and there are some "Windows" ones, but I was unhappy not to find a good mobi formatted one. I tried to make one then, and after some days - I succeded.

First of all, I downloaded a Babylon setup and extracted the much needed BLG file. Then I used UnpackBLG.exe in wine to extract a TXT file - option Terms - Outside. This TXT was processed by two Perl scripts, one (A) removing empty lines and tab-delimiting them, and correctly stuffing UTF symbols instead of the bad ones from UnpackBLG.exe output, the other (B) one splitting Pipe-delimited verbal forms and cases into separate lines:

A:

$line1=~ s/\n/\t/g if ($line1 =~ /(.+)\n/);

$line1=~ s/þ/ÅŸ/g;#s cedilla, shekure

$line1=~ s/ý/Ä±/g;#i grave

$line1=~ s/ü/Ã¼/g;#u dotted

$line1=~ s/ö/Ã¶/g;#o dotted

$line1=~ s/ð/ÄŸ/g;#g capped

$line1=~ s/â/Ã¢/g;#a capped

$line1=~ s/ç/Ã§/g;#c cedilla, cocuk

$line1=~ s/Þ/ÅŸ/g;#s cedilla, shekure

$line1=~ s/Ý/Ä°/g;#i grave

$line1=~ s/Ü/Ãœ/g;#u dotted

$line1=~ s/Ö/Ã–/g;#o dotted

$line1=~ s/Ð/ÄŸ/g;#g capped

$line1=~ s/Â/Ã¢/g;#a capped

$line1=~ s/Ç/Ã‡/g;#c cedilla, cocuk

$line1=~ s/é/Ã©/g;#French e

if ($line1 =~ /(.+) \| (.+)/) {

    my ($pipes, $meaning) = split('    ', $line1);

    my @list = split(' \| ',$pipes);

    my $counter = 0;

    foreach my $line (@list) {

        my $str = '';

        if(!$counter) {

            $str = $line.'    '.$meaning."\n";

        } else {

            $str = $line.'    '.$list[0]."\n";

        }

        $counter++;

        print OUTF $str unless ($str =~ /\//);

    }

} else {

    print OUTF $line1;

}

I manually removed a couple of trash lines that were infused by UnpackBLG, those that tab2opf.py couldn't swallow:

python tab2opf.py -utf turk.txt 2>&1>err

At last, I got an .opf file and 138x2Mb XML files to feed to mobigen:

wine mobigen/mobigen.exe turk.opf -unicode

The desktop 2-core, it took an hour rattling HDD and my fears of losing it, but finally - it worked, thank God.

Here is the 40MB zipped file - http://82.146.44.218/turk.zip

Put it directly to Kindle docs directory via USB (no conversion requred). Then choose it as primary dictionary for Turkish texts. You would usually need push the button twice - first, reducing form to infinitive, second, getting the meaning of it, - it's the most handy way, I think.

By the way, you know how to clip web sites to your Kindle? Look where I clip Turkish newspapers for reading on my Kindle.

По ПО

пятница, 23 сентября 2011 г.

Комментариев нет:

Отправить комментарий

пятница, 23 сентября 2011 г.

Комментариев нет:

Отправить комментарий

пятница, 23 сентября 2011 г.