Hello–this is a fairly specific technical question and I don’t know if anyone will be able to help me, but I know there are lots of smart computery folks here so I figured it’s worth a shot.
I have a problem with a DejaVu X translation database that I’m trying to solve, either by using SQL commands to modify the database, or by exporting it to TMX (XML) and using a script to modify the XML, then importing it back into my translation memory program. I can also export the database to other formats such as Excel or Access.
The problem is that rather than keeping a single record for each source segment, with multiple languages, for example:
[Record #1]
[Source] Saturday
[French] samedi
[Italian] sabato
[Latin] aturdaysay
my database has duplicate records, one for each language:
[Record #1]
[Source] Saturday
[French] samedi
[Record #2]
[Source] Saturday
[Italian] sabato
[Record #3]
[Source] Saturday
[Latin] aturdaysay
I would like to combine these into a single record, as in the first example. The language entries may not necessarily appear consecutive to each other in the database.
The other problem is that multiple translations into a language may exist for a single term in the database and I would like to keep both–for example, if “mouse” has been translated into Italian as both “topo” (for the animal) and “mouse” (for the computer peripheral). It doesn’t matter which one gets merged into the multilingual record.
Does anybody know of a way to do this?
Here is a sample of a TMX file showing two duplicate records, the first containing an English-German pair and the second containing an English-Danish pair–I would like them to both appear as part of the same record, with English as the main source segment and both German and Danish specified as <tuv> elements of the same <tu>, if that makes sense.
<tu
tuid=“3”
datatype=“Text”
srclang=“en-us”
>
<prop type=“x-Project”>7301447</prop>
<tuv
xml:lang=“en-us”
creationdate=“20051019T204809Z”
creationid=" "
>
<prop type=“IsSource”>True</prop>
<seg>INDICATION FOR USE</seg>
</tuv>
<tuv
xml:lang=“de”
creationdate=“20060302T013912Z”
creationid=" "
>
<prop type=“IsSource”>False</prop>
<seg>Einsatzbereich</seg>
</tuv>
</tu>
<tu
tuid=“4”
datatype=“Text”
srclang=“en-us”
>
<prop type=“x-Project”>7301447</prop>
<tuv
xml:lang=“en-us”
creationdate=“20051019T204809Z”
creationid=" "
>
<prop type=“IsSource”>True</prop>
<seg>INDICATION FOR USE</seg>
</tuv>
<tuv
xml:lang=“da”
creationdate=“20051019T204809Z”
creationid=" "
>
<prop type=“IsSource”>False</prop>
<seg>Indikation</seg>
</tuv>
</tu>