Problems to Fix


This document lists the known outstanding major problems in the BibleTrans technology, and how I might try to fix them. Not included here are the numerous easily fixed small bugs.

Rule Table Space
Discourse Structure Nesting
Noun Management
 

Rule Table Space

I allowed for 31 rules in each of four types. This was already inadequate in the case of conditions and assignments, so there is a second set of 31 of each of these, and the doubled lists are already filling up. I need to partition the "Adat" resource numbers differently -- and probably also rename it "Ndat" to minimize collision during the transition -- so that we can have up to maybe 255 (or more) or each rule type. Unfortunately, the rule number space is hard-coded into much of the grammar rule processing, so this involves a significant rewrite or the grammar editor and the translation engine rule compiler. If done well, the existing grammar files can be imported into the new number space with minor (automated) fixups and still work correctly.
 

Discourse Structure Nesting

Consistent with Steve Beale's advice to me when I was designing this, BibleTrans allows for two types of discourse structure: coordination between two or more propositions at the same logical level, and subordination of one proposition under another. This now seems unrealistically simplistic, as I'm sure Dr.Beale has also discovered.

The problem seems most pronounced where the Greek language apparently did not have an adequate vocabulary to fully express the author's intention, so (for example) the Apostle Paul needed to compose a single verbal idea from two or more Greek verbs conjoined. Because the BibleTrans ontology closely follows the Greek lexicon, Elizabeth Miles was forced into a comparable composite structure when she encoded Philippians, which is foreign both to Paul's expressed ideas and to the natural BibleTrans tree structure. Current thinking is to allow the ontology to expand in such cases, so that "expect and hope" in Php.1:20 becomes a single concept and a single proposition, with the attendant simplification of the tree structure. We must be careful not to let this liberty to degenerate in runaway license, for that will greatly burden the linguists writing translation grammars for the target languages.

There are numerous cases in both Philippians and Luke as encoded by Ms.Miles, where she has subordinating relations under coordinating relations. This makes no sense linguistically, unless her intent was to express that the subordinated proposition(s) were in fact subordinate to the composite idea expressed by the coordination. Somebody needs to evaluate each of these instances (they are all hollow nodes in the current implimentation, except for Php.1:21 and 1:29, which I moved under the second of the two propositions in each case), and if necessary, invent new extended concepts to capture the composite verb sense, then clean up the tree structure to reflect this new encoding.
 

Noun Management

I originally designed this to have a separate noun list in each episode. Given the persistence of many nouns across episodes, this began to seem foolish, so I merged them all into a single ThingList at the book level. That has a different problem, which was not apparent in the small scale encodings we have done so far: the noun list for a whole book the size of a gospel will be huge, unmanagable while building trees.

Current thinking is to trim the logical list by encoding the noun numbers by chapter times 1000, so that nouns appearing only in chapter 2 would have numbers in the range 2000-2999, and so on. Nouns that persist across chapters would be given numbers less than 1000. BibleTrans can facilitate this partition by designating a "current chapter" for assigning noun numbers, and for hiding those outside the designated chapter.

The user should be able to ask for a noun to be given a specific noun number (if not already taken), so to move any noun into the common region (or to any other chapter), and the nouns should be sorted in the ThingList, but these are easy fixes.
 

First draft: 2014 June 9