Re:[Corpora-List] canonical order

From: Geert-Jan M. Kruijff (gj@CoLi.Uni-SB.DE)
Date: Thu Dec 05 2002 - 09:45:42 MET

  • Next message: Gerald Nelson: "[Corpora-List] Mime-Version: 1.0"

    Jim,

    > I have the very firm idea that canonical order is that in part
    > because it is the most frequent order in the language.

    I think this depends a bit on what you see as the function of word
    order in a language that has "free" word order. Many people have
    argued that word order helps realizing information structure in such
    languages -- Prague School, Vallduvi's information packaging, etc.

    THAT variation in word order does indeed indicate different
    information structure can be seen from the 'fact' that, even though
    different variations might be equally well-formed, they are not
    necessarily equally interchangeable in a given context.

    If you'd adopt this view on the function of word order, the
    "canonical" word order would be the order that realizes an "all-focus"
    construction, i.e. one in which no item is indicated as being
    dependent on the preceding context ("given"). Needless to say, this
    does not need to be the most frequent order.

    (NOTE: I am purely concerned here with surface word order, not with
    "deep" word order.)

    (Shameless plug: See my dissertation for formal models of this view,
    based on the Prague school. Dissertation is available from my website.)

    > However, I have
    > done no research on this supposed fact, and cannot think of any
    > offhand. Does anyone know of any work on the relative frequency of
    > sentences in canonical order and those showing variation in that
    > order? Of course, this would be especially useful in a 'free word
    > order' language like Spanish, but anything would be welcome. Likewise,
    > she would be interested in the relative frequency of the different
    > orders of the basic elements, if anyone knows of any work on that (one
    > type of sentence and its variants that she is working with is SUBJECT -
    > VERB - OBJECT - CIRCUMSTANTIAL_COMPLEMENT -- the last is normally a
    > prepositional phrase or adverbial phrase; this would produce in
    > principle 24 different orders in this case, *all* of which are
    > attested and attestable in Spanish, though presumably with rather
    > different relative frequencies of use).

    ... but do the 24 orders presuppose identical contexts?

    > She would also like to know who was the first person to coin the
    > term 'canonical order', or to whom it is attributed. (Or is it just an
    > idea that 'grew'? This last seems to me to be unlikely, but if anyone
    > has any really old references to the notion, I guess I might have to
    > accept it)

    Greenberg, in his "Some universals of grammar with particular
    reference to the order of meaningful elements" talks of "basic order",
    and refers to work on typology dating back to the nineteenth century
    (Footnote 4, page 105). Greenberg himself perceives of the basic order
    as the "dominant" order. Essentially, a dominant order is the order
    that always occurs as implicatum in universals about word order -- the
    preferred order, other things being equal -- i.e. capturing in a
    typological fashion the idea of canonical word order. (For a nice
    explanation, see Croft's book "Typology & universals", p.53ff.)

    When it comes to word order variation, neither Greenberg or Hawkins
    (in his book "Word Order Universals") say much. Steele published on
    this in the 1970's, proposing to characterize variation on a discrete
    scale "rigid", "mixed" and "free". This scale is more fine-grained
    than e.g. Skalicka's characterization of variability, based on
    morphology. (In my dissertation, I tried to extend Steele's
    characterization, and tie it into a characterization of information
    structure as typological category.)

    I'm not sure whether the above answers your questions completely :-)
    What it does point to, though, is that canonicity first of all seems
    to depend on what you consider to be the *function* of word
    order. Only once you have fixed THAT, it makes sense to start
    collecting frequency data I guess.

    Best regards,

    Geert-Jan

    =============================================================
    Dr.ir. Geert-Jan M. Kruijff

    Computational Linguistics Room 3.03, Building 17
    University of the Saarland Phone: +49.(681).302.4502
    Postfach 15 11 50 Mobile: +49 .179. 479.5892
    D-66041 Saarbruecken (Germany) Fax: +49.(681).302.4700

    gj@coli.uni-sb.de, gj@acm.org www.coli.uni-sb.de/~gj

    "Communications without intelligence is noise; Intelligence
     without communications is irrelevant."
     -- Alfred. M. Gray



    This archive was generated by hypermail 2b29 : Thu Dec 05 2002 - 09:52:22 MET