Re: [Corpora-List] Looking for some corpora about why-questions, how-questions, and their answers.

From: Zhiping Zheng (zzheng@umich.edu)
Date: Thu Nov 21 2002 - 23:57:55 MET

  • Next message: Pavel Rychly: "Re: [Corpora-List] On tools for indexing and searching large corpora"

    Dear all,

    I got several responses asking if I am planning to make my question list
    public. I think I should answer this question to the whole list.

    I am willing to make it public but I am not sure if I should do it right
    now. Here are the reasons:

    1. Some questions ask information about specific people, not only
    celebrities, but also probably the questioners or other people with very
    close relationships to the questioners. This may raise some privacy
    issues. I prefer to take off these questions before make the question
    corpus public.

    2. Some questions, actually not a small number, contain some uncensored
    words. I think these questions are improper to be in a corpus.

    3. Many questions are not grammatically correct or with some spell errors.
    I personally think this is ok becaues the questions are from real world. I
    don't know what other researchers think about this.

    4. Different researchers may have different expections. For example, the
    original poster of this thread required why- and how- questions, other
    people have asked about statistic information on specific phrase groups. I
    would like to know if there are some common requirements from most or many
    researchers.

    5. After I do something to the question archive and make it public, I am
    thinking of updating the public question corpus time to time. More efforts
    have to take and I am not sure if I have enough energy to do this. I hope
    some one is willing to join me.

    I am waiting for your inputs. Especially if you are willing to do
    something for building the corpus, I am happy to work with you.

    Many thanks.

    Zhiping

    On Wed, 20 Nov 2002, ZHIPING ZHENG wrote:

    >
    > Dear Tian-Zuo and others,
    >
    > I have a big corpus which contains over 40K unique questions collected
    > from real world users by my AnswerBus Question Answering System
    > (http://www.answerbus.com/). I am willing to do some research based on the
    > data together with other people who have the same interest.
    >
    > Zhiping
    >
    >
    > On Wed, 20 Nov 2002, tzshen wrote:
    >
    > > Dear all,
    > >
    > > I am doing some work to find the answer patterns
    > > to help automatic answering some complex questions, which ask for a complex answer.
    > > I first focus on why-questions and how-questions.
    > > So I am eager to find some corpora that contains large amount of this two types of questions and corresponding answers.
    > > Does anyone know where I can find this kind of corpora or related resources?
    > > Resouces about other complex questions and answers beyond why- and how-questions are also welcome.
    > >
    > > THANK YOU ALL VERY MUCH.
    > >
    > > Tian-Zuo, Shen
    > >
    > >
    > >
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Fri Nov 22 2002 - 00:08:31 MET