索引法與資料庫設計

  • 有許多語料索引可能。

    • lucene/ElasticSearch

    • CWB

    • Emdros

  • 決定用 CWB,原因還是語言學上的理由。

"The IMS Open Corpus Workbench (CWB) is a collection of open-source tools for managing and querying large text corpora (ranging from 10 million to 2 billion words) with linguistic annotations. Its central component is the flexible and efficient query processor CQP.

Last updated