User Tools

Site Tools


Redesign of the CQi client-server protocol

The CQi protocol defines a client-server API to CWB/CQP with the following three goals:

  1. provide a uniform API to corpus data and CQP queries, which can easily be accessed from all programming languages (without having to link against C libraries or run slave processes in the background)
  2. allow remote access to CQP process running on a dedicated server (e.g. from thin client or Web interface)
  3. enable distributed processing, where multiple (sub)corpora can be queried in parallel by separate CQi server instances (either on the same server or on different nodes in a compute grid)

The official CWB distribution currently includes the CQi 1.0 protocol specification, a stand-alone CQi server (cqpserver), and a reference client implementation for Perl. CWB users have implemented similar clients for Java, Lisp and Python with varying success.

The CQi protocol was designed back in the year 1999 within a few weeks, and without expert knowledge of client-server programming. Due to its various shortcomings, it has never found wide-spread use. However, there is now renewed interest in CQi as its three goals are becoming more and more important for the future of the CWB. Various parties plan to work on new CWB/CQP APIs (e.g. for R), on distributed processing and virtual corpora, etc.

It seems to be the right time, therefore, to redesign the CQi protocol from the ground up in preparation for new CQi-based software. The new design should be powerful and flexible enough to support features that may be introduced by future CWB releases (e.g. an extension to corpora of more than 2 billion words), it should make use of standard technologies where this is sensible, and it should try to solve performance problems of the current implementation.

developers/cqi_redesign.txt · Last modified: 2011/03/07 11:30 by stefan