install:compile_cygwin

Compiling CWB from source under Windows

Compilation and installation of CWB version 3.0

The Cygwin port of the CWB is experimental. While the source code compiles without error messages and basic CQP queries work well, it is currently not possible to process large corpora (e.g. in the 100M word range).

The problems we have encountered may be due to limitations in the virtual memory management of Windows and the Cygwin emulation layer manage. Apparently, a user process is limited to a 2 GB address space in Windows, and Cygwin seems to impose further restrictions. Help from someone with more experience in Windows and Cygwin programming would be highly welcome, so that we can identify the true cause of the problems and try to find a workaround.

In the long term, we hope to offer a native Windows port of the CWB. Please join the CWBdev mailing list if you would be interested to work on this port.

In order to compile and run the CWB tools under Windows you need the cygwin environment. A standard cygwin installation will suffice, plus the following packages (you will find them in the Devel section of the cygwin setup program):

  • bison
  • flex
  • gcc
  • libncurses-devel
  • make
  • perl

We recommend to install the simple text editor nano (from the Editors section) for editing configuration files, but you can also use your favourite Windows text editor.

Get the source code from here, and unpack it:

tar xf cwb-XXXXXX.tgz (current version 2.2.b99-RC1)

Enter the new directory:

cd cwb-XXXXXX (current version 2.2.b99-RC1)

Important note: In principle, you can unpack the CWB source code anywhere you like, but don't put it on a network drive (we've encountered some weird errors there) and make sure that the directory path does not contain blanks (which will happen e.g. if you put the source code on your Desktop). The best solution is probably to keep the source code somewhere in the Cygwin directory tree, e.g. your Cygwin home directory.

First you need to set a few parameters in config.mk using your favourite text editor (but not Microsoft Word!). If you have installed the nano package as recommended above, just type the following command:

nano -w config.mk

Otherwise, navigate to the cwb-XXXXXX directory in Windows Explorer and open the file config.mk with a text editor.

In the platform directive, insert cygwin

# 
# PLATFORM-SPECIFIC CONFIGURATION (OS and CPU type)
#
# Pre-defined platform configuration files:
#       unix          standard Unix configuration [must set ENDIAN manually!]
#       linux         i386-Linux (generic)
#         linux-64       - configuration for 64-bit CPUs
#         linux-opteron  - with optimimzation for AMD Opteron processor
#       darwin        Mac OS X / Darwin [use one of the CPU-specific entries below]
#         darwin-g4      - with optimization for PowerPC G4 processor
#         darwin-g5      - with optimization for PowerPC G5 processor
#         darwin-i386    - configuration for i386-compatible processors
#         darwin-64      - 64-bit build on Intel Core2 and newer processors
#         darwin-core2   - optimised build for Core 2 CPU (requires Xcode 3.1)
#       solaris       SUN Solaris 8 for SPARC CPU
#       cygwin        Win32 build using Cygwin emulation layer (experimental)
#
include $(TOP)/config/platform/cygwin

The site directive also has to be changed to cygwin

#
# SITE-SPECIFIC CONFIGURATION (installation path and other local settings)
#
# Pre-defined site configuration files:
#       standard        standard configuration (installation in /usr/local tree)
#       classic         "classic" configuration (CWB v2.2, uses /corpora/c1/registry)
#       osx-fink        Mac OS X installation in Fink's /sw tree
#       binary-release  Build binary package for release (static if possible, local install in build/ tree)
#         osx-release     ... for Mac OS X
#         linux-release   ... for i386 Linux
#         solaris-release ... for SUN Solaris 2.x
#         linux-rpm       ... build binary RPM package on Linux (together with rpm-linux.spec)
#       cygwin          Win32 / Cygwin configuration (experimental)
#       
include $(TOP)/config/site/cygwin

The easiest way to compile the CWB is to type

make all

at the command line, and go to fetch a cup of coffee (due to the overhead of the Cygwin emulation layer, compilation is much slower than on Unix systems).

Since the Cygwin port is still experimental, it is probably a good idea to compile each component of the CWB separately. This will make it easier to recognise compilation errors and warnings. First, clean up any old files and check dependencies:

make clean
make depend

Then, compile the editline library used by CQP, which is included in the CWB source code distribution:

make editline

Now compile the corpus library:

make cl

Then the utilities:

make utils

And finally CQP:

make cqp

You may also want to check that the manpages are up to date:

make man

Now we're ready to install the whole toolkit:

make install

If you have set up Cygwin with a separate administrator account, you may need to type sudo make install and enter the administrator password here.

  • install/compile_cygwin.txt
  • Last modified: 2010/12/03 09:30
  • by eros