What You Will Find Here
Datasets of linguistic interest are provided in delimited text and spreadsheet formats. A goal for these dispirate datasets is to bring them together in a common RDF representation.
A number of books that are unencumbered by copyright restrictions have been digitized into basic text. Goals for the book collection are to: correct errors in the text, apply formatting, and then publish the works under popular e-book formats.
Preliminary. Document artifacts are being collected, rediscovered, reviewed and sorted from various old harddrives and backup media. Following this collection phase the repository will most likely be split in two: one repository for datasets and one for e-books. Refinement of the assets will be ongoing as will be documentation of the assets.
While issues with the artifacts abound, specific defects are being tracked in the archive Issue Tracker. Please feel free to add to it.