Perseus' Java Hopper
The source code and much of the content for Perseus' Java Hopper was first released in November 2007, under open source licenses. In the future, we intend to release a virtual of the hopper which can be run on many platforms. For more information, please contact Rashmi Singhal.
Source Code - The source code can be downloaded from SourceForge.net
Text Files - These are the original XML text files. Download these files if you are generating the data for the hopper yourself. Texts are licensed under the Creative Commons NonCommercial ShareAlike 3.0 License
- Download all texts (425 MB)
- Download individual collections of texts. NOTE: If you download individual collections, place the directories downloaded in /sgml/texts/. Some files may be duplicated in the different collections.
- Download Greek and Roman collection texts (80 MB)
- Download Duke Databank of Documentary Papyri collection texts (36 MB)
- Download Germanic collections texts (2 MB)
- Download American History collection texts (208 MB)
- Download Richmond Times collection texts (85 MB)
- Download Renaissance collection texts (16 MB)
Data - Download these .tar.gz files if you prefer to use the provided database dumps and other generated data.
- Individual MySQL dumps:
- hib_chunks (159 MB)
- hib_citations (40 MB)
- hib_entities (27 MB)
- hib_entity_occurrences (36 MB)
- hib_frequencies (298 MB)
- hib_lang_abbrevs (867 bytes)
- hib_languages (914 bytes)
- hib_parses (10 MB)
- hib_person_names (3 MB)
- hib_toc_chunks (7 MB)
- hib_tocs (28 KB)
- hib_word_counts (39 KB)
- metadata (299 KB)
- morph_frequencies (36 MB)
- prior_frequencies (198 MB)
- scored_entity (80 MB)
- Download processed XML texts and cache files. These directories go in /sgml/xml/.
- Download Lucene indexes (250 MB). This directory goes in /sgml/reading/.
Perseus' Art & Archaeology Module
Source Code - The source code can be downloaded from SourceForge.net
Data - Download the data for the Art & Archaeology Module (4 MB)