TLG to Unicode Converter 1.9

tlgu will convert an input_file from Thesaurus Linguae Graeca (TLG) and Packard Humanities Institute (PHI) representation to a Unicode (UTF-8) output_file which can then be read or searched using available pattern matching tools, like grep and awk. The TLG/PHI representation consists of "beta-code" text and citation information. The TLG / PHI and Epigraphical corpuses include the majority of classical Hellenic and Latin works and inscriptions. Several options are available, including splitting a file into works, hyphenation removal to re-join words and allow reformatting for reading the material, formatting of citation information.

Tags hellenic latin epigraphical texts unicode search translation linguistic filter
License GNU GPL
State stable

Recent Releases

1.930 Sep 2024 11:58 cleanup: tlgu was first released in 2005. Version 1.9 changes: Citation handling corrections
1.8.230 May 2020 10:23 cleanup: tlgu was first released in 2005. Version 1.8.2 changes: Addition of -U option: vowels with acute accent are output using the Unicode 0x0370 (default is outputting codes from the 0x1F00 codepoint block). Several special beta code representations were updated.