Notebook for Lemmatization¶

Setup¶

  • Check your Python version and install the CLTK library.
In [ ]:
!python --version
!which python
In [ ]:
pip install cltk
In [ ]:
pip install --upgrade jupyter ipywidgets

Processing a Single Chapter¶

  1. Familiarize yourself with the document you want to lemmatize.

  2. Read the document carefully.

  3. Document analysis:

    • List the first 80 characters.
    • Count the total number of characters (letters, spaces, punctuation).
    • Tokenize the document to count the number of words.
In [ ]:
with open("texts/C_II_all.txt") as f:
    Chapter2_full = f.read()
In [ ]:
snippet = Chapter2_full[:80]
chars   = len(Chapter2_full)
tokens  = len(Chapter2_full.split())

print("80 first worlds:", snippet) 
print("Character count:", chars) 
print("Approximate token count:", tokens)

Running the Lemmatization Pipeline¶

  1. Import the NLP module.
  2. Apply lemmatization to the document.
  3. Print the first 20 lemmatized words as a quick test.
In [ ]:
from cltk import NLP
In [ ]:
cltk_nlp = NLP(language="lat")
In [ ]:
%time  cltk_doc = cltk_nlp.analyze(text=Chapter2_full) 
In [ ]:
# List of lemmas
print(cltk_doc.lemmata[:20])

Formatting the Document for Data Visualization¶

  • Create a .txt file with one word per line (for archival purposes).
  • Create a .txt file containing the full chapter with the entirely lemmatized text for visualization (e.g., co-occurrence, topic modeling).
In [ ]:
# Assuming cltk_doc.lemmata contains the lemmata data
lemmata = cltk_doc.lemmata

# Specify the file path where you want to save the result
file_path = "C_II_lem_col.txt"

# Open the file in write mode ('w'), this will overwrite any existing file with the same name
with open(file_path, 'w') as file:
    # Write each lemma to the file
    for lemma in lemmata:
        file.write(f"{lemma}\n")  # Each lemma on a new line

print(f"Results saved to {file_path}")

Reconstituer le texte pour l'analyse de coocurence

In [ ]:
# Assuming cltk_doc.lemmata contains the lemmata data
lemmata = cltk_doc.lemmata

# Specify the file path where you want to save the result
file_path = "C_II_lem.txt"

# Open the file in write mode ('w'), this will overwrite any existing file with the same name
with open(file_path, 'w') as file:
    # Join the lemmata with spaces between them and write to the file
    file.write(" ".join(lemmata))  # All words in one line, separated by spaces

print(f"Results saved to {file_path}")

Processing a Batch of Documents¶

  1. Work with a large dataset using the glob function.

    • Example path: "texts/*/*.txt"
  2. Create the necessary folders and subfolders.

  3. Run the NLP pipeline.

In [1]:
import glob
import os
from cltk import NLP

# Initialize CLTK only once
cltk_nlp = NLP(language="lat")

# Input folder pattern: all .txt files inside subfolders e.g. : "texts/*/*.txt" ou un dossier précis  "texts/Hyperius/*.txt"
INPUT_PATTERN =  "texts/Unbekannt/*.txt"

# Output folder
OUTPUT_DIR = "lemmatized_outputs"
os.makedirs(OUTPUT_DIR, exist_ok=True)

print("Setup complete.")
‎𐤀 CLTK version '1.4.0'. When using the CLTK in research, please cite: https://aclanthology.org/2021.acl-demo.3/

Pipeline for language 'Latin' (ISO: 'lat'): `LatinNormalizeProcess`, `LatinStanzaProcess`, `LatinEmbeddingsProcess`, `StopsProcess`, `LatinLexiconProcess`.

⸖ ``LatinStanzaProcess`` using Stanza model from the Stanford NLP Group: https://stanfordnlp.github.io/stanza/ . Please cite: https://arxiv.org/abs/2003.07082
⸖ ``LatinEmbeddingsProcess`` using word2vec model by University of Oslo from http://vectors.nlpl.eu/ . Please cite: https://aclanthology.org/W17-0237/
⸖ ``LatinLexiconProcess`` using Lewis's *An Elementary Latin Dictionary* (1890).

⸎ To suppress these messages, instantiate ``NLP()`` with ``suppress_banner=True``.
Setup complete.

Verify Documents to be Lemmatized¶

  • Display the list of documents to be processed.
In [2]:
files = glob.glob(INPUT_PATTERN)

print(f"Found {len(files)} text files:")
for f in files:
    print(" -", f)
Found 12 text files:
 - texts/Unbekannt/C_II_v2_cl.txt
 - texts/Unbekannt/C_II_v8_cl.txt
 - texts/Unbekannt/C_II_v6-7_cl.txt
 - texts/Unbekannt/C_II_v1_cl.txt
 - texts/Unbekannt/C_II_v13-14_cl.txt
 - texts/Unbekannt/C_II_cl.txt
 - texts/Unbekannt/C_II_v5-6_cl.txt
 - texts/Unbekannt/C_II_v11-12_cl.txt
 - texts/Unbekannt/C_II_v15_cl.txt
 - texts/Unbekannt/C_II_v9_cl.txt
 - texts/Unbekannt/C_II_v10_cl.txt
 - texts/Unbekannt/C_II_v3-4_cl.txt

Lemmatizing a Dataset¶

  1. Create a loop to process multiple documents.

  2. Choose a consistent naming convention for output files:

    filename = "FolderName_FileName.txt"
    
  3. Run the NLP processing line for each document.

  4. Save the results in a designated folder.

In [15]:
#open the loop

for file_path in files:
    print("\nProcessing:", file_path)

    # -------------------------------------------------------------------
    # Extract folder name + filename
    # -------------------------------------------------------------------
    folder = os.path.basename(os.path.dirname(file_path))     # e.g. "Bullinger"
    filename = os.path.basename(file_path)                    # e.g. "C_II_v5-7_cl.txt"

    # Clean base name: remove extension + replace hyphens
    base = os.path.splitext(filename)[0].replace("-", "_")

    # Output name format: Folder_Filename_lem.txt
    output_name = f"{folder}_{base}_lem.txt"
    output_path = os.path.join(OUTPUT_DIR, output_name)

    # -------------------------------------------------------------------
    # Read the file
    # -------------------------------------------------------------------
    with open(file_path, "r", encoding="utf-8") as f:
        text = f.read()

    # -------------------------------------------------------------------
    # CLTK NLP
    # -------------------------------------------------------------------
    cltk_doc = cltk_nlp.analyze(text=text)
    lemmas = [w.lemma for w in cltk_doc.words]

    # -------------------------------------------------------------------
    # Save output
    # -------------------------------------------------------------------
    with open(output_path, "w", encoding="utf-8") as out:
        out.write("\n".join(lemmas))

    print("Saved →", output_path)

print("\nAll files processed!")
Processing: texts/Unbekannt/C_II_v2_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v2_cl_lem.txt

Processing: texts/Unbekannt/C_II_v8_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v8_cl_lem.txt

Processing: texts/Unbekannt/C_II_v6-7_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v6_7_cl_lem.txt

Processing: texts/Unbekannt/C_II_v1_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v1_cl_lem.txt

Processing: texts/Unbekannt/C_II_v13-14_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v13_14_cl_lem.txt

Processing: texts/Unbekannt/C_II_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_cl_lem.txt

Processing: texts/Unbekannt/C_II_v5-6_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v5_6_cl_lem.txt

Processing: texts/Unbekannt/C_II_v11-12_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v11_12_cl_lem.txt

Processing: texts/Unbekannt/C_II_v15_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v15_cl_lem.txt

Processing: texts/Unbekannt/C_II_v9_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v9_cl_lem.txt

Processing: texts/Unbekannt/C_II_v10_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v10_cl_lem.txt

Processing: texts/Unbekannt/C_II_v3-4_cl.txt
Saved → lemmatized_outputs/Unbekannt_C_II_v3_4_cl_lem.txt

All files processed!

List Processed Documents¶

  • Keep track of the documents that have been successfully lemmatized.
In [3]:
print("Generated files:")
for f in sorted(os.listdir(OUTPUT_DIR)):
    print(" -", f)
Generated files:
 - Aretius_C_II_cl_lem.txt
 - Aretius_C_II_v10_cl_lem.txt
 - Aretius_C_II_v11_cl_lem.txt
 - Aretius_C_II_v12_cl_lem.txt
 - Aretius_C_II_v13_cl_lem.txt
 - Aretius_C_II_v14_cl_lem.txt
 - Aretius_C_II_v15_cl_lem.txt
 - Aretius_C_II_v1_cl_lem.txt
 - Aretius_C_II_v1b_cl_lem.txt
 - Aretius_C_II_v2_cl_lem.txt
 - Aretius_C_II_v2b_cl_lem.txt
 - Aretius_C_II_v3_cl_lem.txt
 - Aretius_C_II_v6_cl_lem.txt
 - Aretius_C_II_v6b_cl_lem.txt
 - Aretius_C_II_v7_cl_lem.txt
 - Aretius_C_II_v8_cl_lem.txt
 - Aretius_C_II_v9_cl_lem.txt
 - Bugenhagen_C_II_cl_lem.txt
 - Bugenhagen_C_II_v11_cl_lem.txt
 - Bugenhagen_C_II_v1_cl_lem.txt
 - Bugenhagen_C_II_v4_cl_lem.txt
 - Bugenhagen_C_II_v5_cl_lem.txt
 - Bugenhagen_C_II_v6_cl_lem.txt
 - Bugenhagen_C_II_v8_cl_lem.txt
 - Bugenhagen_C_II_v8b_cl_lem.txt
 - Bullinger_C_II_cl_lem.txt
 - Bullinger_C_II_v11_15_cl_lem.txt
 - Bullinger_C_II_v15ep_cl_lem.txt
 - Bullinger_C_II_v15epb_cl_lem.txt
 - Bullinger_C_II_v1_2_cl_lem.txt
 - Bullinger_C_II_v1_cl_lem.txt
 - Bullinger_C_II_v3_4_cl_lem.txt
 - Bullinger_C_II_v5_7_cl_lem.txt
 - Bullinger_C_II_v8_cl_lem.txt
 - Bullinger_C_II_v9_10_cl_lem.txt
 - Cajetan_C_II_cl_lem.txt
 - Calvin_C_II_v11_15_cl_lem.txt
 - Calvin_C_II_v1_2_cl_lem.txt
 - Calvin_C_II_v2_4_cl_lem.txt
 - Calvin_C_II_v5_7_cl_lem.txt
 - Calvin_C_II_v8_10_cl_lem.txt
 - Hyperius_C_II_cl_lem.txt
 - Hyperius_C_II_v11_12_cl_lem.txt
 - Hyperius_C_II_v13_14_cl_lem.txt
 - Hyperius_C_II_v15_cl_lem.txt
 - Hyperius_C_II_v15ep_cl_lem.txt
 - Hyperius_C_II_v1_2_cl_lem.txt
 - Hyperius_C_II_v3_4_cl_lem.txt
 - Hyperius_C_II_v5_6_cl_lem.txt
 - Hyperius_C_II_v7_cl_lem.txt
 - Hyperius_C_II_v8_10_cl_lem.txt
 - Hyperius_C_II_v8_cl_lem.txt
 - Lambertus_C_II_cl_lem.txt
 - Lambertus_C_II_v10_cl_lem.txt
 - Lambertus_C_II_v11_cl_lem.txt
 - Lambertus_C_II_v12_cl_lem.txt
 - Lambertus_C_II_v12b_cl_lem.txt
 - Lambertus_C_II_v13_cl_lem.txt
 - Lambertus_C_II_v14_cl_lem.txt
 - Lambertus_C_II_v15_cl_lem.txt
 - Lambertus_C_II_v1_cl_lem.txt
 - Lambertus_C_II_v2_cl_lem.txt
 - Lambertus_C_II_v3_cl_lem.txt
 - Lambertus_C_II_v4_cl_lem.txt
 - Lambertus_C_II_v5_cl_lem.txt
 - Lambertus_C_II_v6_cl_lem.txt
 - Lambertus_C_II_v7_cl_lem.txt
 - Lambertus_C_II_v8_cl_lem.txt
 - Lambertus_C_II_v9_cl_lem.txt
 - Lefevre_C_II_cl_lem.txt
 - Pellicanus_C_II_v11_12_cl_lem.txt
 - Pellicanus_C_II_v13_14_cl_lem.txt
 - Pellicanus_C_II_v15_cl_lem.txt
 - Pellicanus_C_II_v1_2_cl_lem.txt
 - Pellicanus_C_II_v2_cl_lem.txt
 - Pellicanus_C_II_v3_4_cl_lem.txt
 - Pellicanus_C_II_v5_6_cl_lem.txt
 - Pellicanus_C_II_v6_7_cl_lem.txt
 - Pellicanus_C_II_v8_cl_lem.txt
 - Pellicanus_C_II_v9_10_cl_lem.txt
 - Unbekannt_C_II_cl_lem.txt
 - Unbekannt_C_II_v10_cl_lem.txt
 - Unbekannt_C_II_v11_12_cl_lem.txt
 - Unbekannt_C_II_v13_14_cl_lem.txt
 - Unbekannt_C_II_v15_cl_lem.txt
 - Unbekannt_C_II_v1_cl_lem.txt
 - Unbekannt_C_II_v2_cl_lem.txt
 - Unbekannt_C_II_v3_4_cl_lem.txt
 - Unbekannt_C_II_v5_6_cl_lem.txt
 - Unbekannt_C_II_v6_7_cl_lem.txt
 - Unbekannt_C_II_v8_cl_lem.txt
 - Unbekannt_C_II_v9_cl_lem.txt

Formatting Documents for Data Visualization¶

  • Create a .txt file with one word per line (for archives).
  • Create a .txt file containing each verses and his commentary with the fully lemmatized text (for co-occurrence and topic modeling visualization).
In [4]:
# Directory with lemma files
LEMMA_DIR = OUTPUT_DIR

# Directory for continuous text output
CONTINUOUS_DIR = "lemmatized"
os.makedirs(CONTINUOUS_DIR, exist_ok=True)

# Loop over all lemma files
for filename in os.listdir(LEMMA_DIR):
    if filename.endswith("_lem.txt"):
        input_path = os.path.join(LEMMA_DIR, filename)
        output_path = os.path.join(CONTINUOUS_DIR, filename)

        # Read lemma file
        with open(input_path, "r", encoding="utf-8") as f:
            lemmas = f.read().splitlines()

        # Join lemmas into continuous text
        continuous_text = " ".join(lemmas)

        # Save
        with open(output_path, "w", encoding="utf-8") as out:
            out.write(continuous_text)

        print("Converted →", output_path)

print("\nAll lemma converted to continuous text!")
Converted → lemmatized/Aretius_C_II_v14_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v4_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v5_7_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v3_4_cl_lem.txt
Converted → lemmatized/Calvin_C_II_v5_7_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v10_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v8_10_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v5_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v3_4_cl_lem.txt
Converted → lemmatized/Calvin_C_II_v2_4_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v2b_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v14_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v4_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v8_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v1_cl_lem.txt
Converted → lemmatized/Lefevre_C_II_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v13_cl_lem.txt
Converted → lemmatized/Calvin_C_II_v11_15_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v5_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v10_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v15epb_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v2_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v6b_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v5_6_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v8_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v7_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v3_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v1_2_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v1_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v2_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v1_2_cl_lem.txt
Converted → lemmatized/Calvin_C_II_v8_10_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v13_14_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v5_6_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v1b_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v1_2_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v3_4_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v13_14_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v6_7_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v6_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v15_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_cl_lem.txt
Converted → lemmatized/Cajetan_C_II_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v1_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v2_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v9_10_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v9_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v10_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v3_4_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v8_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v8_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v8b_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v15_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v8_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v11_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v11_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v3_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v6_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v15_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v11_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v7_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v8_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v5_6_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v9_10_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v15ep_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v12_cl_lem.txt
Converted → lemmatized/Calvin_C_II_v1_2_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v6_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v9_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_v1_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v12_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v2_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v15_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v7_cl_lem.txt
Converted → lemmatized/Bugenhagen_C_II_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v15ep_cl_lem.txt
Converted → lemmatized/Aretius_C_II_cl_lem.txt
Converted → lemmatized/Aretius_C_II_v13_cl_lem.txt
Converted → lemmatized/Hyperius_C_II_v11_12_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v15_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_cl_lem.txt
Converted → lemmatized/Lambertus_C_II_v12b_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v9_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v1_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v6_7_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v8_cl_lem.txt
Converted → lemmatized/Bullinger_C_II_v11_15_cl_lem.txt
Converted → lemmatized/Unbekannt_C_II_v11_12_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v11_12_cl_lem.txt
Converted → lemmatized/Pellicanus_C_II_v13_14_cl_lem.txt

All lemma converted to continuous text!
  • Reconstitute the chapter by combining all .txt files that share the same text, in alphanumerical order
In [9]:
import os
import re
from natsort import natsorted   # pip install natsort

# Folder containing your .txt files
folder = "lemmatized"

# Get all .txt files in folder
files = [f for f in os.listdir(folder) if f.endswith(".txt")]

# Extract prefix before first underscore
# Example: Aretius_C_II_v1 → prefix = "Aretius"
prefixes = {}
for f in files:
    prefix = f.split("_")[0]
    prefixes.setdefault(prefix, []).append(f)

# Process each prefix group
for prefix, grouped_files in prefixes.items():
    # Sort alphanumerically
    sorted_files = natsorted(grouped_files)

    # Output filename
    outfile = f"{prefix}_all_C_II_cl_lem.txt"

    print(f"Creating {outfile} from {len(sorted_files)} files...")

    with open(outfile, "w", encoding="utf-8") as out:
        for fname in sorted_files:
            path = os.path.join(folder, fname)
            with open(path, "r", encoding="utf-8") as infile:
                out.write(infile.read())
                out.write("\n")   # optional separator

print("Done.")
Creating Aretius_all_C_II_cl_lem.txt from 17 files...
Creating Lambertus_all_C_II_cl_lem.txt from 17 files...
Creating Bullinger_all_C_II_cl_lem.txt from 10 files...
Creating Hyperius_all_C_II_cl_lem.txt from 11 files...
Creating Calvin_all_C_II_cl_lem.txt from 5 files...
Creating Unbekannt_all_C_II_cl_lem.txt from 12 files...
Creating Bugenhagen_all_C_II_cl_lem.txt from 8 files...
Creating Pellicanus_all_C_II_cl_lem.txt from 10 files...
Creating Lefevre_all_C_II_cl_lem.txt from 1 files...
Creating Cajetan_all_C_II_cl_lem.txt from 1 files...
Done.

Documentation¶

  • Full documentation is available at CLTK Docs.

Citation¶

When using the CLTK, please cite the following publication:

Johnson, Kyle P., Patrick J. Burns, John Stewart, Todd Cook, Clément Besnier, and William J. B. Mattingly. "The Classical Language Toolkit: An NLP Framework for Pre-Modern Languages." In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, pp. 20-29. 2021. DOI: 10.18653/v1/2021.acl-demo.3

BibTeX entry:

@inproceedings{johnson-etal-2021-classical,
    title = "The {C}lassical {L}anguage {T}oolkit: {A}n {NLP} Framework for Pre-Modern Languages",
    author = "Johnson, Kyle P.  and
      Burns, Patrick J.  and
      Stewart, John  and
      Cook, Todd  and
      Besnier, Cl{\'e}ment  and
      Mattingly, William J. B.",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-demo.3",
    doi = "10.18653/v1/2021.acl-demo.3",
    pages = "20--29",
}