Second Workshop on Computation and Written Language (CAWL 2024)

To be held in conjunction with LREC-COLING 2024
Torino, Italy, May 21, 2024

Annual CAWL workshops are organized under the guidance of the newly formed ACL Special Interest Group on Writing Systems and Written Language (SIGWrit).

Important Dates:
Paper submission deadline: February 22, 2024 (anywhere in the world)
Notification of acceptance: March 25, 2024
Camera-ready paper due: April 1, 2024
Workshop date: May 21, 2024

Invited Speaker: Nizar Habash (NYU Abu Dhabi)

CAWL 2024 will feature a special theme for workshop submissions: Writing Systems of Africa. Stay tuned for the first call-for-papers.

What's the workshop about?
Most work on NLP focuses on language in its canonical written form. This has often led researchers to ignore the differences between written and spoken language or, worse, to conflate the two. Instances of conflation are statements like “Chinese is a logographic language" or “Persian is a right-to-left language", variants of which can be found frequently in the ACL anthology. These statements confuse properties of the language with properties of its writing system. Ignoring differences between written and spoken language leads, among other things, to conflating different words that are spelled the same (e.g., English bass), or treating as different, words that have multiple spellings (e.g., Japanese umai ‘tasty’, which can be written 旨い, うまい, ウマい, or 美味い).

Furthermore, methods for dealing with written language issues (e.g., various kinds of normalization or conversion) or for recognizing text input (e.g. OCR & handwriting recognition or text entry methods) are often regarded as precursors to NLP rather than as fundamental parts of the enterprise, despite the fact that most NLP methods rely centrally on representations derived from text rather than (spoken) language. This general lack of consideration of writing has led to much of the research on such topics to largely appear outside of ACL venues, in conferences or journals of neighboring fields such as speech technology (e.g., text normalization) or human-computer interaction (e.g., text entry).

This workshop will bring together researchers who are interested in the relationship between written and spoken language, the properties of written language, the ways in which writing systems encode language, and applications specifically focused on characteristics of writing systems. Topics of interest include but are not limited to:
