Digitizing Siddurim (PresenTense 2009)

One of the enduring challenges of the Open Siddur has been acquiring digitized siddur content that is in the public domain (or which is at least distributed with a, Open Content copyleft license such as CC BY-SA). Our greatest advance so far been attaining a digitized Public Domain text of the Leningrad Codex of the TaNaḤ (in XML).

Given that over 50% of the siddur is sourced in the TaNaḤ, and since it can be referenced chapter and verse by XML, our digitization efforts for the core content of the siddur can be considered over 50% complete. To obtain the rest from siddurim in the public domain, we either need an excellent Hebrew OCR program, or a large team of (hopefully crowdsourced) transcribers. Both methods will require a rigorous quality control process.

However, there’s yet another more obvious alternative to the digitization challenge: finding siddurim that have already been digitized by others. Since arriving in Jerusalem last Wednesday I’ve already been clued to three digital siddurim available on the web (in Ashkenazi, Sefardi, and Mizrachi nusḥaot). I am hopeful that their likely owner–the people responsible for digitizing the text at the DAAT project–will be agreeable to contributing them with a permissive license that will allow the Open Siddur to create a derivative XML encoded text from them.

One might ask why the content of the siddur isn’t free from copyright to begin with. Well, it is in a sense–all work published prior to 1923 is considered to be in the Public Domain. But copyright is applied even to transcribed texts so unless the publisher has consented to their digital transcription having a permissive copyleft license (some rights reserved), it is protected copyright (all rights reserved). So, projects like the Open Siddur that seek to work creatively with Jewish culture must work cooperatively to liberate the legacy of Jewish culture and tradition from the current restrictive climate determined by intellectual property law.

For those of us interested in working with Jewish texts, the idea others claiming copyright on our foundational sourcetexts, digitized or not, seems like an absurdity. We enliven the works of our ancestors by studying their teachings, and meditating on and singing with their prayers. The inspired author or authors of these works gave their work freely to the Jewish people and to the world. All the tradition demands is correct attribution, as is taught in the Pirkei Avot chapter 6:6,
התורה נקנית בערבעים ושמונה דברים. ואלו הן: (….)והאומר דבר בשם אומרו. הא למדת כל-האומר דבר בשם אומרו מביא גאלה לעולם, שנאמר “ותאמר אסתר למלך בשם מרדכי
…the Torah is acquired by means of forty-eight qualities, which are: (….) [and lastly] what the student has heard from others she will quote in the name of him of whom she has heard it. For so you have learned: He who quotes something in the name of the person who said it brings deliverance to the world. For it is said: “And Esther said to the King in the name of Mordechai.” [emphasis mine]

R. Samson Raphael Hersh comments:

[A student] is careful to absorb and repeat accurately what they have heard from others and will never pass off as their own what others have told them.

2 comments to Digitizing Siddurim (PresenTense 2009)

Comments, Corrections, and Queries

בסיעתא דארעא