Wanted: texts with macrons to teach a "Macronizer" program

Here you can discuss all things Latin. Use this board to ask questions about grammar, discuss learning strategies, get help with a difficult passage of Latin, and more.
Post Reply
Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Wanted: texts with macrons to teach a "Macronizer" program

Post by Phil- »

Hi all,

I've written a script for adding macrons to Latin texts. It's not as straightforward as that, of course, but it should be helpful for unambiguous forms at least. More info and download: http://fps-vogel.github.io/tools/ (under "the Macronizer"--if any links are missing let me know, I put up the site just now).

The problem is that I've been unsuccessful in finding texts with macrons already, which I need in order to teach the program the word forms. I knew that texts in text-format are basically nonexistent online, but I assumed I could OCR texts in PDF files--but it turns out OCR software can't handle Latin text with macrons (except possibly FineReader, which is expensive).

Does anyone know of such texts online in text (not PDF) format? If I don't find anything, I could simply teach the program the forms as I format texts by hand... but lots of manual formatting is what I'm trying to avoid in the first place.

Thanks.

pmda
Textkit Zealot
Posts: 1341
Joined: Tue Apr 27, 2010 5:15 am

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by pmda »

Take a look at this: http://dcc.dickinson.edu/

Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by Phil- »

Thanks, pmda. That's exactly what I'm looking for, and what's more, some of the texts even have audio.

OCR-ing macrons is possible after all: http://fps-vogel.github.io/2014/10/28/t ... d-macrons/. Not without the usual correction of OCR errors, of course, but I have several scanned LLPSI volumes that I think would be worth working on.

EDIT: If anyone knows of good-quality scans of public-domain texts with accurate marking of vowel length, I would consider working on those as well. I'd actually prefer those, since I could freely share them afterwards.

Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by Phil- »

I just re-discovered Laura Gibbs' round-up of online texts with macrons, which I mistakenly thought listed only PDFs: http://bestlatin.blogspot.com/2011/04/s ... s-for.html.

I decided not to work on LLPSI, since it will require just as much correction as any OCR-ed text, and in the end I wouldn't be able to share it. But I'm thinking of working on Puer Romanus instead, even though it's shorter and would require more correction since the PDF (on Archive) is not as good.

pmda
Textkit Zealot
Posts: 1341
Joined: Tue Apr 27, 2010 5:15 am

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by pmda »

Phil I've sent you a DM. Chris Francese of Dickinson College sent me a spreadsheet of Henry Frieze’s Vergilian dictionary, whose macrons were OCR’s and hand corrected by Derek Frymark (column D).

He adds that Derek is also working on OCR macron correction for the first six Books of the Aeneid, based on Knapp’s edition. He may have some preliminary versions to share with you already.

You need to email francese@dickinson.edu...

Paul

Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by Phil- »

pmda wrote:Phil I've sent you a DM. Chris Francese of Dickinson College sent me a spreadsheet of Henry Frieze’s Vergilian dictionary, whose macrons were OCR’s and hand corrected by Derek Frymark (column D).

He adds that Derek is also working on OCR macron correction for the first six Books of the Aeneid, based on Knapp’s edition. He may have some preliminary versions to share with you already.

You need to email francese@dickinson.edu...
Thanks, Paul. Did you mean that you sent me a PM? If so, it seems I didn't receive it. I'll get in touch with Dr. Francese for sure. I had never heard of Dickinson College. It looks like he's doing good work over there--that series of online texts is great.

pmda
Textkit Zealot
Posts: 1341
Joined: Tue Apr 27, 2010 5:15 am

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by pmda »

Hi

I meant Direct Message. If yo want to send me by DM your email address I will forward the spreadsheet of Virgil vocab with macrons he sent me.

Paul

Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by Phil- »

Got it. I sent you a message. Thanks much!

-Felipe

pmda
Textkit Zealot
Posts: 1341
Joined: Tue Apr 27, 2010 5:15 am

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by pmda »

Funny, I didn't get a DM from you? send it to me at (deleted)

metrodorus
Textkit Fan
Posts: 339
Joined: Sun Jun 03, 2007 7:19 pm
Location: London
Contact:

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by metrodorus »

This is a great idea, but a warning - You need to be very careful with any macronised text you find - unfortunately, they are all - and I have seen and read just about every one currently scanned by google etc - rife with errors.

There is one massive source of error - the confusion between syllables long by position - marked as such in the Gradus Ad Parnassum for metrical purposes - and a macron that marks pronounced vowel length - i.e. syllables long by nature. Add to that the disputes that reign over words with hidden quantity.

Any macronised text you use to 'teach' a macroniser program, will need to be gone over and checked meticulously, a word at a time, otherwise you will get a classic case of 'garbage in - garbage out'.

To add to the confusion, almost every macronised text I have seen is internally inconsistent in its application of macrons.

Evan der Millner
I run http://latinum.org.uk which provides the Adler Audio Latin Course, other audio materials, and additional free materials on YouTube.

Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by Phil- »

Thanks, Evan, for the words of caution. I've seen these unfortunate complications for myself in the past few weeks as I've been gathering these texts. I'm working on ways to counteract them, mainly:

(1) a script that helps in applying hidden macrons to a text, using lots of find-and-replace strings, i.e. "stell" -> "stēll" for "stēlla", etc. (but these would have to be checked afterwards, as for example "pactus" can have a long "a" or not, depending on what verb it is derived from), and

(2) detailed records produced when the program "learns" words from a text: all new words similar to known words but with different vowel lengths (e.g. "comīs" and "cōmis") will be listed so that the user can check and if necessary correct them. (Alternate spellings complicate things, but at least the simple differences such as intervocalic i/j and "cu"/"quu" can be handled automatically.)

I didn't know what I was getting myself into, but I hope the end result will be time-saving at least.

pmda
Textkit Zealot
Posts: 1341
Joined: Tue Apr 27, 2010 5:15 am

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by pmda »

This may already be well known but I recently came across it:

http://vergil.classics.upenn.edu/vergil ... ument_id/1

I has a tool for showing macrons..

Phil-
Textkit Neophyte
Posts: 85
Joined: Tue May 27, 2014 12:44 am
Location: USA

Re: Wanted: texts with macrons to teach a "Macronizer" progr

Post by Phil- »

Thanks, pmda. Yes, that is a good resource for the Aeneid.

Way back when I was working on this tool, shortly afterward Johan Winge made his own macronizer which works beautifully: http://stp.lingfil.uu.se/~winge/macronizer/index.py. After I saw it, there was no reason to continue work on mine.

Post Reply