For each sentence in an input stream, display the count of words, of vowels, and of alphanumeric character. Also display the count of vowels and of alphanumerics for each of as many words as will fit on a line.
The program is a filter: it must read from STDIN, and send output to STDOUT.
The input will consist of zero or more paragraphs. Blank lines can occur anywhere in the input, outside of paragraphs.
A paragraph consists of one or more sentences, separated by sequences of whitespace characters (spaces and/or at most one newline).
A sentence consists of one or more words, separated by sequences of whitespace characters, where the last one ends in a period.
A word contains at least one alphanumeric character (letter or digit), and
optionally some punctuation (. , ; : ' " ( ) & /
). Note: This means that punctuation characters can never occur alone. But
except for the period, which must end a sentence, punctuation can occur
anywhere in a word.
An underscore is not a letter.
All input lines are properly newline terminated, and do not contain binary 0.
All input files have a total size so that they will fit comfortably in memory and still allow you ample memory to play with. Please note that the input file can be empty.
You may assume ASCII as the character set but you may not use Unicode-specific semantics.
All sentences will be less than 200 characters long and contain less than 100 vowels.
For each sentence in the input, you are to print two lines.
A line containing the number of words in the sentence, a colon, a horizontal tab, and up to 60 characters of text from the sentence, including the final period. Case must be preserved, and the words of the text must be separated by spaces to conform to the alignment described below.
If the whole sentence, thus aligned, cannot be printed in the given width, the last three characters of that width must be replaced by periods.
A line containing the total number of vowels (aeiouy, regardless of case) in the sentence, a slash, the total number of alphanumeric characters in the sentence, a colon, and a horizontal tab, and similar counts for the words in the preceding line.
That is, for each word of which at least one character was printed on the previous line, you must print the the number of vowels and alphanumerics in the whole word, again separated by a slash.
These pairs of counts must be aligned with spaces to begin in the same columns as the corresponding words, and must be separated by at least one space; if that is not possible (because they would touch or overlap), extra spaces must be inserted in the text line instead.
The tiebreaker is calculated as the number of vowels in the script divided by the total number of letters and digits.
All output lines must be properly newline terminated.
You must not write to STDERR.
The program return code does not matter.
The average runtime of the program must be finite, but may be arbitrarily long.
The programs can be written as one or more lines. The score is the total number of characters you need (smaller is better). If your program is more than one line, you must count the newlines in between as one character each. The #! line is not counted. If you use options on the #! line, the options themselves are counted, including the leading space and ``-''.
If two (or more) golfers have the same score for the hole, the golfer with the lowest tie break score wins.
All programs must work on perl 5.6.1.
Assume total memory is < 2**32 bytes. The runtime of your programs should be finite. If your program takes more than a reasonable time to run, the validation of your solution by the referees can of course take more time than usual.
The programs may only use the perl executable, no other executables on the system are allowed (the program may use itself though). You may use any of the perl 5.6.1 standard core modules (perldoc perlmodlib for a list of those core modules). Your solutions must be portable in the sense that it should work on all versions of 5.6.1 everywhere (however, it's perfectly fine to abuse perl 5.6.1 bugs).
When tested, your script will be named interlin.pl, and you must assume your script to have file permissions of 0444 (that is read-only and non-executable for windows folks).
Given the input:
I am just a poor sample paragraph. Please don't treat me too hard as you split and separate me into vowels and stuff.
You are to output the following:
7: I am just a poor sample paragraph. 11/27: 1/1 1/2 1/4 1/1 2/4 2/6 3/9 16: Please don't treat me too hard as you split and separat... 27/65: 3/6 1/4 2/5 1/2 2/3 1/4 1/2 3/3 1/5 1/3 4/8
The game starts June 1st (00:00 UTC) and ends June 8th (00:00 UTC).
A test program is provided to help screen entries (for those of you who feel courageous, there's also the Games::Golf version - be warned: it's still an experimental module, use at your own risks).
There's a new version (v1.04) of the test program!!
Any program that passes the test program should be submitted. If you are surprised that your solution passed the test program, please submit it anyway! That will help us identify bugs in the test program.
For the test program to work correctly, you will have to name your script interlin.pl and place it in the same directory as your test program. Run the test program:
$ perl tpr04.pl
to verify that your entries are valid.
Passing the test program does not assure your solution is valid. The referees have the final say.
You can submit your solutions here (you'll notice it's the same page as the Leaderboard).
Do not publish your solutions anywhere. That will spoil the game, as your solutions are meant to be secret. All solutions will be published at the end of the game.
Prizes (provided by O'Reilly and ActiveState) will be awarded to veteran and beginner winners. A prize may also be awarded to any especially interesting artistic and/or unorthodox solutions.
You can track your ranking through the leaderboard here. Beginners are encouraged to enter and there is a separate leaderboard for them.
There is also a special leaderboard for teams. There will be no prizes awarded to the best team, other than the admiration of your fellow golfers. If you are in a team, you can't also play individually.
We encourage you to send feedback as well as your ideas for future holes and tiebreakers to golf@theperlreview.com.
Samy Kamkar <skamkar@lucidx.com>
Ala Qumsieh <aqumsieh@hyperchip.com>
Lars Mathiesen <thorinn@diku.dk>
Dave Hoover <dave@redsquirreldesign.com>
Jerome Quelin <jquelin@wanadoo.fr>
If you want to be a referee next month, drop us a note: golf@theperlreview.com