Write a program to read English text to end-of-data (type control-D to indicate end of data at a terminal, see below for detecting it), and print a count of word lengths, i.e. the total number of words of length 1 which occurred, the number of length 2, and so on. Define a word to be a sequence of alphabetic characters. You should allow for word lengths up to 25 letters.

Typical output should be like this:

length 1 : 10 occurrences
length 2 : 19 occurrences
length 3 : 127 occurrences
length 4 : 0 occurrences
length 5 : 18 occurrences

Source Code

Brief explanation is provided after the source code.

#include <stdio.h>
#include <ctype.h>
#include <string.h>
#define SIZE 500
#define LEN 100

int main(int argc, char ** argv) {
    int i, j, letter;
    char c, text[SIZE], word[LEN];
    int occurence[26] = {0};

    for (i = 0; i < SIZE - 1 && (c = getchar()) != EOF; i++)
        text[i] = c;
    text[i] = '\0';

    for (i = 0, j = 0; i <= strlen(text)+1; i++) {

        if(text[i] != ' ' && text[i] != '\0'&& text[i] != '\n')

            word[j++] = text[i];

        else {

            for (letter = 0; letter < j; letter++) {
                if(ispunct(word[letter]) || isdigit(word[letter])) {
                    break;
                }
            }
            if (letter == j ) {
                occurence[j]++;
            }
            j = 0;
        }
    }

    for(int i = 1; i <= 25; i++)
        printf("length %d : %d occurence(s)\n", i, occurence[i]);
    return 0;
}

When you compile and execute the above program it produces the following result on Linux:

I am courageous
I am unstoppable
I am victorious
length 1 : 3 occurence(s)
length 2 : 3 occurence(s)
length 3 : 0 occurence(s)
length 4 : 0 occurence(s)
length 5 : 0 occurence(s)
length 6 : 0 occurence(s)
length 7 : 0 occurence(s)
length 8 : 0 occurence(s)
length 9 : 0 occurence(s)
length 10 : 2 occurence(s)
length 11 : 1 occurence(s)
length 12 : 0 occurence(s)
length 13 : 0 occurence(s)
length 14 : 0 occurence(s)
length 15 : 0 occurence(s)
length 16 : 0 occurence(s)
length 17 : 0 occurence(s)
length 18 : 0 occurence(s)
length 19 : 0 occurence(s)
length 20 : 0 occurence(s)
length 21 : 0 occurence(s)
length 22 : 0 occurence(s)
length 23 : 0 occurence(s)
length 24 : 0 occurence(s)
length 25 : 0 occurence(s)

Brief Explanation

  • The program starts by initializing the array occurrence of size 26 (0-25) to zero (i.e each cell in the array is set to zero). This is the array that keeps track of the occurrences of word lengths up to 25 letters.
  • Using a for loop and the function getchar, the next input character is read to the array text of size 500 until the end-of-data is inputted or the variable i becomes greater than 498. The function getchar returns the next input character each time it is called, or EOF when it encounters end of file. The symbolic constant EOF is defined in <stdio.h>
  • Each character in array text is read into the character array word until either a space, a string termination or a newline is reached in which case the word just read is checked if it contains any digit or punctuation. If it contains a digit or a punctuation, the variable letter would not be equal to the variable j and so the occurrence would not be considered. If otherwise, the occurrence for a word of this length would be incremented by 1.
  • The last for loop is used to loop through the occurrence array and display the count of word lengths up to 25 letters.
  • The return 0 at the end of the program implies normal termination.

Add comment


Security code
Refresh