[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hash Function

From: Jordi Gutiérrez Hermoso
Subject: Re: Hash Function
Date: Wed, 5 Dec 2012 10:35:02 -0500

On 5 December 2012 10:07, Jonathan Karsch <address@hidden> wrote:
> I am trying to figure out how many distinct words are in a text
> document, and how many instances there are of each.

Octave does not have hashes nor sufficiently flexible associative
arrays. I recommend using a language other than Octave for this task.
For example, here is how you can do it in Python:

    #!/usr/bin/env python

    import sys
    from collections import defaultdict

    f = open(sys.argv[1])

    wordcount = defaultdict(int)

    for line in f.readlines():
        words = line.split()
        for word in words:
            wordcount[word] += 1


    for word, count in wordcount.iteritems():
        print "%s: %d" % (word, count)

Here is how to do it in Perl:

    #!/usr/bin/env perl -w

    use strict;

    my %wordcount;
      my @words = split;
      foreach my $word (@words){

    while (my ($word, $count) = each %wordcount){
      print "$word: $count\n";

Both languages are installed already in your McIntosh PC. Put either
of those programs into a file named count_words, give it executable
permissions, and do "./count_words somefile".

- Jordi G. H.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]