Hermetic Word Frequency Counter Advanced
|
Hermetic Word Frequency Counter Advanced 9.01 is a Word Processing product from hermetic.ch, get 5 Stars SoftSea Rating, This program scans a text file, or text on the clipboard, and counts the number of occurrences of the different words (optionally ignoring common words such as this). The words which are found can be listed alphabetically or by frequency.
The term 'word' usually means a word in a natural language such as English or French, but for this program it has an extended meaning: Any sequence of characters consisting of letters from a European language plus (optionally) hyphens, numerals, underscores, colons, periods, apostrophes, @-signs, and forward and backward slashes. Thus not only can the text being scanned be in a language other than English, it can even be in a computer language such as C. The program even allows you to count words which include @-signs (if you are interested in email addresses).
Scannable files: The file upon which the software acts can have any filename extension, but it must consist almost entirely of text characters (either 8-bit text or 16-bit Unicode text). More exactly, it must consist only of characters with single-byte values in the range 32 through 255, except for whitespace characters: linefeeds (byte value 10), carriage returns (13), tab characters (9), backspaces (8) and page breaks (12) - except that (i) Unicode text has zero bytes and (ii) up to 0.1% of the bytes (other than zero bytes in Unicode text files) are allowed to be "anomalous bytes", that is, bytes with values less than 32 but which are not whitespace characters. This exception is due to rare cases where a large text file will, for some reason or another, contain a few anomalous bytes (which should thus not prevent the software from treating the file as a text file).
The input file would typically consist of natural language text (English, German, Spanish, etc.), but need not; it can consist of software code (e.g., a C++ source file) or can be an HTML or an XML document.
The Advanced Version does everything that the standard version does, so this page explains only the functions of the Advanced Version which are not present in the standard version (such as the capability to count words in multiple files and the capability to count phrases as well as words). Thus the user manual for the standard version should be read before (or after) reading this page (but note that the appearance of the main window and of the 'Set parameters' window differ in the Advanced Version).
The term 'word' usually means a word in a natural language such as English or French, but for this program it has an extended meaning:
In the standard version a word is any sequence of characters consisting of letters from a European language plus (optionally) hyphens, numerals, underscores, colons, periods, apostrophes, @-signs, and forward and backward slashes.
In the Advanced Version a word may (optionally) also include ampersands, grave accents, commas and parentheses (the last two thus allowing names of chemical compounds to be treated as words).
The standard version has only one mode of operation: count-all. The Advanced Version has two different modes of operation: count-all and count-only.
In count-all mode the program scans one or more files, or text on the clipboard, and counts the number of occurrences of the different words.
In count-only mode it scans one or more files, or text on the clipboard, and counts the occurrences only of a specified set of words or phrases.
The words (and phrases) found can be listed alphabetically or by frequency. In count-only mode the names of the files in which the words and phrases are found can be displayed, together with their frequencies of occurrence in each file.
There are five main features in the Advanced Version which are not present in the standard version:
The capability to count phrases as well as words.
The capability to scan not just one file but all files in a folder, and optionally in all subfolders, and to return a single report on the frequencies of words and phrases in all files scanned.
The capability to specify not only a list of words to be ignored (such as common words in a natural language) but also the capability to specify a list of words and phrases which are to be counted, with all words not in this list being ignored.
In the latter case to show, for each word or phrase found, the files in which it occurs.
The capability to generate data which can be used to test whether a corpus of text conforms to Zipf's Law. The license of this office software is Free Trial Software, the price is $47.50, you can free download and get a free trial before you buy. If you want to get a full or nolimited version of Hermetic Word Frequency Counter Advanced, you can buy this office software.

