Monday, 11 December, 2006

The Ancient Katapayadi Formula And The Modern Hashing Method

Abstract

The essence of the modern hashing technique in Computer Science is the derivation of a number from a nonnumeric key to index into a table where the record containing the key is stored. It is believed that the idea of hashing was first seriously considered by H.P.Luhn of IBM in 1953.In this paper, an interestingly similar technique used in South Indian musicology in the 18th century is described and the question of whether it is an anticipation of the Hashing technique is briey addressed.

1 Introduction

The problem of retrieving a record from a table based upon a given key has been studied extensively. A survey of work in this area can be found in (Sev-erance 1974). In this paper I describe one particular approach to this problem Hashing, and also an interesting earlier development very similar to it. Itis generally believed that the idea of hashing was originated by H.P.Luhn, in an internal IBM memorandum in 1953 (Knuth 1973) and described first in the open literature by Arnold Dumey (Dumey 1956). But is it possible that the Katapayadi scheme of deriving numbers from names, in conjunction with the applications to which it had been put, especially in classical South Indian musicology, is an early anticipation of the hashing technique? We will look at this issue in more detail here.

2 Hashing

A hash-table is a data structure in which it takes on average constant time to find any given element. This constant time is the time taken to compute a function called the hash-function of the element being searched for. This is in contrast to a Binary Search Tree data structure, for example, in which the time taken to find an element is on average proportional to the log2N, an array or linked linear list data structures in which the time is proportional to N where N is the total number of elements. The following example illustrates the use of hashing where the marks of ten students need to be stored in a table. It is a trivial one, but it is sufficient to bring out the essential principle behind hashing.

2.1 Example

Examination marks for ten students Amy, Ben, Cho, Dan, Eva, Fan, Gus, Hal,Ian and Jim need to be stored in a table. We might additionally want to retrieve the mark of a student on demand, and optionally modify it. One way of doing this is to store the marks sequentially in a table of size 10, and perform a sequential search on it each time we want to retrieve a particular record. This would mean that on average, we can expect to scan half the table (5 elements)before ending the desired record. A more anecient storage technique will be to store the elements in sorted order by name. In this case, we would expect to search the table log210 (approximately 3.2) times on average for each retrieval, because at each examination, our search space is efectively halved as the element we want is either current, in the upper half or lower half depending on whether it is equal to, less than or greater than the current element.

In contrast to these techniques, the hashing scheme derives a unique number corresponding to each name which gives us the cell address of the element in the table. If we used a hash function H(x) = (ascii(x[0])-5) % 10 + 1, where x is the name or value being hashed, x[0] is the first letter of that name, ascii() is a function that returns the ASCII value of a given letter, and % stood for the modulus or remainder operator, then the following arrangement of elements inthe table would be seen.

Addr 0 1 2 3 4 5 6 7 8 9

Name Amy Ben Cho Dan Eva Fan Mark Gus Hal Ian Jim

To retrieve an element, we would not have to scan any part of the table, but go directly to the record’s location by computing its hash value. For example,If Eva wants to know what her mark was, since ascii(‘E’) = 69, we compute(69-5) % 10 which gives 4, the location of Eva’s record in the table.

Of course, there are other important considerations, such as the number of elements that can be stored at any given table location (called a bucket), and how to accommodate overflows and handle collisions (two or more elements withthe same hash value). It has been pointed out to me by a reviewer of this paper that such considerations are equally important as the derivation of the index.But it can be argued that these are secondary in nature given the motivation ofthe hashing technique. Its essence can be said to be the derivation of a number from a given key, which is then subsequently used to index into an array where the element is stored with the purpose of eliminating a scan of any part of the array.

3 The Katapayadi Scheme

In classical India, letters of the Sanskrit alphabet were initially used to represent numbers. The grammarian Panini (4th or 5th century BC) who is believed to have written the first generative grammar for a natural language(Asher 1994) assigned the values 1 through 9 and 0 to the Sanskrit vowels a, i, u, etc. For example,Sutra(rule) v.i.30 of his grammar,Ashtadhyayi, is marked with the letter i, which indicates that the rule applies to the next tworules (Datta and Singh 1962, p.63). It is also known that various synonymsfor the number words existed. In one system, words with meanings evoca-tive of the numbers they represented were used. For example, the words indu(moon), dhara (earth) etc. stood for the number one since there was onlyone of each, netra (eyes), paksha (wings), etc. stood for two and so on. Amore comprehensive list of such synonyms can be found in (Ifrah 1985, p.446)who also gives the following instance of its use by Bhaskara I who in 629A.D. wrote the number 4,320,000 as vijadambaraqkaqa uqnjajamaraqmavedaorsky/atmostphere/space/void/primordial couple/Rama/Veda= 0000234. The term has been transliterated from the Sanskrit using the International PhoneticAlphabet. The palatal sibilant, commonly transcribed ass is represented usingconforming to the guidelines in (Halle and Clements 1983).

The Katapayadi scheme was initially just another such system of expressingnumbers through the use of letters (Sanskrit consonants in this case), withmore than one synonym for each number. The consonants themselves wereunevocative of the values they represented unlike the earlier scheme, but theynow possessed the powerful ability to form easily memorisable words throughthe insertion of vowels between them. Meaningful and mnemonic words couldnow be formed using these letters in much the same way as mnemonic wordsare coined today to represent commercial telephone numbers. In this sense,the Katapayadi scheme could be seen as just a mnemonic technique to helpremember numbers, or at best, a coding scheme like ASCII to derive numericvalues from non-numeric tokens, but it is noteworthy that the scheme continuedto be used long after the invention of numeric symbols and during this time was put to several applications. It is the application of the scheme to the particularinstance described in the next section which is remarkably similar to that of modern hashing.

The following Sanskrit verse describes one version of the Katapayadi scheme.(Fleet 1911) quotes this from C.M.Whish (Trans. Lit. Soc. of Madras, Part 1,p.57, 1827) who quotes this from an unspecied source, but (Datta and Singh1962) state that it is found inSadratnamala, which is a treatise on astronomypublished in 1823 by Prince Sankaravarman of Katattanat in North Malabar.The prince was an acquaintance of Mr Whish who spoke of him in high terms asa very intelligent man and acute mathematician" (Raja 1963).

Sadratnamala


Value 1 2 3 4 5 6 7 8 9 0

Velar and Palatal Stops k kh g gh 8 c ch , ,h 7

Retroex and dental stops P Ph h 9 t th d dh n

Labial stops p ph b bh m

Fricatives & Glides j r l v L s h

Table 1: The Katapayadi translation table

was published with a commentary in the Malayalam monthly Kavanodayam,vol.16, 1898, Calicut.

na7avaca ca unjani samk hja kaPapajaqdajah j

mi re tuqpaqnta hal samk hja na ca cintjo halasvarah k

(7 and n denote zeroes; the letters (in succession) beginning with k, P, p andj (the palatal glide, y in non-phonetic representation) denote the digits. In aconjoint consonant, only the last one denotes a number; and a consonant notjoined to a vowel should be disregarded)

There are said to be four variations of this scheme, which is claimed as thereason for its not coming into general use. The transcription scheme is moreeasily understood from the table 1. It lists the Sanskrit consonants, with theirassociated numeric values as specied in the verse. Each of the lines exceptthe last consists of stops in the following sequence - unvoiced and unaspirated,unvoiced and aspirated, voiced and unaspirated, voiced and aspirated, and nasal.In the rst line the velars are followed by the palatals and in the second line,the retroexes are followed by the dentals. The last line consists of fricatives.

The following interesting verse also appearing in Sadratnamala, illustratesan application of the scheme:

bhadrambudhisiddha,anmaga9ita raddhaqmajadbhuqpagih

If we translate this using the procedure described earlier in the verse aboutthe scheme, we get

bh= 4 (from table)

dr = 2 (only the last part of the conjoint consonant, r, is considered)

mb = 3 (similarly, only the b of mb is considered), etc.

This gives the nal value 423979853562951413. Since it is known that tra-ditional Indian practice was to write number words in ascending powers of10 (Ifrah 1985, p.445) (Menninger 1969, pp.398{399), the number representedabove, properly, is 314159265358979324 which is recognisable to be just thedigits of pi to 17 places (except that the last digit is incorrectit must be 3).


(Menninger 1969, p.275) also quotes an example1of the Indian name for the lu-nar cycle beinganantapura, which in addition to having semantic content itself,also gives the Katapayadi value 21600 (using the consonants n-n-t-p-r), whichis the number of minutes in the lunar half-month (15 25 60).

The originator of this scheme is not known, as with many other Indian inven-tions and discoveries, but it is believed that the scheme was probably familiarto the Indian mathematician and astronomer Aryabhata I in the 5th centuryA.D. and to Bhaskara I who lived in the 7th century A.D. (Sen 1971, p.175).The oldest datable text that employs the scheme isGrahacaranibandhana, writ-ten by Haridatta in 683 AD (Sarma 1972, pp.6{8). The scheme is said to havebeen used in a wide variety of contexts, including occultisms like numerology.A large number of South Indian chronograms have been composed using thisscheme (see for eg. Epigrahia Indica, 3: p.38, 4: pp.203{204, 11: pp.40{41,34: pp.205{206). It is also said that the Indian philosopher of the 7th Cen-tury, Sankara, was named such that the Katapayadi value of his name giveshis birthday215, indicating the fth day of the rst fortnight of the secondmonth in the Indian lunar calendar (Sambamurthy 1983). Not much else isknown about the status or application of this scheme since then. But in the18th century, we nd a novel revival of it in South Indian musicology which isarguably similar to modern hashing. This is described in the following section.

4 An Application of the Katapayadi Scheme

In classical South Indian music, the raga is roughly equivalent to the Western chord. These ragas are classied according to a unique scheme. What followsis a brief description of this classication as is pertinent to the subject of thispaper. A more comprehensive treatment of Indian musicology, its concepts andterms can be found in (Wade 1969).A raga can either be a Janaka(root) raga or a Janyaraga which is considered to be a descendant of one of the Janakaragas. The scale of a Janakaraga has seven notes in its ascent and the same seven notes in reverse in itsdescent. A Janyaraga is a modication of its parent Janakaraga through the insertion or deletion of one or more notes and/or possibly the re-ordering ofsome notes in either or both the ascent and descent of the scale. The seven notes are respectively called Sa (Shadjam), Ri (Rishabham), Ga (Gandharam),Ma (Madhyamam), Pa (Panchamam), Da (Dhaivatam) and Ni (Nishadam).These are the equivalents of the western solfa syllables Do, Re, Mi, Fa, So, La and Ti. The notes Sa and Pa (the fth) are considered xed, and must occur unchanged in all the Janakaragas. If we consider the octave to consist of the 12 notes C, C#, D, D#, E, F, F#, G, G#, A, A# and B, since C and G arexed, Ri and Ga can take any combination of two notes from C#, D, D# and E. Similarly Da and Ni can take any combination of two notes from G#, A,A# and B and Ma can take any of the two values F or F#. Thus there canbe a total of 24C24C2= 72 possible JanakaRagas. If we arrange these ragas systematically in a table, it is possible to derive the notes used by anyone of them from its index in the table. Accordingly, the table of 72 ragas is constructed as follows: The first 36 ragas in the table use F as the middle note Ma, and the second 36 use F#. In other respects they are identical. Each half of the table is further divided into 6 sections called Chakras, each of which has 6 ragas in it. Each of the 6 Chakras in each half use one of the 6 possible combi-nations of the notes Ri and Ga, while, within each Chakra, the notes Ri and Garemain constant, but Da and Ni take on each of their 6 possible combinations.Thus the arrangement in table 2 is arrived at.

This classication makes it easy for us to determine the notes of a raga givenits serial number in the table. For example if we were asked to play the scale ofraga number 65, we would know that it uses the note F# since 65 div 36+1=2.Since 65 % 36=29 and 29 div 6+1=5, we would know that it uses the fthpossible combination of Ri and Ga which is D & E. Also since 26 mod 6=5,we know it uses the fth possible combination of Da and Ni which is A and B.Thus the scale of Janaka Raga number 65 is: C, D, E, F#, G, A and B.

This means that given the name of a raga, one need only search for its raganumber. The notes can be mechanically derived from its number. However,an Indian raga has certain additional musical properties other than the notesit uses. Frequently, also, a janya raga which inherits some properties from itsjanaka raga is described in terms of the modications done to its parent whichresulted in that particular raga. These are usually discussed under a descriptionof the Janaka raga and its descendants, or in concise forms, given succinctlyalongside its name in a table. To get complete information about a JanakaRaga, then, a table search to nd its position given its name is presupposed.Things would be even simpler if we were able to derive the number of a ragadirectly from its name. This is precisely what was done by the South Indianmusicologists. Each raga was named in such a way that a Katapayadi translationof the rst two syllables of its name gives us its number in the table. Forexample, the raga Mechakalyani gives us the number 65 (derived from the rsttwo syllables Me and Cha) and Vanaspati gives 4. Thus it is now possible to godirectly to the raga’s position in its table from its name without having to do a search.

The exact person who coded the names of the ragas thus seems to be indispute, but it is fairly certain that such a codication was complete by the end ofthe 18th Century. (Aiyyangar 1972, p.189) states that although Venkatamakhilays a claim to this arrangement in 1660, it should really be credited to hisgrandson Muddu Venkatamakhi in the early 18th century who added it as asupplement to the former’s work Chaturdandi Prakasika.

5 Discussion

From an observation of the Katapayadi scheme, it seems that there are severalimportant dierences between it and modern hashing techniques. Notably, ahashing formula gives a valid bucket number for any given name, but the Kat-apayadi scheme only gives meaningful results for some names. For example, atrue hashing algorithm will never give a number greater than 72 in the aboveapplication, whatever the value hashed, but the Katapayadi scheme will.

A hashing algorithm can also take any input and return a number corre-sponding to its position in a table, whereas in the application of the Katapayadischeme above the names of the ragas have been carefully chosen for the purpose.Thus it seems more probable that the Katapayadi formula was intended as amnemonic technique to help people remember long numbers. Indeed, the versefromSadratnamalacoding the digits of pi seem to imply just that. In this sense,the scheme is an exact opposite of the modern hashing technique which aims toderive numbers from names, since it aims to derive names from numbers.

But then its application in South Indian musicology, where there are only72 admissible root ragas is clearly directed at liberating the tablelookup op-eration from the constraints imposed on it by the size of the table. This is thebasic aim of a hashing technique. A good hashing algorithm seeks to performthe operations of insertion, deletion and lookup with constant time complex-ity. The insert and delete operations are irrelevant to the application outlinedabove since the raga names were deliberately coined and already inserted intothe table. But once the table had been constructed, lookup took a constanttime because of the application of the Katapayadi scheme. The motivation forthis must have been similar as for a situation that warrants the applicationof a hashing strategy nowconstant time table lookup. The result too is thesame. Here, it is obvious that it bears a strong similarity to the modern hash-ing technique. To be sure, the Katapayadi scheme was initially developed asa mnemonic technique given the oral culture of education in early India. In-deed, Sir Monier Williams remarks that even the grammar of Panini was mainlyintended to aid the memory of teachers than learners by the briefest possiblesuggestions (Williams 1969). Nevertheless, it is possible for such a mnemonictechnique to gradually evolve into a scheme that bears strong similarity to ourmodern hashing technique. It is relatively easy for one to look at this particularapplication of the Katapayadi scheme and to come up with a hashing strategyfor some modern requirement. But whether the scheme actually inuenced laterdevelopment of the hashing technique is in doubt. It is not certain whether anyIndian scholar with knowledge of this technique was a close associate of any ofthe proponents of early hashing. Nor is it likely that the proponents of hashingknew about the Katapayadi technique. Thus the most we can say at this stageis that the Katapayadi scheme can be thought of as an early precursor to themodern hash functions and its application in South Indian musicology bears inretrospect an interesting similarity to modern hash tables.

1 comment:

Anonymous said...

REALLY REALLY GREAT!