Trends in Iranian and Persian Linguistics 9783110455793, 9783110453461

This set of essays highlights the state of the art in the linguistics of Iranian languages. The contributions span the f

188 49 6MB

English Pages 380 Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Trends in Iranian and Persian Linguistics 9783110455793, 9783110453461

This set of essays highlights the state of the art in the linguistics of Iranian languages. The contributions span the f

130 99 20MB Read more

Advances in Iranian Linguistics 9789027260932, 2020017463

222 84 5MB Read more

Trends in Linguistics 9783110890754, 9789027907011

129 100 6MB Read more

Persian Computational Linguistics and NLP 9783110619225, 9783110616545

This companion provides an overview of current work in the areas of Persian Computational Linguistics (CL) and Natural L

128 86 5MB Read more

Persian Computational Linguistics and NLP 9783110619225, 9783110616545

This companion provides an overview of current work in the areas of Persian Computational Linguistics (CL) and Natural L

106 29 8MB Read more

Trends in Hindi Linguistics 9783110610796, 9783110606980, 9783110708059

Trends in Hindi Linguistics provides a snapshot of current developments in Hindi syntax and semantics and covers topics

164 8 3MB Read more

New Trends in Nordic and General Linguistics 9783110346978, 9783110343861

This book offers a survey of current work in Nordic and General Linguistics, with a special focus on language contact. T

122 26 3MB Read more

Runes and Germanic Linguistics (Trends in Linguistics. Studies and Monographs [Tilsm]) [Reprint 2011 ed.] 3110174626, 9783110174625

The older runic inscriptions (ca. AD 150 - 450) represent the earliest attestation of any Germanic language. The close r

414 111 63MB Read more

Advances in Iranian Linguistics 2020017463, 2020017464, 9789027207166, 9789027260932

334 76 7MB Read more

Trends in Romance Linguistics and Philology: Volume 2 Synchronic Romance Linguistics 9783110816129, 9789027978967

175 41 65MB Read more

Trends in Iranian and Persian Linguistics
9783110455793, 9783110453461

Table of contents :
Acknowledgments
Table of contents
Introduction
1 The alleged Persian-Germanic connection: A remarkable chapter in the study of Persian from the sixteenth through the nineteenth centuries1
2 Huihuiguan zazi: A New Persian glossary compiled in Ming China
3 Glimpses of Balochi lexicography: Some iconyms for the landscape and their motivation
4 On some Iranian secret vocabularies, as evidenced by a fourteenth-century Persian manuscript
5 Specialization of an ancient object marker in the New Persian of the fifteenth century1
6 Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto
7 The historically unmotivated majhul vowel as a significant areal dialectological feature
8 Variability in Persian forms of address as represented in the works of Iranian playwrights
9 Some linguistic indicators of sociocultural formality in Persian
10 Spoken vs. written Persian: Is Persian diglossic?
11 Accounting for *yek ta in Persian
12 The associative plural and related constructions in Persian
13 Revisiting the status of -eš in Persian1
14 ‘Difficult’ and ‘easy’ in Ossetic
15 Possessive construction in Kurdish
16 To bring the distant near: On deixis in Iranian oral literature
17 Extracting semantic similarity from Persian texts
List of contributors
Index

Citation preview

Alireza Korangy, Corey Miller (Eds.) Trends in Iranian and Persian Linguistics

Trends in Linguistics Studies and Monographs

Editor Volker Gast Editorial Board Walter Bisang Jan Terje Faarlund Hans Henrich Hock Natalia Levshina Heiko Narrog Matthias Schlesewsky Amir Zeldes Niina Ning Zhang Editor responsible for this volume Hans Henrich Hock

Volume 313

Trends in Iranian and Persian Linguistics Edited by Alireza Korangy Corey Miller

ISBN 978-3-11-045346-1 e-ISBN (PDF) 978-3-11-045579-3 e-ISBN (EPUB) 978-3-11-045359-1 ISSN 1861-4302 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. 6 2018 Walter de Gruyter GmbH, Berlin/Boston Typesetting: RoyalStandard, Hong Kong Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com

Acknowledgments The editors would like to thank their preternaturally patient contributors who braved through the long and arduous – yet delightful – preparation of this volume. The complexities associated with correctly representing linguistic details in all forms and shepherding computer irregularities stemming from multiple systems and countries are not to be taken lightly. We would especially like to thank Martin Schwartz who personifies the spirit of scholarship with his herculean attention to detail and who was a source of great motivation for the editors by example. Corey Miller would like to thank Lixin Yang for his emotional and technical support. With respect to Corey Miller, the views and opinions expressed in this volume are those of the authors and editors, and not those of The MITRE Corporation.

DOI 10.1515/9783110455793-202

Alireza Korangy dedicates this book to his daughter Iran Ghazal Korangy: “a little linguist, who is my motivation for all things good and positive”.

Table of contents Acknowledgments Introduction

1

2

v

ix

Toon Van Hal The alleged Persian-Germanic connection: A remarkable chapter in the study of Persian from the sixteenth through the nineteenth centuries 1 Shinji Ido Huihuiguan zazi: A New Persian glossary compiled in Ming China

21

3

Adriano V. Rossi Glimpses of Balochi lexicography: Some iconyms for the landscape and their motivation 53

4

Martin Schwartz On some Iranian secret vocabularies, as evidenced by a fourteenth-century Persian manuscript 69

5

Agnès Lenepveu-Hotz Specialization of an ancient object marker in the New Persian of the fifteenth century 81

6

Lutz Rzehak Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

101

7

Youli Ioannesyan The historically unmotivated majhul vowel as a significant areal dialectological feature 119

8

Zohreh R. Eslami, Mohammad Abdolhosseini, and Shadi Dini Variability in Persian forms of address as represented in the works of Iranian playwrights 135

9

Hooman Saeli and Corey Miller Some linguistic indicators of sociocultural formality in Persian

163

viii

Table of contents

Behrooz Mahmoodi-Bakhtiari 10 Spoken vs. written Persian: Is Persian diglossic?

11

Lewis Gebhardt Accounting for *yek ta in Persian

183

213

Jila Ghomeshi 12 The associative plural and related constructions in Persian

233

Shahrzad Mahootian and Lewis Gebhardt 13 Revisiting the status of -eš in Persian 263 Arseniy Vydrin 14 ‘Difficult’ and ‘easy’ in Ossetic Z. A. Yusupova 15 Possessive construction in Kurdish

277

297

Carina Jahani 16 To bring the distant near: On deixis in Iranian oral literature Katarzyna Marszalek-Kowalewska 17 Extracting semantic similarity from Persian texts List of contributors Index 365

363

339

309

Introduction The editors of this volume first met at a conference entitled “The Wide World of Persian: Connections and Contestations, 1500–Present”, held at the University of Maryland and the Library of Congress in 2014. That conference brought together literary and linguistic researchers focusing on chronologically and geographically diverse varieties of Persian. This volume, Trends in Iranian and Persian Linguistics, inherits from that conference its dedication to diversity in time and space, as well as a deep rooting in the cultural and historical underpinnings of so many linguistic facts. In fact, we have broadened the spatial aspect to encompass a wide range of Iranian languages in addition to Persian as the title implies. While we are more strictly in the linguistic camp, we span the broader range of linguistic endeavor, as will be outlined below. There are many ways in which a volume of linguistic articles could be assembled. One, of course, would be by linguistic subfield; for example, a collection of articles on nasal vowels, with contributions discussing the phenomenon in different languages. Our collection is more akin to the kind that emanates from the highly successful biennial International Conference on Iranian Linguistics. In fact, we have had the good fortune to meet most of the authors at that conference. In this introduction, we will situate the articles herein in the broader context of linguistics and Iranian studies. They span the gamut of linguistic studies, from both the historiography of the field and historical linguistics to studies examining specific morphological, phonetic, syntactic, semantic, and pragmatic questions, many providing a deep sociolinguistic context to the evolution and maintenance of the phenomena under study. While the majority of the articles employ painstaking tools of analysis that have characterized the field since time immemorial, the article by Marszalek-Kowalewska illustrates the evolution of those tools into the modern computational realm. While Hoenigswald (1999) has said that for linguists, “It is their glory to do their work without looking right or left, letting themselves be guided by their subject matter, language, and nothing else”, Van Hal’s article on “The alleged Persian-Germanic connection: A remarkable chapter in the study of Persian from the sixteenth through the nineteenth centuries” offers an important introspective look at the history of our field. An historiographic study such as this propels us to consider the contemporary societal motivations and influences on our work. Coupled with the complexities of historical views of our field are the inevitable complexities offered by exocentric perspectives. Ido’s paper on “Huihuiguan zazi: DOI 10.1515/9783110455793-001

x

Introduction

A New Persian glossary compiled in Ming China”, provides us with an unusual Eastern perspective on Persian. The phonetic and semantic perceptions of Persian from Ming China examined in Ido’s lexicographic study provide us with a more detailed canvas upon which to consider the evolution of the language, and enrich our understanding of subsequent developments. Continuing the plumbing of the lexicographic vein is Rossi’s masterful study of Balochi vocabulary, “Glimpses of Balochi lexicography: Some iconyms for the landscape and their motivation”. Through an examination of the integral connection between language and space, Rossi shows us how language cannot be studied in a theoretical vacuum that ignores the unique spatio-temporal evolution of a people’s means of communication. Schwartz’s “On some Iranian secret vocabularies, as evidenced by a fourteenthcentury Persian manuscript” is also a critical lexicographic study, this time underlining how Persian is the heritage of a rich melting pot of peoples and cultures, each of them fashioning the language to some extent in their own image, but contributing to the shared vocabulary and ultimately the legacy of the entire Sprachbund. Shortly following the time period of the manuscript described by Schwartz is a morphosyntactic discussion by Lenepveu-Hotz, “Specialization of an ancient object marker in the New Persian of the fifteenth century”. Lenepveu-Hotz’s detailed examination of the variability of expression of direct and indirect objects foreshadows the set of articles probing the form and function of a wide set of phenomena in contemporary Iranian Persian discussed in this volume. Following these diachronically focused articles, we are fortunate to have several contributions navigating the geographical extent of Persian and other Iranian languages, among these, two articles detailing phenomena characterizing its Eastern periphery. Rzehak’s study, “Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto”, while deepening our understanding of his sociopragmatic subject matter, also buttresses the thesis expressed in Miller et al. (2013) that Dari and Pashto in Afghanistan have, through their long history in the same space, developed much in common with each other and distinctive from other varieties of the same languages spoken in other countries. Ioannesyan’s “The historically unmotivated majhul vowel as a significant areal dialectological feature” emphasizes the value of examining contemporary varieties of Tajik and Dari in order to gain perspective on the development of phonetic and phonological features that would remain shrouded in mystery if one were to focus solely on contemporary Iranian Persian. That is, of course, not to say that the study of contemporary Iranian Persian does not yield enormous treasures of its own, as evinced by a wide spectrum of

Introduction

xi

articles exploring a range of grammatical topics in that variety. Two of our contributions deal with variation in forms of address due to sociolinguistic factors. Eslami, Abdolhosseini and Dini’s “Variability in Persian forms of address as represented in the works of Iranian playwrights” explores the sociopragmatic options available to Persian speakers and what each tells us about the interlocutors’ relationships. Saeli and Miller’s “Some linguistic indicators of sociocultural formality in Persian” focuses on differences between same and differing gender interactions, while offering a novel accounting for the role played by variation between standard and colloquial morphology and lexis in politeness. The standard/colloquial distinction is thoroughly described in Mahmoodi-Bakhtiari’s “Spoken vs. written Persian: Is Persian diglossic?” All areas of grammar are described along with the pervasive register distinctions that uniquely characterize the language. In the morphosyntactic realm, we have two articles dealing with particular phenomena in counting and plurality. Gebhardt’s “Accounting for *yek ta in Persian” explores both the number marker and numeral classifier properties of ta. Ghomeshi’s “The associative plural and related constructions in Persian” explores constructions with ina from different angles, including the associative plural, extenders, and compounding. Rounding out the morphosyntactic analyses of phenomena in contemporary Iranian Persian is Mahootian and Gebhardt’s “Revisiting the status of -eš in Persian”, which provides a thorough investigation of the clitic and agreement marker properties of -eš, including an important section in which speaker judgments were systematically elicited. Our collection benefits from three articles exploring morphological, syntactic, and semantic phenomena, which extend our geographic scope to the north and west and the south and east. Vydrin’s “‘Difficult’ and ‘easy’ in Ossetic” extends our bulwark of theoretical linguistic analyses into the Ossetian sphere. By examining the relationship of the construction under study in both the passive and modal constructions of possibility, Vydrin sheds light on a phenomenon that has not received sufficient attention in grammars of the language. Yusupova’s “Possessive construction in Kurdish” explores differences in the expression of this construction in northern and southern dialects on the basis of both poetry and folklore texts. Along similar lines, Jahani’s “To bring the distant near: On deixis in Iranian oral literature”, probes the expression of deixis and its interaction with tense in two varieties of Balochi as well as Vafsi and Gorani. Finally, Marszalek-Kowalewska’s “Extracting semantic similarity from Persian texts” thrusts the linguistics of Iranian languages into the computer era. Pursuing lexicographic themes discussed in the earlier articles, Marszalek shows how computational semantics offers new ways of exploring synonymy and the effects of normative efforts on the way the language is actually used.

xii

Introduction

It is our fervent hope that this volume will be useful both to those seeking an introduction to some of the themes prevalent in studies of the linguistics of Iranian languages, as well as to those already deeply immersed in the field. Even though, as Sa‘di tells us:

‫ﺑﻨﻰ ﺁﺩﻡ ﺍﻋﻀﺎﯼ ﯾﮏ ﭘﯿﮑﺮﻧﺪ‬ [The children of Adam are limbs of one body] Rumi teaches us that we may still find enrichment through our linguistic diversity and our attempts in reaching a common ground through linguistic understanding:

‫ﻫﯿﻦ ﺳﺨﻦ ﺗﺎﺯه ﺑﮕﻮ ﺗﺎ ﺩﻭ ﺟﻬﺎﻥ ﺗﺎﺯه ﺷﻮﺩ‬ [Go speak a new language so to replenish both worlds anew!] The Editors

References Hoenigswald, Henry M. 1999. Review of Mark Durie & Malcolm Ross, The comparative method reviewed: Regularity and irregularity in language change. Diachronica 16 (1): 165–176. Miller, Corey, Evan Jones, Rachel Strong & Mark Vinson. 2013. Reflections on Dari linguistic identity through toponyms. In Rudolf Muhr, Carla Amorós Negre, Carmen Fernández Juncal, Klaus Zimmermann, Emilio Prieto & Natividad Hernández (eds.), Exploring linguistic standards in non-dominant varieties of pluricentric languages, 319–330. Frankfurt: Peter Lang.

Toon Van Hal

1 The alleged Persian-Germanic connection: A remarkable chapter in the study of Persian from the sixteenth through the nineteenth centuries1 Abstract: The idea that Modern Persian (Farsi) and the Germanic language group (especially Dutch and German) were connected was formulated around the end of the sixteenth century and remained influential, also after the (re)discovery of Sanskrit at the end of the eighteenth century and the foundation of comparative linguistics in the first half of the nineteenth century. This contribution aims at outlining the history of this remarkably persistent idea and will discuss the linguistic arguments used by Western scholars to substantiate these claims. It will show how Western authors compiled lexical parallels between both language groups and how they explored morphological similarities. Rather than casting new light on the history of the Iranian languages and dialects, the article will reveal how the Persian language was studied, received, and even appropriated in Early Modern Europe. Keywords: Early Modern language comparison, basic vocabulary, appropriation

1 Introduction One year before his untimely death, the Leiden physician and Orientalist Johannes Elichmann (1601/1602–1639) expressed his admiration for the renowned female scholar Anna Maria van Schurman (1607–1678) by sending her a quatrain on parchment, calligraphically written in both Persian and Dutch (see Figure 1).2 Despite the wide attention Van Schurman has attracted in the past few decades

1 Sections 2 and 3 of this article consist of abridged and rewritten parts of Van Hal (2007 and 2011). All translations are mine. I am indebted to an anonymous reviewer and to the editors for their comments on earlier drafts of this article. 2 Preserved in The Hague, Royal Library of the Netherlands (Ms 133 B 8, N 75). See Van der Stighelen (1987: 239). Toon Van Hal, University of Leuven DOI 10.1515/9783110455793-002

2

Toon Van Hal

Figure 1: Quatrain by Johannes Elichmann destined for Anna Maria van Schurman (Royal Library of the Netherlands, The Hague: Ms 133 B 8, N 75)

as one of the first female students admitted to a university, both the Dutch and the Persian lines have remained untranslated until recently.3 This most likely has to do with the fact that Elichmann, an expert in Persian, wanted this poem to demonstrate the affinity between these languages (affinitatis inter has linguas testandae gratia). He therefore selected words that were, in both languages, similar in meaning (ὁμοιόσημον) as well as in sound (ὁμοιόλογον); on top of all this, he ordered the words in the same sequence (ὁμοιοσύντακτον). Needless to say, this experiment resulted in a somewhat artificial and hard-to-understand text. This article aims to present a remarkable chapter in the early history of Western scholars studying Iranian languages, against the background of which Elichmann’s astonishing endeavor can be understood. After a period of about 1,000 years during which few scholars from Western Europe studied foreign languages other than Latin, the year 1492 marks the breakthrough of multilingualism on several fronts (see, e.g., Aarsleff 1982: 281). The discovery of America 3 See the forthcoming edition by Larsen and Maiullo (2017). I would like to thank Anne Larsen for drawing my attention to this curious epigram (see Larsen [2016] for recent work on Anna Maria van Schurman).

The alleged Persian-Germanic connection

3

confronted Europe with numerous Amerindian languages, which had to be mastered by Spanish missionaries in order to conduct evangelization. In the wake of the 1492 conquest of Granada, a large number of Jews took refuge in northern cities of Europe, which catalyzed the study of Hebrew. It was also in 1492 that Antonio de Nebrija (1441–1522) published a grammar of Spanish (Gramática de la lengua castellana), thus beginning the trend of writing grammars of vernacular languages. It would take another one hundred years for Early Modern Europe, in the wake of the Early Modern globalization, to become familiar with, and conduct investigations into, the Persian language. Europe-based scholars were surprised to notice that the Farsi language bore a surprising number of parallels with the languages spoken in Europe. In particular, the alleged link between German – which included at that time not only High German, but also Low German, Dutch, and at times even English – and Persian was highlighted throughout the entire premodern period and even well into the nineteenth century. This contribution seeks to outline the history of this remarkably persistent idea and to discuss the linguistic arguments Western scholars used to substantiate these claims. It will show against what background Western authors compiled lexical parallels between both language groups and how they explored morphological similarities. Rather than casting new light on the history of the Iranian languages and dialects, the article will reveal how the Persian language was studied, received, and even appropriated in Early Modern Europe.

2 Raphelengius, Scaliger, and Lipsius: The launch of the theory in the last quarter of the sixteenth century In the first book of his Historiae, the Greek historian Herodotus (fifth century BC) refers to a Persian tribe called the Germanioi (1.125). As both earlier and more recent scholars have pointed out,4 it is likely that Herodotus was hinting at the inhabitants of the satrapy Karmania.5 For many humanist scholars, however, it was tempting to suppose that Herodotus was referring to the Germani.6 Only a very small number of humanists voiced some occasional protest against this

4 See, e.g., Schmitt (1996) and Cluverius (1631: 30). 5 See Wiesehöfer (2006). Today, Iran has a province and a city that are still called Kerman. 6 This was also defended by modern scholars such as Hallenberg (1816: 129) and Neckel (1929).

4

Toon Van Hal

tendency to read so much into a mere similarity between two ethnonyms.7 Indeed, inferring genealogical relationships between two tribes on the basis of similar names was a widespread historical device in the Early Modern period.8 Pointing out the similarities between ethnonyms of peoples was, in the opinion of many a scholar, sufficient to establish the historical affinity or even identity of these peoples. In view of all this, it is very possible that Herodotus’s brief remark underlies a very noteworthy and long-lasting theory that emerged in the Early Modern period. Indeed, some sixteenth-century scholars alluded to the alleged special link between Germans and Persians, although without addressing the issue of language. For Johannes Goropius Becanus (1519–1573), Herodotus’s testimony was sufficient to draw the far-reaching conclusion that the Persians originally spoke the same language as the Dutch.9 Although extremely well-versed in the classical languages and Hebrew, Goropius – who owes his enduring fame, or infamy, to his view that all languages, including Hebrew, stemmed from the Dutch (or “Cimbric”) language – lived during a period in which Persian was still hardly known in the West. This situation changed from the end of the sixteenth century onward once Shah ʿAbbas I (r. 1588–1629) came to power and opened Persia to the outside world.10 There are two main reasons why the first students of Persian would be inclined to think that Persian would be one of the so-called linguae orientales (~ Semitic languages, cf. Gruntfest 1995). First, Persian has many Arabic loanwords in its vocabulary. Second, the language was written mostly in Hebrew or Arabic characters. It is, however, interesting to observe how some of the greatest scholars of the sixteenth century developed an interest in Persian and in its parallels with German and Dutch. To the best of my knowledge, the first scholar hinting at the parallels between the Persian and German languages in a published piece of scholarship was Joseph Justus Scaliger (1540–1609), a French Huguenot scholar who would

7 “De cetero, eorum hic maxime notanda est parum felix coniectura, qui a Persarum gente Germanos ortos, ex Herodoto se probare posse arbitrantur. scilicet quia huic, in libr. 1, sub Persarum imperio populi recensentur Γερµάνιοι. quibus equidem duplex huius sententiae ratio: primum, quia multa Persarum, cum in sermone, tum in moribus, ac vivendi ratione, putant Germanis convenire; deinde, quia nomen idem” (Cluverius 1631: 30). 8 Some humanists had been eager to prove the common roots of the Armenians and Arameans, while others sought to show that the Dani and Daci were basically the same tribe. Much ink has been spilled over the question of whether the Gothi and the Getae were basically the same tribe or not. 9 See Goropius Becanus (1580 [Hermathena]: 226). For another example, see, e.g., Neander (1581: 31v). 10 See de Bruijn (1987, 1990) and Matthee (2009).

The alleged Persian-Germanic connection

5

become world-famous for the breadth of his studies and the scope of classical languages he mastered. In 1579, he remarked in a side note in his edition of the works of the Roman poet Marcus Manilius (fl. first century AD) that Goropius Becanus should have studied Persian, which would have allowed him to discover that words such as fader, muder, brader, tuchther, and band surfaced both in Persian and in his own Dutch language (Scaliger 1579: 244). Scaliger was a fierce opponent of Goropius Becanus’s ideas, and it is therefore difficult to assess what his ultimate point was. It is, however, Franciscus Raphelengius Sr. (1539–1597) who is in earlier publications credited with having launched the Persian-Germanic theory. This specialist in Oriental languages (see e.g., Hamilton 1989) embraced the idea much more warmly than Scaliger did. Before running the new Leiden branch office of the publisher’s house of his father-inlaw Christopher Plantin (ca. 1520–1589), Raphelengius was active as a proofreader at Plantin’s Antwerp office. There he contributed to the Biblia Polyglotta project undertaken by his father-in-law and the Spanish Orientalist Benedictus Arias Montanus (1527–1598). By studying a Persian translation of the Bible, Raphelengius noticed some striking parallels between the Persian and Dutch vocabulary. In 1584, he sent a letter to the professor Justus Lipsius (1547–1606), then at Leiden, in which he included a new list of lexical parallels – it was not the first one, but any previous lists communicated are, unfortunately, no longer extant. In this letter he also suggested that he regarded Goropius Becanus’s “Cimbric theory” as a plausible background against which the Persian-Dutch correspondences could be understood. He also emphasized that parallels could also be drawn with Latin, Greek, and Oriental languages, which implies that he did not regard the Persian-Germanic connection as an exclusive one.11 Although Raphelengius, who eventually became the first professor of Hebrew at the new university of Leiden, failed to publish his discovery, he appears to have been the most prolific proponent of the idea. It is plausible that he communicated it to the jurist Hugo Grotius (1583–1645), as one of Grotius’s texts written in his younger years encompasses a discussion of an extensive number of words shared by Germans and Persians, particularly words belonging to what is now sometimes styled “basic vocabulary”, such as kinship terms.12 Raphelengius’s Leiden colleague Bonaventura Vulcanius (1538–1614), a professor of Greek, published in 1597 a work containing numerous language specimens (see Van Hal 2010c). This treatise also surveys the similarities between Persian and German, for which Vulcanius acknowledges a debt to Raphelengius (Vulcanius 1597: 87). Hence, 11 The letter was published by Nauwelaerts and Sué (1983: 84 09 23 R). 12 This Parallelon rerumpublicarum liber tertius was likely written about 1602, but published only in 1801–1803. See Grotius (1801–1803: 62‒63).

6

Toon Van Hal

Vulcanius can be credited with having published the first list of PersianGermanic parallels. Meanwhile, Lipsius had failed to meet his promise to Raphelengius to discuss the Persian-Germanic connections in the second volume of a miscellaneous book project entitled Electa (1585). In 1602, however, he addressed the issue in a very influential treatise, in which he firmly criticized Goropius Becanus’s arguments supporting the primacy of Dutch. The letter included a list of Persian-Dutch word comparisons (Lipsius 1602: 56), probably the second one ever published (see Deneire and Van Hal [2006] for further background). Finally, there are strong indications that Raphelengius also inspired Philips van Marnix van Sint-Aldegonde (1540–1598), an influential Protestant politician, to further explore the Persian-Germanic correspondences (see Schulte [1879: 331] and Van Hal [2011: 154‒155]). Although Scaliger and Lipsius were both key figures in launching the theory, both scholars had raised large questions over the general validity of the exclusive connection. While not casting doubt on the similarities between Dutch and Persian, Lipsius remarked that for many Persian words, the Latin equivalents seemed to be closer than the Dutch ones. We will see, indeed, how a considerable number of later authors did not exclusively focus on the correspondences between Persian and German, but investigated Persian’s parallels with languages such as Greek and Latin as well.13 Lipsius’s general tenet was that the evidential value of languages, given the continuous changes they underwent throughout time, was much too slippery to be of any significance in historical enquiries: “whoever is looking for solidity in a topic that is fundamentally unstable, viz. language, makes a mistake”.14 In other words, he thought that this kind of scholarship was doomed to failure. In addition, Scaliger emphasized that it was very hazardous to assume linguistic kinship from a narrow empirical foundation. Unlike Lipsius, however, Scaliger was very interested in exploring sound bases to compare languages (see Van Hal 2010b). When it comes to assessing the similarities between Persian and Germanic, Scaliger – who was actually the first to have the idea – seems to have been puzzled by the nature of this closeness. Toward the end of his life, he stated that “nothing differs more from German than Persian”.15 However, there are good reasons to assume that this judgment should not be taken at face value. Judging by his “table talks” (published posthumously without his agreement and therefore not 13 Jan van den Driessche’s (Joannes Drusius, 1550–1616) plan to publish a book on Greek borrowings from Persian did not materialize (Drusius 1622: 925). 14 “Errat enim qui in re instabili maxime, id est lingua, quaerit firmitatem” (Lipsius 1602: 55). 15 “Nihil tam dissimile alii rei, quam Teutonismus linguae Persicae” (Scaliger in the introduction to Pontanus [1606]).

The alleged Persian-Germanic connection

7

Figure 2: List of Persian words that are similar to words in other languages, in Scaliger’s hand (Collection Leiden University Libraries: Cod. Scal. 57, f. 51)

entirely reliable), he had stressed the lexical similarities between the two languages (Scaliger 1740: 111). In addition, the Leiden University Library holds a draft of a Persian lexicon initiated by Raphelengius and continued by Scaliger (Cod. Scal. 57, f. 25‒50). This draft is followed by another draft (f. 51), unmistakably in Scaliger’s hand, encompassing a list of Persian words that are similar to words in other languages, among which many comparisons between Dutch and Persian appear (see Figure 2). He offers examples such as / ‫ ﺷﺎﺥ‬shākh ‘branch’, Latin ramus ‘branch’, Dutch tak ‘branch’ and ‫‘ ﻋﻮﺩ‬wd ‘wood’, Latin lignum, Dutch hout ‘wood’. Relying on all sources known and available, it is difficult to draw another conclusion than that Scaliger remained hesitant regarding this thorny issue.

3 The various explanations for the PersianGermanic connection It is safe to posit that the critical attention Lipsius and Scaliger had paid to the Persian-Germanic connection in the early days after its discovery contributed considerably to its swift and wide dissemination, since both scholars were generally regarded as among the very best of their generation. This is not to say that there were no scholars who preferred to simply ignore the hypothesis. In

8

Toon Van Hal

the preface to his Lexicon heptaglotton (1669), the English Orientalist Edmundus Castellus (1606?–1685), for instance, classified Persian among the daughter languages of Hebrew without speculating on its possible connection with any European language. Remarkably, he left unexplained why Persian was given a grammatical and lexical description separate from those he provided for Hebrew, Chaldean, Syrian, Ethiopic, Samaritan, and Arabic.16 There was also a significant group of scholars who virulently contested this alleged connection (Droixhe [1989] surveys the arguments of a number of such adversaries). Nevertheless, it is hard to deny that, in many texts, the Persian-Germanic theory was, in one way or another, mentioned and addressed as though the alleged connection was a well-established fact. A first glance at the sources immediately reveals that champions of the theory cannot be regarded as one cohesive group. Scholars who acknowledged the special connection did not always share the same views on the underlying causes explaining the linguistic similarities. First of all, some scholars limited themselves to observing the parallels without pondering possible explanations.17 A large number of scholars were convinced that early contacts between German and Persian tribes had given way to lexical convergences. In a book entitled De lingua Belgica [On the Dutch language], the Dordrecht pastor Abraham Mylius (1563–1637) took care to show how Dutch colonizers had once subjected an impressive geographical area, even reaching Persia. The similarities between Dutch and several other languages, including Persian, served to underpin this nationalist-like scheme. In contrast, Bernard Furmer (1542–1616) did not trace the Persians back to the Germans, but the other way around.18 Instead of explaining the parallels in terms of borrowings, some scholars preferred to see them as remnants of the lingua Adamica. As a matter of fact, the book of Genesis recalls how the infelicitous idea of building a tower in Babel that would reach to Heaven urged God to confuse the original language Adam

16 As reflected in the full title of his work: Lexicon heptaglotton, Hebraicum, Chaldaicum, Syriacum, Samaritanum, Aethiopicum, Arabicum conjunctim, et Persicum separatim. 17 Vulcanius pointed out that Persian and Dutch are “somehow” cognate without clearly defining the nature and the extent of this affinity: “aliquam enim eius esse cum Teutonica affinitatem vel ex eo constat quod multa vocabula utrique linguae inter se sunt communia” (Vulcanius 1597: 87). John Greaves (Gravius, 1602–1652) rounds off his very systematic Persian grammar with a comparative word list and with the observation that many Persian words are similar to English ones, without advancing a clear explanation (Gravius 1649: 89 eqs; see also Droixhe [Forthcoming]). 18 “Germanos autem illos ex Persia ortos nomen suum huic orbis parti contulisse, plerique docti et ex lingua et ex moribus utrique nationi communibus acute coniciunt” (Furmerius 1609: 12).

The alleged Persian-Germanic connection

9

had invented in paradise. Although this language was thus damaged beyond repair, scholars such as the Leiden geographer Philippus Cluverius (1580–1622) theorized that some bits and pieces of it were still detectable across a number of languages of the world. In other words, he regarded not only the similarities between Persian and German, but also those among Latin, Greek, the Indian languages, and Hebrew as merely accidental relics of the primeval language, which was – apart from these scarce vestiges – irrevocably lost.19 It was, however, the previously mentioned Elichmann who would lay the foundations for a significant breakthrough (see Van Hal 2010a). An expert in Persian, he was convinced of its special kinship with Germanic, even to the point of constructing poems that could be understood in both Dutch and Persian alike (see above). The Persian-Germanic connection would become the backbone of the Scythian framework Elichmann designed. Apart from German and Persian, languages such as Latin and Greek would have descended from the languages of the “Scythians”. Two Leiden professors would contribute considerably to the further elaboration and dissemination of this groundbreaking idea. While in his earlier years, he still explained the parallels as Persian borrowings from Greek (Salmasius 1629: 1130), Claude de Saumaise (1588–1653) was the first to launch Elichmann’s idea in print (Salmasius 1643) and even to offer some reconstructions of Scythian numbers. The second Leiden professor who followed in Elichmann’s footsteps was Marcus Zuerius van Boxhorn (1612–1653); see, in this context, especially Boxhornius (1720). His enduring fervor for the “Scythian case” stood in contrast to the more ephemeral (yet more influential) interest shown by his rival Saumaise.

4 Evolving views on the nature of the PersianGermanic connection and on the way of comparing languages Throughout the seventeenth and eighteenth centuries, the idea of a PersianGermanic connection never sank into oblivion. Many allusions, often very concise, were made in major encyclopedias,20 dictionaries,21 historical, chorographical, 19 “Quamobrem nihil mirum, si diversissimae, longeque inter se remotissimae per universum terrae orbem gentes communia habeant etiam nunc nonnulla rerum vocabula” (Cluverius 1631: 30; see also p. 59). 20 For example, in Zedler’s universallexikon, the most extensive eighteenth-century encyclopedia; see Ludovici (1741: 661) as well as in the third edition of the Encyclopaedia Britannica (Anon. 1797: 564). 21 See, e.g., Eberhard (1800: 248).

10

Toon Van Hal

and antiquarian works,22 and even in question-and-answer books tailored for a younger readership.23 Moreover, knowledge of the hypothesis extended beyond the exclusive circle of learned men and women of letters. A number of missionaries and merchants, who can be regarded as linguistic fieldworkers overseas, were obviously familiar with the Persian-Germanic hypothesis when composing Persian dictionaries and grammars. The French Discalced Carmelite Ange de Saint Joseph (1636–1697) converted souls to the Christian faith in Persia and Turkey between 1664 and 1679 (Walsh 1907). His 1684 Gazophylacium linguae Persarum is a Persian grammar and dictionary in Latin, Italian, and French, containing a small section devoted to the analogues between Persian and the European languages (Angelus a S. Joseph 1684: 5‒7; see Droixhe [Forthcoming]). His Protestant peers in India observed some correspondences between German and Hindustani, which demonstrates that other languages belonging to what is now called the Indo-Iranian family were compared with the Germanic languages, once they became known to Western explorers.24 Another example is Joan Josua Kettler (Ketelaar; 1659–1718). After committing a number of crimes in his homeland, this Elbing-born German fled to the Netherlands, where he succeeded in making a career in the East India Company (Vogel 1936). About 1700, he authored a Dutch-language Instruction of the Hindustani and Persian languages, in which he compared some Persian words with Latin and Dutch examples.25 It is interesting to see that none of these authors felt urged to ponder the reasons underlying these parallels. Needless to say, it would be both unfeasible and undesirable to offer a complete chronological survey of all sources in which the Persian-Germanic hypothesis is mentioned, addressed, and elaborated upon. Quite remarkably, however, such extensive overviews can be found in some Early Modern and Modern sources.26 By concentrating on lesser-known source texts, this section

22 See, e.g., Schotanus (1664: 5‒6), Sammes (1676: 423), and Rapin de Thoyras (1743: 26). 23 See, e.g., Anon. (1785: 9). 24 “Sonst finden sich in der Hindostanischen Sprache viele Wörter, die ganz mit den Teutschen überein kommen. Als zum Exempel: Hand oder had, weil das n nicht gehöret wird, die Hand; Mu, der Mund; Bocker, ein Bock; Kamerband, ein Gürtel oder Band um die Lenden, Bandà, binden; Binde-bando, biedet einen Bündel; Man, ein Mann; welches letztere im Composito viel im Gebrauch ist, als Beraman, ein braver oder grosser Mann; Dutchman, ein Feind, u.s.w.” (Schultze 1747: 713). 25 Ketelaar (1700) – I owe this reference to Anna Pytlowany, who is currently preparing an edition of Ketelaar’s manuscript. 26 For example, Eccardus (1711: 209‒211), Odhelius and Celsus (1723: 18‒19), and especially Dorn (1827: 91‒135), presenting first the views of the champions and subsequently the criticisms of the opponents. Droixhe (1978: 81‒83) and Helander (2004: 367‒369) offer more recent lists.

The alleged Persian-Germanic connection

11

will therefore explore the general dynamics of the ongoing debate and examine the gradual refinement and sophistication of the linguistic arguments used either to buttress or to undermine the theory. Many scholars relied solely on the lexical similarities between both languages in order to advocate the mutual kinship. From Vulcanius and Lipsius onward, champions of the Persian-Germanic connection typically offered a small list of about twenty similar words, often adding that many additional words could be supplied without any difficulty. Table 1, which is representative of later lists, reproduces the early yet little-known list provided by the Leiden professor Paullus Merula (1558–1607). Table 1: List of Persian-Germanic comparisons, offered in Merula and Merula (1627: 544); the English translation column has been added by the author “Persian”

Dutch

Latin

[English translation]

Choda Phedar Madar Berader Dochtar Nam Dandan Lab Drog Star Mus Casti Band Must Nau Du Begrijst Grijft Murd Ses Ta

God Vader Moeder Broeder Dochter Naem Tanden Lip Bedrogh Ster Muys Casse Band Most Nieu Du Beschreyt Grijpt Vermoord Ses Daer toe

Deus Pater Mater Frater Filia Nomen Dentes Labium Mendacium Stella Mus Cista Vinculum Mustum Novus Tu Lacrymis oppletus Tene Obtruncatus est Sex Usque ad

‘God’ ‘father’ ‘mother’ ‘brother’ ‘daughter’ ‘name’ ‘teeth’ ‘lip’ ‘fraud’, ‘lie’ ‘star’ ‘mouse’ ‘box’ ‘band’ ‘unfermented wine’ ‘new’ ‘you’ ‘filled with tears’ ‘hold’ ‘murdered’ ‘six’ ‘until’

This way of comparing languages met with thorough criticism. Reviewing Simon Pelloutier’s Histoire des Celtes, in which the similarities between German and Persian were discussed in some detail (Pelloutier 1740: 86‒89), an anonymous reviewer remarked: “Nevertheless, the presence of a few words that are somehow similar do not prove the identity of two languages. Nor do they even

12

Toon Van Hal

demonstrate a common origin. They only hint at a natural adoption of words, which easily pass from one language to another”.27 Such criticisms had also been aired by Richard Verstegan (1548?–1636?), an English antiquary who had been working in Antwerp since 1585 or so. In his Restitution of decayed intelligence in antiquities concerning the most noble and renovvmned English nation (1605), we read: Surely it is an opinion of a very slender confirmation, for that in deed there is no affinitie at all between those two languages, and albeit there may some half a dozen or half a score woords be found in the Persian, that are broken German woords, as Choda, Phedar, Madar, Beradar, Dochtar, Star, Band, for God / Father / Mother / Brother / Daughter / Star / Band / what affinitie makes this, when all the rest is altogether different? [. . .] By this it may be seen espetially to such as have any knowledge in the Duytsh toung, that between that and this, heer is no neernesse of affinitie at all, but as much farnesse as needeth to be (Verstegan 1605: 26‒27, 29).

He added that an investigation by Anthony Sherley (1565–1635), who was ambassador at the Persian court, had clearly revealed that Latin was closer to Persian than Dutch was to Persian. Both critics thus highlighted the poor quantitative foundations on which the claimed Persian-Germanic connections rested. It is precisely against Verstegan that the late seventeenth-century antiquarian Aylett Sammes (1636?–1679?) reacts. Verstegan’s argument, this Essex-born scholar states, would be true if the words alledged were far fetched, and we were forced to run through a whole Dictionary to find only a few, and those as distant in signification as the Heaven and Earth is from each other, but were so nigh Relations, as Father, Mother, Brother and Daughter, which are alwaies in Peoples mouths, are called by the same names in two Languages, it seemeth not to happen by chance. (Sammes 1676: 423)

This remark testifies to the growing awareness of the importance attached to “basic vocabulary” when comparing languages in order to establish genealogical relations (see Van Hal 2015). This is, however, not to say that the line of reasoning followed by all authors relied solely on the shared lexicon. It is indeed noteworthy to find that, even in the early stages of this argument, both Raphelengius and Lipsius highlighted some grammatical similarities. Nevertheless, it is most likely in the wake of 27 “Cependant quelques termes à peu près semblables ne prouvent pas l’identité de deux langues, ni même une commune origine, mais seulement une adoption naturelle de mots, qui passent aisément d’une langue dans une autre” (Anon. 1741: 238). The controversial Jesuit Pierre François Guyot-Desfontaines (1685–1745) might have authored this review, as he was the initiator of the Observations sur les écrits modernes series.

The alleged Persian-Germanic connection

13

Johannes Elichmann that Salmasius observed some similarities in inflection, conjugation, and word composition. One can safely state that the importance attached to structural and grammatical parallels gradually increased over time. The Swedish authors of an academic dissertation dedicated to the PersianGothic parallels emphasized the methodological significance of taking into account grammatical structures by relying on the authority of William Wotton (1666–1727), who had highlighted the importance of grammatical correspondences in another context.28 Other authors still had called attention to similarities in phraseology and syntax.29 Finally, it is interesting to observe that some scholars argued that the similarities in language went hand in hand with parallels in customs (see, e.g., Wellern 1753: 331‒332; Furmerius 1609: 12). Despite the various explanations given for the Persian-Germanic hypothesis in the first half of the seventeenth century (see above, section 3), it is remarkable to find that a very large number of late seventeenth- and eighteenth-century authors deliberately regarded Boxhorn’s Scythian theory as the most plausible framework for explaining the similarities between the languages. As a young man, Hugo Grotius saw no other explanation for the similarities he had observed between Persian and German than by contact, colonization, and borrowing (see above, section 2). By the end of his life, the meanwhile renowned diplomat would also subscribe to the Scythian hypothesis (Grotius 1655: 8). In his sixteenth Prolegomenon, discussing the “Persian language and the Persian versions of this Bible”, the biblical scholar Brian Walton (1600–1661), editor of a renowned Polyglot Bible, also brings up the Persian-Germanic similarities. After discussing the solutions and answers given by other scholars, he presents Boxhorn’s Scythian thesis, “which he will embrace, until a more plausible explanation will be advanced”.30 Along with the above-mentioned Simon Pelloutier (Pelloutier 1740), the Frisian professor Campegius Vitringa the Elder (1658–1722) was a champion of the Scythian framework. He asked whether the contacts and commerce between Persians and Germans had been so intense that they could explain such striking linguistic similarities, as some other Early Modern authors were inclined to think. In this respect, Vitringa himself refers, for instance, to Mylius (1612), but Thomas Hyde (1636–1703) still adhered to the contact theory in 1700 (see Hyde 1700). Vitringa, in turn, was convinced “that both languages, 28 Odhelius and Celsius (1723: 14‒15, 24), referring to Wottonius (1715) (cf. Droixhe [1994] on this dissertation). See also the substantial grammar-based motivation offered by Henselius (1741: 437‒460). 29 See, e.g., Levinus Warner (1619–1665) and August Pfeiffer (1640–1698) in Warnerus (1644: 14) and Pfeifferius (1704: 690). 30 “Probabilis tamen mihi videtur Boxhornii sententia, quam amplectendam sentio, donec aliquid probabilius adferatur” (Walton 1673: 419‒420).

14

Toon Van Hal

Persian as well as Germanic, were born from Scythian”.31 The great polyhistor Gottfried Wilhelm Leibniz (1646–1716), whose interest in the world’s languages was ancillary to his fascination with prehistory and early migrations, also mentioned the Scythian theory as a possible explanation of the Persian-Germanic parallels (see, e.g., Babin and van den Heuvel 2004: 365, 843, 877), although he remained somewhat doubtful of the significance of the kinship (see now especially Droixhe [Forthcoming]). A final example is the Dutch Orientalist Albert Schultens (1686–1750), who gained his fame by offering a systematic and grammar-based account of the Semitic – or in his terminology, “Oriental” – languages. Schultens lucidly underlined that the very structure of Persian indicated that this language could not belong to this Oriental language group. Before massively adopting Arabic words, Schultens claimed, Persian was unmistakably of a Scythian-European origin (Schultens 1761: 194). It is, however, important to emphasize that the Scythian language was not always regarded as a lost parent language of German and Persian. On a number of occasions, scholars equated Scythian with their own mother tongues, so as to grant their language a timehonored status. Gradually, more details on the Persian language became known. It was probably due to a lack of Persian data that the Dordrecht preacher Abraham Mylius could not deal with this language in depth, despite the prominent presence of Persian in the subtitle of his 1612 book (see Mylius 1612). Most scholars entirely focused on Modern Persian. One of the sole seventeenth-century exceptions was Boxhorn, who sought to rely on old Persian words mentioned by classical authors – an exercise continued later by the Utrecht scholar Hadrianus Relandus (1676–1718) (Relandus 1707). From the second half of the eighteenth century onward, the European Republic of Letters witnessed a new and substantial extension of the study of Asian linguistics. The publication of new dictionaries describing Iranian languages other than Farsi enabled scholars to further substantiate the links.32 It was also in this period that a hitherto unknown, extinct Iranian language, so-called Zend-Avestan, came to the surface. A first edition of the text corpus was published by Abraham Hyacinthe Anquetil-Duperron (1731–1805) (see especially App 2010). It was, however, the (re)discovery of Sanskrit that would mesmerize the majority of scholars from the 1780s onward, thus somewhat eclipsing the focus on Persian.33 31 “Ego igitur multo faciliorem puto dari posse hujus convenientiae rationem ex illa hypothesi, quod utraque illa Lingua tam Persica quam Germanica nata sit ex Scythica” (Vitringa 1712: 100). See also Huet (1722: 102), discussed by Droixhe (1980). 32 See, e.g., Garzoni’s lexicon and grammar of the language of the Kurds (Garzoni 1787), which was commented upon by Kinderling (1795: 95). 33 See especially Benes (2008: 72‒73), who also offers some examples of scholars who kept believing in the primacy of Persian.

The alleged Persian-Germanic connection

15

At the turn of the nineteenth century, we see that many scholars are inclined to explain the new data offered by Sanskrit in the context of the age-old PersianGermanic (Scythian) theory. In other words, it seems safe to state that the (re)discovery of Sanskrit did not immediately give way to a fundamental rethinking of contemporary ideas on linguistic kinship. In a work published as late as 1819, the scholar Joseph Cherade de Montbron (1768–1854) still explained the similarities between Persian and German by referring to the invading Tartars or Scythians (Ch[erade] de Montbron 1819: 588‒589), although he also knew about new developments in Indian philology.34 The theory’s ongoing impact still echoes in Franz Bopp’s (1791–1861) groundbreaking work Über das Conjugationssystem (Bopp 1816), whose fifth chapter is devoted to the “conjugation der persischen Sprache und der alten germanischen Mundarten” (see in this respect Hiersche 1985: 157).35 Without providing any references, Windfuhr (1979: 155) even asserts that the idea of a Persian-German affinity is still somehow present in the folk wisdom of some regions in both Iran and Germany. It was the general aim of this article to show that the study of Persian greatly stimulated European ideas on linguistic kinship from the end of the sixteenth century onward. Droixhe (1984) has rightly stressed that the idea of the Persian-Germanic connection is thus a remarkable instance of continuity in the history of (pre)comparative linguistics, which was in other respects very often characterized by discontinuity.

References Aarsleff, Hans. 1982. From Locke to Saussure. Minneapolis: University of Minnesota Press. Angelus a S. Joseph. 1684. Gazophylacium Linguae Persarum, Triplici linguarum Clavi Italicae, Latinae, Gallicae, nec non specialibus praeceptis ejusdem linguae reseratum. Amstelodami: Ex officina Jansonio-Waesbergiana. Anon. 1741. Observations sur les écrits modernes. Paris: Chaubert. Anon. 1785. Epitome historica scientiarum et artium. Ad usum studiosae iuventutis. Dresdae: Ex officina Waltheria. Anon. 1797. Philology. Encyclopædia britannica, or, A dictionary of arts, sciences, and miscellaneous literature, vol. 14: 485–569. App, Urs. 2010. The birth of orientalism. Philadelphia: University of Pennsylvania Press. Babin, Malte-Ludolf & Gerd van den Heuvel (eds.). 2004. [Gottfried Wilhelm Leibniz]: Schriften und Briefe zur Geschichte. Hannover: Verlag Hahnsche Buchhandlung.

34 See, e.g., Kolbe (1819: 504‒507), Hallenberg (1816), and Van Hal (2012) for other examples. 35 For similar nineteenth-century references to the Persian-German connection, see also Vater (1815: 186) and especially Frank (1808: 11).

16

Toon Van Hal

Benes, Tuska. 2008. In Babel’s shadow: Language, philology, and the nation in nineteenthcentury Germany. Detroit: Wayne State University Press. Bopp, Franz. 1816. Über das Conjugationssystem der Sanskritsprache in Vergleichung mit jenem der griechischen, lateinischen, persischen und germanischen Sprache [. . .]. Frankfurt am Main: in der Andreäischen Buchhandlung. Boxhornius, Marcus Zuerius. 1720. Epistola de Persicis Curtio memoratis vocabulis, eorumque cum Germanicis cognatione notis instructa. Edited by Jo. Henr. Von Seelen. Lubeccae: apud Petrum Boeckmannum. Bruijn, Johannes T. P. de. 1987. Iranian studies in the Netherlands. Iranian Studies 20 (2/4). 161–177. Bruijn, Johannes T. P. de. 1990. De ontdekking van het Perzisch. Rede uitgesproken bij de aanvaarding van het ambt van bijzonder hoogleraar in de cultuurgeschiedenis van Iran sedert de opkomst van de Islam aan de Rijksuniversiteit te Leiden op 9 maart 1990. Leiden: Rijksuniversiteit Leiden. Ch[erade] de Montbron, J[oseph]. 1819. Essais sur la littérature des Hébreux. Paris: Louis Janet. Cluverius, Philippus. 1631. [1616]. Germaniae antiquae libri tres [. . .]. Lugduni Batavorum: Ex officina Elzeveriana. Deneire, Tom & Toon Van Hal. 2006. Lipsius tegen Becanus. Over het Nederlands als oertaal. Editie, vertaling en interpretatie van zijn brief aan Hendrik Schotti (19 december 1598). Amersfoort: Florivallis. Dorn, Boris Andreevich. 1827. Über die Verwandtschaft des persischen, germanischen und griechisch-lateinischen Sprachstammes. Hamburg: J. A. Meissner. Droixhe, Daniel. 1978. La linguistique et l’appel de l’histoire (1600‒1800). Rationalisme et révolutions positivistes. Genève: Droz. Droixhe, Daniel. 1980. Le prototype défiguré. L’idée scythique et la France gauloise (XVIIe‒ XVIIIe siècles). In E. F. K. Koerner (ed.), Progress in linguistic historiography. Papers from the International Conference on the History of the Language Sciences (Ottawa, 28‒31 August 1978), 123‒137. Amsterdam: Benjamins. Droixhe, Daniel. 1984. Avant-propos. Genèse du comparatisme indo-européen = Histoire, Épistémologie, Langage 6 (2). 5–15. Droixhe, Daniel. 1989. Boxhorn’s bad reputation. A chapter in academic linguistics. In Klaus D. Dutz (ed.), Kurzbeiträge der IV. Internationalen Konferenz zur Geschichte der Sprachwissenschaften (ICHoLS IV). Trier, 24‒27. August 1987, 359–384. Münster: Nodus. Droixhe, Daniel. 1994. En attendant Bopp. Une dissertation sur la convenance du perse et du gothique de 1723. In Reinhard Sternemann (ed.), Bopp-Symposium 1992 der HumboldtUniversität zu Berlin. Akten der Konferenz vom 24.3.‒26.3.1992 aus Anlaß von Franz Bopps zweihundertjährigem Geburtstag am 14.9.1991, 53–71. Heidelberg: Winter. Droixhe, Daniel. Forthcoming. The Failure of the Germano-Persian Kinship. Drusius, Johannes. 1622. Veterum interpretum Graecorum in totum vetus Testamentum fragmenta. Arnhemiae: apud Iohannem Ianssonium. Eberhard, Johann August. 1800. Versuch einer allgemeinen deutschen Synonymik in einem kritisch-philosophischen Wörterbuche der sinnverwandten Wörter der hochdeutschen Mundart. Halle & Leipzig: J. G. Ruff. Eccardus, Johannes Georgius. 1711. Historia studii etymologici linguae Germanicae hactenus impensi; ubi scriptores plerique recensentur et diiudicantur, qui in origines et antiquitates linguae Teutonica, Saxonicae, Belgicae, Danicae, Suecicae, Norwegicae et Islandicae veteris item Celticae, Gothicae, Francicae atque Anglo-Saxonicae inquisiverunt, aut libros

The alleged Persian-Germanic connection

17

studium nostrae linguae criticum promoventes alios ediderunt. Accedunt et quaedam de lingua Venedorum in Germania habitantium, tandemque proprium de lexico linguae gG aperitur. Hannoverae: apud Nicolaum Foersterum. Frank, Othmar. 1808. Das Licht vom Orient. Nürnberg & Leipzig: Lechner & Besson. Furmerius, Bernardus. 1609. Annalium Phrisicorum libri tres. Franecarae: excudebat Aegidius Radaeus. Garzoni, Maurizio. 1787. Grammatica e vocabolario della lingua Kurda. Roma: Sacra Congregazione di Propaganda Fide. Goropius Becanus, Johannes. 1580. Opera hactenus in lucem non edita, nempe Hermathena, Hieroglyphica, Vertumnus, Gallica, Francica, Hispanica. Antverpiae: excudebat Christophorus Plantinus. Gravius, Johannes. 1649. Elementa linguae Persicae. London: Flesher. Grotius, Hugo. 1655. Historia Gotthorum Vandalorum et Langobardorum; praemissa sunt eiusdem Prolegomena ubi regum Gotthorum ordo et chronologia, cum elogiis; accedunt nomina appellativa et verba Gotthica, Vandalica, Langobardica cum explicatione. Amstelodami: apud Ludovicum Elzevirium. Grotius, Hugo. 1801–1803. Batavi, parallelon rerumpublicarum liber tertius: de moribus ingenioque populorum Atheniensium, Romanorum, Batavorum. Vergelijking der Gemeenebesten. Derde boek: over de zeden en den inborst der Athenienseren, Romeinen en Hollanderen. Edited by Johan Meerman. Haarlem: Loosjes. Gruntfest, Yaakov. 1995. On the history of the classification of Semitic languages. In Kurt R. Jankowsky (ed.), History of linguistics 1993. Papers from the Sixth International Conference on the History of the Language Sciences (ICHolS VI), Washington D.C., 9‒14 August 1993, 147–156. Amsterdam & Philadelphia: Benjamins. Hallenberg, Jonas. 1816. Disquisitio de nominibus in lingua Suiogothica, lucis et visus cultusque solaris in eadem lingua vestigiis. Additae hinc inde sunt generaliores de linguarum origine observationes. Stockholmiae: Apud Direct. Henr. A. Nordström. Hamilton, Alastair. 1989. Nam tirones sumus. Franciscus Raphelengius’ Lexicon Arabico-Latinum (Leiden 1613). In Marcus de Schepper and Francine De Nave (eds.), Ex officina Plantiniana. Studia in memoriam Christophori Plantini (ca. 1520–1589), 523–556. Antwerpen: Vereeniging der Antwerpsche Bibliophielen. Helander, Hans. 2004. Neo-Latin literature in Sweden in the period 1620‒1720. Stylistics, vocabulary and characteristic ideas. Uppsala: University Library. Henselius, Godofredus. 1741. Synopsis universae philologiae, in qua miranda unitas et harmonia linguarum totius orbis terrarum occulta e literarum, syllabarum vocumque natura et recessibus eruitur. Norimbergae: In commissis apud heredes Homannianos. Hiersche, Rolf. 1985. Zu Etymologie und Sprachvergleichung vor Bopp. In Hermann M. Ölberg & Gernot Schmidt (eds.), Sprachwissenschaftliche Forschungen. Festschrift für Johann Knobloch, 157–169. Innsbruck: Institut für Sprachwissenschaft. Huet, Pierre-Daniel. 1722. Huetiana, ou, Pensées diverses de M. Huet [. . .]. Paris: Chez Jacques Estienne. Hyde, Thomas. 1700. Historia religionis veterum Persarum, eorumque magorum [. . .]. Oxonii: Sheldon. Ketelaar, Johan Josua. 1700. Instructie of onderwijsinghe der Hindoustanse en Persiaanse taalen. http://objects.library.uu.nl/reader/index.php?obj=1874-44896&lan=nl&_ga=1.31740217. 641540806.1461772719#page//31/92/68/31926888009404834546760021760428007804. jpg/mode/1up (accessed 31 December 2016).

18

Toon Van Hal

Kinderling, Johann Friedrich August. 1795. Über die Reinigkeit der Deutschen Sprache und die Beförderungsmittel derselben [. . .]. Berlin: Bey Friedrich Maurer. Kolbe, Karl Wilhelm. 1819. Über den Wortreichtum der deutschen und französischen Sprache. Zweite, ganz umgearbeitete Ausgabe. Berlin: In der Realschulbuchhandlung. Larsen, Anne R. 2016. Anna Maria van Schurman, “the star of Utrecht”: The educational vision and reception of a savante. Farnham & Burlington: Ashgate. Larsen, Anne R. & Steve Maiullo (eds.). 2017. Anna Maria van Schurman: Letters and poems to her mentor and other members of her circle. Toronto: The Toronto Center for Renaissance and Reformation Studies. Lipsius, Justus. 1602. Epistolarum selectarum centuria prima [-tertia] ad Belgas. Antverpiae: ex officina Plantiniana, apud Ioannem Moretum. Ludovici, Carl Günther (ed.). 1741. Grosses vollständiges Universal-Lexicon aller Wissenschafften und Künste [. . .]. Leipzig & Halle: J. H. Zedler. Matthee, Rudi. 2009. The Safavids under western eyes: Seventeenth-century European travelers to Iran. Journal of Early Modern History 13 (2). 137–171. Merula, Paullus & Guilielmus Merula. 1627. Tiid-thresoor: Ofte kort ende bondich verhael van den standt der kercken ende de vvereltlicke regieringe: Vervatende een beschrijvinge van alle de gedencvvaerdichste geschiedenissen over den gantschen aertbodem [. . .] : Alles beginnende vande geboorte Iesu Christi, totten jare 1627. Tot Leyden: Jan Claesz. van Dorp. Mylius, Abraham. 1612. Lingua Belgica sive de linguae illius communitate tum cum plerisque aliis, tum praesertim cum Latina, Graeca, Persica, deque communitatis illius causis, tum de linguae illius origine et latissima per nationes quamplurimas diffusione, ut et de ejus praestantia. Lugduni Batavorum: Ulricus Cornelii et G. Abrahami. Nauwelaerts, Marcel A. M. & Sylvette Sué (eds.). 1983. Iusti Lipsi Epistolae. 2: 1584‒1587. Brussels: Koninklijke Academie voor Wetenschappen, Letteren en Schone Kunsten van België. Neander, Michael. 1581. Bedencken Michaelis Neandri An einenn Guten Herrn und Freund, Wie ein Knabe zu leiten und zu unterweisen [. . .]. s.n.: s.n. Neckel, Gustav. 1929. Germanen und Kelten: historisch-linguistisch-rassenkundliche Forschungen und Gedanken zur Geisteskrisis. Heidelberg: Winter. Odhelius, Olaus & Olof Celsius. 1723. Dissertatio philologico-historica de convenientia linguae Persicae cum Gothica [. . .]. Upsaliae: Literis Wernerianis. Pelloutier, Simon. 1740. Histoire des Celtes et particulièrement des Gaulois et des Germains depuis les temps fabuleux jusqu’à la prise de Rome par les Gaulois. La Haye: chez Isaac Beauregard. Pfeifferius, Augustus. 1704. Opera omnia quae extant philologica. Ultrajecti: ex officina Guilielmi Broedelet. Pontanus, Johannes Isacius. 1606. Itinerarium Galliae Narbonensis, cum duplici appendice id est universae fere Galliae descriptione Philologica ac Politica. Cui accedit glossarium Prisco-Gallicum seu de lingua Gallorum veteri Dissertatio. Lugduni Batavorum: ex officina Thomae Basson. Rapin de Thoyras, Paul. 1743. The history of England, 3rd edn. London: John and Paul Knapton. Relandus, Hadrianus. 1707. De reliquiis veteris linguae Persicae. In Dissertationum Miscellanearum partes tres, vol. 2, 97–266. Trajecti ad Rhenum: Guilielmus Broedelet. Salmasius, Claudius. 1629. Plinianae exercitationes in Caii Julii Solini Polyhistora. Item Caii Julii Solini Polyhistor ex veteribus libris emendatus. Parisiis: apud Hieronymum Drouart.

The alleged Persian-Germanic connection

19

Salmasius, Claudius. 1643. De Hellenistica commentarius, controversiam de lingua Hellenistica decidens et plenissime pertractans originem et dialectos Graecae linguae. Lugd[uni] Batavor[um]: ex officina Elseviriorum. Sammes, Aylett. 1676. Britannia Antiqua Illustrata: Or, the antiquities of ancient Britain, derived from the Phœnicians. London: Tho. Roycroft. Scaliger, Josephus Justus. 1579. In Manilii libros astronomicon commentarius et castigationes. Lutetiae: apud Mamertum Patissonium. Scaliger, Josephus Justus. 1740. Scaligerana, Thuana, Perroniana, Pithoeana, et Colomesiana. Ou remarques historiques, critiques, morales, et littéraires de Jos. Scaliger, J. Aug. de Thou, le Cardinal du Perron, Fr. Pithou, et P. Colomie’s. Avec les notes de plusieurs savans. Amsterdam: Covens & Mortier. Schmitt, Rudiger. 1996. Zu den “Germánioi” bei Herodot. Historische Sprachforschung 109 (1). 45–52. [Schotanus], [Christianus]. 1664. Beschryvinge van de heerlyckheydt van Frieslandt tusschen ’t Flie end de Lauwers, met nieuwe caerten van t landschap in’t algemeen soo oud als nieuw. [Franeker]; [Amsterdam]: excudit Johannes Wellens; Jacob van Meurs. Schulte, Joseph Wilhelm. 1879. (2) Gothica Minora. Zeitschrift für deutsches Altertum und deutsche Literatur 23‒24. 318‒336. Schultens, Albert. 1761. Origines Hebraeae sive Hebraeae linguae antiquissima natura et indoles ex Arabiae penetralibus revocata, 2nd edn. Lugduni Batavorum: Apud Samuelem et Joannem Luchtmans, et Joannem le Mair. Schultze, Benjamin. 1747. [52. Continuation] Herrn Missionarii Schultzens zu Madras Diarium vom Jahr 1739. In Gotthilf August Francke (ed.), Der Königl. Dänischen Missionarien aus Ost-Indien eingesandter Ausführlichen Berichten, Von dem Werck ihres Amts unter den Heyden, 710–742. Halle: Waisenhaus. Van der Stighelen, Katlijne. 1987. Anna Maria van Schurman of “Hoe hooge dat een maeght kan in de konsten stijgen”. Leuven: Universitaire Pers Leuven. Van Hal, Toon. 2007. Joseph Scaliger, puzzled by the similarities of Persian and Dutch? Omslag. Bulletin van de Universiteitsbibliotheek Leiden en het Scaliger Instituut. 1–3. Van Hal, Toon. 2010a. On “the Scythian Theory”. Reconstructing the outlines of Johannes Elichmann’s (1601/1602‒1639) planned Archaeologia harmonica. Language & History 53 (2). 70–80. Van Hal, Toon. 2010b. “Quam enim periculosa sit ea via. . .”. Josephus Justus Scaliger’s views on linguistic kinship. Beiträge zur Geschichte der Sprachwissenschaft 20 (1). 111–140. Van Hal, Toon. 2010c. Vulcanius and his network of language lovers. De literis et lingua Getarum sive Gothorum (1597). In Hélène Cazes (ed.), Bonaventura Vulcanius, Works and Networks (Bruges 1538 – Leiden 1614), 387–401. Leiden & Boston: Brill. Van Hal, Toon. 2011. The earliest stages of Persian-German language comparison. In Gerda Hassler (ed.), History of linguistics 2008. Selected papers from the Eleventh International Conference on the History of the Language Sciences (ICHoLS XI), Potsdam, 28th August–2nd September 2008, 147–165. Amsterdam & Philadelphia: Benjamins. Van Hal, Toon. 2012. Linguistics ante litteram. Compiling and transmitting views on language diversity and relatedness before the nineteenth century. In Rens Bod, Jaap Maat, and Thijs Weststeijn (eds.), The making of the humanities. From early modern to modern disciplines, 37–53. Amsterdam: Amsterdam University Press. Van Hal, Toon. 2015. Friedrich Gedike on why and how to compare the world’s languages: A stepping stone between Gottfried Wilhelm Leibniz and Wilhelm von Humboldt? Beiträge zur Geschichte der Sprachwissenschaft 25 (1). 53–76.

20

Toon Van Hal

Vater, Johann Severin. 1815. Linguarum totius orbis index alphabeticus: quarum grammaticae, lexica, collectiones vocabulorum recensetur, patria significatur, historia adumbatur. Berolini: In officina libraria Fr. Nicolai. Verstegan, Richard. 1605. A restitution of decayed intelligence in antiquities concerning the most noble and renovvmed English nation. Antwerpen: by Robert Bruney. Vitringa, Campegius. 1712. Observationum sacrarum libri sex, in quibus de rebus varii argumenti, & utilissimae investigationis, critice ac theologice, disseritur; Sacrorum imprimis librorum loca multa obscuriora nova vel clariore luce perfunduntur. Franequerae: ex officina Wibii Bleck. Vogel, Jean Philippe. 1936. Joan Josua Ketelaar of Elbing, author of the first Hindūstānī grammar. Bulletin of the School of Oriental and African Studies 8 (2‒3). 817–822. Vulcanius, Bonaventura. 1597. De literis et lingua Getarum sive Gothorum. Item de notis Lombardicis. Quibus accesserunt specimina variarum linguarum, quarum indicem pagina quae praefationem sequitur ostendit. Lugduni Batavorum: ex officina Plantiniana, apud Franciscum Raphelengium. Walsh, Thomas. 1907. Ange de Saint Joseph. The Catholic encyclopedia 1. http://www.newadvent. org/cathen/01476b.htm (accessed 31 December 2016). Walton, Brian. 1673. Biblicus apparatus chronologico-topographico-philologicus. [. . .]. Tiguri: Ex Typographeo Bodmeriano. Warnerus, Levinus. 1644. Proverbiorum et sententiarum Persicarum centuria collecta. Lugduni Batavorum: Joannis Maire. Wellern, M. J. G. 1753. Gesammelte Spuren von dem Namen Germanus außer Deutschland. In Das Neueste aus der anmuthigen Gelehrsamkeit, 325–345. Leipzig: Breitkopf. Wiesehöfer, Josef. 2006. Karmania. In Hubert Cancik and Helmuth Schneider (eds.), Der Neue Pauly. Brill Online. http://referenceworks.brillonline.com/entries/der-neue-pauly/karmaniae609260 (accessed 31 December 2016). Windfuhr, Gernot L. 1979. Persian grammar. History and state of its study. Den Haag, Paris & New York: Mouton. Wottonius, Guilielmus. 1715. Oratio dominica in diversas omnium fere gentium linguas versa et propriis cujusque linguae characteribus expressa. Una cum dissertationibus nonnullis de linguarum origine variisque ipsarum permutationibus. In Johannes Chamberlaynius (ed.), Dissertatio de confusione linguarum Babylonica, 37–75. Amstelaedami: Typis Guilielmi & Davidis Goerei.

Shinji Ido

2 Huihuiguan zazi: A New Persian glossary compiled in Ming China Abstract: Huihuiguan zazi, a New Persian glossary compiled in China during the Ming period (1368–1644), has been largely neglected in the linguistic study of Persian despite its obvious importance as a source of data on the historical development of New Persian. In this article, all entries in one particular manuscript of huihuiguan zazi are tabulated and supplemented with translations and transcriptions, thus rendering the linguistic information contained in the glossary easily accessible to linguists. Keywords: huihuiguan zazi, Chinese, Persian Huihuiguan zazi (lit. ‘Huihuiguan1 literacy primer’), a New Persian2 glossary compiled in China during the Ming period (1368–1644), has been largely neglected in the linguistic study of Persian despite its obvious importance as a source of data on the historical development of New Persian.3 In this article, all entries in one particular manuscript of huihuiguan zazi (hereafter abbreviated as zazi) are tabulated and supplemented with translations and transcriptions, thus rendering the linguistic information contained in the glossary easily accessible to linguists. The entries are reproduced in typescript in Table 1. Zazi comprises hundreds of New Persian lexical items and their equivalents in Chinese. It thus offers a unique insight into the historical lexicology of New Persian as well as that of Chinese. Besides its value as a source of lexicological data, zazi has a different kind of value that derives from the script in which it presents New Persian lexical items; zazi presents New Persian lexical items in Chinese script, which, unlike Arabic script, does not (or rather cannot) dispense

1 Huihuiguan ‘office for/of Islamic states’ was the division that was in charge of Persian translation within the Ming dynasty Siyiguan ‘Four Barbarians’ Office’. 2 In the present paper, the term “New Persian” is used somewhat loosely in general reference to the Persian language after the Islamic conquest (see Paul 2013 and Utas 2006: 244–245) for issues of periodization of New Persian) while the present-day varieties of New Persian such as Tajik and Afghan Dari are referred to by their respective names. 3 The reader is referred to Honda (1963) and Liu (2008) for more detailed philological explanations of the glossary. Shinji Ido, Nagoya University DOI 10.1515/9783110455793-003

22

Shinji Ido

with the representation of vowels. Zazi hence serves as a rare source of data on the historical vowel phonology of New Persian. Surviving manuscripts and printed copies of zazi are not uniform in their contents and formats (see Honda 1963: 2, 56–58). There are basically two types of zazi; type 1 zazi lists New Persian lexical items in both their original forms (i.e., in Arabic script) and in their transcribed forms (i.e., in Chinese script), whereas type 2 zazi dispenses with the former and is written entirely in Chinese script. The two types also differ in the entries they comprise (see Honda 1963: 2–3). The manuscript whose entries are reproduced in this article is the Berlin Library Manuscript (hereafter Berlin Manuscript), which is a type 1 zazi manuscript. The manuscript contains a total of 773 entries, all of which are reproduced in Table 1. Table 1 is supplemented with four entries (531st to 534th entries in Table 1), which are absent in the Berlin Manuscript but are present in other copies of type 1 zazi.4 This article follows Ido (2015: 102–104) in identifying the variety of New Persian documented in type 1 zazi as an early fifteenth-century variety of New Persian that had currency in the Timurid court in Samarkand. Accordingly, the New Persian variety whose words are recorded in the Berlin Manuscript will be referred to simply as Timurid Persian in the remainder of this article. This article also follows Ido (2015: 107) in assuming that the Chinese-script transcription of Timurid Persian words in type 1 zazi is based on the Ming-period Chinese dialect of Beijing, where Siyiguan (lit. ‘Four barbarians’ office’), the bureau of translation responsible for the compilation of type 1 zazi, was situated. The Berlin Manuscript divides, as do other copies of type 1 zazi, its entries into eighteen sections, whose headings may be translated into English as (the sections for) “astrology”, “geography”, “season/time”, “person”, “human”, “body”, “palace”, “birds and beasts”, “flowers and trees”, “utensils”, “clothing”, “eating and drinking”, “treasure”, “voice and countenance”, “literature and history”, “four quarters”, “amount/number”, and “currency”, respectively.5 Each entry in the Berlin Manuscript consists of a Timurid Persian lexical item written in Arabic script, its equivalent in Chinese, and the Chinese-script 4 The Berlin Manuscript is unique among various copies of type 1 zazi in comprising an appendix that contains 223 entries (Honda 1963: 2). The appendix is not reproduced in this paper. The Berlin Manuscript is available online on the website of the Staatsbibliothek zu Berlin (Siyiguan 2013 [1579]). This article concerns the section titled “6. Persisch (Suppl. Vol. 12; Texte Vol. 23)”. The appendix, on the other hand, is indexed as “12. Persisch (Suppl. zu Vol. 6)”. 5 The section headings are 天文門, 地理門, 時令門, 人物門, 人事門, 身體門, 宮室門, 鳥獸門, 花木門, 器用門, 衣服門, 飲食門, 珍寳門, 聲色門, 文史門, 方隅門, 數目門, and 通用門, respectively.

23

Huihuiguan zazi: A New Persian glossary compiled in Ming China

transcription of the Timurid Persian lexical item. See, for example, the entry for

‫( ﻣﻐﻮﻝ‬91st entry in Table 1) reproduced below in typescript. ‫ﻣﻐﻮﻝ‬

韃靼卯斡勒

Timurid Persian Ming-period Beijing Chinese Chinese-script transcription

In this entry, the Timurid Persian word ‫ ﻣﻐﻮﻝ‬is followed by its Ming-period Beijing Chinese equivalent, namely 韃靼, which in turn is followed by 卯斡勒, a string of Chinese glyphs used as syllabic phonograms to jointly represent the Timurid Persian pronunciation of ‫ﻣﻐﻮﻝ‬. In Table 1, each entry in the Berlin Manuscript is assigned a unique number and is supplemented with the following: an English equivalent of the Chinese word, a reconstructed Ming-period Beijing pronunciation of the Chinese-script transcription, the present-day Standard Chinese pronunciation of the Chinesescript transcription, and the present-day Tajik word that appears to be cognate with (or, in the case of an entry word that is a loanword, the Tajik form of) the Timurid Persian word. The order in which these items appear in Table 1 is as follows. 1.

91

Entry number

2.

‫ﻣﻐﻮﻝ‬

Timurid Persian word

3.

韃靼

Ming-period Beijing Chinese equivalent of 2

4.

‘post-imperial Mongol(ian)’

English equivalent of 3

5.

卯斡勒

Chinese-script transcription of 2

6.

/muɒʊ.•.lɛ/

Reconstructed Ming-period Beijing Chinese pronunciation of 5

7.

/mɑu.wo.lɤ/

Present-day Standard Chinese pronunciation of 5

8.

муғул

Present-day Tajik word that corresponds to, or is cognate with, 2

In Table 1, these items are arranged horizontally, thus: Column 1.

2.

3.

Entry

‫ﻣﻐﻮﻝ‬

韃靼 post-imperial 卯斡勒 muɒʊ.•.lɛ mɑu.wo.lɤ муғул

91

4.

5.

6.

7.

8.

Mongol(ian)

Notes on each of columns 1 to 8 follow. 1. Entry numbers in Table 1 correspond with those used by Honda (1963), and hence also with those by Liu (2008), who adopts Honda’s numbering.

24 2.

3.

Shinji Ido

In preparing Table 1, attention was paid not to “correct” the handwritten Timurid Persian words to the current New Persian orthography. In other words, I tried to replicate in Table 1 (as faithfully as the availability of letters and symbols in the font allowed) the handwritten Timurid Persian words as they appear in the manuscript.6 As a result, Table 1 preserves “deviant” spellings (which may reflect some phonological characteristics of Timurid Persian), misspellings, and genuine slips of the brush found in the manuscript. For instance, the 309th entry word in the Berlin Manuscript, namely ‫ﺍﻧﺪﺍﻥ‬, is presented in Table 1 as ‫ﺍﻧﺪﺍﻥ‬, though it is apparently cognate with present-day New Persian ‫ ﺩﻧﺪﺍﻥ‬dandān7 ‘tooth’. Similarly, the 732nd entry word in the Berlin Manuscript, namely ‫ﮐﺪﺷﺘﻦ‬, is reproduced “as is” in Table 1, despite the spelling of present-day New Persian ‫ ﮔﺬﺷﺘﻦ‬gozaštan8 ‘to pass’.9 For the simple reason that we know little about Timurid Persian phonology, no phonological representations of Timurid Persian words are provided in Table 1. Where discrepancies between the assumed meaning of a Timurid Persian lexical item and that of its Chinese equivalent are large, such as in the 396th entry where Timurid Persian ‫ﺟﺎﻧﻮﺭ‬, an apparent cognate of contemporary New Persian ‫ ﺟﺎﻧﻮﺭ‬jān(e)var10 ‘animal’, and Chinese 鶯 ‘oriole’ are shown to be each other’s equivalents, they are mentioned in notes.11

6 The reader is recommended to consult the original manuscript (Siyiguan 2013 [1579]) to examine their forms in handwriting. 7 For simplicity, only the Iranian Persian romanization of ‫ ﺩﻧﺪﺍﻥ‬is provided here. The same word is romanized as dandaan in Sayd’s (2009: 153) Dari romanization. The Tajik equivalent of the word is дандон. 8 For simplicity, only the Iranian Persian romanization of ‫ ﮔﺬﺷﺘﻦ‬is provided here. The same word is romanized as gozashtan in Sayd’s (2009: 247) Dari romanization. The Tajik equivalent of the word is гузаштан. 9 These examples testify to orthographical differences between Timurid Persian and the New Persian of today, with one salient difference being the absence of the letter ‫ ﮒ‬in the former. This may not be entirely surprising; Kuroyanagi (1984: 98) observes that the invention of ‫ﮒ‬ postdates that of ‫ ﭺ‬, ‫ﭖ‬, and ‫( ﮊ‬which, according to Kuroyanagi, already existed in the late thirteenth century) by several centuries. While the Berlin Manuscript contains no instance of ‫ﮒ‬, it does contain one instance of the letter ‫ ݣ‬in the 330th entry word of the manuscript, namely ‫ﯕﺮﺩﻥ‬. Incidentally, the use of ‫ ݣ‬is less restricted in the two Beijing Library manuscripts reproduced in Beijing tushuguan guji chuban bianji zu (1987–[1994]: 465–572), in which there are multiple instances of the letter. 10 For simplicity, only the Iranian Persian romanization of ‫ ﺟﺎﻧﻮﺭ‬is provided here. The same word is romanized as jaan(a)war in Sayd’s (2009: 94) Dari romanization. The Tajik equivalent of the word is ҷон(а)вар. 11 Such discrepancies do not have to result from mistranslation. At least theoretically, they can be ascribed to “false friendship” in Timurid Persian and contemporary New Persian or that in Ming-period Beijing Chinese and present-day Standard Chinese, as there is no guarantee that ‫ ﺟﺎﻧﻮﺭ‬meant ‘animal’ among Timurids in Samarkand in the early fifteenth century.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

25

4. For translating Chinese words into English, I consulted Tōdō and Kanō (2005) and Kleeman and Yu (2010). An effort was made to present the meanings that the Chinese words had in the early Ming period, during which period type 1 zazi was likely first compiled (Ido 2015: 100–104). For example, the word 韃靼 in the example shown above, which is currently used primarily in reference to ‘Tatar’ in Standard Chinese, meant ‘post-imperial Mongol(ian)’ in Ming China, hence the English translation.12 5. Some Chinese-script transcriptions in the Berlin Manuscript contain obvious slips of the brush. Such scribal errors are found in entries 166, 175, 415, 522, 678, and 755, and are pointed out in notes. There is also one illegible glyph in the 270th entry, which is also noted in a note. 6. The reconstructed Ming-period Chinese pronunciations presented in this column are based on Lu’s (1988) reconstruction of the readings of the Chinese glyphs contained in dengyun tujing, a set of rhyme tables compiled in Ming China.13 However, a number of Chinese glyphs occur in zazi for which their Ming-period readings are not retrievable from dengyun tujing. The symbol • in this column serves as a placeholder for such glyphs. Thus, the Ming-period Beijing Chinese pronunciation of 卯斡勒 is shown to be /muɒʊ.•.lɛ/ in this column because the reading of the glyph 斡 is not retrievable from dengyun tujing. The glyph 阿, which appears in column 5, is shown to have the reading of “ɔ/ɑ” in this column. This is because the glyph likely had multiple readings in Ming-period Beijing (as in fact it still does in Standard Chinese today) and no good reason seems to exist for limiting its reading in zazi to one of /ɔ/ and /ɑ/.14 Note that the reconstructed Ming-period Beijing Chinese pronunciations presented in this article are not beyond dispute, nor do they certainly represent the pronunciations that the compiler(s) of type 1 zazi used. For instance, 12 Many of the English translations supplied in this column abstract away from the distinction between different parts of speech because many Chinese words are not readily amenable to that distinction in the absence of context (see Norman 1998: 157). For example, the glyph 差 in the 227th entry of type 1 zazi, for which only ‘send; differ’ is shown in Table 1, may mean any of ‘send’, ‘differ’, ‘different’, ‘difference’, and ‘somewhat’, depending on the context in which it occurs. 13 Lu’s reconstruction is used here with slight modifications made in accordance with Tōdō (1957: 104–108), Satoh (1981), and Ye (2001: 140–153). See Nagashima (1941: 23–24) for a more detailed philological explanation of dengyung tujing, which can be literally translated as ‘illustrated book of rhyme classification’. 14 An analysis of dengyun tujing reveals multiple readings of some glyphs that occur in column 5, but, in this column, with the exception of 阿 (indicated in this section as ɔ/ɑ), only one reading is assigned to each of the glyphs. The reader is referred to Ido (2015: 117–118) for an explanation of how particular readings are selected for the glyphs.

26

Shinji Ido

法, 府, 番, and 夫, which typically appear in the Chinese-script transcriptions of Timurid Persian words comprising the letter ‫ﻑ‬, are respectively pronounced /fa/, /fu/, /fan/, and /fu/ (i.e., with the voiceless labio-dental fricative) in today’s Standard Chinese, while the glyphs’ reconstructed Ming-period Beijing Chinese pronunciations presented in this column are /puɑ/, /piu/, /puiɛn/, and /piu/, respectively. This may suggest that the reconstruction needs revision, and/or that considerable variation exists within what I simply refer to as Ming-period Beijing Chinese in this article.15 7. The present-day Standard Chinese pronunciations of the Chinese-script transcriptions in the Berlin Manuscript are supplied in Table 1 primarily to allow an inference as to how the syllables shown with • in column 6 (see above) may have been pronounced in Ming-period Beijing.16 A number of glyphs have multiple readings in Standard Chinese, but in this column only one reading is assigned to each glyph. For example, I identify the reading of the glyph 都 not as dōu /tou/ but as dū /tu/ and that of 血 not as xuè /ɕɥe/ but as xiě /ɕje/ because the latter seem more in line with the reconstructed Ming-period Beijing Chinese readings of the glyphs.17 8. The Tajik data in this column are provided in the hope that they are useful in studying the phonology of Timurid Persian because Tajik is apparently more closely related to Timurid Persian than are other major varieties of contemporary New Persian such as Iranian Persian and Afghan Dari (in terms of the vowel system at any rate; see Ido [2015]). The symbol × used in this column indicates that I could not find a Tajik word that is in apparent correspondence with (or that is an apparent cognate of) the Timurid Persian entry in the published Tajik-language sources available to me.18 English translations are provided in notes for Tajik words whose meanings differ markedly from those presented in column 4.

15 Alternatively, it could suggest variation within Timurid Persian or that Timurid Persian ‫ ﻑ‬in zazi represents a bilabial plosive. 16 The IPA transcription in this section is based on Lin’s (2007: 123–135, 283–292) phonetic description of the pīnyīn romanization system. 17 Similarly, 恁 is read not nèn /nən/ but nín /nin/ while 塞 is read not sāi /sai/ but sè /sɤ/ in this section. 18 The Tajik spellings of many archaic words are retrieved from Šukurov et al. (1969a, 1969b).

Huihuiguan zazi: A New Persian glossary compiled in Ming China

27

Table 1: Huihuiguan zazi (the Berlin Library Manuscript) The “astrology” section

天文門

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

‫ﺁﺳﻤﺎﻥ‬ ‫ﺁﻓﺘﺎﺏ‬ ‫ﻣ ﺎه‬ ‫ﺳﺘﺎﺭه‬ ‫ﺍﺑﺮ‬ ‫ﺑﺎﺩ‬ ‫ﺑﺎﺭﺍﻥ‬ ‫ﺷ ﺒ ﻨﻢ‬ ‫ﭘﺸﮏ‬ ‫ﺑﺮ ﻑ‬ ‫ﺭ ﻋﺪ‬ ‫ﺑﺮﻕ‬ ‫ﻗﻮﺱ ﻗﺰﺥ‬ ‫ﺕ ﺍﻟﻨﻌﺶ‬ ٜ ‫ﺑﻨ ﺎ‬ ‫ﺑﺨﺎﺭ‬ ‫ﻏﺒﺎﺭ‬ ‫ﻳﺦ‬ ‫ﻳﺨﭽﻪ‬ ‫ﺻﺎﻋﻘﻪ‬ ‫ﺁﺗﺶ‬ ‫ﻧﻮﺭ‬ ‫ﺳﺎﯾﻪ‬ ‫ﺭﻭﺷﻦ‬ ‫ﺗﺎﺭﯾﮏ‬ ‫ﺑﺎﺩ ﺻﺒﺎ‬

天日

26

‫ﺑﺎﺩ ﺳﻤﻮﻡ‬

薫風

27 28 29 30 31 32

‫ﺑﺎﺩ ﺩﺑﻮﺭ‬ ‫ﺑﺎﺩ ﺻﺎﯾﻢ‬ ‫ﺩﺍﺭه‬ ‫ﻫﺎﻟﻪ‬ ‫ﺑﺪﺭ‬ ‫ﻣﺤﺎﻕ‬

金風

33 34 35

‫ﮐﺴﻮﻑ‬ ‫ﺧﺴﻮﻑ‬ ‫ﮊﺍﻟﻪ‬

日蝕

月星雲風雨露霜雪雷電虹斗煙霧氷雹霆火光影明暗東風

朔風日運月運圓月殘月

月蝕霖雨

他列克巴得塞巴

ɔ/ɑ.sɿ.muɑ.ən ɔ/ɑ.piu.tʰɑ.• muɑ.xɛ si.tʰɑ.lɛ ɔ/ɑ.•.• puɑ.tɛ puɑ.•.ən •.•.nan •.•.• puɛ.•.piu lɛ.ɔ/ɑ.tɛ puɛ.•.kɛ kɒʊ.sɿ.ku.ʦɛ.xɛ puɛ.nɑ.tʰun.nɑ.ɔ/ɑ.• •.xɑ.• •.puɑ.• iɛ.xɛ iɛ.xɛ.• sɑ.•.kɛ ɔ/ɑ.tʰɛ.• nu.• sɑ.iɛ luɔ.ʂan tʰɑ.liɛ.• puɑ.tɛ.sɛ.puɑ

a.sɹ̩.ma.ən a.fu.tʰa.pu ma.xei ɕi.tʰa.lɤ a.pu.əɹ pa.tɤ pa.la.ən ʂɤ.pu.nan pʰu.ʂɹ̩.kʰɤ pai.əɹ.fu lɤ.a.tɤ pai.əɹ.kɤ kɑu.sɹ̩.ku.ʦɤ.xei pai.na.tʰwən.na.a.ʂɹ̩ pu.xa.əɹ wu.pa.əɹ je.xei je.xei.tʂʰɤ sa.ɤ.kɤ a.tʰɤ.ʂɹ̩ nu.əɹ sa.je lwo.ʂan tʰa.lje.kʰɤ pa.tɤ.sɤ.pa

осмон офтоб моҳ ситора абр бод борон шабнам пашк барф раъд барқ қавси қузаҳ Банотуннаъш бухор ғубор ях яхча соиқа оташ нур соя равшан торик боди сабо

巴得塞木恩

puɑ.tɛ.sɛ.•.ən

pa.tɤ.sɤ.mu.ən

боди самум19

巴得得卜兒

puɑ.tɛ.tɛ.•.• puɑ.tɛ.sɑ.in tɑ.lɛ xɑ.lɛ puɛ.tɛ.• mu.xɑ.kɛ

pa.tɤ.tɤ.pu.əɹ pa.tɤ.sa.jin ta.lɤ xa.lɤ pai.tɤ.əɹ mu.xa.kɤ

kʰu.su.piu xu.su.piu ʐɛ.lɛ

kʰu.su.fu xu.su.fu ɹɤ.lɤ

боди дабур20 ×21 × ҳола бадр муҳок / маҳоқ / миҳоқ кусуф хусуф жола

sky sun moon star cloud wind rain dew frost snow thunder lightning rainbow Plough (Big Dipper) smoke, vapour fog, mist ice hail thunderbolt fire light shadow clear, bright dark east wind; spring wind south wind; early summer breeze autumn breeze north wind solar halo22 lunar halo23 full moon morning moon

阿思媽恩阿夫他卜

solar eclipse lunar eclipse long-continued rain

苦蘇夫

媽黒洗他勒阿卜兒巴得把剌恩舍卜南僕石克百兒夫勒阿得百兒革髙思古則黒百納呑納阿石卜哈兒五巴兒夜黒夜黒徹撒額革阿忒石奴兒撒夜羅山

巴得撒因打勒哈勒百得兒母哈革

虎蘇夫惹勒

19 Modern Tajik боди самум ‘hot, harmful desert wind’. 20 Modern Tajik боди дабур ‘west wind’. 21 Timurid Persian ‫ ﺑﺎﺩ ﺻﺎﯾﻢ‬would be written in Tajik as боди соим lit. ‘faster’s wind’, which, however, seems to lack direct semantic correspondence with 朔風 ‘north breeze’. 22 This is actually the meaning of 日暈 rìyùn, with which 日運 is homophonous (in present-day Standard Chinese). 23 This is actually the meaning of 月暈 yuèyùn, with which 月運 is homophonous (in presentday Standard Chinese).

28

Shinji Ido

36 37

‫ﺛﺎﺑﺘﺎﺕ‬ ‫ﺳﻴﺎﺭﺍﺕ‬

雜星七政

38 39 40

‫ﺻﺒﺞ ﺻﺎﺩﻕ‬ ‫ﻫﻮﺍ‬ ‫ﺍﻧﺠﻼ‬

天暁天氣復圓

miscellaneous stars sun, moon, Mercury, Mars, Venus, Jupiter, and Saturn dawn weather fourth contact (astronomy)

撒必他忒塞呀剌忒

sɑ.pi.tʰɑ.tʰɛ sɛ.•.•.tʰɛ

sa.pi.tʰa.tʰɤ sɤ.ja.la.tʰɤ

собитот саёрот24

速卜黒撒的革黒洼

•.•.xɛ.sɑ.ti.kɛ xɛ.• •.•.•

su.pu.xei.sa.ti.kɤ xei.wa jin.tʂɹ̩.la

субҳи содиқ ҳаво инҷило25

卓衣

kʰuɔ.xɛ tʂuɔ.i

kʰɤ.xei tʂwo.ji

кӯҳ ҷӯй26

魯得

lu.tɛ

lu.tɤ

рӯд27

得兒呀

tɛ.•.• xɑ.• ʦɛ.mi.• ɔ/ɑ.• tʂʰɛ.•.• xan.puɑ.•.• muan.•.•.tʰɛ •.xɛ.• nɑ.•.xɛ luɔ.sɿ.tʰɑ pi.•.puɑ.ən

tɤ.əɹ.ja xa.kʰɤ ʦɤ.mi.jin a.pu tʂʰɤ.ʂɹ̩.mwo xan.pa.li.ɤ man.la.kʰɤ.tʰɤ ʂɤ.xei.əɹ na.wa.xei lwo.sɹ̩.tʰa pi.ja.pa.ən

дарё28 хок замин об чашма × мамлакат шаҳр навоҳӣ русто биёбон30

•.•.•.tʰɛ puɑ.• kɛ.•.tɛ liɛ.• sɑŋ.• •.xɛ puɑ.•.• tʂʰɑ.xɛ tʰɛ.•.lɛ •.•.• ɔ/ɑ.kɛ.puɛ ɔ/ɑ.• xɒʊ.ʦɿ

ʨi.la.ɤ.tʰɤ pa.ɤ kɤ.əɹ.tɤ lje.kʰɤ saŋ.kʰɤ la.xei pa.ʦa.əɹ tʂʰa.xei tʰɤ.wa.lɤ tje.wa.əɹ a.kɤ.pai a.əɹ xɑu.ʦɹ̩

зироат31 боғ гард рег санг роҳ / раҳ бозор чоҳ / чаҳ тавора / тувара девор ақаба ғор ҳавз

尹知剌

The “geography” section

地理門

41 42

‫ﮐﻮه‬ ‫ﺟﻮﯼ‬

山

43

‫ﺭﻭﺩ‬

江

44 45 46 47 48 49 50 51 52 53 54

‫ﺩﺭﯾﺎ‬ ‫ﺧ ﺎﮎ‬ ‫ﺯ ﻣﯿﻦ‬ ‫ﺁﺏ‬ ‫ﭼﺸﻤﻪ‬ ‫ﺧﺎﻧﺒﺎﻟﻎ‬29 ‫ﻣﻤﻠﮑﺖ‬ ‫ﺷﻬﺮ‬ ‫ﻧﻮﺍﺣﯽ‬ ‫ﺭﻭﺳﺘﺎ‬ ‫ﺑﯿﺎﺑﺎﻥ‬

海

55 56 57 58 59 60 61 62 63 64 65 66 67

‫ﺯﺭﺍﻋﺖ‬ ‫ﺑﺎﻍ‬ ‫ﮐﺮﺩ‬ ‫ﺭﯾﮏ‬ ‫ﺳﻨﮏ‬ ‫ﺭﺍه‬ ‫ﺑﺎﺯﺍﺭ‬ ‫ﭼ ﺎه‬ ‫ﺗﻮﺍﺭه‬ ‫ﺩﯾﻮﺍﺭ‬ ‫ﻋﻘﺒﻪ‬ ‫ﻏﺎﺭ‬ ‫ﺧﻮﺽ‬

田園

河

土地水泉京國城境村野

塵沙石路市井籬墻嶺洞潭

mountain stream; river; Yellow River (large) river; Yangtze River sea earth ground water fountain; spring capital (city) country; state city; town border; region village wide empty plains; field field; farmland garden dust, dirt sand stone road market well hedge wall; fence ridge; mountain cave deep water/pool

科黒

哈克則米尹阿卜扯石黙罕巴力額滿剌克忒舍黒兒納洼黒羅思他比呀巴恩即剌額忒巴額革兒得列克桑克剌黒把咱兒叉黒忒洼勒迭洼兒阿革百阿兒蒿子

24 Modern Tajik саёрот ‘planets’ appears in Odilov (1974: 82). 25 Modern Tajik инҷило ‘light/bright/manifestation’ does not specifically refer to fourth contact. 26 Modern Tajik ҷӯй means ‘brook; stream’ rather than ‘river’. 27 Modern Tajik рӯд ‘canal; river’ is not used in reference to large rivers. 28 Modern Tajik дарё ‘river’. 29 ‫( ﺧﺎﻧﺒﺎﻟﻎ‬modern Beijing) is variously spelt Khan-baliq (Franke 1966: 57), Khānbaliḳ̊ (Barthold 1987: 898), Khanbaligh (Atwood 2004: 123), etc., in English. 30 Modern Tajik биёбон ‘desert’. 31 Modern Tajik зироат ‘agriculture’.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

68 69 70 71 72 73 74 75 76 77 78

‫ﭼﻮﻝ‬ ‫ﺟﻮﯾﭽﻪ‬ ‫ﮐﺪ ﺭ ﮐ ﺎ ه‬ ‫ﻟﺐ ﺟﻮﯼ‬ ‫ﺩﻭﺗﺎه‬ ‫ﻣﺰﺍﺭ‬ ‫ﻣﻮﺝ‬ ‫ﺑﻴﺨﺎﻥ‬ ‫ﺟﻬﺎﻥ‬ ‫ﺟ ﻨﮑ ﻞ‬ ‫ﻣﻌﺪﻥ‬

川溝

79 80 81 82 83 84 85 86

‫ﺣ ﻀﻴﺮ‬ ‫ﮐﻞ‬ ‫ﺗﺮ‬ ‫ﺧﺸ ﮏ‬ ‫ﻣ ﻐ ﺎﮎ‬ ‫ﭘﺎﯾﺎﺏ‬ ‫ﺧ ﻨﺪ ﻕ‬ ‫ﻣﯿﺪﺍﻥ‬

街

87 88 89 90 91 92 93 94 95 96

‫ﺩﺭﻭﺍﺯه‬ ‫ﺩﻫﺎﻧﻪ‬ ‫ﻣﺴﻠﻤﺎﻥ‬ ‫ﺗﺮﮐ ﯽ‬ ‫ﻣﻐﻮﻝ‬37 ‫ﺟﻮﺭﺟﯽ‬ ‫ﺗ ﺒﺖ‬ ‫ﻗﺮﻳﺎﻧﯽ‬ ‫ﮐﻨﺠﺎﻧﻔﻮ‬39 ‫ﺗﻨﻐﻮﺕ‬40

關廂

渡岸徑墳潮庄世林鑛

泥濕乾深淺城壕教塲

關口回回髙昌韃靼女直西畨雲南陝西河西

river; plain ditch cross bank; shore path grave tide manor, village world forest ore; mineral deposit

搠勒卓衣徹

street mud wet dry deep; depth shallow city moat square for training/reviewing troops area outside of a city gate pass; juncture Islam(ic) Gaochang/Qara-hoja post-imperial Mongol(ian) Jurchen (Kham) Tibetans Yunnan Shaanxi west of the Yellow River

黒雖兒

古得兒噶黒勒比卓衣堵他黒黙咱兒卯知擺哈恩者哈恩展革勒母阿定

吉勒忒兒戸石克母阿克呀卜罕得革買搭恩得兒洼則得哈納母蘇里媽恩土兒期卯斡勒卓兒知土百忒古兒呀你欽張夫湯屋忒

29

•.lɛ tʂuɔ.i.• ku.tɛ.•.•.xɛ lɛ.pi.tʂuɔ.i •.tʰɑ.xɛ •.•.• muɒʊ.• puai.xɑ.ən tʂɛ.xɑ.ən tʂan.kɛ.lɛ mu.ɔ/ɑ.tiŋ

ʂwo.lɤ tʂwo.ji.tʂʰɤ ku.tɤ.əɹ.ka.xei lɤ.pi.tʂwo.ji tu.tʰa.xei mwo.ʦa.əɹ mɑu.tʂɹ̩ pai.xa.ən tʂɤ.xa.ən tʂan.kɤ.lɤ mu.a.tjəŋ

xɛ.suei.• •.lɛ tʰɛ.• xu.•.• mu.ɔ/ɑ.• pʰuɑ.•.• xan.tɛ.kɛ muai.tɑ.ən

xei.swei.əɹ ʨi.lɤ tʰɤ.əɹ xu.ʂɹ̩.kʰɤ mu.a.kʰɤ pʰa.ja.pu xan.tɤ.kɤ mai.ta.ən

чӯл32 ҷӯйча гузаргоҳ лаби ҷӯй дутоҳ33 мазор мавҷ34 × ҷаҳон ҷангал маъдан / маъдин × гил тар хушк мағок поёб хандақ майдон35

tɛ.•.•.ʦɛ tɛ.xɑ.nɑ mu.su.•.muɑ.ən tʰu.•.• muɒʊ.•.lɛ tʂuɔ.•.• tʰu.puɛ.tʰɛ ku.•.•.ni kʰin.•.piu tʰɑŋ.•.tʰɛ

tɤ.əɹ.wa.ʦɤ tɤ.xa.na mu.su.li.ma.ən tʰu.əɹ.ʨʰi mɑu.wo.lɤ tʂwo.əɹ.tʂɹ̩ tʰu.pai.tʰɤ ku.əɹ.ja.ni ʨʰin.tʂɑŋ.fu tʰɑŋ.wu.tʰɤ

дарвоза даҳона мусулмон туркӣ36 муғул × Тибет38 × × ×

32 Modern Tajik чӯл ‘steppe’, which is a Mongolian loan; does not mean ‘river’. 33 Modern Tajik дутоҳ ‘bent; curved’. 34 Modern Tajik мавҷ ‘wave’. 35 Modern Tajik майдон ‘public square’. 36 Modern Tajik туркӣ ‘Turkic’ does not refer specifically to ‘Gaochang/Qara-hoja’. 37 韃靼 Dádá, which is currently used in reference to ‘Tatar’, was used in the Ming to refer to (post-imperial Mongolian) Northern Yuan, hence the entry word and Tajik муғул ‘Mongol’. 38 Present-day Tajik Тибет ‘Tibet’ is probably a loanword from Russian. 39 ‫ ﮐﻨﺠﺎﻧﻔﻮ‬has been identified in the literature as reflecting the pronunciation of 京兆府 Jīngzhàofǔ or 咸陽府 Xiányángfŭ (see Haw 2014: 7). 40 Haw (2014: 22) identifies “the Hexi 河西 region” as “[t]he Tangut region, that is, the former Xi Xia 西夏 state”.

30

Shinji Ido

The “season/time” section

時令門

97 98 99 100 101 102 103 104 105 106

‫ﺳﺎﻝ‬ ‫ﻣﺎه‬ ‫ﺭﻭﺯ‬ ‫ﺳﺎﻋﺖ‬ ‫ﺑﻬﺎﺭ‬ ‫ﺗﺎﺑﺴﺘﺎﻥ‬ ‫ﺗﯿﺮﻣﺎه‬ ‫ﺯﻣﺴﺘﺎﻥ‬ ‫ﺑﺎﻣﺪﺍﺩ‬ ‫ﺷﺒﺎﻧﮑﺎه‬

年

107

‫ﺍﺟﺘﻤﺎﻉ‬

朔

108

‫ﺍﺳﺘﻘﺒﺎﻝ‬

望

109 110 111 112

‫ﴎﻣﺎ‬ ‫ﮐﺮﻣﺎ‬ ‫ﺗﯿﺮه‬ ‫ﺻﺎﻑ‬

寒

113 114 115

‫ﻧ ﯿﻢ ﺷﺐ‬ ‫ﺳﺤﺮ‬ ‫ﭘﮑﺎه‬

子

116 117 118 119 120 121 122

‫ﺿﺨﻮه‬43 ‫ﺍﮊﺩﺭ‬ ‫ﭼﺎﺷﺘﮑﺎه‬ ‫ﺍﺳﺘﻮﺍ‬ ‫ﺑ ﯿﺸ ﯿ ﻦ‬ ‫ﺩﯾﮑﺮ‬ ‫ﺁﻓﺘﺎﺏ ﻓﺮﻭﺭﻓﺘﻦ‬

卯

123 124 125 126 127

‫ﺷﺎﻡ‬ ‫ﺧﻔﺘ ﻦ‬ ‫ﻣﻌﺘﺪﻝ‬ ‫ﻓﴪﺩﻥ‬ ‫ﺳﺎﻝ ﻋﴩﺕ‬

戌

128 129

‫ﺳﺎﻝ ﻗﺤﻄﯽ‬ ‫ﺁﻓﺖ ﺳﻤﺎﻭﯼ‬

歉年

130 131

‫ﺁﻓﺖ ﺧﺸﮏ‬ ‫ﺩﯾﻨﻪ‬

旱災

41 42 43 44

月日時春夏秋冬早晩

熱陰晴

丑寅

辰巳午未申酉

舍榜噶黒

sɑ.lɛ muɑ.xɛ luɔ.ʦɿ sɑ.•.tʰɛ •.xɑ.• tʰɑ.pi.sɿ.tʰɑ.ən tʰi.•.muɑ.xɛ •.mi.sɿ.tʰɑ.ən puɑŋ.tɑ.tɛ •.puɑŋ.•.xɛ

sa.lɤ ma.xei lwo.ʦɹ̩ sa.ɤ.tʰɤ pu.xa.əɹ tʰa.pi.sɹ̩.tʰa.ən tʰi.əɹ.ma.xei ʨi.mi.sɹ̩.tʰa.ən pɑŋ.ta.tɤ ʂɤ.pɑŋ.ka.xei

first day of a lunar month fifteenth day of a lunar month cold hot cloudy clear, fine (weather) 11 pm–1 am 1 am–3 am 3 am–5 am

以知體媽額

•.•.tʰi.muɑ.•

ji.tʂɹ̩.tʰi.ma.ɤ

сол моҳ / маҳ рӯз соат баҳор тобистон тирамоҳ зимистон бомдод шабонгоҳ / шабонгаҳ иҷтимоъ41

以思體革巴勒

•.sɿ.tʰi.kɛ.puɑ.lɛ

ji.sɹ̩.tʰi.kɤ.pa.lɤ

истиқбол42

塞兒媽

sɛ.•.muɑ kɛ.•.muɑ tʰi.lɛ sɑ.piu

sɤ.əɹ.ma kɤ.əɹ.ma tʰi.lɤ sa.fu

сармо гармо тира соф

nin.•.• sɛ.xɛ.• •.•.xɛ

nin.ʂɤ.pu sɤ.xei.əɹ pʰu.ka.xei

5 am–7 am 7 am–9 am 9 am–11 am 11 am–1 pm 1 pm–3 pm 3 pm–5 pm 5 pm–7 pm

祖黒斡

•.xɛ.• ɔ/ɑ.ʐʅ.tɛ.• tʂʰɑ.•.tʰɛ.•.xɛ •.sɿ.tʰi.• pʰiɛ.•.• ti.kɛ.• ɔ/ɑ.piu.tʰɑ.•.piu. luɔ.lɛ.piu.tʰan ʂɑ.ən xu.piu.tʰan mu.ɔ/ɑ.tʰɛ.ti.lɛ •.sɿ.•.tan sɑ.lɛ.•.•.lɛ.tʰɛ

ʦu.xei.wo a.ɹɹ̩.tɤ.əɹ tʂʰa.ʂɹ̩.tʰɤ.ka.xei ji.sɹ̩.tʰi.wa pʰje.ʂɹ̩.jin ti.kɤ.əɹ a.fu.tʰa.pu.fu. lwo.lɤ.fu.tʰan ʂa.ən xu.fu.tʰan mu.a.tʰɤ.ti.lɤ fei.sɹ̩.əɹ.tan sa.lɤ.ɤ.ʂɹ̩.lɤ.tʰɤ

нимшаб саҳар пагоҳ / пагаҳ × аждар чоштгоҳ истиво пешин дигар офтоб фурӯ рафтан шом хуфтан мӯътадил фусурдан соли ишрат

sɑ.lɛ.kɛ.xɛ.• ɔ/ɑ.puɑ.tʰi.sɛ.muɑ.•

sa.lɤ.kɤ.xei.tʰwei a.fa.tʰi.sɤ.ma.ɥy

ɔ/ɑ.puɑ.tʰi.xu.•.• ti.nɑ

a.fa.tʰi.xu.ʂɹ̩.kʰɤ ti.na

year month day hour spring summer autumn winter morning evening

撒勒媽黒羅子撒額忒卜哈兒他比思他恩體兒媽黒即米思他恩榜搭得

革兒媽體勒撒夫恁舍卜塞黒兒僕噶黒

阿日得兒叉石忒噶黒以思體洼撇石尹底革兒阿夫他卜府羅勒夫貪

温凍稔年

水災

昨日

7 pm–9 pm 9 pm–11 pm warm freezing year of bumper harvest lean year flood disaster

沙恩

drought yesterday

阿法梯戸石克

虎夫貪母阿忒的勒非思兒丹撒勒額石勒忒撒勒革黒推阿法梯塞媽迂

底納

Modern Tajik иҷтимоъ ‘society; gathering’. Modern Tajik истиқбол ‘festive welcome; future; prospects’. According to Steingass (2012: 800), ‫ ﺿﺤﻮﺓ‬means “[f]orenoon; luncheon-time”. Modern Tajik офати самовӣ lit. ‘skyey/celestial disaster’.

соли қаҳтӣ офати самовӣ44 офати хушк дина

Huihuiguan zazi: A New Persian glossary compiled in Ming China

132 133

‫ﻓﺮﺩﺍ‬ ‫ﻫﺮﺭﻭﺯ‬

明日逐日

134 135 136 137

‫ﺟﻬﺎﺭ ﻓﺼﻞ‬ ‫ﺑﻨﺞ ﻋﻨﺎﴏ‬ ‫ﺳﺎﻝ ﺭﻭﻧﺪه‬ ‫ﺳﺎﻝ ﺁﯾﻨﺪه‬

四季五行45 去年来年

tomorrow day by day; everyday four seasons five elements last year next year

31

法兒搭哈兒羅子

puɑ.•.tɑ xɑ.•.luɔ.ʦɿ

fa.əɹ.ta xa.əɹ.lwo.ʦɹ̩

фардо ҳар рӯз

叉哈兒法思勒

tʂʰɑ.xɑ.•.puɑ.sɿ.lɛ pʰuan.•.•.nɑ.•.• sɑ.lɛ.lɛ.•.tɛ sɑ.lɛ.ɔ/ɑ.iɛn.tɛ

tʂʰa.xa.əɹ.fa.sɹ̩.lɤ pʰan.tʂɹ̩.ɤ.na.su.əɹ sa.lɤ.lɤ.wan.tɤ sa.lɤ.a.jɛn.tɤ

ч(аҳ)ор фасл панҷ аносир соли раванда соли оянда

pʰuɑ.tɛ.ʂɑ.xɛ ɔ.•.• •.ɑŋ.puɛ.•

pʰa.tɤ.ʂa.xei wo.ʨi.əɹ pʰwo.ɑŋ.pai.əɹ

xɛ.•.• ɔ/ɑ.mi.• ni.san.tɛ •.•.• lɛ.•.•.• •.tɛ

xei.ʨʰi.jin a.mi.əɹ ni.san.tɤ ji.li.tʂʰɹ̩ lɤ.ʂɹ̩.kʰɤ.əɹ tʂɤ.tɤ

подшоҳ вазир пайғамбор / пайғомбор ҳаким амир нависанда элчӣ46 лашкар ҷадд

•.tɛ.• muɑ.tɛ.• • tɑ.tɛ.• pi.•.ʦɛ.• puɑ.•.•.tɛ tuɔ.xɛ.tʰɛ.• xuɛ.• •.• xɑ.•.tɛ mi.xɛ.muɑ.ən •.sɿ.tʰɑ.tɛ ʂɑ.•.•.tɛ pʰi.• tʂʉ.•.ən tʰuɔ.in •.• kʰi.ʂɑ.•.•.ʦɿ

pʰwo.tɤ.əɹ ma.tɤ.əɹ ʦan ta.tɤ.əɹ pi.la.ʦɤ.əɹ fa.əɹ.ʦan.tɤ two.xei.tʰɤ.əɹ xwo.ʂɹ̩ ja.əɹ xa.wan.tɤ mi.xei.ma.ən wu.sɹ̩.tʰa.tɤ ʂa.ʨi.əɹ.tɤ pʰi.əɹ tʂu.wa.ən tʰwo.jin mu.ɤ ʨʰi.ʂa.wo.əɹ.ʦɹ̩

падар модар зан додар бародар фарзанд духтар хеш ёр хованд меҳмон устод шогирд пир ҷавон × муғ48 кашоварз

•.tɑ.kɛ.• tʰɛ.pi.tʰɛ.

sɑu.ta.kɤ.əɹ tʰɤ.pi.tʰɤ

савдогар табиб

潘知額納速兒撒勒勒灣得撒勒阿言得

The “person” section

人物門

138 139 140

‫ﺑﺎﺩﺷﺎه‬ ‫ﻭﺯﻳﺮ‬ ‫ﺑﯿﻐﺎﻣﱪ‬

君臣

141 142 143 144 145 146

‫ﺣﮑ ﻴ ﻢ‬ ‫ﺍﻣﯿﺮ‬ ‫ﻧﻮﯾﺴﻨﺪه‬ ‫ﺍﻳﻠﺠﯽ‬ ‫ﻟ ﺸﮑ ﺮ‬ ‫ﺟﺪ‬

賢

147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164

‫ﭘﺪ ﺭ‬ ‫ﻣﺎﺩﺭ‬ ‫ﺯﻥ‬ ‫ﺩﺍﺩﺭ‬ ‫ﺑﺮﺍﺫﺭ‬ ‫ﻓﺮﺯﻧﺪ‬ ‫ﺩﺧﱰ‬ ‫ﺧﻮﯾﺶ‬ ‫ﯾﺎ ﺭ‬ ‫ﺧﺎﻭﻧﺪ‬ ‫ﻣﻬﻤﺎﻥ‬ ‫ﺍﺳﺘ ﺎ ﺩ‬ ‫ﺷ ﺎﮐﺮ ﺩ‬ ‫ﺑﯿ ﺮ‬ ‫ﺟﻮﺍﻥ‬ ‫ﺗﻮﻳﻦ‬47 ‫ﻣﻎ‬ ‫ﮐﺸﺎﻭﺭﺯ‬

父母

165 166

‫ﺳﻮﺩﺍﮐﺮ‬ ‫ﻃﺒﻴ ﺐ‬

商

聖

官吏使軍祖

妻兄弟子女親朋主客師徒老少僧道農

醫

monarch liege; vassal saint; sage; master

得沙黒我即兒

wise; able person bureaucrat; official minor official envoy army ancestor; grandfather father mother wife elder brother younger brother child; son woman; daughter parent; relative companion; friend host; owner guest teacher; master apprentice; pupil old young; few monk Taoist agriculture; peasant trade; tradesman doctor

黒期尹

迫昻百兒

阿米兒你傘得以里赤勒石克兒折得迫得兒媽得兒簮打得兒比剌則兒法兒簮得朶黒忒兒或石呀兒哈灣得米黒媽恩五思他得沙吉兒得批兒主洼恩脱因木額起沙斡兒子嫂搭革兒忒比忒49

45 五行 wǔxíng refers to the five elements in an ancient Chinese doctrine. The elements, namely wood, fire, earth, metal, and water, are “allotted” respectively to spring, summer, midsummer, autumn, and winter. 46 Clauson’s (1972: 539) dictionary of pre-thirteenth-century Turkic has élçi: ‘ambassador’ as one of its entries. 47 Liu (2008: 93) considers ‫ ﺗﻮﻳﻦ‬to ultimately originate from Chinese 道人 dàoren ‘Taoist priest’. 48 Modern Tajik муғ ‘magus’. 49 This probably is a misspelt 忒比卜 (see Honda 1963: 9; Beijing tushuguan guji chuban bianji zu 1987–[1994]: 476, 528), whose reconstructed Ming-period Beijing Chinese pronunciation and Standard Chinese pronunciation would be /tʰɛ.pi.•/ and /tʰɤ.pi.pu/, respectively.

32

Shinji Ido

167 168 169 170

‫ﻓﺎﻝ ﮐﻮﯼ‬ ‫ﭘﺮ ﯼ‬ ‫ﻧﯿ ﮏ ﻣﺮ ﺩ‬ ‫ﺑﺖ‬

卜神

171 172 173 174

‫ﺩﯾﻮ‬ ‫ﺧﺮﻓﻪﻭﺭ‬ ‫ﺷﻮﯼ‬ ‫ﺁﺩﻣﯽ‬

鬼

175 176

‫ﺭ ﻋ ﻴﺖ‬ ‫ﻋﻤ ﮏ‬

民

177 178 179 180 181 182 183 184 185 186 187 188

‫ﺩﺍﺩﺭﺯﺍﺩه‬ ‫ﻋﻤ ﻪ‬ ‫ﻳ ﻨﮑ ﻪ‬ ‫ﮐﻨﯿﺰ ﮎ‬ ‫ﺟﺎﺭﻳﻪ‬ ‫ﻫﻤﺴﺎﯾﻪ‬ ‫ﻧﺒﯿﺮه‬ ‫ﺗﻮ‬ ‫ﻣﻦ‬ ‫ﻭﯼ‬ ‫ﻏﻼ ﻡ‬ ‫ﺩ ﺑﯿﺮ‬

姪

189 190 191 192 193 194 195 196 197 198 199 200 201 202

‫ﻣﺒﺎﺭﺯ‬ ‫ﺯ ﻧ ﻨﺪ ه‬ ‫ﻣﺎﻫﯽ ﮐﯿﺮ‬ ‫ﻃﺒﺎﺥ‬ ‫ﺻﻴﺎﺩ‬ ‫ﻧﻘﺎﺵ‬ ‫ﻣﻄﺮﺏ‬ ‫ﻓﺮ ﺍﺳﺖ‬ ‫ﺧﺘﺎﻳﯽ‬ ‫ﺷﺒﺎﻥ‬ ‫ﮐﻠﻪﺑﺎﻥ‬ ‫ﮐﺎﻭﺑﺎﻥ‬ ‫ﻓﯿﻠﺒﺎﻥ‬ ‫ﺩﺯﺩ‬

将軍樵人

仙佛

工夫人

叔

姑嫂婢妾隣孫你我他僕秀士

漁人厨役獵人画士樂人相士漢人53 牧羊牧馬牧牛牧象盗賊

fortune-telling deity; spirit hermit; wizard Buddha; statue of Buddha ghost; devil worker; craft(sman) husband person; human being people; subjects third among brothers; uncle nephew; niece mother-in-law; aunt elder brother’s wife maidservant concubine neighbour grandchild you I she/he; other; that servant man of knowledge and virtue general woodcutter fisherman cook hunter painter musician physiognomist Han shepherd sheep herd horses graze cattle tend elephants robber

fa.lɤ.kwo.ji pʰwo.li mje.kʰɤ.mwo.əɹ.tɤ pu.tʰɤ

фолгӯй пари некмард пут

• xɛ.•.puɑ.•.• ʂuɔ.i ɔ/ɑ.tɛ.mi

tjɑu xei.əɹ.fa.wo.əɹ ʂwo.ji a.tɤ.mi

дев ҳирфавор шӯй / шӯ одамӣ

lɛ.lɛ.iɛ.tʰɛ ɔ/ɑ.•.•

lɤ.lɤ.je.tʰɤ a.mwo.kʰɤ

раият амак

tɑ.tɛ.•.•.tɛ an.• •.kɛ •.ni.ʦɛ.• tʂɑ.•.iɛ xan.sɑ.iɛ nɑ.pi.lɛ tʰu muan uai •.•.ən tɛ.pi.•

ta.tɤ.əɹ.ʦa.tɤ an.mwo jɛn.kɤ kʰɤ.ni.ʦɤ.kʰɤ tʂa.əɹ.je xan.sa.je na.pi.lɤ tʰu man wai wu.la.ən tɤ.pi.əɹ

додарзода амма янга канизак ҷория ҳамсоя набера ту ман вай ғулом дабир

mu.puɑ.•.ʦɿ xiuɛ.ʦin.ʦɛ.nan.tɛ muɑ.xi.•.• tʰɛ.puɑ.xɛ sɛ.•.tɛ nɑ.•.• mu.tʰɛ.•.• puɑ.•.sɛ.tʰɛ xɛ.tʰɑ.i •.puɑ.ən •.lɛ.puɑ.ən kɒʊ.puɑ.ən •.lɛ.puɑ.ən tu.ʦɿ.tɛ

mu.pa.li.ʦɹ̩ ɕje.ʨin.ʦɤ.nan.tɤ ma.ɕi.ʨi.əɹ tʰɤ.pa.xei sɤ.ja.tɤ na.ka.ʂɹ̩ mu.tʰɤ.li.pu fa.la.sɤ.tʰɤ xei.tʰa.ji ʂu.pa.ən kʰɤ.lɤ.pa.ən kɑu.pa.ən fei.lɤ.pa.ən tu.ʦɹ̩.tɤ

мубориз51 ҳезум-зананда моҳигир таббох сайёд наққош мутриб фиросат52 хитоӣ чӯпон га(л)лабон говбон филбон дузд

法勒鍋衣迫里

puɑ.lɛ.kuɔ.i •.• 乜克黙兒得 miɛ.•.•.•.tɛ 卜忒 •.tʰɛ 刁黒兒法斡兒朔衣阿得密勒勒夜忒50 阿黙克打得兒咱得俺黙眼革克你則克鮓兒夜罕撒夜納必勒禿蠻歪五剌恩得必兒母巴力子血津則南得馬希几兒忒巴黒塞呀得納噶石母忒力卜法剌塞忒黒他衣鼠巴恩克勒巴恩稿巴恩非勒巴恩杜子得

50 This is probably a misspelt 勒額夜忒 (see Honda 1963: 9; Beijing tushuguan guji chuban bianji zu 1987–[1994]: 476, 529), whose reconstructed Ming-period Beijing Chinese pronunciation and Standard Chinese pronunciation would be /lɛ.•.iɛ.tʰɛ/ and /lɤ.ɤ.je.tʰɤ/, respectively. 51 Modern Tajik мубориз ‘fighter’. 52 Modern Tajik фиросат ‘comrehension; cleverness’. 53 Haw (2014: 22) writes that 漢人 Hànrén “included not just Chinese, but all the peoples who had been subjects of the Jin empire, including Jurchens and Khitans, among others” during the Yuan period. How or whether this is related to the placement of the entry for 漢人 in the “person” section, which consists mostly of occupation names and kinship terms, is unclear. Incidentally, 女直 Nǚzhēn ‘Jurchen’ is an entry (number 92) in the “geography” section.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

33

The “human” section

人事門

203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223

‫ﺩﻭﻟﺖ‬ ‫ﻋﻤ ﺮ‬ ‫ﺷﺎﺩ‬ ‫ﻧﺸﺎﻁ‬ ‫ﺟﺪ‬ ‫ﮐﺎﻫﻠﯽ‬ ‫ﺩﻭﺳﺖ‬ ‫ﺭﺣﻢ‬ ‫ﺑﺮﺁﻣﺪﻥ‬ ‫ﺩﺭﺁﻣﺪﻥ‬ ‫ﺩﯾﺪﺏ‬ ‫ﺩﺍﻧﺴﺘﻦ‬ ‫ﺟﺴ ﺘ ﻦ‬ ‫ﺍﻧﺪﯾﺸﻪ‬ ‫ﮐﺎﺭ‬ ‫ﺁﻣﻮﺧﺘﻦ‬ ‫ﺧﺎﺹ‬ ‫ﺻﺪﻕ‬ ‫ﺍﻧﻌﺎﻡ‬ ‫ﺗﴩﻳﻒ‬ ‫ﻋﺮﺿﻪ‬

福

224

‫ﺧﻮﺍﺳﺘﻦ‬

討

225 226 227 228 229 230 231

‫ﻧﯿﺎﺯ‬ ‫ﻋﻨﺎﻳﺖ‬ ‫ﻓﺮﺳﺘﺎﺩﻥ‬ ‫ﺩﺍﺩﻥ‬ ‫ﺗﻮﺍﻧﮑﺮ‬ ‫ﻣﻬﱰ‬ ‫ﻓﻘﻴﺮ‬

拜望

232 233 234 235 236 237 238 239 240

‫ﮐﻬﱰ‬ ‫ﺍﻧﺴﺎﻧﻴﺖ‬ ‫ﻣﺮﻭﺕ‬ ‫ﺁﺩﺏ‬ ‫ﺧﺮﺩ‬ ‫ﻭﻓﺎ‬ ‫ﻇﺮﻳﻒ‬ ‫ﻋﺰﻳﺰﯼ‬ ‫ﺣﺮﮐﺎﺕ‬

賤仁

壽喜樂勤懶愛憐出入見知想事學專誠恩賞奏

差與富貴貪

義禮智信清濁動

good fortune longevity delight; happy joyous deligent lazy be fond of; cherish pity; sympathize go/come out enter see know seek think; miss matter learn; knowledge specific sincere favour reward present a memorial to an emperor; play (music) ask for; send punitive expedition make obeisance gaze; hope for send; differ give abundant noble poor; poverty; avaricious humble; cheap humanity; human justice; rightous propriety wisdom trust clear; clean turbid move

阿兒則

•.lɛ.tʰɛ •.•.• ʂɑ.tɛ ni.ʂɑ.tʰɛ •.tɛ •.xɛ.• tuɔ.sɿ.tʰɛ lɛ.xan puɛ.•.•.tan tɛ.•.•.tan ti.tan tɑ.ni.sɿ.tʰan tʂʉ.sɿ.tʰan an.•.• •.• ɔ/ɑ.mu.xɛ.tʰan xɑ.sɿ suei.tɛ.kɛ •.ɔ/ɑ.ən tʰɛ.•.•.piu ɔ/ɑ.•.ʦɛ

tɑu.lɤ.tʰɤ wu.mu.əɹ ʂa.tɤ ni.ʂa.tʰɤ tʂɹ̩.tɤ ka.xei.li two.sɹ̩.tʰɤ lɤ.xan pai.la.mwo.tan tɤ.la.mwo.tan ti.tan ta.ni.sɹ̩.tʰan tʂu.sɹ̩.tʰan an.tje.ʂɤ ka.əɹ a.mu.xei.tʰan xa.sɹ̩ swei.tɤ.kɤ jin.a.ən tʰɤ.ʂɹ̩.li.fu a.əɹ.ʦɤ

давлат умр шод нишот / нашот ҷидд коҳилӣ дӯст раҳм баромадан даромадан дидан донистан ҷустан андеша кор омӯхтан хос сидқ инъом ташриф арза

花思貪

xuɑ.sɿ.tʰan

xwa.sɹ̩.tʰan

хостан

你呀子額納夜忒

ni.•.ʦɿ •.nɑ.iɛ.tʰɛ puɑ.•.sɿ.ni.tan tɑ.tan tʰu.uɑŋ.kɛ.• mi.xɛ.tʰɛ.• puɑ.kɛ.•

ni.ja.ʦɹ̩ ɤ.na.je.tʰɤ fa.əɹ.sɹ̩.ni.tan ta.tan tʰu.wɑŋ.kɤ.əɹ mi.xei.tʰɤ.əɹ fa.kɤ.əɹ

ниёз иноят54 фиристодан додан тавонгар меҳтар фақир

kʰi.xɛ.tʰɛ.• in.sɑ.ni.iɛ.tʰɛ mu.lu.•.tʰɛ ɔ/ɑ.tɛ.• xɛ.lɛ.tɛ ɔ.puɑ ʦɛ.•.piu •.•.• xɛ.•.•.tʰɛ

ʨʰi.xei.tʰɤ.əɹ jin.sa.ni.je.tʰɤ mu.lu.wo.tʰɤ a.tɤ.pu xei.lɤ.tɤ wo.fa ʦɤ.li.fu ɤ.ʨi.ʨi xei.əɹ.ka.tʰɤ

кеҳтар инсоният мурувват одоб хирад вафо зариф азизӣ56 ҳаракат

倒勒忒兀木兒沙得你沙忒只得噶黒里多思忒勒罕百剌黙丹得剌黙丹底丹打你思貪主思貪俺迭舍噶兒阿母黒貪哈思遂得革尹阿恩忒石里夫

法兒思你丹55 打丹土往革兒米黒忒兒法革兒起黒忒兒因撒你夜忒母魯斡忒阿得卜黒勒得我法則里夫額即即黒兒噶忒

54 Modern Tajik иноят ‘thoughtfulness; assistance; favour’. 55 This is probably a misspelt 法兒思他丹 (see Honda 1963: 11; Beijing tushuguan guji chuban bianji zu 1987–[1994]: 480, 533), whose reconstructed Ming-period Beijing Chinese pronunciation and Standard Chinese pronunciation would be /puɑ.•.sɿ.tʰɑ.tan/ and /fa.əɹ.sɹ̩.tʰa.tan/, respectively. 56 Modern Tajik азизӣ ‘valuableness’.

34

Shinji Ido

241

‫ﺳﮑﻨﺎﺕ‬

静

242 243 244 245 246 247 248

‫ﮐﺮﻳﺴﺘﻦ‬ ‫ﺧﻨﺪﻳﺪﻥ‬ ‫ﻣﮑ ﺮ‬ ‫ﻋﺎﻡ‬ ‫ﻣﺰﻳﺪﻥ‬ ‫ﮐﻢ‬ ‫ﺻﻔﺖ‬

哭

249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282

‫ﭘ ﻨﺪ‬ ‫ﻣﻄﻴﻊ‬ ‫ﻳﺎﻏﯽ‬ ‫ﺣﮑ ﻢ‬ ‫ﺧﺪ ﮎ‬ ‫ﺧﺮﻳﺪﻥ‬ ‫ﻓﺮﻭﺧﺘﻦ‬ ‫ﺑﻴﺎ‬ ‫ﺑﺮﻭ‬ ‫ﻏﻀﺐ‬ ‫ﺗﻐﺮﻳﻢ‬ ‫ﺷﮑﺎﻳﺖ‬ ‫ﺗﻘﺎﺿﺎ‬ ‫ﻣﺴ ﺖ‬ ‫ﺑﻴﺪﺍﺭ‬ ‫ﻣﺎﻧﺪهﮐﯽ‬ ‫ﻋﻔﻮ‬ ‫ﺑﺎﺯﺩﺍﺷﺘﻦ‬ ‫ﺗﻌﻠﻴﻢ‬ ‫ﺑﺮﺧﺎﺳﺘﻦ‬ ‫ﺑﺎﺵ‬ ‫ﻣﻘﺒﻮﻝ‬ ‫ﺩﺳﺘﮑﻴﺮﯼ‬ ‫ﺳﻴﺎﺳﺖ‬ ‫ﮐﺸ ﺘ ﻦ‬ ‫ﺧﻮﺍﺏ ﺩﻳﺪﻥ‬ ‫ﺧﺴﺒﻴﺪﻥ‬ ‫ﺷﻨﺎﺧﺘﻦ‬ ‫ﻭﻋﺪه‬ ‫ﻃﻠﺒﻴﺪﻥ‬ ‫ﺧﻮﺍﻧﺪﻥ‬ ‫ﻧﻈﺮ‬ ‫ﺁه‬ ‫ﺗﻮﺍﻧﺎ‬

勸

笑詐愚添减誇

順逆斷疑買賣来去怒罰怨催醉醒勞赦留訓起居受救刑殺夢睡認許請讀觀嘆能

still; calm; motionless cry laugh deceive stupid add; append reduce; subtract to be proud of; exaggerate advise smooth, in order adverse; contrary judge; cut off doubt buy sell come go anger; get angry punish resent urge: hasten drunk awake; sober toil; fatigue amnesty remain; retain teach; train rise; raise reside; be; occupy accept; receive rescue punishment kill dream sleep recognize allow ask for read watch sigh; lament capable

塞克納忒

sɛ.•.nɑ.tʰɛ

sɤ.kʰɤ.na.tʰɤ

сукунат

己里思貪

•.•.sɿ.tʰan xan.ti.tan •.•.• ɔ/ɑ.ən •.•.tan kʰan suei.puɑ.tʰɛ

ʨi.li.sɹ̩.tʰan xan.ti.tan mwo.kʰɤ.əɹ a.ən mwo.ʨi.tan kʰan swei.fa.tʰɤ

гиристан хандидан макр ом мазидан кам сифат57

pʰuan.tɛ mu.•.• •.• xu.kʰun xɛ.tu.• xɛ.•.tan piu.luɔ.xɛ.tʰan pi.• •.lɒʊ •.ʦɛ.• tʰɛ.•.•.• •.•.iɛ.tʰɛ tʰɛ.•.• •.sɿ.tʰɛ piɛ.tɑ.• muɑ.ən.tɛ.• ɔ/ɑ.piu puɑ.ʦɿ.tɑ.•.tʰan tʰɛ.ɔ/ɑ.•.• puɛ.•.xɑ.sɿ.tʰan puɑ.• •.kɛ.□.lɛ tɛ.sɿ.tʰɛ.•.• si.•.sɛ.tʰɛ kʰu.•.tʰan xuɑ.•.ti.tan xu.sɿ.pi.tan •.nɑ.xɛ.tʰan ɔ.ɔ/ɑ.tɛ tʰɛ.lɛ.pi.tan xuɑ.ən.tan nɑ.ʦɛ.• ɔ/ɑ.xɛ tʰu.•.nɑ

pʰan.tɤ mu.tʰwei.ɤ ja.ɤ xu.kʰwən xei.tu.kʰɤ xei.li.tan fu.lwo.xei.tʰan pi.ja pu.lɑu ɤ.ʦɤ.pu tʰɤ.ɤ.li.jin ʂɹ̩.ka.je.tʰɤ tʰɤ.ka.ʦa mwo.sɹ̩.tʰɤ pje.ta.əɹ ma.ən.tɤ.ʨi a.fu pa.ʦɹ̩.ta.ʂɹ̩.tʰan tʰɤ.a.li.jin pai.əɹ.xa.sɹ̩.tʰan pa.ʂɹ̩ mwo.kɤ.□.lɤ tɤ.sɹ̩.tʰɤ.ʨi.li ɕi.ja.sɤ.tʰɤ kʰu.ʂɹ̩.tʰan xwa.pu.ti.tan xu.sɹ̩.pi.tan ʂɹ̩.na.xei.tʰan wo.a.tɤ tʰɤ.lɤ.pi.tan xwa.ən.tan na.ʦɤ.əɹ a.xei tʰu.wa.na

панд мутеъ ёғӣ ҳукм хазук / хадук /худук58 харидан фурӯхтан биё бирав / бурав ғазаб × шикоят тақозо маст бедор мондагӣ афв боздоштан таълим бархостан бош мақбул дастгирӣ сиёсат куштан хоб дидан хусбидан / хуспидан шинохтан ваъда60 талбидан / талабидан хондан назар оҳ тавоно

罕底丹黙克兒阿恩黙即丹堪髄法忒潘得母推額呀額戸坤黒杜克黒里丹府羅黒貪必呀卜勞額則卜忒額里尹石噶夜忒忒噶咱黙思忒別搭兒媽恩得几阿夫巴子打石貪忒阿里尹百兒哈思貪巴石黙革□59勒得思忒己里洗呀塞忒苦石貪花卜底丹虎思比丹石納黒貪我阿得忒勒比丹花恩丹納則兒阿黒土洼納

57 Modern Tajik сифат ‘quality; attribute; adjective’. 58 Modern Tajik хазук ‘distressed’. 59 The white square □ here represents an illegible Chinese glyph. Judging from Honda (1963: 12), the glyph is probably 卜, whose Standard Chinese pronunciation is /pu/. 60 Modern Tajik ваъда ‘promise’.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

283 284 285 286 287 288 289 290

‫ﺷﺎﺩﺑﺎﺵ‬ ‫ﮐﻨ ﺎه‬ ‫ﮐﺮﺩﻥ‬ ‫ﺁﻭﻳﺨﺘﻦ‬ ‫ﺗﺠﺴﺲ‬ ‫ﭘﺸﻴﻤﺎﻥ‬ ‫ﺑﺎﺯﯼ‬ ‫ﺳﺎﺯﻭﺍﺭﯼ‬

謝罪

291 292 293 294 295 296 297 298 299 300

‫ﺁﻭﺍﺯ‬ ‫ﺗﻌﻠﻖ‬ ‫ﺗﻘ ﻀﻴﺮ‬ ‫ﺩﻻﻟﺖ‬ ‫ﺯﻳﺮﮎ‬ ‫ﻣﺸﻮﺭﺕ‬ ‫ﺯﻳﻨﻬﺎﺭ‬ ‫ﺍﺧﺘﻴﺎﺭ‬ ‫ﺧﺼﻮﻣﺖ‬ ‫ﻓﺴﻮﺱ‬

聲音

為掛査悔戲和睦

管束怠慢導引聰明商議叮嚀選擇爭競欺凌

thank guilt do; carry out hang investigate regret play concord; amity; harmonious sound keep control over neglect (of duty) guide clever; bright confer; discuss exhort choose contend; compete insult

ʂɑ.tɛ.puɑ.• ku.nɑ.xɛ •.•.tan ɔ/ɑ.iuɛ.xɛ.tʰan tʰɛ.•.•.sɿ •.•.muɑ.ən puɑ.• sɑ.ʦɿ.•.•

ʂa.tɤ.pa.ʂɹ̩ ku.na.xei kʰɤ.əɹ.tan a.ɥe.xei.tʰan tʰɤ.tʂɤ.su.sɹ̩ pʰwo.ʂɤ.ma.ən pa.ʨi sa.ʦɹ̩.wa.li

шодбош гуноҳ / гунаҳ кардан овехтан таҷассус пушаймон бозӣ созворӣ / созгорӣ

ɔ/ɑ.•.ʦɿ tʰɛ.an.•.kɛ tʰɛ.kɛ.suei.• tɛ.•.lɛ.tʰɛ •.lɛ.• •.•.lɛ.tʰɛ ʦin.xɑ.• •.xɛ.tʰi.•.• xu.su.•.tʰɛ piu.su.sɿ

a.wa.ʦɹ̩ tʰɤ.an.lu.kɤ tʰɤ.kɤ.swei.əɹ tɤ.la.lɤ.tʰɤ ʨi.lɤ.kʰɤ mwo.ʂu.lɤ.tʰɤ ʨin.xa.əɹ ji.xei.tʰi.ja.əɹ xu.su.mwo.tʰɤ fu.su.sɹ̩

овоз тааллуқ тақсир далолат зирак машварат зинҳор ихтиёр хусумат фусӯс / фусус

sɛ.• luɔ.i ɔ/ɑ.•.luɔ mu.i •.• kuɔ.• pi.ni tɛ.xɑ.ən •.tɑ.ən

sɤ.əɹ lwo.ji a.pu.lwo mu.ji tʂʰɤ.ʂən kwo.ʂɹ̩ pi.ni tɤ.xa.ən tan.ta.ən

сар рӯй абрӯ / абру мӯй / муй чашм гӯш бинӣ даҳан / даҳон дандон

法兒知

ʦɛ.puɑ.ən ti.lɛ •.kʰan tɛ.sɿ.tʰɛ pʰuɑ.i puɑ.•.pi.xɛ •.•.• puɑ.•.•

ʦɤ.pa.ən ti.lɤ ʂɹ̩.kʰan tɤ.sɹ̩.tʰɤ pʰa.ji fa.əɹ.pi.xei la.ɤ.əɹ fa.əɹ.tʂɹ̩

забон дил шикам / ишкам даст пой фарбеҳ (фарбӣ) лоғар фарҷ

則克兒

ʦɛ.•.•

ʦɤ.kʰɤ.əɹ

закар

乎衣扎恩

•.i •.ən sɛ.xun ɔ/ɑ.•.lɛ ni.iɛ.tʰɛ

xu.ji tʂa.ən sɤ.xwən a.mwo.lɤ ni.je.tʰɤ

хӯ(й) ҷон сухан / сахун амал ният

沙得巴石古納黒克兒丹阿月黒貪忒折速思迫舍媽恩巴即撒子洼力阿洼子忒俺路革忒革雖兒得剌勒忒即勒克黙束勒忒儘哈兒以黒體呀兒虎蘇黙忒府蘇思

The “body” section

身體門

301 302 303 304 305 306 307 308 309

‫ﴎ‬ ‫ﺭﻭﯼ‬ ‫ﺍﺑﺮﻭ‬ ‫ﻣﻮﯼ‬ ‫ﭼﺸ ﻢ‬ ‫ﮐﻮﺵ‬ ‫ﺑﻴﻨﯽ‬ ‫ﺩﻫﺎﻥ‬ ‫ﺍﻧﺪﺍﻥ‬

頭

310 311 312 313 314 315 316 317

‫ﺯﺑﺎﻥ‬ ‫ﺩﻝ‬ ‫ﺷﮑ ﻢ‬ ‫ﺩﺳﺖ‬ ‫ﭘﺎﯼ‬ ‫ﻓﺮﺑﻪ‬ ‫ﻻ ﻏﺮ‬ ‫ﻓﺮﺝ‬

舌心

318

‫ﺫﮐﺮ‬

陽

319 320 321 322 323

‫ﺧﻮﯼ‬ ‫ﺟﺎﻥ‬ ‫ﺳﺨ ﻦ‬ ‫ﻋﻤ ﻞ‬ ‫ﻧ ﻴﺖ‬

性命

面眉髪眼耳鼻口牙

腹手足肥痩陰

言行意

head face eyebrow hair eye ear nose mouth tooth; posterior tooth tongue heart belly hand leg fat thin; skinny yin; concealed; negative; femininity (as the opposite of masculinity) yang; open; positive; masculinity (as the opposite of femininity) character; nature life word; speech conduct; behavior aspiration

塞兒羅衣阿卜羅母衣徹深鍋石比你得哈恩膽搭恩則巴恩的勒石堪得思忒衣法兒必黒剌額兒

塞昏阿黙勒你夜忒

35

36

Shinji Ido

324 325 326 327 328 329 330 331 332 333

‫ﺻﻮﺭﺕ‬ ‫ﺟﮑ ﺮ‬ ‫ﺯﻫﺮه‬ ‫ﺍﺳﺘﺨﻮﺍﻥ‬ ‫ﮐ ﻮﺷﺖ‬ ‫ﺯﻧﺦ‬ ‫ﯕﺮﺩﻥ‬ ‫ﺷﺶ‬ ‫ﺳﻴﻨ ﻪ‬ ‫ﺗﺎﺭﮐﴪ‬

形肝

334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350

‫ﻣﺸ ﺖ‬ ‫ﮐﺘ ﻒ‬ ‫ﭘﺸ ﺖ‬ ‫ﺩ ﻳﺪ ه‬ ‫ﻣ ﻌﺪ ه‬ ‫ﺭﻭﺩه‬ ‫ﭘﯽ‬ ‫ﺭﻟ ﻒ‬ ‫ﺭﻳﺶ‬ ‫ﺍ ﻧﮑ ﺸ ﺖ‬ ‫ﺑﻴﺸﺎﻧﯽ‬ ‫ﺁ ﻓﺖ‬ ‫ﺁﺏ ﺩﻳﺪه‬ ‫ﺧﻮﻥ‬ ‫ﺗﻦ‬ ‫ﺧﻠﻘﻮﻡ‬ ‫ﺑﻴﻤﺎﺭ‬

拳肩

351 352

‫ﻗﻠﻤﻪ‬ ‫ﺻﻔ ﻪ‬

樓

353

‫ﮐﻮﺷﮏ‬

殿

354 355 356 357 358 359 360

‫ﻏﺮﻓﻪ‬ ‫ﺍﻧﺒﺎﺭ‬ ‫ﺧﺰﻳﻨﻪ‬ ‫ﻣ ﺴ ﺠﺪ‬ ‫ﺧﺎﻧﻪ‬ ‫ﺩﺭ‬ ‫ﺩﺭﻳﭽﻪ‬

閣

膽骨肉頬項肺胸頂

背晴61 胃腸筋鬢鬚指額灾淚血身咽喉疾病

shape; body liver gall bladder bone flesh cheek nape of the neck lung chest top; the crown (of the head/hat) fist shoulder back clear (see note 61) stomach bowel muscle sidelock beard finger forehead disaster tear blood body throat disease; illness

su.lɛ.tʰɛ tʂʅ.kɛ.• ʦɛ.xɛ.lɛ •.sɿ.tʰu.xɛ.•.ən kuɔ.•.tʰɛ ʦɛ.nɑ.xɛ kɛ.•.tan •.• si.nɑ tʰɑ.•.kʰi.sɛ.•

su.lɤ.tʰɤ tʂʰɹ̩.kɤ.əɹ ʦɤ.xei.lɤ wu.sɹ̩.tʰu.xei.wa.ən kwo.ʂɹ̩.tʰɤ ʦɤ.na.xei kɤ.əɹ.tan ʂu.ʂɹ̩ ɕi.na tʰa.li.ʨʰi.sɤ.əɹ

сурат ҷигар заҳра устухон гӯшт занах гардан шуш сина тораки сар

•.•.tʰɛ •.•.piu •.•.tʰɛ ti.tɛ mi.ɔ/ɑ.tɛ luɔ.tɛ pʰuɛ.i ʦɛ.lɛ.piu •.• an.•.•.tʰɛ pʰiɛ.ʂɑ.ni ɔ/ɑ.puɑ.tʰɛ ɔ/ɑ.•.ti.tɛ •.ən tʰan xu.lu.•.ən piɛ.muɑ.•

mu.ʂɹ̩.tʰɤ kʰɤ.tʰi.fu pʰu.ʂɹ̩.tʰɤ ti.tɤ mi.a.tɤ lwo.tɤ pʰai.ji ʦɤ.lɤ.fu li.ʂɹ̩ an.ku.ʂɹ̩.tʰɤ pʰje.ʂa.ni a.fa.tʰɤ a.pu.ti.tɤ xu.ən tʰan xu.lu.ku.ən pje.ma.əɹ

мушт китф / кифт пушт дида меъда рӯда пай зулф риш ангушт пешонӣ офат62 оби дида хун тан ҳалқум / ҳулқум бемор

塞法

kɛ.•.• sɛ.puɑ

kɤ.li.mwo sɤ.fa

× суффа

科石克

kʰuɔ.•.•

kʰɤ.ʂɹ̩.kʰɤ

кӯшк

五兒法

•.•.puɑ an.puɑ.• xɛ.•.nɑ •.sɿ.•.tɛ xɑ.nɑ tɛ.• tɛ.•.•

wu.əɹ.fa an.pa.əɹ xei.ʨi.na mwo.sɹ̩.tʂɹ̩.tɤ xa.na tɤ.əɹ tɤ.əɹ.tʂʰɤ

ғурфа анбор хазина масҷид хона дар дарича

蘇勒忒止革兒則黒勒五思土黒洼恩鍋石忒則納黒革兒丹束石洗納他里起塞兒木石忒克替夫僕石忒底得米阿得羅得拍衣則勒夫里石俺故石忒撇沙你阿法忒阿卜底得乎恩貪虎魯故恩別媽兒

The “palace” section

宮室門臺

倉庫寺房門窓

storied building raised large building; platform large building with a substantial base; hall pavilion granary; magazine storehouse temple house; room door; gate window

革里黙

俺巴兒黒即納黙思只得哈納得兒得兒徹

61 晴 qíng ‘clear’ here is probably a misspelt 睛 jīng ‘eye’. 62 Tajik офат does mean ‘disaster’, but seems out of place in the “body” section. This entry word may hence be a misspelt ‫ﺍﻓﺖ‬, which is current in modern Tajik (as well as in modern Uzbek) in the form of афт ‘face; appearance; facial expression’. Incidentally, афт can be identified as synonymous with modern Tajik рӯй ‘face’ (see entry 302) in certain contexts (see Muhammadiev 1975: 165).

Huihuiguan zazi: A New Persian glossary compiled in Ming China

361 362 363 364 365 366 367 368 369 370 371 372 373 374

‫ﺳﻘ ﻒ‬ ‫ﺳﺘﻮﻥ‬ ‫ﭘﻐﻨﻪ‬ ‫ﻣﻨﺎﺭه‬ ‫ﺗ ﻴﻢ‬ ‫ﺑ ﺎﻡ‬ ‫ﭘﻞ‬ ‫ﻣﺠﻮﺭ‬ ‫ﺧﺸ ﺖ‬ ‫ﺳﻔ ﺎ ﻝ‬ ‫ﴎﺍﯼ‬ ‫ﻭﺳ ﻪ‬ ‫ﺩﺍﺭﺍﻓﺰﻳﻦ‬ ‫ﺑﺎﺭﮐﺎه‬

梁柱

375

‫ﯾﺎﻡ ﺧﺎﻧﻪ‬

館驛

堦塔店簷橋廊磚瓦廰椽欄杆丹墀

376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401

巴兒噶黒

sɛ.kɛ.piu •.tʰu.ən •.•.nɑ mu.nɑ.lɛ tʰi.• puɑ.• •.lɛ mi.•.•.• xɛ.•.tʰɛ sɛ.puɑ.lɛ sɛ.•.i ɔ.sɛ tɑ.•.ɔ/ɑ.piu.•.• puɑ.•.•.xɛ

sɤ.kɤ.fu su.tʰu.ən pʰwo.ɤ.na mu.na.lɤ tʰi.jin pa.mu pʰu.lɤ mi.tʂɹ̩.wo.əɹ xei.ʂɹ̩.tʰɤ sɤ.fa.lɤ sɤ.la.ji wo.sɤ ta.əɹ.a.fu.ʨi.jin pa.əɹ.ka.xei

сақф сутун пағна минора тим63 бом пул × хишт / ғишт сафол сарой васса дорафзин / дарафзин боргоҳ

呀木哈納

•.•.xɑ.nɑ

ja.mu.xa.na

ёмхона

ɔ/ɑ.ʐʅ.tɛ.• •.lɑŋ.• •.• •.lɛ •.•.tʰu.• ɔ/ɑ.sɿ.• kɒʊ kuɔ.sɿ.puiɛn.tɛ •.•.• muɑ.xi •.ʦɿ puɛ.tʰɛ lu.puɑ.xɛ xɛ.•.kuɔ.• •.• sɛ.• puɑ.•.•.tʰu.•.• ɔ/ɑ.sɿ.•.• xan.tu.nɑ •.ʦɿ •.nu.•.• puɑ.• muɑ.• •.•.• ku.•.puɛ mu.•

a.ɹɹ̩.tɤ.əɹ pʰwo.lɑŋ.kʰɤ ʂɤ.əɹ fei.lɤ wu.ʂu.tʰu.əɹ a.sɹ̩.pu kɑu kwo.sɹ̩.fan.tɤ mu.əɹ.ɤ ma.ɕi ka.ʦɹ̩ pai.tʰɤ lu.pa.xei xei.əɹ.kwo.ʂɹ̩ xu.kʰɤ sɤ.kʰɤ fa.la.ʂɹ̩.tʰu.lu.kʰɤ a.sɹ̩.wo.əɹ xan.tu.na ɥy.ʦɹ̩ tʂa.nu.wo.əɹ pa.ʂɤ ma.əɹ mu.ʂɹ̩.kʰɤ ku.əɹ.pai mu.ʂɹ̩

аждар паланг шер фил уштур / шутур асп / асб гов / гав гӯсфанд / гусфанд мурғ моҳӣ қоз бат рӯбоҳ / рӯбаҳ / рубоҳ харгӯш / харгуш хук саг фароштурук × ҳамдуна / ҳамдӯна юз ҷонвар / ҷонавар65 боша мор мушк гурба муш

塞革夫速禿恩迫額納母納勒梯尹巴木僕勒米知斡兒黒石忒塞法勒塞剌衣我塞打兒阿夫即尹

The “birds and beasts” section

鳥獸門

‫ﺍَﮊﺩ ْﺭ‬ ‫ﭘﻠﻨﮏ‬ ‫ﺷﻴﺮ‬ ‫ﻓﻴﻞ‬ ‫ﺍﺷﱰ‬ ‫ﺍﺳﺐ‬ ‫ﮐﺎﻭ‬ ‫ﮐ ﻮ ﺳ ﻔ ﻨﺪ‬ ‫ﻣﺮ ﻍ‬ ‫ﻣﺎﻫﯽ‬ ‫ﻗﺎﺯ‬ ‫ﺑﻂ‬ ‫ﺭﻭﺑﺎه‬ ‫ﺧﺮﮐﻮﺵ‬ ‫ﺧ ﻮﮎ‬ ‫ﺳﮏ‬ ‫ﻓﺮ ﺍﺷﱰ ﮎ‬ ‫ﺍﺳﻮﺭ‬ ‫ﺣﻤﺪﻭﻧﻪ‬ ‫ﯾﻮﺯ‬ ‫ﺟﺎﻧﻮﺭ‬ ‫ﺑﺎﺷﻪ‬ ‫ﻣﺎﺭ‬ ‫ﻣﺸ ﮏ‬ ‫ﮐﺮﺑﻪ‬ ‫ﻣﻮﺵ‬

roof beam pillar; column step; stairs pagoda shop eave bridge corridor; porch brick tile hall rafter railing red steps leading to a palace posthouse inn

37

龍虎獅64 象駝馬牛羊雞魚鵞鴨狐兎猪犬燕鴈猴豹鶯鷂蛇麝猫

dragon tiger lion elephant camel horse cow sheep fowl; chicken fish goose duck fox rabbit pig dog swallow wild goose monkey leopard oriole sparrow hawk snake musk deer cat mouse; rat

阿日得兒迫郞克賒兒非勒五束土兒阿思卜髙果思番得木兒額馬希噶子百忒魯巴黒黒兒鍋石乎克塞克法剌石土路克阿思斡兒罕都納迂子扎奴斡兒巴舎媽兒木石克古兒百母石

63 Modern Tajik тим ‘large caravanserai; covered bazaar’ seems to diverge semantically from 店 ‘shop’. Honda (1963: 15) identifies Timurid Persian ‫ ﺗﻴﻢ‬as a loanword from Chinese. 64 According to Tōdō and Kanō (2005: 1132), 獅 shī ‘lion’ is a loanword from kodai perushago lit. ‘ancient Persian’. 65 Modern Tajik ҷон(а)вар ‘animal’.

38

402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424

‫ﻣﮑ ﺲ‬ ‫ﭘﺸ ﻪ‬ ‫ﮊ ﻣﺮه‬ ‫ﭘﺮﻭﺍﻧﻪ‬ ‫ﮐﺮﻡ‬ ‫ﻣﻮﺭﭼﻪ‬ ‫ﭘﺮﻳﺪﻥ‬ ‫ﺑﺎﻧﮏ‬ ‫ﭘ ﺸﻢ‬ ‫ﺑﺎﻝ‬ ‫ﺳ ﻨﺐ‬ ‫ﺟ ﻨﮑ ﺎ ﻝ‬ ‫ﻣﻨﻘﺎﺭ‬ ‫ﻓﻠﻮﺱ‬ ‫ﻇﺮﺍﻓﻪ‬ ‫ﺳ ﻴﻤ ﺮ ﻍ‬ ‫ﻃﺮ ﻃ ﯽ‬ ‫ﻟﺤ ﺎﻡ‬ ‫ﻃﺎﻭﺱ‬ ‫ﭼ ﻐﺰ‬ ‫ﭼﺮ ﭼﺮ ﯼ‬ ‫ﻭﺭﺗﻴﺞ‬ ‫ﻣﻠﺦ‬

Shinji Ido

蝿蚊蟬蛾蟲蟻飛鳴毛翅蹄爪嘴麟66 麒麟68 鳳凰鸚鵡鴛鴦孔雀蝦蟇翡翠鵪鶉蝗虫

‫ﺩ ﺭ ﺧﺖ‬ ‫ﭼﻮﺏ‬ ‫ﺗﻮﺕ‬ ‫ﺑ ﻴﺪ‬ ‫ﴎﻭ‬ ‫ﺍﺑﺨﻞ‬ ‫ﮐﻞ‬ ‫ﻋﻠﻒ‬ ‫ﺑﺎﺩﺭﻧﮏ‬ ‫ﻣﻴ ﻮه‬ ‫ﻣﺮﻭﺩ‬ ‫ﭼﺒﻐﺎﻥ‬ ‫ﺁﻟ ﻮ‬ ‫ﺯﺭﺩﺍﺭﻭ‬ ‫ﺷﻔﺘﺎﻟﻮ‬ ‫ﺍﺑﺎﺭ‬

黙革思迫舍日母勒迫兒洼納乞林抹兒徹迫里丹邦克迫深巴勒孫卜展噶勒敏噶兒府羅思祖剌法洗木兒額脱推魯哈木他屋思徹額子赤兒赤里我兒梯知黙勒黒

•.kɛ.sɿ •.• ʐʅ.mu.lɛ •.•.•.nɑ •.lin muɔ.•.• •.•.tan puɑŋ.• •.• puɑ.lɛ sun.• tʂan.•.lɛ min.•.• piu.luɔ.sɿ •.•.puɑ si.•.•.• tʰuɔ.• lu.xɑ.• tʰɑ.•.sɿ •.•.ʦɿ •.•.•.• ɔ.•.tʰi.• •.lɛ.xɛ

mwo.kɤ.sɹ̩ pʰwo.ʂɤ ɹɹ̩.mu.lɤ pʰwo.əɹ.wa.na ʨʰi.lin mwo.əɹ.tʂʰɤ pʰwo.li.tan pɑŋ.kʰɤ pʰwo.ʂən pa.lɤ swən.pu tʂan.ka.lɤ min.ka.əɹ fu.lwo.sɹ̩ ʦu.la.fa ɕi.mu.əɹ.ɤ tʰwo.tʰwei lu.xa.mu tʰa.wu.sɹ̩ tʂʰɤ.ɤ.ʦɹ̩ tʂʰɹ̩.əɹ.tʂʰɹ̩.li wo.əɹ.tʰi.tʂɹ̩ mwo.lɤ.xei

магас пашша × парвона кирм мӯрча / мурча паридан бонг пашм бол сунб / сум(м) чангол минқор фулус67 заррофа69 симурғ тӯтӣ × товус чағз чирчирӣ вартиш малах

tɛ.lɛ.xɛ.tʰɛ •.• tʰu.tʰɛ piɛ.tɛ •.lu •.•.xu.lɛ •.lɛ ɔ/ɑ.lɛ.piu puɑ.tɛ.lɑŋ.• miɛ.• mu.lu.tɛ •.•.ɔ/ɑ.ən ɔ/ɑ.lu ʦɛ.•.tɑ.lu •.piu.tʰɑ.lu ɔ/ɑ.nɑ.•

tɤ.lɤ.xei.tʰɤ ʂwo.pu tʰu.tʰɤ pje.tɤ su.lu wu.pu.xu.lɤ ku.lɤ a.lɤ.fu pa.tɤ.lɑŋ.kʰɤ mje.wo mu.lu.tɤ tʂʰɹ̩.pu.a.ən a.lu ʦɤ.əɹ.ta.lu ʂɤ.fu.tʰa.lu a.na.əɹ

дарахт чӯб тут бед сарв × гул алаф бодиринг / бодранг70 мева муруд / амруд × олу зардолу шафтолу анор

The “flowers and trees” section

花木門

425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440

fly mosquito cicada moth insect; worm ant fly cry hair; feather wing hoof claw; talon mouth; beak (female) unicorn kylin; unicorn phoenix parrot mandarin duck peacock frog; toad halcyon quail grasshopper; locust

樹木桑柳松栢花草瓜果梨棗李杏桃榴

plant; tree tree; wood mulberry willow pine tree cypress flower grass melon fruit pear jujube plum apricot peach pomegranate

得勒黒忒卜禿忒別得速魯五卜戸勒故勒阿勒夫巴得郞克滅斡母魯得赤卜阿恩阿魯則兒打魯舍夫他魯阿納兒

66 This 麟 lín is probably a mistakenly written 鱗 lín ‘fish scale’ (see Honda 1963: 17). 67 Modern Tajik фулус ‘fish scales’. 68 The Japanese word 麒麟 kirin (qílín in Chinese pīnyīn) is used in reference to ‘giraffe’. 69 Modern Tajik заррофа ‘giraffe’. 70 Modern Tajik бодиринг means not ‘melon’ but ‘cucumber’. Cucumber is called 黄瓜 huángguā lit. ‘yellow melon’ or 胡瓜 húguā lit. ‘melon introduced from the northern/western ethnic groups’ in modern Standard Chinese. (Note that both 黄瓜 and 胡瓜 contain the glyph 瓜 guā ‘melon’.)

Huihuiguan zazi: A New Persian glossary compiled in Ming China

441

‫ﺧ ﺘﻤ ﯽ‬

葵

442 443 444 445 446 447 448 449 450 451 452

‫ﺍﻗﺨﻮﺍﻥ‬ ‫ﺑﺎﺑﻮﻧﺞ‬ ‫ﻏﻨﭽ ﻪ‬ ‫ﻧﯽ‬ ‫ﮐﻞ ﻧﻴﻠﻮﻓﺮ‬ ‫ﺑﺎﺩﻧﺠﺎﻥ‬ ‫ﭘﻴﺎﺯ‬ ‫ﺯﻧﺠﺒﻴﻞ‬ ‫ﺳﻴﺮ‬ ‫ﺷ ﺎﻟ ﯽ‬ ‫ﮐ ﻨﺪ ﻡ‬

桂

453 454 455 456

‫ﻃﺮﺍﻭﺕ‬ ‫ﻣﺎ ﺵ‬ ‫ﻗﻠﻘﺎﺱ‬72 ‫ﮐ ﻨﺐ‬

新鮮豆

457 458 459 460 461 462 463 464 465 466

‫ﮐﻨﺪﻧﺎ‬ ‫ﺷﺎﺥ‬ ‫ﺑ ﺮﮎ‬ ‫ﻧﻴ ﺶ ﻧ ﯽ‬ ‫ﺑﻴ ﺦ‬ ‫ﺳﻠﻴﺤﻪ‬73 ‫ﺳﺒﺴﺖ‬74 ‫ﺍﻧﮑﻮﺭ‬ ‫ﻓﺨﺮ ﺏ‬ ‫ﮐﻼ ﺏ‬

韭

467 468 469 470 471 472 473 474 475 476 477

‫ﮐﻤﺎﻥ‬ ‫ﺗﻴﺮ‬ ‫ﺧﻮﺩ‬ ‫ﺟﻮﺷﻦ‬ ‫ﻧ ﻴﺰه‬ ‫ﮐﺎﺭﺩ‬ ‫ﺭﮐ ﺎ ﺏ‬ ‫ﺟﻨﺎﻕ‬ ‫ﻃﺒ ﻖ‬ ‫ﮐ ﺎﺳ ﻪ‬ ‫ﴏﺍﺣﯽ‬

菊蕋竹蓮茄葱薑蒜稻麥

芋麻

枝葉笋根牡丹苜蓿蒲萄浮萍薔薇

geraniums, hollyhocks, mallows, etc. fragrant olive chrysanthemum stamen bamboo lotus aubergine onion ginger garlic rice wheat, barley, oats, etc. fresh bean taro; potatoes hemp, flax, jute, etc. leek; chive branch leaf bamboo shoot root peony lucerne grape duckweed rose

39

xei.tʰɤ.mi

хатмӣ

•.kɛ.xɛ.•.ən puɑ.•.nɑ.• un.• 柰 nai 故勒你魯法兒 •.lɛ.ni.lu.puɑ.• 把廷扎恩 puɑ.•.•.ən 痞呀子 pʰi.•.ʦɿ 簮知必勒 •.•.pi.lɛ 西兒 •.• 沙里 ʂɑ.• 敢敦 kan.tun

wu.kɤ.xei.wa.ən pa.pu.na.tʂɹ̩ wən.tʂʰɤ nai ku.lɤ.ni.lu.fa.əɹ pa.tʰjəŋ.tʂa.ən pʰi.ja.ʦɹ̩ ʦan.tʂɹ̩.pi.lɤ ɕi.əɹ ʂa.li kan.twən

уқҳувон / ақҳавон ×71 ғунча най гули нилуфар бодинҷон / бодимҷон пиёз занҷабил сир шолӣ гандум

忒剌斡忒媽石

tʰɛ.•.•.tʰɛ muɑ.• ku.lɛ.•.sɿ •.nɑ.•

tʰɤ.la.wo.tʰɤ ma.ʂɹ̩ ku.lɤ.ka.sɹ̩ kʰɤ.na.pu

тароват мош × канаб

kan.tɛ.nɑ ʂɑ.xɛ puɛ.•.• ni.•.nai piɛ.xɛ sɛ.•.xɛ si.•.•.tʰɛ an.•.• puɑ.xɛ.•.• ku.•.•

kan.tɤ.na ʂa.xei pai.əɹ.kʰɤ ni.ʂɹ̩.nai pje.xei sɤ.li.xei ɕi.pu.ɕi.tʰɤ an.ku.əɹ fa.xei.lu.pu ku.la.pu

гандано шох барг неши най бех × × ангур × гулоб

•.muɑ.ən tʰi.• •.tɛ tʂuɔ.ʂan nai.ʦɛ •.•.tɛ •.•.• tʂʉ.nɑ.kɛ tʰɛ.puɛ.kɛ •.sɛ •.•.xɛ

kʰɤ.ma.ən tʰi.əɹ xu.tɤ tʂwo.ʂan nai.ʦɤ ka.əɹ.tɤ li.ka.pu tʂu.na.kɤ tʰɤ.pai.kɤ ka.sɤ su.la.xei

камон тир хӯд ҷавшан найза корд рикоб ҷаноғ табақ коса суроҳӣ

黒忒密

xɛ.tʰɛ.mi

五革黒洼恩巴卜納知穩徹

古勒噶思克納卜敢得納沙黒百兒克你石柰別黒塞里黒洗卜細忒俺姑兒法黒路卜古剌卜

The “utensils” section

器用門弓箭盔甲鎗刀鐙盤碗壺

bow arrow helmet armour spear knife; sword stirrup saddleflap plate; dish bowl; cup pot; bottle

克媽恩梯兒乎得卓山乃則噶兒得里噶卜主納革忒百革噶塞速剌黒

71 Modern Tajik бобуна means ‘camomile’. 72 According to Steingass (2012: 985), ‫ ﻗﻠﻘﺎﺱ‬means “[t]he root of a plant which is edible when cooked”. 73 According to Steingass (2012: 695), ‫ ﺳﻠﯿﺨﺔ‬means “[a] certain perfume; benzoin or balsam of the bān-tree before it is prepared”. 74 According to Steingass (2012: 652), ‫ ﺳﭙﺴﺖ‬has “[t]refoil, clover” as one of its meanings.

40

Shinji Ido

478 479 480 481 482 483 484 485 486 487 488 489

‫ﭼﻮﮐﯽ‬75 ‫ﺗﻤﻐﺎ‬ ‫ﺁﻳﻨﻪ‬ ‫ﻏﮋ ﮎ‬ ‫ﺷﻄﺮﻧﺞ‬ ‫ﻳﻮﯼ‬ ‫ﻧ ﻘﺸ ﻴ ﻦ‬ ‫ﻋﻠﻢ‬ ‫ﭼﱰ‬ ‫ﮐﻮﺯه‬ ‫ﮐﺸﺘﯽ‬ ‫ﮐﺮﺩﻭﻥ‬

筯印

490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516

‫ﺑﻮﺭﻳﺎ‬ ‫ﻧﺎﯼ‬ ‫ﺩﻫﻞ‬ ‫ﻗﺪﺡ‬ ‫ﻟﮑﺎﻡ‬ ‫ﺍﮐﺰ‬ ‫ﭘﺮﺩه‬ ‫ﺩﺭﻓﺶ‬ ‫ﺁﺳﻴﺎ‬ ‫ﮐﻮﺑﻪ‬ ‫ﺟﻮﺍﺯ‬ ‫ﭼﺮﺍﻍ‬ ‫ﺧﻢ‬ ‫ﺑﺎﺩﺑﺎﻥ‬ ‫ﺗﺎﺯﻳﺎﻧﻪ‬ ‫ﮐﻮﯼ‬ ‫ﻣﺴﻘﺎﺭ‬ ‫ﺷﻴﺮه‬77

席

‫ﻟﻄﻦ‬78 ‫ﺑﺎﺩﻭﻳﺰﻥ‬ ‫ﺩﻳﮏ‬ ‫ﺳﻔﻂ‬ ‫ﻓﺎﻧﻮﺱ‬ ‫ﺷﻤ ﻊ‬ ‫ﻣﺤﻔﻪ‬ ‫ﻣ ﺠﻤ ﺮ‬

鏡琴棋香畫旗傘瓶船車

笛鼓鍾轡鈎簾錐磨杵臼燈甕篷鞭毬笙卓櫈盆扇鍋箱燈籠蠟燭轎子香爐

chopsticks stamp; seal mirror zither chess perfume paint; drawing flag umbrella; parasol bottle boat vehicle; wheeled instrument/ machine seat flute drum bell; goblet bridle hook curtain awl mill pestle mortar lamp urn awning; sail whip ball reed pipe table stool basin fan pot; cauldron box; chest lantern candle sedan chair incense burner

搠几貪阿阿衣納額日克舍忒藍知鉢衣納革石尹阿藍徹忒兒科則起石梯革兒都恩

鉢兒呀納衣堵戸勒革得黒魯噶木阿革子迫兒得堵路夫石阿洗呀科百主洼子赤剌額昏巴得巴恩他子呀納鍋衣母洗噶兒史勒散得里勒團巴得月簮迭克塞法忒法奴思舎黙額黙黒法米知黙兒

•.• tʰan.ɔ/ɑ ɔ/ɑ.i.nɑ •.ʐʅ.• •.tʰɛ.lan.• •.i nɑ.kɛ.•.• ɔ/ɑ.lan •.tʰɛ.• kʰuɔ.ʦɛ kʰi.•.tʰi kɛ.•.tu.ən

ʂwo.ʨi tʰan.a a.ji.na ɤ.ɹɹ̩.kʰɤ ʂɤ.tʰɤ.lan.tʂɹ̩ pwo.ji na.kɤ.ʂɹ̩.jin a.lan tʂʰɤ.tʰɤ.əɹ kʰɤ.ʦɤ ʨʰi.ʂɹ̩.tʰi kɤ.əɹ.tu.ən

× тамға оина ғижжак шатранҷ / сатранҷ бӯй / бӯ / бу нақшин алам чатр кӯза киштӣ гардун

•.•.• nɑ.i •.xu.lɛ kɛ.tɛ.xɛ lu.•.• ɔ/ɑ.kɛ.ʦɿ •.•.tɛ •.•.piu.• ɔ/ɑ.si.• kʰuɔ.puɛ tʂʉ.•.ʦɿ •.•.• xun puɑ.tɛ.puɑ.ən tʰɑ.ʦɿ.•.nɑ kuɔ.i mu.si.•.• ʂʅ.lɛ san.tɛ.• lɛ.• puɑ.tɛ.iuɛ.• •.• sɛ.puɑ.tʰɛ puɑ.nu.sɿ •.•.• •.xɛ.puɑ mi.•.•.•

pwo.əɹ.ja na.ji tu.xu.lɤ kɤ.tɤ.xei lu.ka.mu a.kɤ.ʦɹ̩ pʰwo.əɹ.tɤ tu.lu.fu.ʂɹ̩ a.ɕi.ja kʰɤ.pai tʂu.wa.ʦɹ̩ tʂʰɹ̩.la.ɤ xwən pa.tɤ.pa.ən tʰa.ʦɹ̩.ja.na kwo.ji mu.ɕi.ka.əɹ ʂɹ̩.lɤ san.tɤ.li lɤ.tʰwan pa.tɤ.ɥe.ʦan tje.kʰɤ sɤ.fa.tʰɤ fa.nu.sɹ̩ ʂɤ.mwo.ɤ mwo.xei.fa mi.tʂɹ̩.mwo.əɹ

бӯрё / бурё най / ной дуҳул / дӯл қадаҳ лигом / лагом окаҷ парда дарафш / дирафш осиёб / осиё кӯба76 ҷувоз чироғ / чароғ хум бодбон тозиёна гӯ(й) мусиқор × сандалӣ лаган бодбезан / бодбизан дег сабад фонус шамъ миҳаффа миҷмар

75 According to Jarring (1964: 76), čökɛ ‘chop-sticks’ exists in Turkic dialects spoken in the southern part of Xinjiang. 76 Modern Tajik кӯба ‘mallet; hammer’. 77 The meanings of ‫ ﺷﯿﺮه‬listed in Steingass (2012: 774) include “a tray with a leg to stand upon”. 78 ‫ ﻟﻄﻦ‬may be a misspelt ‫ﻟﮑﻦ‬, which, according to Steingass (2012: 1128), has the meaning of “a basin, bowl”.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

The “clothing” section

衣服門

517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537

‫ﺟﺎﻣﻪ‬ ‫ﺗﺎﺝ‬ ‫ﻣﻮﺯه‬ ‫ﮐﻤ ﺮ‬ ‫ﮐﺘ ﯽ‬ ‫ﺗﻮﺍﺭ‬79 ‫ﺣﺮ ﻳﺮ‬ ‫ﻻﯼ‬ ‫ﮐﺮﺑﺎﺱ‬ ‫ﺗﻮﺭﻗﻮ‬82 ‫ﺍﺑﺮﻳﺸﻴﻢ‬ ‫ﺭﻳﺸﺘﻪ‬ ‫ﺍ ﺑﺮه‬ ‫ﺍﺳﱰ‬ 83 84 85 86

‫ﮐﺮﻳﺒﺎﻥ‬ ‫ﻧﻤﺪ‬ ‫ﻗﻀﺎﻏﻨﺪ‬

41

衣冠靴帯錦叚80 綾羅布絹絲線表裏襟袖綿帽領氊被

clothes; garment hat boot belt brocade satin81 thin silk net; silk gauze cloth plain silk raw silk; silk thread thread; string; line surface lining; inside front of a garment sleeve cotton hat; cap collar; neck felt quilt; blanket

扎黙他知抹則克黙兒克梯忒洼兒黒里兒剌衣克兒巴思土兒孤阿卜列石尹里石忒阿卜勒阿思忒兒打蠻阿思梯尹敏搭禿苦剌黒己里巴恩納黙得革咱安得

•.• tʰɑ.• muɔ.ʦɛ •.•.• •.tʰi tʰɛ.•.• xɛ.•.• •.i •.•.puɑ.sɿ tʰu.•.ku ɔ/ɑ.•.liɛ.•.• •.•.tʰɛ ɔ/ɑ.•.lɛ ɔ/ɑ.sɿ.tʰɛ.• tɑ.muan ɔ/ɑ.sɿ.tʰi.• min.tɑ.tʰu kʰu.•.xɛ •.•.puɑ.ən nɑ.•.tɛ kɛ.•.an.tɛ

tʂa.mwo tʰa.tʂɹ̩ mwo.ʦɤ kʰɤ.mwo.əɹ kʰɤ.tʰi tʰɤ.wa.əɹ xei.li.əɹ la.ji kʰɤ.əɹ.pa.sɹ̩ tʰu.əɹ.ku a.pu.lje.ʂɹ̩.jin li.ʂɹ̩.tʰɤ a.pu.lɤ a.sɹ̩.tʰɤ.əɹ ta.man a.sɹ̩.tʰi.jin min.ta.tʰu kʰu.la.xei ʨi.li.pa.ən na.mwo.tɤ kɤ.ʦa.an.tɤ

ҷома тоҷ мӯза камар × × ҳарир лой карбос × абрешим ришта абра астар доман остин × кулоҳ гиребон намад қазоган(д) / қазоған(д)

79 According to Steingass (2012: 332), ‫ ﺗﻮﺍﺭ‬means “[a] rope for tying on a load”. 80 Honda (1963: 20) identifies the glyph 叚 jiǎ ‘false; borrow’, which has as one of its alternative forms, as 段 duàn ‘step’. 81 This is not the meaning of 叚 jiǎ ‘false; borrow’, but that of 緞 duàn ‘satin’. I tentatively assume that 叚 here is misspelt for, or is meant to represent, 緞, because this particular entry is in the ‘clothing’ section, and also because 緞 appears in the place of 叚 in another copy of huihuiguan zazi (Beijing tushuguan guji chuban bianji zu 1987–[1994]: 553). 82 Clauson’s dictionary of pre-thirteenth-century Turkic (1972: 539) has torku: ‘silk fabric’ as one of its entries, but Clauson suspects that it may be a loanword. 83 The entry numbered 531 in Honda (1963: 20) is absent in the Berlin Manuscript, hence the blank. The entry is ‫ ﺩﺍﻣﻦ‬in Beijing tushuguan guji chuban bianji zu (1987–[1994]: 499, 554), from which the Chinese translation 襟 and Chinese-script transcription 打蠻 in this row are retrieved. 84 The entry numbered 532 in Honda (1963: 20) is absent in the Berlin Manuscript, hence the blank. The entry appears as ‫ ﺁﺳﺘﻴﯽ‬and ‫ ﺁﺳﺘﻦ‬in different pages of Beijing tushuguan guji chuban bianji zu (1987–[1994]: 499, 554), from which the Chinese translation 袖 and Chinese-script transcription 阿思梯尹 in this row are retrieved. 85 The entry numbered 533 in Honda (1963: 20) is absent in the Berlin Manuscript, hence the blank. The entry is ‫ ﻣﻨﺪﺍﺗﻮ‬in Beijing tushuguan guji chuban bianji zu (1987–[1994]: 500, 554), from which the Chinese translation 綿 and Chinese-script transcription 敏搭禿 in this row are retrieved. 86 The entry numbered 534 in Honda (1963: 20) is absent in the Berlin Manuscript, hence the blank. The entry appears as ‫ ﮐﺪﺍ ه‬and ‫ ﮐﻼه‬in different pages of Beijing tushuguan guji chuban bianji zu (1987–[1994]: 500, 554), from which the Chinese translation 帽 and Chinese-script transcription 苦剌黒 in this row are retrieved.

42

538 539 540 541 542

‫ﭘ ﺴﱰ‬ ‫ﺑ ﺎﻟ ﺶ‬ ‫ﺟﻮﺍﻝ‬ ‫ﻓﻮﻃﻪ‬ ‫ﻧ ﻴﻤ ﺘ ﻨ ﻪ‬

Shinji Ido

褥枕袋手巾短衫

543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574

‫ﮐﻮﺷﺖ‬ ‫ﺁﺵ‬ ‫ﺁﺭﺩ‬ ‫ﺭ ﻭ ﻏﻦ‬ ‫ﻧﻤ ﮏ‬ ‫ﴍﺍﺏ‬ ‫ﴎﮐ ﻪ‬ ‫ﺷ ﮑﺮ‬ ‫ﻋ ﺴﻞ‬ ‫ﺩﺍﺭﻭ‬ ‫َﭼﺎ‬ ‫ﮐﺮﺳﻨ ﻪ‬ ‫ﺳ ﻴﺮ‬ ‫ﺗﺸ ﻨ ﻪ‬ ‫ﺁﺷﺎﻣﻴﺪﻥ‬ ‫ﺧﻮﺭﺩﻥ‬ ‫ﻣ ﺰه‬ ‫ﺷﻴﺮ ﻳ ﻦ‬ ‫ﺗﻠﺦ‬ ‫ﺗﺮ ﺵ‬ ‫ﺗ ﺮه‬ ‫ﺷﻮﺭ‬ ‫ﺳﻮﺧﺘﻦ‬ ‫ﺟﻮﺷﻴﺪﻥ‬ ‫ﺧ ﺎﻡ‬ ‫ﭘﺨﺘ ﻪ‬ ‫ﺷﻮﺭﺑﺎ‬ ‫ﻧﺎ ﻥ‬ ‫ﺭﻭﻏﻦ ﮐﺎﻭ‬ ‫ﺟﻐﺮﺍﺕ‬ ‫ﮐﻮﻣﻪ‬

575

‫ﺁﻣﻴﺨﺘﻦ‬

pʰi.sɹ̩.tʰɤ.əɹ pa.li.ʂɹ̩ tʂu.wa.lɤ fu.tʰɤ nin.tʰɤ.na

бистар болиш / болишт ҷувол фута / фӯта нимтана

主額剌忒科黙

tʰu.• kuɔ.•.tʰɛ ɔ/ɑ.• ɔ/ɑ.•.tɛ luɔ.an nɑ.•.• ʂɛ.•.• si.•.• ʂɛ.•.• ɔ/ɑ.sɛ.lɛ tɑ.lu pun.• ku.•.sɿ.nɑ siɛ.• tʰɛ.•.nɑ ɔ/ɑ.ʂɑ.mi.tan xuɔ.•.tan •.ʦɛ ʂʅ.•.• tʰɛ.lɛ.xɛ tʰu.•.• tʰɛ.•.lɛ ʂuɔ.• suɔ.xɛ.tʰan tʂuɔ.•.tan xɑ.ən •.xɛ.tʰɛ ʂuɔ.•.puɑ nɑ.ən luɔ.an.kɒʊ tʂʉ.•.•.tʰɛ kʰuɔ.•

tʰu.ʨi kwo.ʂɹ̩.tʰɤ a.ʂɹ̩ a.əɹ.tɤ lwo.an na.mwo.kʰɤ ʂɤ.la.pu ɕi.əɹ.kʰɤ ʂɤ.kʰɤ.əɹ a.sɤ.lɤ ta.lu pən.jin ku.əɹ.sɹ̩.na ɕje.əɹ tʰɤ.ʂɹ̩.na a.ʂa.mi.tan xwo.əɹ.tan mwo.ʦɤ ʂɹ̩.li.jin tʰɤ.lɤ.xei tʰu.lu.ʂɹ̩ tʰɤ.əɹ.lɤ ʂwo.əɹ swo.xei.tʰan tʂwo.ʂɹ̩.tan xa.ən pʰu.xei.tʰɤ ʂwo.əɹ.pa na.ən lwo.an.kɑu tʂu.ɤ.la.tʰɤ kʰɤ.mwo

× гӯшт ош орд равған намак шароб сирка / сирко шакар асал дору чой гурусна сер ташна ошомидан хӯрдан маз(з)а ширин талх турш тарра шӯр сӯхтан ҷӯшидан хом пухта шӯрбо нон равғани гов ҷурғот / ҷуғрот ×

阿滅黒貪

ɔ/ɑ.miɛ.xɛ.tʰan

a.mje.xei.tʰan

омехтан

痞思忒兒把力石主洼勒府忒恁忒納

pʰi.sɿ.tʰɛ.• puɑ.•.• tʂʉ.•.lɛ piu.tʰɛ nin.tʰɛ.nɑ

The “eating and drinking” section

飲食門

‫ﺗﻮﮐﯽ‬87

mattress pillow bag; pouch; sack towel short upper garment

米肉飯麪油鹽酒醋糖蜜藥茶饑飽渴飲喫味甜苦酸辣鹹焼煮生熟湯餅酥酪醤調和

husked rice meat; flesh cooked rice; meal flour; noodle oil salt alcoholic drink vinegar sugar honey medicine tea hungry full thirsty drink eat; consume taste sweet bitter sour peppery; pungent salty bake boil raw ripe soup round flat cake butter junket; curd (bean) sauce made by fermenting mix; blend

土几鍋石忒阿石阿兒得羅安納黙克捨剌卜洗兒克捨克兒阿塞勒打魯本音88 古兒思納兒忒石納阿沙米丹火兒丹黙則史里尹忒勒黒土路石忒兒勒朔兒鎖黒貪卓石丹哈恩僕黒忒朔兒巴納恩羅安髙

87 Clauson’s (1972: 478) pre-thirteenth-century Turkic dictionary has tögi: ‘crushed or cleaned cereal’ as one of its entries, while Doerfer (1965: 629–630) lists ‫‘ ﺗﻮﮔﯽ‬millet’ as a loanword in New Persian. 88 This is arguably not a transcription but a note to the reader as 本音 běn yīn means ‘this sound’ in Chinese. This is to say that the Timurid Persian word for “tea” had a similar pronunciation to that of 茶 chá ‘tea’ in Ming-period Beijing Chinese. 本音 also appears in the appendix section of the Berlin Manuscript (entry number 806 in Honda 1963: 29).

Huihuiguan zazi: A New Persian glossary compiled in Ming China

The “treasure” section

珍寳門

576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593

‫ﺯﺭ‬ ‫ﻧﻘﺮه‬ ‫ﻣﺮﻭﺍﺭﻳﺪ‬ ‫ﻳﺸ ﻢ‬ ‫ﻣﺲ‬ ‫ﺁﻫ ﻦ‬ ‫ﻗﺎﺵ‬89 ‫ﺟﻴﺰﯼ‬ ‫ﴎﺏ‬ ‫ﺍﺭﺯﻳﺰ‬ ‫ﺳ ﺒ ﻴﺪ ه‬ ‫ﺭﺧﺖ‬ ‫ﺟﺰﻉ‬ ‫ﻣﺮﺟﺎﻥ‬ ‫ﺑﻠﻮﺭ‬ ‫ﮐﻬﺮﺑﺎﯼ‬ ‫ﮐﻮﻫﺮ‬ ‫ﺣﻠﺒﯽ‬90

金

則兒

銀

奴革勒

gold silver 珠 pearl 玉 jade 銅 copper 鐵 iron 錢 coin 物 thing 鉛 lead 錫 tin 粉 powder 貨 goods; property 瑪瑙 agate 珊瑚 coral 水晶 crystal 琥珀 amber 寳貝 treasure 玻瓈 glass

‫ﮐﺒﻮﺩ‬ ‫ﴎﺥ‬ ‫ﺯﺭﺩ‬ ‫ﺳ ﻔ ﻴﺪ‬ ‫ﺳﻴﺎه‬ ‫ﻧﻮﮎ‬ ‫ﺭﻧﮑﺎﺭﯼ‬ ‫ﺳﺒﺰ‬ ‫ﺭﻧﮑﻴﻦ‬ ‫ﺑﯽ ﺭﻧﮏ‬ ‫ﺭﻧﮏ ﮐﺮﺩﻥ‬ ‫ﺭﻧﮏ‬ ‫ﺟﻮﺯﯼ‬ ‫ﺁﻝ‬ ‫ﺳﺒﺰ ﺭﻭﺷﻦ‬ ‫ﺳﺒﺰ ﺗﻠﺦ‬ ‫ﻓﺴﺘﻘﯽ‬

黙兒洼里得夜深密思阿罕噶石赤則速兒卜阿兒即子洗撇得勒黒忒止則額黙兒扎恩卜魯兒克黒兒巴衣稿黒兒黒勒必

ʦɛ.• nu.kɛ.lɛ •.•.•.•.tɛ iɛ.• mi.sɿ ɔ/ɑ.xan •.• •.ʦɛ •.•.• ɔ/ɑ.•.•.ʦɿ si.pʰiɛ.tɛ lɛ.xɛ.tʰɛ tʂʅ.ʦɛ.• •.•.•.ən •.lu.• •.xɛ.•.puɑ.i kɒʊ.xɛ.• xɛ.lɛ.pi

ʦɤ.əɹ nu.kɤ.lɤ mwo.əɹ.wa.li.tɤ je.ʂən mi.sɹ̩ a.xan ka.ʂɹ̩ tʂʰɹ̩.ʦɤ su.əɹ.pu a.əɹ.ʨi.ʦɹ̩ ɕi.pʰje.tɤ lɤ.xei.tʰɤ tʂʰɹ̩.ʦɤ.ɤ mwo.əɹ.tʂa.ən pu.lu.əɹ kʰɤ.xei.əɹ.pa.ji kɑu.xei.əɹ xei.lɤ.pi

зар нуқра марворид яшм мис оҳан × чизе сурб арзиз сапеда рахт ҷазъ марҷон булӯр каҳрабо / каҳрабоӣ гавҳар ×

•.•.tɛ •.•.xɛ ʦɛ.•.tɛ si.puɑ.tɛ si.•.xɛ •.• ʦan.•.• sɛ.•.ʦɿ lɑŋ.•.• piɛ.lɑŋ.• lɑŋ.•.•.•.tan lɑŋ.• •.• ɔ/ɑ.lɛ sɛ.•.ʦɿ.luɔ.ʂan sɛ.•.ʦɿ.tʰɛ.lɛ.xɛ •.sɿ.tʰɛ.kɛ

kʰɤ.pu.tɤ su.əɹ.xei ʦɤ.əɹ.tɤ ɕi.fa.tɤ ɕi.ja.xei na.kʰɤ ʦan.ka.li sɤ.pu.ʦɹ̩ lɑŋ.ʨi.jin pje.lɑŋ.kʰɤ lɑŋ.kʰɤ.kʰɤ.əɹ.tan lɑŋ.kʰɤ tʂɑu.ʨi a.lɤ sɤ.pu.ʦɹ̩.lwo.ʂan sɤ.pu.ʦɹ̩.tʰɤ.lɤ.xei fei.sɹ̩.tʰɤ.kɤ

кабуд сурх зард сафед сиёҳ ×91 зангорӣ сабз рангин беранг ранг кардан ранг ҷавзӣ ол сабзи равшан сабзи талх пистақ(қ)ӣ / пистоқӣ / пистагӣ

The “voice and countenance” section

聲色門

594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610

43

青紅黄白黒紫藍綠濃淡染色茶褐大紅明綠黒綠柳青

blue-green red yellow white black purple blue green deep; thick pale; light; thin dye color dark brown bright red bright green dark green yellowish green

克卜得速兒黒則兒得洗法得洗呀黒那克昝噶力塞卜子郞几尹別郞克郞克克兒丹郞克爪即阿勒塞卜子羅山塞卜子忒勒黒非思忒革

89 ‫ ﻗﺎﺵ‬may be related to Turkic ka:ş ‘jade’ (Clauson 1972: 669). 90 According to Steingass (2012: 428), ‫ ﺣﻠﺒﯽ‬means “[b]elonging to a milch cow; native of Aleppo; white iron, tin-plate (modern colloquialism)”. 91 Modern Tajik нок ‘unclean; impure (musk, ambergris, etc.)’ may be related to this entry, though нок would be spelt ‫ ﻧﺎﮎ‬in Arabic script (Šukurov et al. 1969a: 865).

44

Shinji Ido

The “literature and history” section

文史門

611 612 613

‫ﺷ ﻌﺮ‬ ‫ﺩ ﻓﱰ‬ ‫ﻋﺒﺎﺭﺕ‬

詩

614 615 616 617 618 619 620 621 622 623 624

‫ﺧﻂ‬ ‫ﮐ ﺎ ﻏﺬ‬ ‫ﺑﮑﻪ‬93 ‫ﻗ ﻠﻢ‬ ‫ﺩﻭﺍﺕ‬ ‫ﻗﺮﺁﻥ‬ ‫ﺗﻮﺍﺭﻳﺞ‬ ‫ﺳﻮﺭه‬ ‫ﺩﻳﺒﺎﭼﻪ‬ ‫ﺧ ﻂ ﺗﺨﻘﻴ ﻖ‬ ‫ﺧﻂ ﻣﺴﻮﺩه‬

字

625 626 627

‫ﻣﺨﻠﻮﺝ‬ ‫ﺧﻂ ﮐﻮﻓﯽ‬ ‫ﺑ ﻴﺖ‬

行書

書文

紙墨筆硯經史篇序真字草字

篆字詞曲

‫ﻣﴩ ﻕ‬ ‫ﻣﻐﺮﺏ‬ ‫ﺟﻨﻮﺏ‬ ‫ﺷﻤ ﺎ ﻝ‬ ‫ﺯ ﺑﺮ‬ ‫ﺯ ﻳﺮ‬ ‫ﭼﺐ‬ ‫ﺭ ﺍﺳﺖ‬ ‫ﺑﻴ ﺶ‬ ‫ﭘﺲ‬ ‫ﺍﻧﺪﺭﻭﻥ‬ ‫ﺑﻴﺮﺭﻭﻥ‬ ‫ﻣﻴﺎﻥ‬ ‫ﮐﺮﺍﻧﻪ‬ ‫ﻣ ﺮ ﺑﻎ‬ ‫ﻣﺪﻭﺭ‬ ‫ﻓﺮﺍﺥ‬ ‫ﺗﻨ ﮏ‬

舎額兒

semicursive script seal script ci and qu forms of poetry

黙黒魯知

得夫忒兒額巴勒忒

黒忒噶額子百克革藍得洼忒古剌恩土洼列黒蘇勒底巴徹黒忒忒黒革革黒忒母嫂斡得

黒忒科法擺忒

•.•.• tɛ.piu.tʰɛ.• •.puɑ.lɛ.tʰɛ

ʂɤ.ɤ.əɹ tɤ.fu.tʰɤ.əɹ ɤ.pa.lɤ.tʰɤ

шеър дафтар иборат / ибора92

xɛ.tʰɛ •.•.ʦɿ puɛ.• kɛ.lan tɛ.•.tʰɛ ku.•.ən tʰu.•.liɛ.xɛ su.lɛ ti.puɑ.• xɛ.tʰɛ.tʰɛ.xɛ.kɛ.kɛ xɛ.tʰɛ.mu.•.•.tɛ

xei.tʰɤ ka.ɤ.ʦɹ̩ pai.kʰɤ kɤ.lan tɤ.wa.tʰɤ ku.la.ən tʰu.wa.lje.xei su.lɤ ti.pa.tʂʰɤ xei.tʰɤ.tʰɤ.xei.kɤ.kɤ xei.tʰɤ.mu.sɑu.wo.tɤ

•.xɛ.lu.• xɛ.tʰɛ.kʰuɔ.puɑ puai.tʰɛ

mwo.xei.lu.tʂɹ̩ xei.tʰɤ.kʰɤ.fa pai.tʰɤ

хатт коғаз × қалам давот Қуръон таърих сура дебоча хатти таҳқиқ хатти мусаввада ×94 хатти кӯфӣ байт

mu.•.•.kɛ •.•.•.• tʂɛ.nu.• •.muɑ.lɛ ʦɛ.puɛ.• •.• •.• •.sɿ.tʰɛ pʰiɛ.• •.sɿ an.tɛ.lu.ən piɛ.lu.ən mi.•.ən •.•.nɑ mu.lan.puɛ.• mu.•.•.• puɑ.•.xɛ tʰɑŋ.•

mu.ʂɹ̩.li.kɤ mwo.ɤ.li.pu tʂɤ.nu.pu ʂɹ̩.ma.lɤ ʦɤ.pai.əɹ ʨje.əɹ tʂʰɤ.pu la.sɹ̩.tʰɤ pʰje.ʂɹ̩ pʰwo.sɹ̩ an.tɤ.lu.ən pje.lu.ən mi.ja.ən kʰɤ.la.na mu.lan.pai.ɤ mu.tɑu.wo.əɹ fa.la.xei tʰɑŋ.kʰɤ

машрик мағриб ҷануб шимол забар зер чап рост пеш пас андарун берун / бурун миён карона мураббаъ мудаввар фарох танг

The “four quarters” section

方隅門

628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645

poetry; poem script; writing literary composition; literary language letter; character paper ink writing brush inkstone scripture history chapter; passus preface regular script cursive script

東西南北上下左右前後内外中邉方圓寛窄

east west south north up; above below; down left right front back inside outside middle; centre edge; margin square circle; round wide; tolerant narrow

母石力革黙額力卜者奴卜石媽勒則百兒節兒徹卜剌思忒撇石迫思俺得魯恩別魯恩米呀恩克剌納母藍百額母倒斡兒法剌黒湯克

92 Modern Tajik ибора(т) ‘expression; statement’. 93 Honda (1963: 23) identifies this entry (‫ )ﺑﮑﻪ‬as a loanword from Chinese. 墨 was /muɛ/ in the Ming-period Beijing Chinese according to Lu (1998: 69), and was, according to Tōdō and Kanō (2005: 381), /muək/ in Zhou, Qin, and Han, /m(b)uək/ in Sui and Tang, and /mo/ in Song, Yuan, and Ming. Pullyblank (1991: 218) reconstructs the pronunciation of the same glyph in Late Middle Chinese and Early Middle Chinese as /muə̆k/ and /mək/, respectively. 94 The Arabic loanword махлӯъ ‘deposed; cashiered’ (‫ ﻣﺨﻠﻮﻉ‬in Arabic script) exists in Tajik.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

646 647 648 649 650 651

‫ﮐﻮﺷﻪ‬ ‫ﺗﮏ‬ ‫ﺣﺮﻡ‬ ‫ﯾﮏ ﺩﺭ‬ ‫ﺍﻳﻨﺠﺎ‬ ‫ﺍﻧﺠﺎ‬

角底家間這里那里

‫ﯾﮏ‬ ‫ﺩﻭ‬ ‫ﺳﻪ‬ ‫ﺟﻬﺎﺭ‬ ‫ﺑﻨﺞ‬ ‫ﺷﺶ‬ ‫ﻫﻔﺖ‬ ‫ﻫﺸ ﺖ‬ ‫ﻧﻪ‬ ‫ﺩه‬ ‫ﺻﺪ‬ ‫ﻫﺰﺍﺭ‬ ‫ﻃﺎﻕ‬ ‫ﺟﻔﺖ‬ ‫ﺗﻤ ﻦ‬ ‫ﺳﻴﺮ‬ ‫ﻋﺪﺩ‬ ‫ﺫﺭه‬

一二三四五六七八九十百千单雙萬兩數毫釐

‫ﻫﺴ ﺖ‬ ‫ﻧ ﻴﺴ ﺖ‬ ‫ﺑﺮﺍﺑﺮ‬ ‫ﺗﻔﺎﻭﺕ‬ ‫ﺑﻠﯽ‬ ‫ﻓﺘﻨﻪ‬ ‫ﮐﺎﻭﺍﮎ‬ ‫ﺧﻘ ﻴﻘﺖ‬ ‫ﺁﻫﺴﺘﻪ‬ ‫ﺗﻴﺰ‬ ‫ﺩﺷﻮﺍﺭ‬

黒藍夜克得兒因扎昻扎

kʰuɔ.• tʰɛ.• xɛ.lan iɛ.•.tɛ.• in.• ɑŋ.•

kʰɤ.ʂɤ tʰɤ.kʰɤ xei.lan je.kʰɤ.tɤ.əɹ jin.tʂa ɑŋ.tʂa

гӯша таг ҳарам95 якдар инҷо онҷо / унҷо

one two three four five six seven eight nine ten hundred thousand single two; double ten thousand two; both number a minute amount; the least bit

夜克都㱔

iɛ.• tu siɛ tʂʰɑ.xɑ.• pʰuan.• •.• xɑ.piu.tʰɛ xɑ.•.tʰɛ nu.xɛ tɛ.xɛ sɛ.tɛ xɑ.•.• tʰɑ.kɛ tʂʉ.piu.tʰɛ tʰu.muan siɛ.• ɔ/ɑ.tɛ.tɛ ʦɛ.•.lɛ

je.kʰɤ tu ɕje tʂʰa.xa.əɹ pʰan.tʂɹ̩ ʂɤ.ʂɹ̩ xa.fu.tʰɤ xa.ʂɹ̩.tʰɤ nu.xei tɤ.xei sɤ.tɤ xa.ʦa.əɹ tʰa.kɤ tʂu.fu.tʰɤ tʰu.man ɕje.əɹ a.tɤ.tɤ ʦɤ.əɹ.lɤ

як ду се чаҳор / чор панҷ шаш / шиш ҳафт ҳашт нӯҳ / нуҳ даҳ сад ҳазор тоқ ҷуфт тӯмон / тумон сер96 адад зарра

叉哈兒潘知舎石哈夫忒哈石忒奴黒得黒塞得哈咱兒他革住夫忒土蠻㱔兒阿得得則兒勒

xɑ.sɿ.tʰɛ miɛ.sɿ.tʰɛ puɛ.•.puɛ.• tʰɛ.puɑ.•.tʰɛ puɛ.liɛ •.tʰɛ.nɑ •.•.• xɛ.kɛ.kɛ.tʰɛ ɔ/ɑ.sɿ.sɿ.tʰɛ tʰiɛ.ʦɿ •.•.•.•

xa.sɹ̩.tʰɤ mje.sɹ̩.tʰɤ pai.la.pai.əɹ tʰɤ.fa.wu.tʰɤ pai.lje fei.tʰɤ.na ka.wa.kʰɤ xei.kɤ.kɤ.tʰɤ a.sɹ̩.sɹ̩.tʰɤ tʰje.ʦɹ̩ tu.ʂɹ̩.wa.əɹ

ҳаст нест баробар тафовут бале фитна97 ковок ҳақиқат оҳиста тез душвор

The “currency” section

通用門

670 671 672 673 674 675 676 677 678 679 680

科舍忒克

The “amount/number” section

數目門

652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669

corner; horn bottom home space between; room here there

45

有無同異是非虚實緩急難

exist nothing; nonexistent same different be; right; yes be not; wrong; nonfalse; empty fact; full slow rapid difficult

哈思忒乜思忒百剌百兒忒法兀忒百列非忒納噶洼克黒革革忒阿思思忒98 貼子堵石洼兒

95 Modern Tajik ҳарам ‘forbidden (place); harem’. 96 Modern Tajik сер ‘full’. See entry number 556. 97 Modern Tajik фитна ‘discord’. 98 This is probably a misspelt 阿黒思忒 (see Honda 1963: 25; Beijing tushuguan guji chuban bianji zu 1987–[1994]: 510, 566), whose reconstructed Ming-period Beijing Chinese pronunciation and Standard Chinese pronunciation are /ɔ.xɛ.sɿ.tʰɛ/-/ɑ.xɛ.sɿ.tʰɛ/ and /a.xei.sɹ̩.tʰɤ/, respectively.

46

Shinji Ido

681 682 683 684 685 686

‫ﺁﺳﺎﻥ‬ ‫ﺩﻭﺭ‬ ‫ﻧﺰﺩﻳﮏ‬ ‫ﮐﺸﺎﺩﻥ‬ ‫ﺑﺴ ﺘ ﻦ‬ ‫ﺩﺭﺳﺖ‬

易遠

687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717

‫ﺩﺭﺷﺖ‬ ‫ﺩﺭﺍﺯ‬ ‫ﮐﻮﺗﺎه‬ ‫ﮐﻼﻥ‬ ‫ﺧﺮﺩ‬ ‫ﺑﺴﻴﺎﺭ‬ ‫ﮐﻤ ﱰ‬ ‫ﺟﻮﻥ‬ ‫ﺍﮐﻨﻮﻥ‬ ‫ﺍﮐﺮ‬ ‫ﮐﻬﻨﻪ‬ ‫ﺗﻤﺎﻣﺖ‬ ‫ﮐﻤﺎﻝ‬ ‫ﺍﻗﺒﺎﻝ‬ ‫ﺍﺩﺑﺎﺭ‬ ‫ﺧﻮﺵ‬ ‫ﺑﺪ‬ ‫ﻳﺎﻓﺘﻦ‬ ‫ﻧﺎﻳﺎﻓﺘﻦ‬ ‫ﮐﺮﺍﻥ‬ ‫ﺳﺒﮏ‬ ‫ﻓﺮﺍﻏﺖ‬ ‫ﺷﺘﺎﺏ‬ ‫ﺟﻴﺮ‬ ‫ﴍ‬ ‫ﺭﻭﻧﻖ‬ ‫ﴍ ﻣ ﻨﺪ ه‬ ‫ﺑﺎﻻ ﺭﻓﺘﻦ‬ ‫ﻓﺮﻭ ﺭﻓﺘﻦ‬ ‫ﺯﺷﺖ‬ ‫ﺧﻮﺏ‬

粗

718 719 720 721 722 723

‫ﻗﻮﯼ‬ ‫ﺿﻌﻴﻒ‬ ‫ﺑﻬﺎ‬ ‫ﭘﺮ‬ ‫ﺑﺎﻳﺴﺘﻦ‬ ‫ﮐﻔﺎﻳﺖ‬

強

724

‫ﺑﻠﻨﺪ‬

髙

近開閉精

長短大小多寡如今若舊完全興敗好歹得失重輕閑忙善惡榮辱升沉醜俊

弱價滿用成

easy far near open close refined; energy; essence coarse long short big small much; many few be like; as now if; be like former; old complete entire interest defeat; fail good bad get lose heavy light idle; vacant busy virtuous evil glory disgrace ascend; go up sink; fall ugly talented; handsome strong weak price; value full use accomplish; become; achievement high

ɔ/ɑ.sɑ.ən tu.• nɑ.ʦɿ.ti.• kʰu.ʂɑ.tan puɛ.sɿ.tʰan •.•.sɿ.tʰɛ

a.sa.ən tu.əɹ na.ʦɹ̩.ti.kʰɤ kʰu.ʂa.tan pai.sɹ̩.tʰan tu.lu.sɹ̩.tʰɤ

осон дур наздик кушодан бастан дуруст

•.•.•.tʰɛ tɛ.•.ʦɿ kʰuɔ.tʰɑ.xɛ •.•.ən xu.•.tɛ pi.si.•.• kʰan.tʰɛ.• tʂʰu.ən ɔ/ɑ.•.nu.ən ɔ/ɑ.kɛ.• kʰuɔ.xɛ.nɑ tʰɛ.muɑ.•.tʰɛ •.muɑ.lɛ •.kɛ.puɑ.lɛ •.tɛ.puɑ.• xuɛ.• puɛ.tɛ •.piu.tʰan nɑ.•.piu.tʰan kɛ.•.ən sɛ.•.• puɑ.•.•.tʰɛ •.tʰɑ.• xai.• •.• lɒʊ.nɑ.kɛ ʂɛ.•.muan.tɛ puɑ.•.lɛ.piu.tʰan piu.luɔ.lɛ.piu.tʰan •.•.tʰɛ •.•

tu.lu.ʂɹ̩.tʰɤ tɤ.la.ʦɹ̩ kʰɤ.tʰa.xei kʰɤ.la.ən xu.əɹ.tɤ pi.ɕi.ja.əɹ kʰan.tʰɤ.əɹ tʂʰu.ən a.kʰɤ.nu.ən a.kɤ.əɹ kʰɤ.xei.na tʰɤ.ma.mwo.tʰɤ kʰɤ.ma.lɤ ji.kɤ.pa.lɤ ji.tɤ.pa.əɹ xwo.ʂɹ̩ pai.tɤ ja.fu.tʰan na.ja.fu.tʰan kɤ.la.ən sɤ.pu.kʰɤ fa.la.ɤ.tʰɤ ʂɹ̩.tʰa.pu xai.əɹ ʂɤ.əɹ lɑu.na.kɤ ʂɤ.əɹ.man.tɤ pa.la.lɤ.fu.tʰan fu.lwo.lɤ.fu.tʰan ʨi.ʂɹ̩.tʰɤ xu.pu

дурушт дароз кӯтоҳ калон хурд бисёр камтар чун акнун агар кӯҳна / куҳан тамомат камол иқбол идбор хуш бад ёфтан наёфтан гарон / гирон сабук фароғат шитоб хайр шар / шарр равнақ шарманда боло рафтан фурӯ рафтан зишт хуб

起法夜忒

•.• ʦɛ.•.piu puɛ.xɑ •.• puɑ.i.sɿ.tʰan kʰi.puɑ.iɛ.tʰɛ

kai.ɥy ʦɤ.ɤ.fu pai.xa pʰu.əɹ pa.ji.sɹ̩.tʰan ʨʰi.fa.je.tʰɤ

қавӣ заиф баҳо пур боистан кифоят99

百藍得

puɛ.lan.tɛ

pai.lan.tɤ

баланд

阿撒恩都兒納子底克苦沙丹百思貪堵路思忒堵路石忒得剌子科他黒克剌恩戸兒得比洗呀兒堪忒兒初恩阿克奴恩阿革兒科黒納忒媽黙忒克媽勒以革巴勒以得巴兒或石百得呀夫貪納呀夫貪革剌恩塞卜克法剌額忒石他卜兒舎兒勞納革捨兒滿得把剌勒夫貪府羅勒夫貪即石忒乎卜改迂則額夫百哈僕兒把衣思貪

99 Modern Tajik кифоят ‘sufficiency; contentment’.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

725 726 727 728 729 730 731 732 733 734 735

‫ﭘﺴﺘﯽ‬ ‫ﺭﺍﺳﺘﯽ‬ ‫ﺩﻭﺭﻍ‬ ‫ﻓﺮﻳﻖ‬ ‫ﺟﻤﺎﻋﺖ‬ ‫ﻣﻴﻞ‬ ‫ﻋﺪ ﻝ‬ ‫ﮐﺪ ﺷ ﺘ ﻦ‬ ‫ﺭﻧﺞ‬ ‫ﻭﺭﺯﻳﺪﻥ‬ ‫ﻣﺸ ﻔ ﻖ‬

低真

736 737 738 739 740 741

‫ﺳﻄﱪ‬ ‫ﺗﻨﮏ‬ ‫ﺍﻧﺘﻈﺎﺭ‬ ‫ﺳ ﺒﺐ‬ ‫ﮐﺪﺍﻡ‬ ‫ﺻﺪﻕ‬102

厚

742 743 744 745 746

‫ﻣ ﻨﻊ‬ ‫ﺭﻫ ﺎ‬ ‫ﺭﺳﻴﺪﻥ‬ ‫ﺑﺪﻳﻦ‬ ‫ﻓﻬﻢ‬

阻放

747 748 749 750 751 752 753 754 755 756 757

‫ﻧﻮ‬ ‫ﻣﺠﻤﻮﻉ‬ ‫ﻭ ﻟ ﻴﮑ ﻦ‬ ‫ﺳﻼ ﻣﺖ‬ ‫ﮐﺮﺩﺍﻧﻴﺪﻥ‬ ‫ﻣﻘﺼﻮﺩ‬ ‫ﺟﮑﻮﻧﻪ‬ ‫ﻏﺮﺑﺪه‬ ‫ﺳﻴﺎﺧﺖ‬ ‫ﺁﺭﺍﺳﺘﻦ‬ ‫ﭘ ﺎﮎ‬

新

假分聚私公過傷積孝

薄等因誰施

至此省

總然安轉縁故怎生喧嘩逰翫齊整乾浄

low true; genuine false; fake divide; division assemble private public; fair pass wound; injury accumulate filial (piety); mourning thick thin wait cause who bestow; grant; carry out obstruct let go; discharge reach; arrive this conscious; examine oneself new put together; total but; so safe; calm turn reason how uproar; hubbub play; stroll neat; in good order clean

迫思梯剌思梯朶羅額法里革者媽額忒買勒阿得勒古得石貪藍知我兒即丹母石費革洗忒卜兒土奴克尹體咱兒塞百卜革搭恩塞得革黙納額勒哈勒洗丹百底尹法罕腦黙知母額我列欽塞剌黙忒革兒打你丹黙革蘇得初科納額兒百得洗呀塞忒104 阿剌思貪克

•.sɿ.tʰi •.sɿ.tʰi tuɔ.luɔ.• puɑ.•.kɛ tʂɛ.muɑ.•.tʰɛ muai.lɛ ɔ/ɑ.tɛ.lɛ ku.tɛ.•.tʰan lan.• ɔ.•.•.tan mu.•.•.kɛ

pʰwo.sɹ̩.tʰi la.sɹ̩.tʰi two.lwo.ɤ fa.li.kɤ tʂɤ.ma.ɤ.tʰɤ mai.lɤ a.tɤ.lɤ ku.tɤ.ʂɹ̩.tʰan lan.tʂɹ̩ wo.əɹ.ʨi.tan mu.ʂɹ̩.fei.kɤ

пастӣ ростӣ дурӯғ фариқ ҷамоат майл100 адл101 гузаштан ранҷ варзидан мушфиқ

si.tʰɛ.•.• tʰu.nu.• •.tʰi.•.• sɛ.puɛ.• kɛ.tɑ.ən sɛ.tɛ.kɛ

ɕi.tʰɤ.pu.əɹ tʰu.nu.kʰɤ jin.tʰi.ʦa.əɹ sɤ.pai.pu kɤ.ta.ən sɤ.tɤ.kɤ

ситабр тунук интизор сабаб кадом садақа

•.nɑ.• lɛ.xɑ lɛ.si.tan puɛ.ti.• puɑ.xan

mwo.na.ɤ lɤ.xa lɤ.ɕi.tan pai.ti.jin fa.xan

манъ раҳо расидан бадин103 фаҳм

• •.•.mu.• ɔ.liɛ.kʰin sɛ.•.•.tʰɛ kɛ.•.tɑ.ni.tan •.kɛ.su.tɛ tʂʰu.kʰuɔ.nɑ •.•.puɛ.tɛ si.•.sɛ.tʰɛ ɔ/ɑ.•.sɿ.tʰan pʰuɑ.•

nɑu mwo.tʂɹ̩.mu.ɤ wo.lje.ʨʰin sɤ.la.mwo.tʰɤ kɤ.əɹ.ta.ni.tan mwo.kɤ.su.tɤ tʂʰu.kʰɤ.na ɤ.əɹ.pai.tɤ ɕi.ja.sɤ.tʰɤ a.la.sɹ̩.tʰan pʰa.kʰɤ

нав маҷмӯъ валекин саломат гардонидан мақсуд чӣ гуна арбада сиёсат / саёҳат оростан пок

47

100 Modern Tajik майл ‘inclination; liking’. 101 Modern Tajik адл ‘justice; truthfulness’. 102 This might be a misspelt ‫ ﺻﺪﻗﺎﺕ‬whose meaning is “alms” according to Steingass (2012: 784). 103 Modern Tajik бадин ‘to, in, or with this’. 104 Judging from Beijing tushuguan guji chuban bianji zu (1987–[1994]: 514, 571), this 洗呀塞忒 may be a misspelt 洗呀黒忒, whose reconstructed Ming-period Beijing Chinese pronunciation and Standard Chinese pronunciation are /si.•.xɛ.tʰɛ/ and /ɕi.ja.xei.tʰɤ/, respectively. Both сиёсат ‘politics’ and саёҳат ‘travel’ are presented in the far-right cell of this row because the pronunciation of the former resembles the Chinese pronunciation of 洗呀塞忒 while the latter appears to be congruent with the reading of 洗呀黒忒 as well as with the spelling of the entry word.

48

Shinji Ido

758 759 760 761 762 763

‫ﭘﺪﺭﻓﺘﺎﺭ‬ ‫ﺑﻴﮑﺎﺭ‬ ‫ﺧﻮﺩﻣﺮﺍﺩ‬ ‫ﮐﻮﺝ‬ ‫ﺑ ﺎ ﮎ ﻧ ﻴﺴ ﺖ‬ ‫ﭘﺎﺳﮑﻮﻧﻪ‬

舉保無用

764 765 766 767 768 769

‫ﺗﺮﺗﻴﺐ‬ ‫ﻋﺎﺩﺕ‬ ‫ﺑﻨﻬﺎﻥ‬ ‫ﺑﻬﺎﻧﻪ‬ ‫ﻗﻨﺎﻋﺖ‬ ‫ﺁﺭﺍﻣﻴﺪﻥ‬

次序慣曽

order; sequence custom 遮蔵 hide 推辭 decline 守分 know one’s place 重 steady

忒兒梯卜阿得忒

770 771

‫ﮐﺮﭼﻪ‬ ‫ﻏﻴﺮﺕ‬

雖是發志

772 773 774 775 776 777

‫ﻫﻨﺮ‬ ‫ﺩﻝ ﮐﻤﺎﺩﯼ‬ ‫ﺧﺪ ﻣ ﺖ‬ ‫ﺗﻘﺪﻳﻢ‬ ‫ﺍﺑﺮﺍﻟﺪﻫﺮ‬ ‫ﺁﻣﺎﻥ‬

本事

自由遷更無妨顛倒

用心侍奉進貢永遠太平

?105 useless free ?107 would do no harm turn upside down

although stimulate aspiration capability intention; take care tend to; look after pay tribute to forever peaceful and tranquil

•.tɛ.lɛ.piu.tʰɑ.• piɛ.•.• xuɛ.tɛ.mu.•.tɛ kʰuɔ.• puɑ.•.miɛ.sɿ.tʰɛ pʰuɑ.sɿ.kʰuɔ.nɑ

pʰwo.tɤ.lɤ.fu.tʰa.əɹ pje.ka.əɹ xwo.tɤ.mu.la.tɤ kʰɤ.tʂɹ̩ pa.kʰɤ.mje.sɹ̩.tʰɤ pʰa.sɹ̩.kʰɤ.na

阿剌米丹

tʰɛ.•.tʰi.• ɔ/ɑ.tɛ.tʰɛ pʰuan.xɑ.ən puɛ.xɑ.nɑ kɛ.nɑ.•.tʰɛ ɔ/ɑ.•.mi.tan

tʰɤ.əɹ.tʰi.pu a.tɤ.tʰɤ pʰan.xa.ən pai.xa.na kɤ.na.ɤ.tʰɤ a.la.mi.tan

革兒赤矮勒忒

kɛ.•.• ai.lɛ.tʰɛ

kɤ.əɹ.tʂʰɹ̩ ai.lɤ.tʰɤ

пазируфтор106 бекор × кӯч бок нест божгуна / бозгуна / бошгуна108 тартиб одат пинҳон баҳона қаноат оромидан / орамидан гарчи ғайрат

虎納兒

xu.nɑ.• ti.lɛ.ku.muɑ.• xɛ.tɛ.•.tʰɛ tʰɛ.kɛ.ti.• ɔ/ɑ.•.•.tɛ.xɛ.• ɔ/ɑ.muɑ.ən

xu.na.əɹ ti.lɤ.ku.ma.li xei.tɤ.mwo.tʰɤ tʰɤ.kɤ.ti.jin a.pu.twən.tɤ.xei.əɹ a.ma.ən

ҳунар дилгуморӣ хидмат / хизмат тақдим абадуддаҳр амон

迫得勒夫他兒別噶兒或得母剌得科知巴克乜思忒思科納

潘哈恩百哈納革納額忒

的勒古媽里黒得黙忒忒革的尹阿卜鈍得黒兒阿媽恩

As is evident from Table 1, New Persian written in Chinese script provides different (more detailed) information about New Persian vowels than is obtainable from Arabic-script sources. Thanks to this information, we can see, for example, that Timurid Persian, unlike present-day Iranian Persian (see Miller 2012: 167), contrasts the two vowels corresponding with present-day Tajik /i/ and /e/, whose orthographic representations are ‹и›/‹ӣ› and ‹е›, respectively (Table 2).109

105 舉保 may mean ‘recommend’, judging from the meanings of the two glyphs that it comprises. 106 Modern Tajik пазируфтор means ‘accepter; surety’. 107 遷更 may mean ‘migrate’ judging from the meaning of the two glyphs that it comprises and the meaning of Modern Tajik кӯч ‘migration’, which, incidentally, is a Turkic loanword. 108 These would be spelled ‫ﺑﺎﺯﮔﻮﻧﻪ‬, ‫ﺑﺎﮊﮔﻮﻧﻪ‬, and ‫ ﺑﺎﺷﮕﻮﻧﻪ‬in the Perso-Arabic writing system. 109 Note that this correspondence is not absolute. For instance, ‫( ﻧﺒﯿﺮه‬entry number 183) whose present-day Tajik cognate набера is transcribed as 納必勒 /nɑ.pi.lɛ/ in the Berlin Manuscript.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

49

Table 2: A vowel contrast shared by Timurid Persian in Chinese-script transcription (/i/ with /iɛ/) and Tajik (/i/ with /e/)

4 332 46 104 102 275 278 307 692 556 654 434 575 350 428 461 639 759

Entry word Timurid Persian in Chinese-script transcription in zazi

Tajik

Iranian Persian

‫ﺳﺘﺎﺭه‬ ‫ﺳﻴﻨﻪ‬ ‫ﺯﻣﯿ ﻦ‬ ‫ﺯﻣﺴﺘﺎﻥ‬ ‫ﺗﺎﺑﺴﺘﺎﻥ‬ ‫ﺧﺴﺒﻴﺪﻥ‬ ‫ﻃﻠﺒﻴﺪﻥ‬ ‫ﺑﻴﻨ ﯽ‬ ‫ﺑﺴﻴﺎﺭ‬ ‫ﺳﻴﺮ‬ ‫ﺳﻪ‬ ‫ﻣﻴﻮه‬ ‫ﺁﻣﻴﺨﺘ ﻦ‬ ‫ﺑﻴﻤﺎﺭ‬ ‫ﺑﻴ ﺪ‬ ‫ﺑﻴ ﺦ‬ ‫ﺑﻴﺮﺭﻭﻥ‬ ‫ﺑﻴﮑﺎﺭ‬

ситора сина замин зимистон тобистон хусбидан талбидан бинӣ бисёр сер се мева омехтан бемор бед бех берун бекор

setāre sine zamin zemestān tābestān xosbidan talabidan bini besyār sir se mive āmixtan bimār bid bix birun bikār

洗他勒洗納則米尹即米思他恩他比思他恩虎思比丹忒勒比丹比你比洗呀兒

兒滅斡阿滅黒貪別媽兒別得別黒別魯恩別噶兒

si.tʰɑ.lɛ si.nɑ ʦɛ.mi.• •.mi.sɿ.tʰɑ.ən tʰɑ.pi.sɿ.tʰɑ.ən xu.sɿ.pi.tan tʰɛ.lɛ.pi.tan pi.ni pi.si.•.• siɛ.• siɛ miɛ.• ɔ/ɑ.miɛ.xɛ.tʰan piɛ.muɑ.• piɛ.tɛ piɛ.xɛ piɛ.lu.ən piɛ.•.•

Table 1 also provides data on the nominal morphology of Timurid Persian. For instance, it reveals an apparent absence of (or vowel reduction in) the ezāfe marker in a number of entry words. Notice how ‫‘ ﺧﻂ‬script writing’ is transcribed identically in entries 614 and 623 (Table 3). Table 3: Presence and absence of the ezāfe marker in the Chinese-script transcription of zazi

345 130 614 623

Entry word in zazi

Timurid Persian in Chinese-script transcription

‫ﺁﻓﺖ‬ ‫ﺁﻓﺖ ﺧﺸﮏ‬ ‫ﺧﻂ‬ ‫ﺧﻂ ﺗﺨﻘﻴﻖ‬

阿法忒阿法梯戸石克黒忒黒忒忒黒革革

ɔ/ɑ.puɑ.tʰɛ ɔ/ɑ.puɑ.tʰi.xu.•.• xɛ.tʰɛ xɛ.tʰɛ.tʰɛ.xɛ.kɛ.kɛ

Tajik офат офати хушк хатт хатти таҳқиқ

In addition, the Chinese-script transcritption of zazi allows an estimation of the positions of Timurid Persian vowels relative to one another. This has been attempted in a recently published paper (Ido 2015: 127–128), from which Figure 1 is reproduced with little modification. In conclusion, Table 1, in which entries in zazi are augmented by reconstructed Ming-period Beijing Chinese readings, provides a wealth of linguistic information on the variety of New Persian that was current in the Timurid court in Samarkand about 600 years ago, and hence is particularly useful in the study of the historical development of New Persian.

50

Shinji Ido

Figure 1: Timurid Persian vowels as they are reflected in the Chinese-script transcription system used in type 1 zazi

References Atwood, Christopher P. 2004. Encyclopedia of Mongolia and the Mongol empire. New York: Facts on File, Inc. ̊ In Martijn Theodoor Houtsma, Thomas Walker Arnold, René Barthold, W. 1987. Khānbaliḳ. Basset & Richard Hartmann (eds.), E. J. Brill’s first encyclopaedia of Islam, 1913–1936 (photomechanical reprint edition), 898–899. Leiden: E. J. Brill. Beijing tushuguan guji chuban bianji zu. (ed.). 1987–[1994]. Beijing tushuguan guji zhenben congkan 6: huayi yiyu [Beijing Library collection of ancient books and rare editions 6: huayi yiyu]. Beijing: Shumu Wenxian Chubanshe. Clauson, Gerald. 1972. An etymological dictionary of pre-thirteenth-century Turkish. Oxford: Clarendon Press. Doerfer, Gerhard 1965. Türkische und mongolische Elemente im Neupersischen, vol. II, Türkische Elemente im Neupersischen. Wiesbaden: Steiner. Franke, Herbert. 1966. Sino-Western contacts under the Mongol empire. Journal of the Hong Kong Branch of the Royal Asiatic Society 6. 49–72. Haw, Stephen G. 2014. The Persian language in Yuan-dynasty China: A reappraisal. East Asian history 39. 5–32. Honda, Minobu. 1963. ‹Kaikaikan yakugo› ni tsuite [On huihuiguan yiyu]. Hokkaidō daigaku bungakubu kiyō 11. 1–73. Ido, Shinji. 2015. New Persian vowels transcribed in Ming China. In Matteo De Chiara & Evelin Grassi (eds.), Iranian languages and literatures of Central Asia: From the 18th century to the present, 99–136. Paris: Association pour l’Avancement des Études Iraniennes. Jarring, Gunnar. 1964. An Eastern Turki-English dialect dictionary. Lund: C. W. K. Gleerup. Kleeman, Julie & Harry Yu (eds.). 2010. Oxford Chinese dictionary. Oxford: Oxford University Press. Kuroyanagi, Tetsuo. 1984. Perushiago no hanashi [An account of the Persian language]. Tokyo: Daigaku Shorin. Lin, Yen-Hwei. 2007. The sounds of Chinese. Cambridge: Cambridge University Press.

Huihuiguan zazi: A New Persian glossary compiled in Ming China

51

Liu Yingsheng. 2008. «Huihuiguan zazi» yu «huihuiguan yiyu» yanjiu [A study on two medieval Sino-Persian clossaries (sic)]. Beijing: Renmin University of China Press. Lu, Zhiwei. 1988. Ji Xu Xiao «zhongding sima wengong dengyun tujing» [A note on Xu Xiao’s zhongding sima wengong dengyun tujing]. In Lu Zhiwei jindai hanyu yinyun lunji, 54–84. Beijing: Commercial Press. Miller, Corey. 2012. Variation in Persian vowel systems. Orientalia Suecana 61. 156–169. Muhammadiev, Mardon. 1975. Luġati muxtasari sinonimhoi zaboni tojikī [A concise dictionary of Tajik synonyms]. Dushanbe: Maorif. Nagashima Eiichirō. 1941. Kinsei shinago toku ni hoppōgo keitō ni okeru on’inshi kenkyū shiryō ni tsuite (zoku) [A review of the sources for the study of the early modern Chinese phonological history with a focus on northern varieties of Chinese (part 2)]. Gengo kenkyū 9. 17–79. Norman, Jerry. 1998. Chinese. Cambridge: Cambridge University Press. Odilov, Nodir. 1974. Mirovozzrenie Džalaliddina Rumi [The world-view of Jalaluddin Rumi]. Dushanbe: Izdatel’stvo «Irfon». Paul, Ludwig. 2013. Early New Persian. Encyclopædia Iranica. http://www.iranicaonline.org/ articles/persian-language-1-early-new-persian (accessed 30 March 2015). Pullyblank, Edwin G. 1991. Lexicon of reconstructed pronunciation in early Middle Chinese, late Middle Chinese, and early Mandarin. Vancouver: UBC Press. Satoh, Akira. 1981. Chūko tōkōsetsu nisshōji to pekingo kōgo’on [The literary and colloquial pronunciations in the Peking dialect: With focus on the ru-sheng words of the Ancient Chinese zeng and geng rime-groups]. Yokohama kokuritsu daigaku jinbun kiyō dai 2 rui gogaku bungaku 28. 43–64. Sayd, Mustafa Ajan. 2009. Dari-English dictionary. Hyattsville: Dunwoody Press. Siyiguan. 2013 [1579]. Hua-i-yi-yü. Digitalisierte Sammlungen der Staatsbibliothek zu Berlin. http://resolver.staatsbibliothek-berlin.de/SBB000103AF00000000 (accessed 3 August 2015). Steingass, Francis Joseph. 2012. A comprehensive Persian-English dictionary (Nataraj edition). Springfield: Nataraj Books. Šukurov, Мuhammad Šarifovič, Vladimir Аleksandrovič Kapranov, Rahim Hošim & Nosirjon Аsadovič Ма”sumī. 1969а. Farhangi zaboni tojikī (az asri Х to ibtidoi asri ХХ) I [A dictionary of the Tajik language (from the tenth century to the beginning of the twentieth century) I]. Moscow: Našriëti «Sovetskaja Ènciklopedija». Šukurov, Мuhammad Šarifovič, Vladimir Аleksandrovič Kapranov, Rahim Hošim & Nosirjon Аsadovič Ма”sumī. 1969b. Farhangi zaboni tojikī (az asri Х to ibtidoi asri ХХ) II [A dictionary of the Tajik language (from the tenth century to the beginning of the twentieth century) II]. Moscow: Našriëti «Sovetskaja Ènciklopedija». Tōdō, Akiyasu. 1957. Chūgokugo on’inron [Chinese phonology]. Tokyo: Kōnan Shoin. Tōdō, Akiyasu & Yoshimitsu Kanō (eds.). 2005. Gakken shin kanwa daijiten [Gakken great dictionary of Sino-Japanese]. Tokyo: Gakushū Kenkyūsha. Utas, Bo. 2006. A multiethnic origin of New Persian? In Lars Johanson & Christiane Bulut (eds.), Turkic-Iranian contact areas: Historical and linguistic aspects, 241–251. Wiesbaden: Harrassowitz Verlag. Ye, Baokui. 2001. Mingqing guanhua yinxi [Ming-Qing Mandarin phonology]. Xiamen: Xiamen University Press.

Adriano V. Rossi

3 Glimpses of Balochi lexicography: Some iconyms for the landscape and their motivation Abstract: The speakers of any language, even if at a small extent, concur to change the lexicon, which they have inherited as a whole. They are driven to do that by the necessity of naming something new or optimizing the onomasiological salience of already existing words, with a continuous changing in the way they express concepts. In order to avoid an overloading of the memory system, they are encouraged to recycle what is already existent in the lexicon. Through a small set of associative strategies, people relate a concept which has already been verbalized, with another one which has to be verbalized, producing lexical changes. Over time, however, the conceptual motivation which originated a particular designation becomes obscure to speakers. Large scale lexical surveys aid us in discovering recurrent schemas of designating a concept and recovering the relevant motivation for each designation, i.e. its ‘iconym’ (the Engl. term iconym has been currently utilized, e.g., by Joachim Grzega in his contributions to Onomasiology Online). In the general framework of cognitive onomasiology, I have been carrying out since the 1990s (at L’Orientale University, Naples) a project aimed at singling out the different ‘pathways’ through which natural physical concepts have been designated in the Iranian languages, in order to get insight into the way Iranian speaking peoples have perceived and conceptualized the physical environment which they concurred to change with their millenary activities. There are several types of associative relations on which lexical innovation relies on; one of these is similarity. The best known process based on similarity is that of metaphor, a process through which we speak of a concept in terms of another, and whose main lines are similarity of shape, similarity of spatial configuration, functional similarity, etc. Since human beings perceive their bodies as an interface between themselves and the surrounding world, the body part lexicon overlaps in many points with those of other conceptual domains; first of all, with the lexicon used to describe the environment.

Adriano V. Rossi, University L’Orientale DOI 10.1515/9783110455793-004

54

Adriano V. Rossi

Metaphorical mappings involving human (or animal) body parts as a source, and elements of the landscape as a target, are commonly found in most Iranian languages. Object of this paper will be a selection of Balochi terms for parts of the human body, variously related to terms used to describe the landscape, studying them from an etymological and areal perspective. Keywords: linguistics, lexicography, body part terms, Iranian linguistics, Balochi

1 Generalities The speakers of any language can, at any time, concur to make changes (however minor) in the lexicon they have inherited. They are driven to do that by the necessity of naming something new or optimizing the onomasiological salience of existing words. In order to avoid overloading the memory system, they are encouraged to recycle existing words in the lexicon. Through small associative strategies, people relate a concept that has already been verbalized with another one that has yet to be verbalized, producing lexical changes. Over time, however, the conceptual motivation that originated a particular designation becomes obscure to speakers. Large-scale lexical surveys aid us in discovering recurrent schemas of designating a concept and recovering the relevant motivation for each designation, i.e., its ‘iconym’.1 In the general framework of cognitive onomasiology, we have outlined with our research team at L’Orientale University2 a project aimed at singling out the 1 Cf. Filippone (2006, 365). The English term iconym, first introduced by Alinei (1997), has been currently utilized in the subsequent years, particularly by Joachim Grzega in his contributions to Onomasiology Online. Alinei’s original definition is as follows: [B]ecause of the importance of the role of motivation in the genesis of words, I have recently proposed calling it iconym (from icon + ‘name’, with the derivations “iconymy”, “iconymic”, and “iconomastic”), in order to avoid using the much too ambiguous and generic term “motivation” (Alinei 1997c). Only a few linguists, unfortunately, have recently discussed some theoretical aspects of iconymy (e.g., Lakoff and Johnson (1980), Lakoff (1987) and Sanga (1997)). [. . .] This is exactly what iconyms do, by “representing”, as it were, whole concepts. Any new concept that in the process of social communication has become standardized, can thus be collapsed, by means of iconyms, into a new word, allowing us to enrich our knowledge, without changing our abstract, mental categories (Alinei 2003: 108‒109).

2 This research is carried out within the frame of the Ethnolinguistics of the Iranian area Project (no. 9710425417), also drawing on lexical material from the Archives of the Comparative Etymological Balochi Dictionary Project (no. MM10422399, hereinafter referred to as Archive), both directed by myself and funded by the Italian Ministry of Education at L’Orientale University, Naples.

Glimpses of Balochi lexicography

55

different ‘pathways’ through which natural physical concepts have been designated in the Iranian languages, in order to get insight into the way Iranian-speaking peoples – and particularly Balochi-speaking peoples – have perceived and conceptualized the physical environment that they concurred to change with their millenary activities. This research has been carried out since the 1990s within the frame of the Ethnolinguistics of the Iranian area project and the Comparative etymological Balochi dictionary project, both of which I direct and which are funded by the Italian Ministry of Education at L’Orientale University, Naples.3 To accomplish this work, many years ago we began gathering the relevant lexicon in the Iranian languages, using as sources mostly dictionaries and glossaries and, for a few languages, mostly Western Iranian (including Balochi), information provided by native speakers. The corpus produced so far contains several thousand words of a remarkable interest, many hundreds of which refer to different dialects of Balochi.

1.1 Metaphorical mappings involving human/animal body parts as a source in Balochi There are several types of associative relations on which lexical innovation relies on; one of these is similarity. The best-known process based on similarity is that of metaphor, a process through which we speak of a concept in terms of another. Since human beings perceive their bodies as an interface between themselves and the surrounding world, the body-part lexicon overlaps in many points with those of other conceptual domains; first of all, with the lexicon used to describe the environment. Metaphorical mappings involving human (or animal) body parts as a source, and elements of the landscape as a target, are commonly found in most Iranian languages.4 3 Two previous studies conducted within this framework are Filippone (2006, 2010), to which the reader is referred. Since the introduction of this methodology in the Iranian studies originates from joint research of Prof. Filippone and myself, practically every concept hinted there (and in many other places) stems from a shared vision (even if not explicitly stated). It is consequently a pleasure for me to state here how much I am indebted to Prof. Filippone for her invaluable support in our common research (and in my life). Special thanks are also due to my former pupil, Dr. Gerardo Barbera, for important unpublished materials from the Bashkardi area. 4 This terminology is according to Lakoff (1987, 276). Conceptual Metaphor Theory, sometimes called Cognitive Metaphor Theory, was developed by researchers within the field of cognitive linguists. Recent developments within this field are treated by Kövecses (2002, 2005) and Evans and Green (2006).

56

Adriano V. Rossi

The body-part lexicon overlaps in many points with those of other conceptual domains. First of all, there is the lexicon used to describe the environment. Metaphorical mappings involving human (or animal) body parts as a source, and elements of the landscape as a target, are commonly found in most languages, including Balochi. In the framework of modern onomasiology, which operates in the light of cognitive linguistics, I concentrated on the “pathways” through which different concepts for parts of the landscape have been verbalized, going back (when possible) to the respective source concepts. This article will describe a small selection of Balochi terms for parts of the human body, variously related to terms used to describe the landscape, which will be analyzed from an etymological and areal perspective. The most common Iranian terms having similar usage, such as sar, pād, nyām, are not included in this article since they have been at least hinted at – even if frequently in a simplistic way5 – in the iconomastic studies in Iranological literature; a few relatively marginal terms – most of which are unknown even to scholars working in (Indo-)Iranian dialectology – will be briefly treated in order to give an idea of the methodological approach of the research.

2 Iconym: “parts of the body indicating the same relative position in the body as that of a single locational feature in a salient object of the landscape” (1)

Bal. barbūnz ‘hillock’ [= sunṭ] (Mitha Khan Marri and Surat Khan 1970: s.v.) ♦ EastBal. barbūnz ‘peak, summit’ (Ahmedzai n.d.: s.v.), cf. Psht. wərb’uz, Wan. warbīz ‘muzzle, snout; spur of a hill’ (according to Morgenstierne [2003]: s.v. Psht. wərb’uz < *fra- + poza- ‘nose’), Prs. bar-pōz ‘the parts around the mouth’, Bal. būz, būnz ‘the animal’s pointed mouse’, with the same composition pattern as bar-dast ‘shoulder-blade’, bar-čānk ‘hand, fist, hilt of sword’, bar-gaṛ ‘hole, pit’, etc., either inner-Balochi, or Pre-Balochi. (Razzaq, Buksh, and Farrell 2001: s.v.) consider sunṭ (q.v.) as a synonym of barbūnz (East Balochi, from Mitha Khan Marri and Surat Khan 1970).

5 Surely not in the case of Wilhelm Eilers, who was an outstanding pioneer in this field of Iranology (cf. his research on the subject in the bibliographies contained in Eilers [1987, 1988]).

Glimpses of Balochi lexicography

57

Here the pathway seems to go from THE LOWER TERMINAL PART (or perhaps THE of an animal head to THE TOP OF A MOUNTAIN , if the projected function of the pointed mouse is perceived as a SPUR (the SPUR metaphor is widespread in the mountain lexicon of many different languages, independently from the iconymical history of the term spur in each language). The origin of the metaphor could be located outside Balochi if one accepts Morgenstierne’s (2003: s.v.) suggestion of a generalization of the iconym from THE PARTS AROUND THE MOUTH to WHAT IS NEAR A DOOR , MOUTH at least in Pashto (where some dictionaries give for wərb’uz ‘slave guarding a door’, ‘land in front or surrounding a gate’). POINTED MOUSE )

(2)

Bal. čūṭī ‘hair, down on the head of a baby’ (Mitha Khan Marri and Surat Khan 1970: s.v.); Bal. čūṭī ‘summit, peak (of a hill or mountain)’. Razzaq, Buksh, and Farrell (2001: s.v.) give as synonym sunṭ, ṭul, ‘peak of a mountain’ (Mayer 1909, s.v.); Br. čoṭī ‘top-knot, tuft. Crest, summit’ (thus Bray 1934: s.v.), but are all meanings really documented? Cf. Urdu čōṭī ‘a lock of hair left on the top of the head; crest of a bird; top; peak of a mountain etc.’; Si. čōṭī ‘peak of mountain’, ‘crest’, Sir. čoṭī ‘peak’ in Turner (1966: 266, no. 4883).

Four different bases are postulated by Turner (1966: 266, no. 4883) for this lexical family (possibly < Dravidian), but in any case Hindi/Urdu, Panjabi, Siraiki, Sindhi are rather homogeneous in preserving the vocalism -o- and meanings ranging from ‘topknot, crest’ to ‘top (of a tree)’, ‘peak’, etc.; Mayrhofer (1956, 3: 396) < Dravidian; Mayrhofer (1992, 1: 546) notes: “Nicht klar”, but remarks that in case of Indo-Iranian origin coḍa- ‘curl’ (epic +) should be primary as contrasted with cūḍa- ‘bulge on a brick’ (Śatapatha-Brāhmaṇa +) (“c° nur vor urspr. Diphthong lautgesetzlich, cū́ ḍa- usw. ‘mit ū für o’, AiGr I2 Nachtr 14”). Here the pathway seems to go from ANY SALIENT /PROTRUDING FEATURE ON / FROM THE HEAD OF A HUMAN /ANIMAL to THE TOP SECTION OF A MOUNTAIN . In consideration of Psht. čoṭi ‘uncombed, disheveled’, it is reasonable to assume an Indo-Iranian pressure in the iconymic process of the Indo-Iranian frontier languages, having originated in an area in which the focus was on HUMAN HEAD, a metonymic process toward the mountain lexicon. (3)

Bal. dīmag, dōmag CoBal. dūmmag ‘čammē ēkirr-o-akirr kuṭṭ itagēn hadd’ (Hashmi Baloch 2000: s.v.), with the following example: pōnzē dūmmag = pōnzē piḍḍ ‘nasal septum’); IrBal. dumbag ‘tail’, also (politely) ‘bottom’, an Archive informant from Iranshahr, but note that Bal. dumbag only means ‘tail fat (of sheep)’ (thus correctly Elfenbein [1990: s.v.] and Hashmi Baloch

58

Adriano V. Rossi

[2000: s.v.]); cf. Bal. dīm ‘back, hinder part’ (Dames 1891: s.v.); Mayer (1909: s.v.); and one Archive informant from Sibi (cf. Filippone 1996: 307); possible etymological connections of Bal. dīm with the group of Bal. dumb, homogeneously recorded as ‘tail’, are treated in Filippone (1996: 307‒308). Geographical meanings: ‘high place, ascent’ (Mitha Khan Marri and Surat Khan 1970: s.v.), also caṛhāī, buṛzaγ (Razzaq, Buksh, and Farrell 2001: s.v.: no English meaning but glossed as Bal. burzag, Urdu caṛhāī) – Archive informants: Turbat-1 dōmmag perhaps ‘foot of mountain’; Turbat-2 dūmmag ‘ridge of gwāš’ (foot, middle of a hill); ‘that part of gwāš having a šep (slope) at both sides’; Bālgitar/ Turbat dūmmag ‘mountain peak running to the plain’. Cf. Larestani domaga ‘starting part of a valley’. Eilers (1988: 291‒292) remarks that Dames (1913: 651a, 654, 657) connects the ethnonym Dōmkī to the toponym Dōmbak in Iranian Balochistan (with difficulties in explaining ō as against Bal. dum). In Balochi Race, Dames (1904: 54) connected the same ethnonym Dom(b)kī with the river Dumbak. If these ethnonyms/ toponyms have original short vowels, they might be connected with ‘tail’; in the place of settlement of the Dombkī, nothing contrasts the association ‘tail’ with ‘slope’. Also in the Pamir toponyms referring to mountain slopes containing dum ‘tail’ (dumzoj etc.) are known; cf. Junker (1930: 77–78, 96, 121). Here the pathway follows the common experience according to which if the mountain is conceived as a human/animal body, its CAUDAL SECTION is what lies at the foot of the mountain, i.e., its PIEDMONT SLOPE . The origin of the iconymic process may be pan-Iranian (cf. Larestani domaga ‘starting part of a valley’, in which ‘starting’ points to its lower layer), since Yaghnobi dumzoj quoted above is confirmed by Xromov (1975: 33 s.v.), but it seems isolated in East Iranian (all the remaining dum-toponyms quoted by Xromov are Tajik); cf. Ossetic dymæg/ dumæg in the translated meaning of ‘kraj’, konec’ as stated in Abaev (1958: s.v.). (4) Bal. kaš(š), recorded as ‘armpit’ in Mayer (1909: s.v.) and Dames (1891: s.v.), mainly refers to the ‘side of the body’ or to the ‘lateral area just under the ribs’, as in Barker and Mengal (1969: s.v.), Elfenbein (1990: s.v.), Razzaq, Buksh, and Farrell (2001: s.v.). Notwithstanding its lexicographical attestations, it seems to be unknown among the East Balochi speakers (Filippone [1996: 311 and n. 80]; Archive). Bal. kaš in the sense of ‘beside’ enters the series of locatives that prototypically refer to the human body sides and the area adjacent to them, i.e., pahnadā/ bagalā/kašā. While pahnadā/bagalā are found almost everywhere in Balochi,

Glimpses of Balochi lexicography

59

kašā belongs only to the Southern Balochi lexicon (perhaps also accepted by Western Balochi speakers, even if not actively used by them; cf. Filippone [1996: 190]). Bal. kaš(š) belongs to the lexical family of Av. kaša- ‘armpit’, and is commonly considered of Indo-Iranian origin, cf. Sanskrit (Atharva Veda- Saṁhitā) kákṣa‘armpit’ (Mayrhofer [1992: 288], Middle Iranian cognates in Bailey [1979: 56b]). In Balochi it is probably a loanword < Persian, as already stated by Geiger (1891: 453, no. 130). In view of RaBal. kač(č) ‘thigh’ (Rzehak and Naruyi 2007: s.v.); Bshk. kač ‘gluteus muscle’, kačak kert ‘to embrace’ (syn. baγal kert) Barbera (n.d.: s.v.), Sist. kač ‘thigh’, and its geographical projections: kačč ‘bank of river’ (Elfenbein 1990: s.v.); EastBal. kaččh ‘a piece of flat alluvial ground near the bank of a torrent below the rocks’ (Dames 1907: 120); EastBal. kaččh ‘cultivated land by the side of the river; an island’ (Mitha Khan Marri and Surat Khan 1970: s.v.); RaBal. kič ‘small pieces of land near the bank of a stream’ (Ahmedzai (n.d.: s.v., = Brahui according to Ahmedzai, not in Bray [1934: s.v.]); IrBal. kačč ‘meadow’ Archive informant from Iranshahr, a (rather old) connection between the kačč and kašš families is highly probable, and it also possible that it originates in the epic Sanskrit variant kacchā- of classic Skt. kakṣā- ‘girdle’. Skt. (MBh) kaccha- ‘bank, shore, marshy ground’ is continued in Pa. kaccha- ‘marshy land’, Pkt. kaccha- ‘bank, flooded forest’. New Indo-Aryan (Turner 1966, no. 2618) has Si. kaco ‘low alluvial lands lying below a bank or hill or lately thrown up by river’; Sir. kachhā ‘land subject to inundation; alluvial low-lying land where tamarisk grows’, kachhī ‘the alluvial valley of the Indus’, and other derivatives with similar meanings. Eilers (1988: 297, 368 n. 226) had already remarked that in the Balochi area, a series of geographical names containing kač (e.g., Bābarkač, Kačhī, Nīlīkač, Rūdīān Kač, Kacha Dāman) refer to ‘depressions, lowlands’. Since Dzadr. kackay ‘terrain se trouvant à proximité d’un cours d’eau’ (Septfonds [1994, s.v.]) and Wan. kucaṇá ‘armpit’ (according to Morgenstierne [1930: 168; 2003, s.v.] < Khetrani kucəṇī ́ ‘armpit’) face up Psht. kš́e, Wan. če ‘in’, a specialization of kač for the geographical meaning and kaš for the body/ locative lexicon may have arisen in the Balochi area from an older distribution in which the IA kač-outcomes spreading westwards along the Ocean coast superseded the Ir. kaš-outcomes. The base denotes the SIDE OF A BODY, and in three-dimensional objects it refers to the two lateral surfaces (if an intrinsic axe is perceived) or to all the vertical axes (if there is none); in bidimensional objects it generically indicates a relationship of proximity, with an emphasis on the localization of the object in the area ‘(partially) encircled’ by the ground.

60

Adriano V. Rossi

3 Iconym: “parts of the body presenting functional similarity/similar shape to geographical features” (5)

Bal. kump ‘hunchbacked/hump’ (Hashmi Baloch 2000: s.v.); Ahmedzai (n.d.: s.v.) kub ‘hunchbacked’; kubbī (a) ‘bent’; (b) ‘crookedness’; kubbō wang ‘person with a bent back’; Brahui kōmp ‘hump’ (according to Elfenbein 1983: 199 < Bal.); Bal. kumbīγ ‘truffle’ (type POT according to Morgenstierne [1973: 18], but probably type HUMP according to Rossi 2016: 217); Psht. kūp ‘crooked, bent in the back’; kūpaey, kūbaey ‘a hunchback’; kwab ‘hump’ – Indian words like Panǰ. kubb, Hind. kub ‘hump’, Panǰ., Sir. kubbā ‘hump-backed’ have influenced the Pashto forms according to Morgenstierne (1927: s.v.), while most recently Morgenstierne (2003: s.v.) separates the Iranian family of Psht. kwab (< Ir. *kaupa-) from IA *kubba- of Turner (1966, no. 3301); Parachi kūmbū ‘shoulder’ (as protruding from the body?) may (or may not) be connected. Geographically, we have Bal. kump ‘hillock’, also toponym indicating hillocks: kōp in dokop, gwarkop (from Makran Gazetteer) ‘place-names’ (Morgenstierne 1946‒1948: 289) – probably some unenlarged form of kōpag ‘shoulder’, according to Morgenstierne – toponyms from Makran: Sarbāz kopk [=/kōpk/?], Sarāvān Kupag [=/kūpag/] (Spooner 1971: 527), with the following annotation: “Names of Baluchi origin – or at least fully Baluchized. These are almost exclusively names of natural features, e.g. rivers, streams, rocks, mountains [. . .], and small areas. These can be seen to suggest the toponymy of a pastoral, nomadic people”).

Apparently Central-Iranian dialects have only the geographical metonymical projection: (5)

a.

type g/qomb: Naini gom, gomb, gombu ‘hillock, heap of earth’; Behdinani qomb ‘clay vessel’ (for which see Rossi [2016]) and ‘raised ground’; Khunsari qombeli, gombeli ‘relief’; qombela ‘prominent, raised’;

b.

type kope: Judeo-Isfahani kope ‘heap’; Judeo-Pers., Yazd koppo ‘heap’; Khunsari kopa ‘heap’; keppeli ‘prominent, raised’; Kermanshahi qopa ‘prominent, raised’, qomboli ‘prominent, raised’; Kurmanji qov ‘hump’; Sorani qubke (1) ‘protruding’, ‘dome’ [= kubk]; (2) ‘top of mountain’;

c.

Fārs, Lori and Southeastern coast: Bandari gambel ‘hill’; Bakhtiari gomboluk ‘prominent, raised’; Davani kombor ‘peak, stone relief (hill or mountain)’; Jiroft-Kahnuji kombar ‘earth hill’.

Glimpses of Balochi lexicography

61

Here the pathway proceeds from the conception of a prototypically BENT HUMAN to ANY FEATURE OF THE LANDSCAPE BEING BENT AND THEREFORE STICKING OUT FROM THE HORIZON LINE ; this is based on the assumption that the prototypical human body is conceived as lying on the horizon line.

SPINE

(6) Bal. mōl Psht. mowl ‘hump’ (Hanley 1981: s.v.). Cf. also Bakht. mol ‘hump of an oxen’; ‘round hill’, Kurdish Sor. milik ‘hunch of camel’, etc.; Kurd. Sul. mol ‘piled, heaped’; Naini koo-mol-kaǰa ‘mountain with crooked neck’; mol ‘neck’; Shushtari mol ‘hump’; Yazdi mol ‘neck’, the same in all Fârs dialects, etc. Apparently a different word is Bal. mōl ‘a particular Balochi fashion of binding turbans’ (Mitha Khan Marri and Surat Khan 1970: s.v.); mōl ‘a corner of a turban used to cover the face’ (Elfenbein 1990: s.v.); mōl ‘particular way of binding turbans’ (Razzaq, Buksh, and Farrell 2001: s.v.); cf. Psht. mōl ‘tip of a turban’; ‘way of wrapping a turban so it covers one’s ears’ (Pashtoon 2009: s.v.), ‘pan du turban avec lequel on se cache le nez et la bouche’ (Kabir and Akbar 1999: s.v.); Br. mōl ‘muffling of chin and ears against the cold’; Sir. mōl ‘a pad placed on the top of the head for carrying weights’; Si. mōṛu ‘cock’s comb’ both < Skt. mukuṭa- (also mauli-) ‘tiara, crest’, according to Turner (1966: no. 10144, Skt. < Drav.), “wohl drav”. Mayrhofer (2001: s.v.); cf. Burrow and Emeneau (1984: 437, no. 4888). Geographically, Bal. mōl ‘round hill’ Mayer (1909: s.v.), cf. Mol ‘a place’ [location uncertain; The Farhang-e joɣrāfiāi-ye Irān (Teheran, 1330, 7: 225) lists a place called Mol near Lar (a city often mentioned in the Balochi epics)] (Barker and Mengal 1969: 270); Molā name of a famous pass, and the Western Iranian words such as mil(e) in all Kurdish and Luri dialects for ‘pass’, ‘hillock’, cf. Mokri (1997: 8‒10). WIr. mōl /mīl < Ir. *mr ̥du-, SWIr. form ~ Av. mәrәzu- ‘vertebra’; cf. Christensen, Barr, and Henning (1939: 338); Eilers (1987: 14‒16, 1988: 371); Mayrhofer (1996: 334, with further literature). Notwithstanding the many difficulties raised by the proposal of Bailey (1979: 337b) to trace to the same Av. base also Skt. malhá- ‘mit Auswüchsen am Hals versehen’ (on which see Mayrhofer [1996: 334]), there may be some connection and/or semantic influence between the Iranian and the Indo-Aryan lexical families. If the bases collected here are really connected, the Balochi geographical usages/denominations would prove the antiquity of the metonymy from the HUMP type in Iranian. The pressure from the Indo-Aryan PAD type seems not to have produced geographical metaphors in Indo-Aryan MPrs. kōf ‘mountain’, Av. kaofa- ‘hump (only in compounds); mountain’ is the Iranian parallel par excellence; see all HUMP New Iranian continuants in Hasandust (2011: 343).

62

Adriano V. Rossi

(7a) Bal. pūnz, pōnz RaBal. pōz ‘nose’; ‘protruding part of mountain or anything’ (Hashmi Baloch 2000: s.v.); pōz ‘nose’ (Elfenbein 1990: s.v.), pōnz ‘nose’ (Gilbertson 1925: s.v.). (7b) Bal. pūnzīg ‘heel’ (Elfenbein 1990: s.v.); RaBal. pū̆ nzuk ‘heel’ (Elfenbein 1990: s.v.); pō(n)zag ‘protruding part’ (Hashmi Baloch 2000: s.v.); also RaBal. pūnz ‘heel’ (Elfenbein 1990: s.v.); Co. pīnz, EHBal. pīz, pīδ; CoBal. pīnz ‘heel’ (Razzaq, Buksh, and Farrell 2001: s.v.); pēnz, phēnz, also pūnz ‘heel’ (Mitha and Surat 1970: s.v.). Archive informants: pūnzīk ‘heel’ (Kharan), pūnzuk (Panjgur-1), pū(n)z ‘heel’ (Turbat-1, Turbat-2, Karachi-1, Karachi-3, Karachi-4 [pīnz], Dashtiari [pīnz]), cf. CoBal. pādē pīnz ‘heel’ (Karachi-1, Oman), RaBal. pādpūnz ‘heel’ (Kharan-1, Kharan -2), RaBal. pādē būnz ‘heel’; pādpūnz ‘heel’ (Ahmedzai n.d.: s.v.). See Br. būz ‘snout, muzzle; kiss (vulgar); skirt of a hill’ (Rossi 1979: 122, no. F22); note Balochi forms with b° and the following Western Iranian ones: Prs. pā-bus ‘heel’; Az. boz ‘heel’; Gil. buz, buzi, pā-buz ‘heel’; Khor. buzak ‘bone of a horse leg’. Geographical meanings: RaBal. pūnz ‘boulder, rock’ (Elfenbein 1990, s.v.); pōz ‘protruding part of mountain or anything’ (Hashmi Baloch 2000: s.v.); RaBal. kōhe pōzag; syn. sunṭ (Hashmi Baloch 2000: s.v.); CoBal. pūzak (toponym) ‘crest of a mountain of the Makran Range south of Nikshahr’ (Pozdena 1978: 78). A Turbat informant (Archive) knows the geographical usage of kōhē pūnz, but does not know the exact meaning. According to the common opinion, pūnz, pōnz (palatalized in pīnz, pēnz in Coastal Balochi and Eastern Balochi) ‘heel’ – and their derivatives in -ag, -uk, -īk, -īg – are original Balochi developments (cf. Geiger [1890: 142, no. 306], with doubts of Morgenstierne [1927: 57, 1932: 49, 2003: 63], Benveniste [1955: 300]), while Bal. pōz, pūnz ‘nose’ is a borrowing < Prs. pōz ‘snout, beak’, also ‘mouth area’, with secondary nasalization (thus, e.g., Korn 2005: 216, cf. Korn 2005: 203; Geiger 1890: 142, no. 310). In any case, it seems hardly tenable (because of its isolation in Iranian, phonetic grounds, and semantic reasons) connecting *pauk- KISS (documented only by Khotanese) and Prs. pōz ‘snout, beak’ as assumed by Bailey (1979: 250b); cf. also Korn (2005: 203 n. 139), or hypothesizing *faź-, *fāź-, *fauź-: fuź- / *pauź- : puź- or *fi ̯auź- / *pi ̯auź- LOWER PART OF FACE to explain Prs. pōz ‘snout, beak’, as assumed by Rastorgueva and Èdel'man (2007: 49‒51); a connection of Prs. pōz, Bal. pō/ūnz ‘nose’, and Bal. pūnz ‘heel’ remains possible in view of their prototypical shape/function (as assumed here). The series of labels for ‘a broad surface of the body, the front, face and breast or the back’, already pointed out by Bailey (1967: 179‒180) and summarily treated by Rossi (1998: 407‒409), possibly specialized in some Iranian languages as a protruding body part, seems to

Glimpses of Balochi lexicography

63

belong to a network of terms characterized by the amplitude of the attested forms, many of which open to geographical transfers. To the Balochi geographical metaphors one could add Kurm. poz ‘cape; headland’; Sarvestani puze ‘spur of mountain’; Larestani pūza ‘spur of mountain’; Psht. poza, Waz. pēza, Wan. pīza ‘nose’; ‘peak of a mountain’ (Morgenstierne 2003: s.v.). Here the pathway proceeds from ANY PROTRUDING PART OF THE HUMAN / ANIMAL BODY towards ANY FEATURE OF THE MOUNTAIN LANDSCAPE APPEARING AS PROTRUDING from the massif. If one arranges in a scale of protrusion Bal. mōl, pūnz, sunṭ as projected onto the landscape, the coefficient of roundness decreases and that of pointedness increases. (8) Bal. sunṭ, suṭ ‘beak’, ‘sting’ and ‘chin’ in Hashmi Baloch (2000: s.v.); RaBal. sunṭ (a) ‘trunk’, (b) ‘beak’ (Rzehak and Naruyi 2007: s.v.); MwBal. sunṭ ‘beak’ (Elfenbein 1963: s.v.), PrsBal. sunṭ (Spooner 1967: 68) ‘beak, bill’; sunṭ ‘beak, bill; sting (of a mosquito)’ (Elfenbein 1990: s.v.), also sunt; sunṭī ‘beaked, stinger’, sunṭīg ‘a fierce mosquito’ in S. W. Makrān (Sarawani); Mirjave sūnṭ ‘beak’ (‘animal’s mouth’) (Coletti 1981: s.v.); sunṭ ‘beak’ (Barker and Mengal 1969: 30); suṭ, suṭh, sunṭ ‘elephant’s trunk; snout; bank’; ‘hillock’ (s.v. barbūnz) (Mitha Khan Marri and Surat Khan 1972: s.v.); sunṭ ‘peak, summit; beak, bill; the trunk of elephant’ (Ahmedzai n.d.: s.v.); EastBal. sut ‘spur of a mountain’ (Gilbertson 1925: s.v.), EastBal. sut ‘spur of a mountain running down into a plain’ (Dames 1891: s.v.); sut ‘spur of mountain run to plain’ (Mayer 1909: s.v.); sunṭ ‘peak; summit’ (Razzaq, Buksh, and Farrell 2001: s.v.; give syn. ṭul); PrsBal. sunt ‘Bergsporn’ (Pozdena 1978: 78), in toponyms: Širuksunt ‘Bergsporn des Chahbahar Plateaus in Tiskupan); Bal. sunṭ ‘bottom of a hill sloping into a beak’ (Ata 1968: 142). Archive informants: sunṭ ‘lip’ (= lunṭ) (Iranshahr-1), ‘upper lip’ (Oman); ‘mouth and chin’ (Turbat-2); ‘chin’ (Karachi-1, also knows zanūk), Karachi-4 (=zanīk) ‘chin’ (the same as zanuk, but mostly referring to birds, considered impolite with reference to human beings) (Turbat-1); ‘mouth area’ (Dashtiari), ‘chin’ [Panjgur-1]; IrBal sonṭṭ ‘lip’; Turbat-2: sunṭ ‘top of mountain if not rounded’ (in this case it would be sar, ṭul). Compare Br. sunṭ ‘beak, muzzle; projecting corner; bottom of a hill sloping into a beak’ (Bray 1934: s.v.; Rossi 1979: 49, no. A357); cf. Sir. sunḍ, Si. sūnḍḍhi, Skt. śuṇḍā ‘trunk, proboscis’; Sist. sont ‘muzzle’ (with reference to human beings only when distorted). Mayrhofer (1996: 426) hesitates in attributing Skt. śuṇḑā- ‘trunk, proboscis’ to a common Indo-Iranian base. Tremblay (2005: 426) assumes a base *sundika‘fauces’ (> Khot. ṣuṃca- ‘beak’, Waxi šεndik (Lorimer 1902: s.v.), šənḍg ‘gums of mouth’ (Steblin-Kamenskij 1999: s.v.); NPers. šand ‘beak’, Ved. śuṇḑa- ‘tusk’) and adds (Tremblay 2005: 426 n. 28):

64

Adriano V. Rossi

The following facts militate against a direct borrowing of the Sakan word from Indian: 1. The meanings diverge; 2. The word for ‘beak’ is attested in Persian; 3. It is enlarged by an -ika-suffix in Khotanese and Waxi; 4. The Khotanese word has ṣ, not ś. If the Iranian word were a borrowing from Indian, it must be a very early one. The Indian lexeme was later borrowed in Sogdian B šnth ‘trunk’, and through Dardic (Khowar šūn, Tir. šuṇḑ ‘lip’), in Šughni šand < *šundā, Parachi Pashto šū̆ ṇḑ ‘lip’.

As in the case of pūnz above, the pathway here also proceeds from ANY toward ANY FEATURE OF THE MOUNTAIN LANDSCAPE APPEARING AS PROTRUDING from the massif. The notion of CONNECTIVITY (from one part to another of the mopuntani slope as from one part to another of the mouth/nose area in the face/snout) seems residual in some scattered Balochi evidence; in any case no other Iranian language documents both the bodily and geographical meanings in living usage (NPers. šand ‘beak’ is doubtful). While in diachronic cognitive onomasiology the main strategies that exist in a language sample for conceptualizing and verbalizing a given concept are investigated, with the aim of explaining them against a cognitive background in terms of salient perceptions, prominency, etc., the iconymic (motivational) sequence reconstructed in the few examples commented on above is, by definition, consolidated, being as it were crystallized in the name itself, just as a fossil is embedded in the surrounding matter. If its chronological span reaches some point in ancient history, we can be sure of its relative antiquity. No one would devalue the potential of this approach to the reconstruction of the cultural landscape in an area of such intensive multilingualism as that of the Indo-Iranian Frontier languages.

PROTRUDING PART OF THE HUMAN /ANIMAL BODY

Abbreviations Av. Az. Bal. Br. Bshk. Drav. Dzadr. Gil. IA Ir.

Avestan Azari Balochi (CoastalBal., IranianBal./ PrsBal., EastHillBal., MarwBal., RakhshaniBal.) Brahui Bashkardi Dravidian Dzadrani Gilaki Indo-Aryan Iranian

Glimpses of Balochi lexicography

Khot. Kurm. Khor. Pa. Panǰ. Pkt. Prs. Psht. Si. Sir. Sist. Skt. Sor. Sul. Wan.

65

Khotanese Kurmanǰi Khorasani Pali Panǰabi Prakrit Persian Pashto Sindhi Siraiki Sistani Sanskrit Sorani Suleimani Wanetsi

References Abaev, Vasilij I. 1958–1989. Istoriko-ètimologičeskij slovar´ osetinskogo jazyka. 4 vols. (Moskva) & Leningrad: Izd. AN SSSR (vol. 1), Nauka (voll. 2-4) Ahmedzai, Agha Nasir Khan. n.d. Unpublished notes for a Balochi-Brahui-Urdu-English dictionary. Rakhshani dialect (including dialectal variants), lists more than 15,000 words and phrases. A copy of the manuscript is at the Archives of the Comparative Etymological Balochi Dictionary Project (compiled in the 1970s and 1980s). On the first page of the manuscript is the following annotation: “checked by the late Mir Gul Khan Naseer 7/9/83”. Alinei, Mario. 1997. Principi di teoria motivazionale (iconimia) e di lessicologia motivazionale (iconomastica). In Lessicologia e lessicografia, Atti del XX Convegno della Società Italiana di Glottologia (Chieti-Pescara, 12‒14 Ottobre 1995). Roma: Il Calamo. 9‒36. Ata Shad (ˁAṭā Šād). 1968. Balōčināma [in Urdu]. Lahore: Markazi Urdu Board. Bailey, Harold W. 1967. Prolexis to the book of Zambasta. Cambridge: Cambridge University Press. Bailey, Harold W. 1979. Dictionary of Khotan Saka. Cambridge: Cambridge University Press. Barbera, Gerardo. n.d. Unpublished notes for a Molkigâl Baškardi-English dictionary, 1,750 words. Barker, Muhammad A. & Aqil Khan Mengal. 1969. A course in Baluchi. Vols. 1–2. Montreal: Institute of Islamic Studies, McGill University. Benveniste, Emile. 1955. Études sur quelques textes sogdiens chrétiens (I). Journal asiatique 243. 298‒335. Bray, Denis. 1934. The Brāhūī Language, part II – The Brāhūī problem; part III – etymological vocabulary. Delhi: Manager of Publications. Burrow, Thomas & Murray B. Emeneau. 1984. A Dravidian etymological dictionary. Oxford: Clarendon Press.

66

Adriano V. Rossi

Christensen, Arthur, Walter Henning & Kaj Barr. 1939. Iranische Dialektaufzeichnungen aus dem Nachlass von F. C. Andreas, Abhandlungen der Gesellschaft der Wissenschaften zu Gottingen, III/11. Berlin: Weidmannsche Verlagsbuchhandlung. Coletti, Alessandro. 1981. Baluchi of Mirjave (Iran). Roma: Edizioni A.C. Dames, Mansel L. 1891. A text book of the Balochi language, consisting of miscellaneous stories, legends, poems and a Balochi-English vocabulary. Lahore: Punjab Government Press. Part IV, 1‒117. Dames, Mansel L. 1904. A historical and ethnological sketch. London: The Royal Asiatic Society. Dames, Mansel L. 1907. Popular poetry of the Baloches. Vols. 1‒2. London: The Royal Asiatic Society. Dames, Mansel L. 1913. Balōčistān. In The encyclopædia of Islam: A dictionary of the geography, ethnography and biography of the Muhammadan peoples. Vol. 2: B‒D. Leiden: E. J. Brill. 625‒640. Eilers, Wilhelm. 1987. Iranische Ortsnamenstudien. Wien: Verlag der österreichischen Akademie der Wissenschaften. Eilers, Wilhelm. 1988. Der Name Demawend. Hildesheim, Zürich & New York: Olms Verlag. Elfenbein, Josef. 1963. A vocabulary of Marw Baluchi. Naples: Istituto Universitario Orientale. Elfenbein, Josef. 1983. A Brahui supplementary vocabulary. Indo-Iranian Journal 25. 191–209. Elfenbein, Josef. 1990. An anthology of classical and modern Baluchi literature. II: Glossary. Wiesbaden: Harrassowitz. Evans, Vyvyan & Melanie Green. 2006. Cognitive linguistics: An introduction. Edinburgh: Edinburgh University Press. Farhang-e joɣrāfiāi-ye Irān. 1328–1332.10 vols. Tehran. Filippone, Ela. 1996. Spatial models and locative expressions in Balochi. Naples: Istituto Universitario Orientale. Filippone, Ela. 2006. The body and the landscape. Metaphorical strategies in the lexicon of the Iranian languages. In Antonio Panaino & Riccardo Zipoli (eds.), Proceedings of the 5th conference of the Societas Iranologica Europaea held in Ravenna, 6–11 October 2003. Vol. II. Classical & contemporary Iranian studies. Milano: Mimesis. 365–389. Filippone, Ela. 2010. The fingers and their names in the Iranian languages. Wien: Verlag der österreichischen Akademie der Wissenschaften. Geiger, Wilhelm. 1890. Etymologie des Balūčī. Abhandlungen der Königlichen bayerischen Akademie der Wissenschaften, I. Cl. 19/1. 105‒153. Geiger, Wilhelm. 1891. Lautlehre des Balūčī. Abhandlungen der Königlichen bayerischen Akademie der Wissenschaften 19/2. 397‒464. Gilbertson, George W. 1925. English-Balochi colloquial dictionary. Hertford: Stephen Austin & Sons. Hanley, Barbara. 1981. Concise English-Pushto dictionary. Quetta: Afghan Publishers. Hasandust, Mohammad. 2011. A comparative-thematic dictionary of the Iranian languages and dialects. Tehrān: Farhangestān-e zabān va adab-e fārsi. Junker, Heinrich F. J. 1930. Arische Forschungen. Yaghnobi-Studien I: Die Sprachgeographische Gliederung des Yaghnōb-Tales. Abhandlungen der philologisch-historischen Klasse der sächsischen Akademie der Wissenschaften 41/2. Kabir, Habib & Wardag Akbar. 1999. Dictionnaire pashto-français. Paris: L’Asiathèque. Korn, Agnes. 2005. Towards a historical grammar of Balochi. Studies in historical phonology and vocabulary. Wiesbaden: Reichert.

Glimpses of Balochi lexicography

67

Kövecses, Zoltán. 2002. Metaphor: A practical introduction. Oxford: Oxford University Press. Kövecses, Zoltán. 2005. Metaphor in culture: Universality and variation. Cambridge: Cambridge University Press. Lakoff, George. 1987. Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago. Lorimer, John G. 1902. Grammar and vocabulary of Waziri Pashto. Calcutta: Office of the Superintendent of Government Printing, India. Mayer, Thomas J. L. 1909. English-Biluchi dictionary. Lahore: Sheikh Mubarak Ali. Mayrhofer, Manfred. 1956‒1980. Kurzgefasstes etymologisches Wörterbuch des Altindischen. Vol. III, 1976. Heidelberg: Winter. Mayrhofer, Manfred. 1992‒2001. Etymologisches Wörterbuch des Altindoarischen. Vol. I, 1992; Vol. II, 1996; Vol. III, 2001. Winter: Heidelberg. Mitha Khan Marri (Miṭhā Xān Marrī) & Surat Khan (Sūrat Xān). 1970. Balochi-Urdu Lughat [in Urdu]. Quetta: Balochi Academy. Mokri, Mohammad. 1997. Le nom de « vallée » dans les toponymes iraniens. Paris-Louvain: Éditions Peeters. Morgenstierne, Georg. 1927. An etymological vocabulary of Pashto. Oslo: I komisjon hos Jacob Dybwad. Morgenstierne, Georg. 1930. The Wanetsi dialect of Pashto. Norsk tidsskrift for sprogvidenskap 4. 156‒175. Morgenstierne, Georg. 1946‒1948. Balochi miscellanea. Acta Orientalia 20. 253‒292. Morgenstierne, Georg. 1973. Irano-Dardica. Wiesbaden: Reichert. Morgenstierne, Georg. 2003. A new etymological vocabulary of Pashto. Compiled and edited by Josef Elfenbein, David N. MacKenzie & Nicholas Sims-Williams. Wiesbaden: Reichert. Pashtoon, Zeeya A. Pashto-English dictionary. Hyattsville: Dunwoody Press. Pozdena, Hans. 1978. Das Dashtiari-Gebiet in Persisch-Belutschistan. Eine regional-geographische Studie. Wien: Verlag A. Schendl. Rastorgueva, Vera S. & Džoj I. Èdel'man. 2007. Ètimologičeskij slovar' iranskix jazykov. Vol. III. Moskva: Izdatel'skaja firma « Vostočnaja literatura » RAN. Razzaq, Abdul, Mula Buksh & Tim Farrell. 2001. Unpublished manuscript Balochi Ganj. BalochiEnglish dictionary, 19,631 entries, fourth draft. Rossi, Adriano V. 1979. Iranian lexical elements in Brāhūī. Naples: Istituto Universitario Orientale. Rossi, Adriano V. 1998. Ossetic and Balochi in V. I. Abaev’s Slovar’. In Studia iranica et alanica. Festschrift for Prof. Vasilij Ivanovič Abaev on the Occasion of His 95th Birthday. Rome: IsIAO. 373–431. Rossi, Adriano V. 2016. Problemi di etimologia areale nel Mediterraneo orientale: gr. κύμβαχος nel suo retroterra asiatico. Linguarum varietas. An International Journal 5. 211–227. Rzehak, Lutz & Bedollah Naruyi (eds.). 2007. Abdurrahman Pahwal, Balochi Gâlband. Balochi, Pashto, Dari & English dictionary. Kabul: Al-Azhar Book Co. Hashmi Baloch, Sayad. 2000. Sayad Ganj. The first Balochi dictionary [in Balochi]. Karachi: Sayad Hashmi Academy. Septfonds, Daniel. 1994. Le Dzadrâni. Un parler pashto du Paktyâ (Afghanistan). Louvain-Paris: Peeters. Spooner, Brian. 1967. Notes on the Baluchī spoken in Persian Baluchistan. Iran 5. 51‒71. Spooner, Brian. 1971. Notes on the toponymy of Persian Makran. In C. E. Bosworth (ed.), Iran & Islam. In memory of the late Vladimir Minorsky, Edinburgh: Edinburgh University Press. 517‒533.

68

Adriano V. Rossi

Steblin-Kamenskij, Ivan M. 1999. Ètimologičeskij slovar' vaxanskogo jazyka. Sankt-Peterburg: Peterburgskoe Vostokovedenie. Tremblay, Xavier. 2005. Irano-Tocharica et Tocharo-Iranica. Bulletin of the School of Oriental and African Studies 68 (3). 421‒449. Turner, Ralph L. 1966. A comparative dictionary of the Indo-Aryan languages. London: Oxford University Press. Xromov, Aleksandr L. 1975. Očerki po toponimii i mikrotoponimii Tadžikistana, 1. Dušanbe: Irfon.

Martin Schwartz

4 On some Iranian secret vocabularies, as evidenced by a fourteenth-century Persian manuscript Abstract: A fourteenth-century manuscript in Tashkent gives, in the margins of five pages, an anonymous Persian treatise on secret forms of communication. The treatise comprises a vocabulary for an underworld Shi’ite argot; also a series of Arabic kennings used by a sect of worshippers of ’Ali; a description of an alphabetic-numerological code, with illustration of its use; and verses in the Shi’ite argot. With the exclusion of details on the alphabetic-numerological code, this article will discuss various aspects of the secret vocabularies provided, which are of interest for the broader linguistic and sociological history of Persian. Keywords: Iranian argots, jargons, Persian codicology, Shi’ites, Judeo-Iranian, Jews, Loterai, Aramaic, Hebrew, Gypsies, Old Iranian, Middle Iranian, New Iranian, Iranian etymology, Persian lexicography, Proto-Indo-European, Banu Sasan, worship of ’Ali, logograms, Central Asia

1 Introduction For my account of Iranian secret vocabularies I shall use, as a springboard to related bodies of data, a medieval Persian treatise called ‫ ﮐﺘﺎﺏ ﺳﺎﺳﯿﺎﻥ ﺑﮑﻤﺎﻝ‬Ketāb-e Sāsīān bekamāl ‘The Book of Accomplished Grifters’ (henceforth KS). It is written on the margins of five pages of an unrelated text in a manuscript miscellany dated 1344 CE, kept in the Albiruni Center for Oriental Manuscripts, Tashkent State Institute of Oriental Studies. (I thank Professor Aftandil Erkinov of the National University of Uzbekistan, Tashkent, for kindly providing me with relevant photoscans upon my request.) The manuscript is listed in Sobranie vostochnyx rukopisej Akademii Nauk Uzbekskoj SSR (Catalogue of Oriental Manuscripts of the Uzbek Akademy of Sciences of the Uzbek SSR), Vol. I (Tashkent 1952, 196– 97); the relevant text is on folios 74–77. The Sāsīān of the title is the plural of Persian ‫ ﺳﺎﺳﯽ‬sāsī ‘grifter, beggar, parasite, rogue’ from ‫ ﺳﺎﺱ‬sās ‘bug, louse, flea’, cf. the name of the old Arab underworld, the Banū Sāsān ‘Sons of S.’, ennobled by association with the Martin Schwartz, University of California‒Berkeley DOI 10.1515/9783110455793-005

70

Martin Schwartz

eponym of the Sasanian dynasts. KS in fact takes its name from what may now be described as the first part of a four-part tract on exclusionary forms of communication, of which at least the second and third parts were based on the anonymous author’s personal experience. Here then, for the first time, is the treatise’s four-part scheme: Part One, in nine topical chapters (bāb), is devoted to an argot of Shiʿite Sāsīān, with each argot word glossed in Persian in a different color of ink. Part Two, again in different colors, gives glossed expressions of certain ʿAli worshippers. Part Three may here be merely described as an ‫ﺍﺑﺠﺪ‬ abjad ‘(alphabetic-numerological) code’ for private oral communication. Part Four continues verses from the end of Part One in the Shiʿite Sāsīān argot. Only from Part One was linguistic material cited by Ivanow (1922), who gave random words he managed to note, having briefly borrowed the manuscript from an unscrupulous dealer in Qarshi. Troickaja (1948), who found the same manuscript archived in Tashkent, concentrated on correlating the Shiʿite argot of KS Part One with the Abdoltili argot of Uzbekistan’s Gypsies, mendicant preachers, and itinerant reciters. Schwartz (2014) provided a most extensive number of citations of the argot terms and their glosses in Part One (and Part Four), mainly showing origination of many of the argot words (including those of twentieth-century itinerants of Iran and Central Asia) from the earlier form(s) of the exclusionary jargon of Iranian Jews, the modern remnants of which vocabulary were collected by Yarshater (1977).

2 Vocabulary of the

‫ ﻋﻠﯽ ﺍﻟﻠﻪ‬ʿAlī Allāh

I shall now give examples of the expressions used by the ʿAlī-worshippers (ʿAlī Allāh) according to KS Part Two. Like the Old Germanic kennings, this exclusionary vocabulary consists of enigmatic metaphorical phrases, in this instance in Arabic, with Persian glosses: (1)

‫ ﻋﺮﺵ ﺍﻟﺸﯿﻄﺎﻥ‬ʿarš al-šayṭān ‘Satan’s throne’ = ‫ ﮐﺎﻣﻪ‬kāme ‘desire’. Via association with kāme is juxtaposed:

(2)

‫ ﺍﺑﻮ *ﺍﻟﻬﺎﺭﺱ‬abū *al-hāris ‘father of the *grinder’ = ‫ ﺁﺏ ﮐﺎﻣﻪ‬āb kāme ‘a digestive liquid’ (whose initial preparation involved the grinding of a dried fermented bread [Hosseinzadeh et al. 2013]). The manuscript has al-hāriš or al-hādiš; for copyist’s errors in KS, cf. Part One ‫ ﻓﺘﻨﻪ‬fetne ‘sedition’ for *‫* ﺗﺸﻨﻪ‬tešne ‘thirsty’ (juxtaposed with gorosne ‘hungry’ and glossing argot ‫‘ ﺑﺮ ﻣﯿﺎ‬thirsty’ < ‘waterless’ < Aramaic bar + mayyā); cf. below on ‫ ﺳﺎﻭﺗﻪ‬sāūte ‘old person’ and ‫ ﺗﺎﺯ‬tāz ‘boy, catamite’. Arabic abū ‘father of-’ indicating association, occurs in further examples, the next three of which require disambiguation of the governed Arabic noun:

On some Iranian secret vocabularies, as evidenced

(3)

71

‫ ﺍﺑﻮ ﺍﻟﻌﺒﺎﺱ‬abū al-ʿabbās ‘father of the frowner’ (not ‘-of the lion’) = ‫ ﴎﮐﻪ‬serke ‘vinegar’.

(4)

‫ ﺍﺑﻮ ﺍﻟﻔﺮﺝ‬abū l-faraj f. ‘of the aperture’ (not ‘-of the vulva’) = ‫ ﮐﻠﯿﺪ‬kelīd ‘key’.

(5)

‫ ﺍﺑﻮ ﺍﻟﺠﺎﻣﻊ‬abū l-jāmiʿ f. ‘of the gathering’ (not ‘-of the mosque’) = ‫ ﺳﻔﺮه‬sofre

‘table(cloth)’. (6)

‫ ﺷﻬﯿﺪ ﺑﻦ ﺷﻬﯿﺪ‬Šahīd bin Šahīd ‘Martyr, son of Martyr’) = ‫ ﺑﺮه‬barre ‘lamb’.

(7)

‫ ﻧﺒﺎﺕ ﺍﻟﻘﺶ‬nabāt al-qašš ‘vegetation of the straw’ = ‫ ﺧﺎﯾﻪ ﻣﺮﻍ‬xāye-ye morɣ ‘(chicken) egg’. The latter kenning may in part be backgrounded in the origin of nabāt ‘vegetation’ < ‘to germinate’, but the kenning is (also) visual. The final example is visual, and in fact cartoonish:

(8)

‫ ﺣﻤﯿﺪ ﺍﻟﮑﻮﺳﺞ‬Ḥamīd al-Kūsaj ‘Hamid the Swordfish’ = ‫ ﻧﺎﯼ ﺯﻥ‬nāy-zan ‘flute player’.

2.1 KS argot and Iranian Jewish jargon As for the first and last parts of the Ketāb, in Schwartz (2014) I gave a long and varied account of how the Ketāb’s Shiʿite argot and the twentieth-century argots of itinerant groups of Iran and Central Asia in their vocabularies greatly reflect the exclusionary speech of Iranian Jews in its older phase, in which, inter alia, words of Jewish Aramaic origin predominated over words from Hebrew. I shall here revisit a small part of that material, but also bring out new observations, and introduce material bearing on Iranian (non-Jewish) aspects of the relevant vocabularies. The term (‫< ﻟﻮﺗﺮﺍ)ﯼ‬lwtrʾ(y)> is attested in Persian literature from the tenth century onward for some kind of cryptic speech. The same term (vocalized lō̆ terāʾī, lūtrāʾī) with variants has been used by Jews of Iran and Afghanistan for their exclusionary vocabulary, only to keep gentiles (and children) from understanding. As is true for the gentile argots, the morphology is that of ordinary Iranian speech. Thus KS has ‫< ﺑﻬﺰ‬bhz> (Pers. ‫ ﺑﺮﻭ‬borou) second person singular imperative of ‫< ﻫﺰﯾﺬﻥ‬hzyδn> hezīδan ‘to go’, parallel to the Jewish Loterāʾi (henceforth JLot.) of Nehavand be-hez ‘id.’, infinitive hezidan; similarly formed are KS ‫< ﺑﻨﻮﻧﺪ‬bnwnd> ‘give!’ (Pers. be-deh), JLot. Mashhad be-nund ‘id.’ Cf. further KS ‫< ﻧﺘﮑﻦ‬ntkn> ‘don’t make’ (Pers. ma-kon), JLot. Mashhad be-teken ‘make, fix!’ Note also, in the past tense, KS ‫< ﻣﯽ ﺩﻫﻠﻢ‬my dhlm> ‘I fear’ (Pers. ‫ ﻣﯽ ﺗﺮﺳﻢ‬mī-tarsam) and occurring in a small group of verses appended to the first part of KS; cf. mī-dahlad ‘he fears’ in a regionally unspecified JLot., and Tajik Gypsy Jugigi argot me-dahlom ‘I fear’.

72

Martin Schwartz

The above citations of the KS are among the Aramaic-originated forms that constitute the majority of verbs in that work’s Shiʿite argot. These verbs, and other words in the KS (and later itinerant argots), with their correspondents in JLot., support the derivation of the designation Loterāʾī, Lūtrā(ʿī), etc., from Heb. Lo-Tōrāh ‘not [in the chief language of the] Torah’, i.e., ‘not Hebrew’ = ‘Aramaic’. Further confirmation comes from the KS itself, which calls the language of the argot verses ‫ ﺯﺑﺎﻥ ﺳﻮﺭﯼ‬zabān-e sūrī ‘the Syrian language’, corresponding to the older Jewish designation for Aramaic. The words from Aramaic that passed from Jewish exclusionary vocabularies into the gentile argots inter alia show a selection from the various regional forms of JLot. For example, KS ‘go!’ agrees in its aspiration with JLot. Nehavand hez- vs. JLot. Golpayegan ez- < Aram. ʿazl- ‘going’. Against both, for ‘to go’, JLot. Shiraz, which often differs from the more northerly JLot. has, gāledan, also found in KS in synonymy with , as ‫< ﮐﺎﻟﯿﺪﻥ‬kʾlydn> /gālīdan/, < Aram. galy- ‘going out’. KS , variants , ‘to make’, etc., corresponds to JLot. Mashhad teken-, Herat, Kabul, tikin- ‘to fix, make’, but JLot. Shiraz taːn ‘id.’ shows a different outcome of the Aram. etymon: taqqen ‘to establish, fix’ > JLot. Mashhad teken-, etc., with *q > k, but via *taɣen- or the like, > JLot. taːn. KS (inf. ‫ﻧﻮﻧﺪﯾﺪﻥ‬, Pers. ‫ ﺩﺍﺩﻥ‬dādan) ‘to give’ corresponds to nund- ‘to give’ in the aforementioned axis JLot. Mashhad-Herat-Kabul < OAram. *nudn- ‘a gift’, but JLot. Shiraz av-, Borujerd ab- ‘to give’ from Aram. haβ ‘give!’, of which a JLot. form, still with h-, yielded in Tajik Gypsy argots Jugigi hob-/how- and “Samarkand Lʸuli” hob-/hov- ‘to give’. KS ‫< ﺑﯿﺴﻪ‬bysh> ‘egg’ (Pers. ‫ ﺧﺎﯾﻪ ﻣﺮﻍ‬xāye-ye morɣ) from Heb. bēyṣāʰ ‘id.’ contrasts with JLot. Shiraz bika ‘id.’, whose k reflects an Old Aramaic phoneme written as qoφ in the early Achaemenid period (later /ʕ/). The KS selection of regionally diverse JLot. words is again shown from words of Iranian origin. KS ‫< ﺟﻬﺴﺘﻦ‬jhstn> ‘to see’ (Pers. dīdan), i.e., /čehestan/ vel sim. (with -estan as in Pers. dānestan ‘to know’), ‫< ﺑﺠﻪ‬bjh> /be-čeh/ vel sim. ‘see!’ (Pers. ‫ ﺑﺒﯿﻦ‬bebin) corresponds with *čeh-, which underlies JLot. Kashan če-(V-)/čā-(C-) ‘know, see’, and goes back to Old Iranian čaiθ- (Old Avestan cōiθat )̰ ‘to perceive’. The latter contrasts with its synonymous OIr. variant etymon čait-, reflected on one hand by JLot. Shiraz čed- ‘know, understand’, and on the other by JLot. Yazd čer- (for -r- < *-t Judeo-Yazdi šer- < *šyuta- ‘gone’), also Kermani and further Isfahan ‘see, know, understand, recognize’. In the same semantic field we have, in the above noted easterly JLot. axis, Mashhad, Herat ruj- (and Golpayegan rej-) ‘to see, know’, Kabul ruč- ‘to look’, not reflected in the argots. Derivation from OIr. rauč as ‘to be illumined’ would phonologically parallel, deriving from OIr. *(-)hačaya- the verbs JLot. Shiraz āj-, Herat huj- (*hoj-) ‘to come’ from old middle voice form ‘to lead oneself’, cf. Avestan hācaiia- (Yasna 5.18, etc.) ‘to lead, direct, persuade’.

On some Iranian secret vocabularies, as evidenced

73

From a causative of the JLot. verb (via *‘make come’ > ‘bring forth, produce’) comes North Iranian Gypsy argots ajon- ‘to make’ (and ‘to come’), Astarabad Gypsy “hedjonddan” ‘to make’. With the semantics of Av. hācaiia- ‘to persuade’ and upaŋhācaiia- ‘to come to agreement with somebody’, *hāčaya-, with preverb *abi (and *upari?) are derivable JLot. Nehavand viāj-, Mashhad velāj- ‘to sell, finish (a deal with someone)’.

2.2 KS argot and the Persian lexicon We now turn to words attested in KS of which equivalent forms passed from argot into Persian, where they are attested in early lexica and poetry (note especially Qarīʿ al-Dahr, cited by Asadī). Examples (9) and (10) survive in Persian speech. Note that examples (14) and (15) are attributed to ‫ﺯﺑﺎﻥ ﺁﺳﯿﺎﻥ‬, i.e., zabān-e *sāsīān ‘language of the Sāsīān’. Of these two examples, (14) ṣʾbwth differs somewhat semantically from KS *sʾwth as well as formally, showing a different background of transmission of forms, which parallel the diversity in the Gypsy argot forms. For twentieth-century argots of Persophone itinerants, see Schwartz (2014) in the references. (9) KS ‫< ﻧﻬﻮﺭ‬nhwr>: (a) ‘eye’ (Pers. ‫ ﭼﺸﻢ‬čašm). (b) ‘light’: ‫< ﻧﻬﻮﺭ ﺗﯿﮑﯿﻨﻪ‬nhwr tykynh> ‘(day)light-making’ (Arab. ‫ ﻓﺠﺮ ﺍﳌﻨﺼﻮﺭ‬fajr al-manṣūr ‘the victor’s dawn’) = Pers. ‫ ﭘﯿﺮﻭﺯﯼ‬pīrūzī ‘victory’; the starting point is probably pīrūzī (which is from OIr. pari-aujah-) wrongly associated with Pers. ‫ ﺭﻭﺯ‬rūz ‘day’. (c) ‘blind’ (Pers. ‫ ﮐﻮﺭ‬kūr ‘blind’). As to the twentieth-century argots of Iran and Central Asia: ‘eye’ is found for nuhur among the Gypsies and mendicant dervishes of eastern Iran, náhur among the Gypsies of Osof, Fars, and Kerman, as well as Gypsy Toshmal and Luti musicians, and for nhūr among the Arāk Gypsies. Both meanings, ‘eye’ and ‘blind’, are found among the Tajik Jugigi and “Samarkand Lʸuli” Gypsies. Finally nuhur means ‘blind’ in the argots of the Tajik Arabcha and Soghutrosh Gypsies, in Uzbek Abdoltili, and the Magati Gypsies. The co-occurrence of ‘eye, light’ and ‘blind’, the paradox of which suits an intentionally cryptic argot, is explained from Jewish usage: Aram. nəhōrā is ‘light, eyesight’; and saggī nəhōrā ‘hearing much light’ is a euphemism for ‘blind’. Close in meaning to the Aramaic are Pers. ‫ ﻧﻬﻮﺭ‬nohūr, Taj. nuhůr ‘eyesight, eye’. (10) KS ‫< ﺷﯿﺪﺍ‬šydʾ> ‘insane’ (Pers. ‫ ﺩﯾﻮﺍﻧﻪ‬dīvāne ‘insane’). Pers. šeydā ‘crazed, infatuated, impassioned’ is well attested (Ferdousī, Daqīqī, Farrokhī, Asadī et al.). From Aram. šēdā ‘demon’, cf. Pers. dīvāne < ‫ ﺩﯾﻮ‬dīv ‘demon’. The

74

Martin Schwartz

Hebrew form, šēd, occurs throughout Jewish languages (including Yiddish šed ‘demon’). Note Judeo-Isfahani šezim ‘demons’ and most relevantly, JLot. Shiraz šedd ‘to catch disease’. (11)

(12)

KS ‫< ﺩﺥ‬dx> ‘good’ (Pers. ‫ ﻧﯿﮏ‬nīk) paired with ‫< ﺯﯾﻒ‬zyf> ‘bad’ (Pers. ‫ﺑﺪ‬ ‘bad’). A poem by Sūzanī of Samarkand (twelfth century) contrasts “Lūtrā” ‘fair’ and ‘vile, ugly’. Throughout Iranian and Central Asiatic Gypsy and mendicant argots dax (Kavoli dak) means ‘good’, and in Persian dervish, Uzbek Abdoltili, Tajik Jugigi, Chistoni, and Arabcha argots dax also means ‘clean, pure, right, correct’, the latter proceeding semantically from Aram. dəxē/daxyā ‘clear, pure, ritually correct’. Remarkably, Arāk Gypsy argot distinguishes dax ‘good, right’ from daxyā ‘pure’. KS , etc., goes back to Aram. zayiφ (zēφ-) ‘ugly’. KS ‫< ﻫﺎﺩﻭﺭ‬hʿdwr> ‘a job, some work’ (Pers. ‫ ﮐﺎﺭ‬kār). Tenth-century Qaṣīda Sāsāniyya ‫ﻫﺎﺫﻭﺭ‬/‫< ﻫﺎﺩﻭﺭ‬hʾdwr/hʾδwr> ‘the circle of [fortune-tellers and their shills in a street crowd] around which people congregate’. In Gypsy argots of Iran, xodur, and of Central Asia, hodur means ‘beggar’, and in Magati argot ādur is ‘peddling’, the chief occupation of the Sheikh Momadi Afghan itinerants. Pers. ‫ ﻫﺎﺩﻭﺭﯼ‬hādūrī (Sanāʿī) ‘member of a society of persistent beggars’. Aram. hādōr ‘circle’, hādōrā ‘peddler’, root h-d-r ‘to go around’.

(13) KS ‫< ﻫﺎﺭ‬hʾr> ‘feces’ (Pers. ‫ ﮔﻮ‬gū). Pers. hār ‘id.’ (Sanāʿī). Central Asiatic Gypsy argots hor ‘id.’ From Aram. hārē ‘feces’. (14) KS *‫*< ﺳﺎﻭﺗﻪ‬sʾwth> (ms. ‫< ﺳﺎﻭﻧﻪ‬sʾwnh>) ‘old (person)’ (Pers. ‫ ﭘﯿﺮ‬pīr). Arabcha argot sout, Jugigi argot sowut, Iranian Gypsy argot sobut ‘old man’. Asadī, Loɣat al-Fors, defines ‫< ﺻﺎﺑﻮﺗﻪ‬sʾbwth> as an old woman who has reached the age of seventy; cf. Borhān-e Qāṭeʾ ‘old woman’. In form stands to sout, sowut as stands to sobut. The alleged reference of the latter alone to a woman may only be explained from the context of the poem by Qarīʿ al-Dahr, which Asadī cites (see 2.5). The Persian spelling with ṣād indicates oral transmission, in which the word was treated as though Arabic. Borhān-e Qāṭeʾ simplifies Asadī’s definition as ‘old woman’; the attribution to ‫< ﺯﻧﺪ ﻭﺍﺳﺖ‬znd wʾst>, as though to Zand and Avesta, probably proceeds from ‫< ﺍﺳﺘﺎ‬ʾstʾ>, a misreading of Asadī’s ‫ ﺁﺳﯿﺎﻥ‬Āsīān. The argot forms are from a cross of Aram. sēβūtā ‘old age’ and sāβā ‘(old) man’. For abstract used as adjective, cf. Heb. gālūt ‘(the Jewish) exile’ > JLot. Golpayegan gālut ‘miserable’, Djougi “galout” ‘bad’, Sam. Lʸuli argot gohlut ‘ill’, etc.; KS ‫< ﮐﺎﻟﻮﺕ‬kʾlwt> with contextually similar meaning in a verse. Further Heb. hăβalūt (abstract of heβel ‘vanity’ > JLot. Borujerd, etc., hevalut ‘bad’).

On some Iranian secret vocabularies, as evidenced

(15)

75

KS ‫< ﺩﻧﻪ‬dnh> ‘woman’ (Pers. ‫ ﺯﻥ‬zan). Asadī, Loɣat al-Fors, defines as ‘noun for woman in the language of the Āsīān (nām-e zan be zabān-e Āsīān), and gives a verse for from Qarīʿ al-Dahr, with an argot phrase -ue zīf ‘an ugly woman’ (see above on zīf). Cf. Persian musicians duneh, etc., Kurdish Lūter-e Jāberī dānu, Pers. dervishes danew, deneb; Pers. Gypsies denew; Aboltili danap; Tajik Gypsy argots danap, danam, danawak; Persian musicians metathetic nedew, nidu, etc. The argots of itinerants point to early derivation from Arab. ðanab ‘tail’, with phonetic change usual in comparison with the conservatism usual in the KS for other forms from Arabic. For the argotic semantics ‘tail’ > ‘woman’, cf. the Tajik Gypsy forms in the next item. I now view the similarity of the forms nVdV to the Hebrew word for niddāh ‘menstruant’ as coincidental. Dr. Ali Ashoury tells me (orally) that in Laki, dānu is used for bānu (< Pers. bānu) ‘lady, woman, wife’. I take this as a borrowing from Lūter-e Jāberi dānu, whose vocalism would reflect influence of the Persian word.

(16) KS ‫< ﻫﺮه‬hrh> ‘rump, buttocks’ (Pers. ‫ ﮐﻮﻥ‬kūn). Abū Dulaf’s tenth-century CE Qaṣīda Sāsāniyya has hurr ‘rump’, cf. Aram. ḥor ‘behind, posterior’, whereas the later argots of Iran and Central Asia agree as to final vowel with KS . Pers. has ‫ ُﻫ ّﺮه‬horre(h) ‘rump, buttocks’. In Tajik Gypsy argots, hurra comes to mean ‘vulva’; thus “Samarkand Lʸuli” hurra, Jugigi ɣurra, and Chistoni ura ‘vulva’ (cf. for *h- “Sam. Lʸuli” muhůz, Jugigi muɣůz, KS ‫< ﻣﺎﻫﻮﺯ‬mʾhwz> ‘city’ < Aram. māḥōzā ‘id.’; “Sam. Lʸuli” hal, Jugigi ɣal ‘a piece’; Chistoni Tajik ar < har ‘every’). (17)

KS ‫< ﻧﺎﺯ‬nʾz> ‘boy’ (Pers. ɣolām ‘boy’) is found in the midst of a series with Persian gloss ‫ ﺩﺧﱰ‬doxtar ‘daughter, girl’ toward the beginning and for * (Pers. pīr ‘[old] man’[14] above) toward the end, which position supports the meaning ‘boy’ as per the gloss ɣolām. However, emending ‫ ﻧﺎﺯ‬to *‫ ﺗﺎﺯ‬tāz yields the attested word for ‘catamite’ (see 2.3 below), a meaning also common for Pers. ɣolām. As for the position of *tāz with its gloss ɣolām indicated above, note that the early Persian lexicographic tradition, alongside the usual explanation of tāz as ‘catamite, passive homosexual’, also gives for tāz the meaning ‘beardless young man’. The nuance ‘catamite’ is made likely by the fact that KS, Part Four, ends with a difficult argot verse, most likely scurrilous, featuring indefinite tāz-ī and definite tāz. The etymology is unclear (cf. Pers. tāze ‘fresh’ or less likely tāz *’galloping’). I now fully doubt my earlier speculation that KS, Part One ‫ ﻧﺎﺯ‬is a miscopying of *‫* ﻧﺎﺭ‬nār < Heb. naʿar ‘boy’.

76

Martin Schwartz

2.3 Classical Persian poets and transmission of argot It is clear that it was particularly satirists (Qarīʿ al-Dahr, Sūzanī, Sanāʿī) who were cited in early Persian dictionaries as sources of argot words in scurrilous contexts. Qarīʿ al-Dahr (or Qarīʿ al-Fors), late fourth and early fifth centuries, cited by Asadī, Loɣat al-Fors, is especially relevant. In one verse we have the collocation of ‘woman’ and ‘ugly’ in ‘an ugly woman’. Furthermore, ṣābūte, horre, and tāz are collocated in the following verse:

‫ﻣﺮﺍ ﮐﻪ ﺳﺎﻝ ﺑﻪ ﺷﺶ ﻭ ﻫﻔﺘﺎﺩ ﺭﺳﯿﺪ ﻭ ﺭﻣﯿﺪ‬ ‫ﺩﻟﻢ ﺯ ُﺷ ّﻠﮥ ﺻﺎﺑﻮﺗﻪ ﻭ ﺯ ُﻫ ّﺮﮤ ﺗﺎﺯ‬

marā ke sāl be šeš o haftād rasīd o ramīd delam ze šolle-ye ṣābūte (v)o ze horre-ye tāz [My years having reached seventy and six My heart shuns an old (one’s) con and a catamite’s cul]

The context explains Asadī et al. having ṣābūte mean ‘a septuagenarian woman’; rather, noun ‘old person’ or adjective ‘old’. (I thank Mahmoud Omidsalar for helpful discussion of the foregoing verse.)

2.4 Fossilized nouns with possessive suffixes as parallel to aramaeograms In the group of 2.2, KS ‫< ﻫﺮه‬hrh>, Pers. ‫ ُﻫ ّﺮه‬horre(h) vs. Abū Dulaf hurr, Aram. ḥor, exemplifies a type of diversity within the history of argot nouns, which is inherited from JLot., and constitutes a noteworthy oral parallel to the variable fixation of stereotyped forms with possessive suffixes in logograms of the Aramaicoriginated Middle Iranian scripts. In the instance of , etc., the termination goes back to an Aramaic possessive suffixation of -eh ‘his’ as is found vestigially in JLot. Khomein ragle ‘foot’ < Aram. ragleh ‘his foot’. In the word for ‘behind’ > ‘rump’, the vestigial *-eh merged with the reflex of Middle Persian *-ag, i.e., Early New Persian -a, whereby -a in the Tajik Gypsy argot forms. Another KS word for a body part, ‫< ﻟﮑﺘﻪ‬lkth> ‘finger’ (Pers. ‫ ﺍﻧﮕﺸﺖ‬angošt) probably reflects a JLot. form based on an Aram. -eh possessive of a slang noun from root l-q-ṭ ‘to pick up’. The last word is mentioned (from Ivanow 1922) by G. Morgenstierne (1973:151) as “probably” belonging to a series of New Iranian words for ‘finger’ (Nushki Balochi aŋgul, etc.) which reflect *anguli-. Of these “Persian Gypsy” lekik is best taken with the Aramaic-originated KS argot word, while Makrani Balochi laŋkúk is a diminutive of a metathesis of aŋgul, and Kumzari linkit a dissimilation of a form like the Makrani Balochi.

On some Iranian secret vocabularies, as evidenced

77

KS ‫< ﺩﮐﻨﯽ‬dkny> ‘beard’ (Pers. rīš) would stand to Persian Gypsy dagnā ‘area of the beard, mouth, etc.’ as is to Abū Dulaf hurr, but represented respectively Aram. dagnī ‘my beard’ and dagnā ‘(the) beard’. Alternation reflecting fossilized nouns *‘my N’ alongside ‘his N’, from underlying Hebrew words, is shown by KS ‫< ﮐﯿﻤﻮﻟﻮ‬kymwlw> ‘camel’ (Pers. ‫ ﺍﺷﱰ‬oštor) = */gimōlō/ < Heb. gəmallō ‘his camel’ vs. JLot. Golpayegan gamelli ‘camel’ < Heb. gəmallī ‘my camel’. Aram./Heb. āβ(ī-) is reflected with fossilized possessive ‘thy’ in KS ‫ﺍﺑﯿﮏ‬ ‘father’ (Pers. ‫ ﭘﺪﺭ‬pedar). Aram./Heb. āβ(ī-) ‘father’ is reflected with fossilized possessive ‘thy’, *-x(ā), in KS ‘father’ (Pers. ‫ ﭘﺪﺭ‬pedar), JLot. Shiraz abeq ‘id.’ and ‘my’, *-ī, in JLot. Khomein ābi ‘id.’. This situation is remarkably parallel to that found for local variation in Middle Iranian Armaeograms, e.g., ‘son’: Pahlavi BRH in Persia < Aram. ‘his son’, Parthian and Sogdian BRY < Aram. ‘my son’. Also remarkable, again in the realm of body parts, is the aforementioned JLot. Khomein ragle ‘foot’ < Aram. ragleh ‘his foot’ as parallel to Pahl. LGLH ‘foot’.

2.5 The argots, East Iranian, and etymology We may now examine argot words with an East Iranian perspective that are etymologically interesting. (18) East Iranian forms are expectable a priori in consideration of the fact that among the cities for which KS provides argot names there are toponyms referring to Central Asia, and that the Arab Banū Sāsān poet Abū Dulaf had a career in the Central Asiatic court of the Samanids. The twelfthcentury poet al-Ḥilli uses the term ‫< ﮐﻨﺖ‬knt> for ‘town’; the same word in the KS, glossed as Pers. ‫< ﺷﺎﺭه‬šʾrh> = šahre ‘town’ and compares with Sogd. kanθ ‘city, town’, etc. KS ‫< ﺷﮑﺮه‬škrh> (with k = g) ‘cat’ (Pers. ‫ﮔﺮﺑﻪ‬ gorbe) and Jugigi argot šaɣur ‘cat’ respectively match Persian and Pamiric words for ‘porcupine, hedgehog’ cognate with Avestan sukurəna- (? < *sukwṛHna- ‘having prickly wool’ with root-stem *suk- ‘pricking, piercing’, to Av. sūkā, Pers. sūzan ‘needle’), the KS and Jugigi forms sharing the same semantic argotic change. (19) Given the foregoing correlations, one may readily connect KS argot ‫ﺍﮐﻮﭼﯿﺪﻥ‬ ‘to take’ (Pers. ‫ ﮔﺮﻓﺘﻦ‬gereftan) with Tajik Gypsy Chistoni argot yakučidan ‘to take, to buy’ under *akōč-; cf. Sogd. ākōč ‘to hang up’, further Sogd. ptkōč ‘to catch fish’ and Pamir cognates involving entanglement and being caught (Cheung 2007: 249; I thank N. Sims-Williams for online

78

Martin Schwartz

discussion). Chistoni -a- cannot be from *-ā-, which would give *-o-; KS and Chistoni yakuč- would be reconciled by *akōč- with unstressed *ă- from *ākṓ č-. The y of Chistoni yakuč- is explainable from an imperative *bi-akuč- > bi-yakuš-; cf. Oranskij (1983: 175), where yakuč is listed, and, independently, Chistoni yors- ‘to pass by’ is reconciled with Jugigi, etc., argot vars-, wars- ‘id.’ via a suggested *bi-w/vors- > *bi-yors- > yors-. (20) Chistoni argot nā̊ r- ‘to make, to do’ (Oranskij 1983: 169, 173) is relevant for the etymon of Proto-Indo-European h2ner- ‘male’, for which Cheung (2007: 182‒183) denies that there is evidence for a PIE verbal root behind this noun. Clearly Latin neriōsus ‘strong’ and Old Irish nertum are denominal, as may be Vedic sūnara- ‘powerful, potent’, although these presuppose ‘strong, powerful’. While OPers. /hūnara-/ ‘ability, skill’ is semantically comparable with Parachi nar- ‘to be able’, nothing indicates that the latter is denominal, especially since a verbal root (H)nar seems to be reflected elsewhere in Iranian, in meanings, moreover, that cannot derive from ‘male’, and in some instances even from ‘be strong’. A PIE primary verbal root emerges from suggested diachronic ordering of the data with a view to semantic development, as illuminated by words with parallel developments: Ossetic nærs- ‘to swell up, become fat’, and nard ‘well fed’ would provide the semantic starting point; cf. PIE *t(e)uH ‘to swell’ > ‘be powerful, be able’. PIE *h2ner- ‘male, strong one’ would simply be a root-stem. From an Iranian verb nar- ‘to be strong, virile’, Ossetic has nart ‘heroism’, most easily derived from *narθra-, whose deverbal suffix *-θra- presupposes a verb root nar. From ‘be powerful’ as ‘have power over’ > ‘come to possess’ (cf. Iranian root xšay ‘rule, possess, be able’ and further Gr. krátos ‘power, rule’, kratéō ‘I hold’), one may account, with different preverbs, for Balochi gīnār- ‘to hold, take possession of’, Gabri afnūrdan, Yazdi pe-nar-t ‘to take’ and (via ‘take hold of’) Oss. avnal- ‘to touch’. In view of the semantically similar PIE *magh (*meh2 g h) ‘be able’, whence Eng. might (verb and noun), may. The potential idea seen in the verbs might and may accounts for the meaning of the Lithuanian cognate magù magéti ‘to want, to like’, which parallels Lith. nóras ‘desire’. The Greek cognate of the Germanic and Slavic words, μῆχος ‘means, expedient’, gets us beyond Parachi nar- ‘to be able’ (whose non-denominative development from ‘be strong, potent’ should now be self-evident) to Chistoni argot nār- ‘to make’, for which cf. semantically Gr. μηχάνομαι mēkhánomai ‘devise, construct, bring about’. In Chistoni argot nār- we likely have local Central Asiatic preservation of an archaism, comparable to the preservation of early Iranian words in Jewish

On some Iranian secret vocabularies, as evidenced

79

jargon. In the instance of nār-, the preservation has a place in the greater history of the PIE root h2ner ‘be strong’, etc. Note: Throughout this article I have, for various reasons, transcribed Persian vowels as in Modern Iranian Literary Persian, rather than early Classical Persian. Thus I have, e.g., horre-ye rather than hurra‑yi, tāz-ī rather than tāz-ē, rūz rather than rōz, etc.

References Cheung, Johnny. 2007. Etymological dictionary of the Iranian verb. Leiden Etymological Dictionary Series, 2. Leiden & Boston: Brill. Hosseinzadeh, Ayda, Ali Mehdizadeh, Arman Zargaran & Moham(m)ad M. Zarshenas. 2013. Abkama, the first reported antibiotic on gastritis and infection throughout history. Pharmaceutical Historian, August. 39‒42. http://www.researchgate.net/publication/258999430 (accessed 2 January 2017). Ivanow, W[ladimir]. 1922. An old Gypsy-Darwish jargon. Journal (and Proceedings) of the Asiatic Society of Bengal NS/10. 375‒383. Morgenstierne, Georg. 1973. Notes on Balochi etymology. In Georg Morgenstierne, Irano-Dardica, 148‒165. Wiesbaden: L. Reichert. Oranskij, Iosif Mixailovič. 1983. Tadžjikojazyčnie ètnografičeskie gruppy Gissarskoj doliny srednaja Azij. Ètnolingvističeskoe issledovanie. Moscow: Akademija Nauk SSSR. Schwartz, Martin. 2014. Loterāi: Jewish jargon, Muslim argot. In Houman M. Sarshar (ed.), The Jews of Iran – The history, religion and culture of a community in the Islamic world, 33–57. London, New York: I. B. Tauris. Revised from 2012 article. http://www.iranicaonline.org/ articles/loterai (accessed 2 January 2017). Sobranie vostochnyx rukopisej Akademii Nauk Uzbekskoj SSR. Catalogue of Oriental Manuscripts of the Uzbek Akademy of Sciences of the Uzbek SSR, Vol. I, 196–197. Tashkent 1952, 196–197. Troickaja, Anna Leonidovna. 1948. Abdoltili – cexa artistov i muzykantov srednej Azij. Sovetskaja Vostokovedenie 5. 251‒264. Yarshater, Ehsan. 1977. THe hybrid language of the Jewish community of Persia. JAOS 97(1). 1‒7. Cf. Yarshater, Ehsan, Encyclopaedia Iranica XV, Fasc. 2, pp. 156‒160.

Agnès Lenepveu-Hotz

5 Specialization of an ancient object marker in the New Persian of the fifteenth century1 Abstract: In Early New Persian (tenth to eleventh centuries), there are two object markers: a postposition rā and a circumposition mar. . .rā, which appear to be equivalent. In some texts of the fifteenth century, mar. . .rā still exists, even if its use is sporadic. Nevertheless, the presence of this ancient object marker alongside the usual marker rā poses a question. Based on the occurrences found in a text written in 1484 and conserved in an autograph manuscript, this article aims to analyze and clarify the function of this circumposition in comparison with the postposition. We will see that adding the former preposition mar allows the author to avoid ambiguity in some uses of indirect object when rā tends to change its marking from an indirect and direct object to simply a direct object. The circumposition is also used to express other specific values, namely, external possession and focalization on an indirect object. Keywords: New Persian, circumposition mar. . .rā, object marking, direct/indirect object, external possession, focalization.

1 Introduction In the earliest Persian prose texts (tenth to eleventh centuries), authors used either the postposition rā or the circumposition mar. . .rā to mark the direct object (DO) or the indirect object (IO). Bossong (1985: 59) states that the morpheme mar generally fell out of use after the twelfth century, although, as we will see, it can still be found in some fifteenth-century texts. To my knowledge,

1 This article is based on a presentation I made at the Third International Conference on Iranian Linguistics, held in Paris on 11–13 September 2009. Since then, I have focused on the circumposition mar. . .rā in two ways: the difficulty of detecting its use, a problem linked to that of the transmission of Persian manuscripts (Lenepveu-Hotz 2014), and its employ in poetry, with the example of Firdausī’s Shāhnāma (Lenepveu-Hotz 2016). Agnès Lenepveu-Hotz, University of Strasbourg DOI 10.1515/9783110455793-006

82

Agnès Lenepveu-Hotz

no grammar study has specifically addressed this circumposition. Salemann and Shukovski (1925: 27‒28) and Jensen (1931: 44) present mar. . .rā as an equivalent to the simple postposition rā. For Phillott (1919: 57), “it is generally redundant but occasionally restricts the meaning to the case in point”. Moreover, its use cannot be compared to that of Middle Persian, since mar did not exist in that stage of the language as a preposition. In Middle Persian mar was a noun, meaning ‘number, account’ (for this etymology, see Benveniste 1938: 459‒460), and it had not yet been grammaticalized.2 In this article, I will analyze mar. . .rā and its use in a work of the fifteenth century: the Rauzat al-ahbāb fī siyar al-nabī wa-’l-āl wa-’l-ashāb (RA), composed by Amīr Jamāl al-Husaynī al-Daštakī al-Šīrāzī in 1484. It was transmitted through an autograph manuscript dating from 1497‒1498 and preserved in the library Āstān-e Qods-e Razavi of Mashhad. This text details the life of the Prophet Muhammad and other prophets in a simple narrative prose. For these reasons it is relevant for a linguistic study.3 Throughout the one hundred pages I studied in RA,4 there are twelve occurrences of mar. . .rā, showing a sporadic use of the marker. Yet, these occurrences are numerous enough to justify some analysis, especially as it does not seem to be a stylistic effect or a purposefully archaistic feature. It is important to clarify that the small number of occurrences does not imply that the morpheme has no linguistic value. For example, in French, the conditional used to express the future in the past does not appear frequently: there are only ten occurrences in one hundred pages from Honoré de Balzac’s Le père Goriot (1835), for instance. No one interprets this to mean that this function of the conditional does not exist in French. The occurrences of the circumposition will be analyzed using the same perspective, that is to say that a small number of occurrences does not undermine the fact that mar. . .rā is used with a specific value. The presence of the circumposition mar. . .rā in the fifteenth century is puzzling because it had been on the verge of disappearance. First, a comparison of RA with other texts from the same period will allow a view on whether the 2 For the prepositions and postpositions in Middle Persian, see Durkin-Meisterernst (2014: 298– 359). Nevertheless, mar appears as a preposition in two occurrences on a Middle Persian papyrus dated from the seventh century and preserved in Berlin (Benveniste 1938: 459–460); it is used before the name of a day, implying a count in the calendar. 3 Occurrences in other works written in the fifteenth century will be used to confirm the analyses, but one must keep in mind that these other works could have been subjected to correction by the scribes (more likely a deletion than an addition), especially for this type of dialectal features (for this problem, see Lenepveu-Hotz 2014). Therefore, the main occurrences will be taken from RA. 4 These correspond to the first fifty pages (1a–26a) and the last fifty (298b–324b) of the 648page manuscript.

Specialization of an ancient object marker in the New Persian

83

morpheme mar was used in all of them (Section 2). Secondly, I shall examine the postposition rā on its own in RA (Section 3) in order to analyze the uses of the circumposition in comparison with the simple postposition. This will lead to an understanding of why the author sometimes chooses mar. . .rā instead of the more common rā. It will appear that sometimes mar. . .rā is used as an IO in ambiguous cases (Section 4), while in others it expresses external possession (Section 5) and focalizes on some IO (Section 6).

2 Dialectal specificity To begin, I will discuss whether mar. . .rā occurs in all texts from this period or is a dialectal feature. Focusing on works from the fifteenth century, it is worth noting that mar. . .rā appears in some texts, but not in others. For the earliest Persian prose texts, Lazard (1963: 382‒384) shows that mar was a dialectal feature of Transoxiana and eastern Afghanistan, and was rarely employed in texts written in Herat and in Khorasan. It thus can be presumed that this was also the case in the fifteenth century. Let us look at several works from the fifteenth century, focusing primarily on the latter half: – ‘Abd al-Razzāq Samarqandī (1413–1482): Matla‘-i Sa‘dayn. – al-Daštakī al-Šīrāzī: Rauzat al-Ahbāb (RA), 1484. – Davānī: ‘Arz-i sipāh-i Uzun Hasan, 1476. – Hāfiz-i Abrū: Panj risāla-i tārīxī, 1414–1436. – Jāmī: Bahāristān (Ba), 1487. – Mīr Xwānd (1433–1498): Tārīx-i rauzat al-safā (RS). – Xunjī: Tārīx-i ‘Ālam-ārā-i Amīnī (TA), 1490. – Xunjī: Mihmān-nāma-i Buxārā (MB), 1508–1509. All of the texts that feature the circumposition mar. . .rā were written in Herat5 (Table 1): in the one hundred pages that I selected from each of the above texts6 I extracted twelve occurrences in RA, six in Jāmī’s Bahāristān, four in the Tārīx-i rauzat al-safā of Mīr Xwānd, and three in the Mihmān-nāma-i Buxārā of Xunjī. According to Xwānd-Amīr, Al-Husaynī al-Daštakī, the author of RA, lived in 5 There are many texts written in Herat from the fifteenth century because of the importance of the court of the Timurid Sultān Husayn Bāyqarā. 6 The selected pages are noted under “Corpus” at the end of the article. In the sections to follow, statements about a certain text do not refer to the whole text but only to the pages indicated.

84

Agnès Lenepveu-Hotz

Herat, taught in the Madrasa-i Sultānīya of this city, and preached in the Masjidi jāmi‘ for several years.7 It is well known that the name Jāmī is associated with Herat as well, and Mīr Xwānd also spent most of his life in this city, under the protection and patronage of the vizier Mīr ‘Alī Šīr Navā’ī. The fact that there is no occurrence of mar. . .rā in the Matla‘-i Sa‘dayn by ‘Abd al-Razzāq Samarqandī, who was born in Herat, does not undermine this hypothesis. The author traveled frequently outside of the Herat area, particularly around India, and was eventually sent to an embassy in Gilan. Furthermore, if an author of Herat does not use this circumposition, this does not imply that it is not a dialectal feature of the area because an author could decide that a dialectal feature might hamper the universal success he hopes his work will achieve. Indeed, there is no occurrence of mar. . .rā in the three other texts written by the authors who lived in other parts of Iran. Although Hāfiz-i Abrū was born in Herat, he was educated in Hamadan, traveled often, and died in Zanjan. As for Davānī, he spent all his life in Fars: born in Davan, near Kazarun, he worked as qadi of Fars, as a professor in a madrasa of Shiraz, and died in Fars. Xunjī also first lived in western Iran. He himself said that he was Xunjī by lineage, Šīrāzī “by birth and origin”, and Isfahānī by residence. However, in his Tārīx-i ‘Ālam-ārā-i Amīnī, there is one occurrence of mar, but it occurs only in one manuscript and this variant was not kept by the editor, who preferred the variant of the other manuscript: har, which suited the context better (va har muslim rā . . . ‘and for each Muslim . . . ’, 88). Moreover, the initial variants of the letters mim (‫ )ﻣ‬and hā (‫ )ﻫ‬can appear almost identical in old manuscripts.8 Regarding the Mihmān-nāma-i Buxārā, Xunjī began writing it in Bukhara and finished it in Herat (see Storey 1927–1977, vol. 1: 372), and this book presents several occurrences of mar. Table 1: mar. . .rā in some texts from the fifteenth century Herat

Outside Herat

alDaštakī

Jāmī

Mīr Xwānd

‘Abd al-Razzāq

Xunjī (MB)

Hāfiz-i Abrū

Davānī

Xunjī (TA)

12

6

4

0

3

0

0

0 (1)

Consequently, there is evidence suggesting that the circumposition mar. . .rā is a dialectal feature of Herat, and possibly of other regions of the eastern Iranian domain, but it seems that mar. . .rā does not appear farther west. 7 For more information about the author, see Newman (1994). 8 See also Lenepveu-Hotz (2014) about the difficulty locating mar in manuscripts.

Specialization of an ancient object marker in the New Persian

85

If in the fifteenth century the circumposition had actually disappeared in other regions where it had previously been used (further study would verify this point concerning earlier centuries), the movement of mar. . .rā from Transoxiana and eastern Afghanistan to Herat may be linked to the transfer of the Timurid court from Transoxiana to Herat during the fifteenth century. However, this hypothesis would need to be confirmed: when Timurids appeared (fourteenth century), was the circumposition still employed in Transoxiana, and only in this area? If the displacement of mar. . .rā to Herat occurred prior to the arrival of the Timurids in Transoxiana, it must have been linked to another factor. To understand why this dialectal feature still appears in the fifteenth century, we will now turn to the use of rā in RA, i.e., to the question of whether rā is more of a marker of IO like in Early New Persian, or whether it has progressively become a marker of DO as in contemporary Persian. As we will see (Section 4), the result may have consequences on the interpretation of the use of the circumposition mar. . .rā.

3 Use of rā in the Rauzat al-Ahbāb In Table 2, rā has become a marker of DO: 49.1 percent of the occurrences with a simple verb mark a DO. Adding the occurrences of the object of a complex verb, the result is 80.2 percent. Simple and complex verbs are given separately because the instances of compound verbs can also be analyzed as simple verbs with a real object and an IO marked by rā, as Key (2008: 240) and Paul (2008: 335) demonstrate. However, one could assume that at this period the complex verbs are lexicalized, so the third column of Table 2 presents them together. Table 2: The use of rā in the first twenty-five pages of RA DO

Object of a complex verb

Total of DO

IO

Other uses

109 49.1%

69 31.1%

178 80.2%

39 17.6%

5 2.2%

These figures confirm the general evolution of rā, and RA appears as an intermediate stage between the tenth and twelfth centuries (51.2 percent DO / 45.5 percent IO) and the twentieth century (82.3 percent DO / 6.3 percent IO) (percentages taken from Lazard 1970: 384).

86

Agnès Lenepveu-Hotz

In the earliest Persian prose works, the circumposition mar. . .rā could be used for the different functions of rā, either DO or IO, apparently without modifying the meaning in comparison with rā alone.9 There is a DO in (1) and an IO in (2), both from the same text, the Hidāyat al-muta‘allimīn (HM), written in 980: (1)

mar zafān rā bi-jumbān-ad10 mar tongue rā VAFF- move.PRS -3SG ‘It moves the tongue’. (HM 61, 7–8)

(2)

mar har šaxs-ē rā mazāj-ē mar each person-INDF rā constitution-INDF ‘For everyone there is a constitution’. (HM 19, 14)

buv-ad be.PRS -3SG

Contrary to the earliest state of New Persian, mar. . .rā always marks the IO in the occurrences found in RA as well as in the other works of the fifteenth century (Ba, RS, and MB). This is one of the reasons why I do not agree with the nuance of respect that Bahār ([1942] 1994: 1: 401) interpreted in mar. Why would an author want to mark respect merely with an IO? Moreover, there are sentences in which one expects a show of respect (with words like “God” or “the Prophet”) and rā alone is found; see examples (3) and (4) below. There are also occurrences with a commonplace word used with mar. . .rā.11 In addition, there are other conclusions to be drawn from this restriction to IO. First, the fact that the circumposition is not used for DO in the fifteenth century as contrary to its use for this purpose in the earliest texts may imply that it specialized in particular functions. Secondly, the circumposition cannot be analyzed as an equivalent to the postposition alone: if this was the case, why would it have been employed only to mark the IO and not, like the postposition, to mark both IO and DO? The fact that in the fifteenth century rā tends to mark the DO more frequently than the IO suggests that the author chooses mar. . .rā when rā alone could be interpreted either as a DO (more typical in this state of language) or as an IO (the former primary function of rā). 9 Cf. Lazard (1963: 382) and Bossong (1985: 59). Regarding the difficulty in understanding its value, see Lenepveu-Hotz (2016). 10 In what follows, language material in Arabic characters is presented in transcription, not in transliteration. 11 See also notes 14 and 29.

Specialization of an ancient object marker in the New Persian

87

4 mar. . .rā as a marker for the IO in ambiguous cases In RA, mar. . .rā is employed with very different verbs, both simple and complex: once with the verbs āfarīdan ‘to create’, guftan ‘to say, to tell’, xwāstan ‘to want’, dilālat kardan ‘to indicate’, vāqa‘ šudan ‘to happen’, twice with du‘ā kardan ‘to pray’, and five times with būdan ‘to be’.12 Concerning the first four verbs, the use of mar. . .rā instead of rā alone seems meant to avoid an ambiguity in non-prototypical sentences between DO and IO, which are both potential functions of rā.13 The uses of the postposition and of the circumposition depend on the actancy of the verb – one-actant verbs or two-actant verbs (Section 4.1) – and on the animacy (Section 4.2). When the verb is a complex verb (e.g., dilālat kardan ‘make an indication’), the presence of the nominal element poses another question (Section 4.3). Even if there are few occurrences in RA, it is possible to distinguish the function of mar. . .rā as a marker that allows the disambiguation of some constructions, where rā alone could be understood as a DO marker. For each of these four verbs I will first examine the use of rā alone in order to demonstrate the role played by mar in the occurrences I found in RA.

4.1 mar. . .rā for IO and rā for DO In RA, two verbs illustrate the general theory of disambiguation: āfarīdan ‘to create’ and xwāstan ‘to want’. For both, if we compare all the occurrences with and without mar. . .rā, it appears that the author employs the circumposition for IO in occurrences where the postposition alone could have been interpreted as a DO, the main value marked by rā alone. With āfarīdan, rā is employed only for a DO: to create something or someone, as in (3). However, in examples (4) and (5), the author wants to express both the thing created and for whom it was created.14 As noted by Paul (2008: 12 In the other texts, mar. . .rā is mainly used with the verb būdan ‘to be’: five times in Ba (out of six occurrences of the circumposition), once in RS, and the three occurrences in MB. The other occurrences of RS are used with dādan ‘to give’ and two complex verbs (sajda kardan ‘to prostrate’ and hāsil šudan ‘to be produced’, and in Ba, with dāštan ‘to have’). 13 Meunier and Samvelian (1997: 212) show that rā can avoid an ambiguity in contemporary Persian too: “Quant aux groupes nominaux ambigus, elle [la présence de rā] en sélectionne une lecture définie et/ou spécifique” [For the ambiguous nominal groups, it (the presence of rā) selects a definite and/or specific reading]. 14 Again the nuance of respect cannot be seen in the use of mar. . .rā: Adam appears in both (3) and (4), and this poses the question of why he deserves respect in (4) and not in (3).

88

Agnès Lenepveu-Hotz

334, with other examples), “the co-occurrence of a direct and an indirect rā is avoided”.15 This seems also to apply to mar. . .rā. (3)

va čūn ādam rā biy-āfarīd and when PN rā VAFF-create.PST.3SG ‘And when he created Adam’ (RA 319b, 20)

(4)

xudāy-i ta‘ālā mar ō rā šahvat God-EZ Almighty mar he rā desire ‘God, the Almighty, created desire for him’ (RA 8a, 4)

(5)

haqq-i ta‘ālā marā barāy-i tu truth-EZ Almighty I.rā for-EZ you ‘God, the Almighty, created me for you’. (RA 8b, 14–15)

āfarīd create.PST.3SG

āfarīd-a create-PP

In all the occurrences of the verb “to create” in RA, the DO appears with rā if it is animate (eighteen times) and without rā if it is inanimate (six times). Paul (2003: 182) and Key (2008: 244) state that in Early New Persian the presence or the absence of rā is linked to the animacy of the object. To some extent, this was still the case in the fifteenth century. In (4), one could say that the author may have used rā alone because the DO is unmarked. However, because rā tends to express a DO with āfarīdan in this text, the author may have thought that the sentence would be clearer with mar. . .rā. In (5), the personal pronoun man ‘I’ must be marked by rā for it is an animate DO and the IO can only be built with a preposition, e.g., barāy ‘for’, and no longer with mar. . .rā. Even if there are fewer occurrences of xwāstan ‘to want’ with an object (three with rā, one with mar. . .rā), the use of the circumposition is also a marker of disambiguation, like with āfarīdan. In the very similar sentences in examples (6) and (7), mar. . .rā marks the IO dōst ‘friend’; the DO sōxtan ‘the burning’ is unmarked in the first sentence, whereas the postposition alone marks the DO dōst in the second. 15 In Early New Persian (tenth to eleventh centuries) – thus a former stage of the language – Lazard (1963: 380–381) finds some examples of such a co-occurrence, but says that in the usual pattern the DO is built with rā and the IO with the preposition ba ‘to’, or the DO is unmarked and the IO is marked by rā.

Specialization of an ancient object marker in the New Persian

(6)

čūn when

dōst friend

ravā allowed

mar mar

dōst friend

rā rā

sōxt-an burn-INF

xwāh-ad want.PRS -3SG

89

zīst-an live-INF

nēst NEG .be.PRS .3SG

‘When a friend wants his friend to burn (lit. ‘wants the burning for [his] friend’), he shall not deserve to live’. (RA 19a, 17–18) (7)

čūn when

dōst friend

dōst friend

rā rā

xwāh-ad want.PRS -3SG

sōxt-an burn-INF

ravā allowed

nēst NEG .be.PRS .3SG ‘When a friend cares for (his) friend (lit. ‘wants the friend’), he shall not deserve to burn’. (RA 19a, 18)16 Two other occurrences of xwāstan, in which the DO is marked only by rā (8), confirm the impression that rā is used for a DO while mar. . .rā is used for an IO. (8)

vai rā dar vaqt-ē xwāst PN he rā in time-REL want.PST.3SG ‘Ibrahim asked for (lit. ‘wanted’) him when . . . ’ (RA 20a, 2–3)

a.

ibrāhīm

b.

tā tu ō rā bi-xwāh-ē in order to you he rā VAFF-want.PRS -2SG ‘In order for you to want her’ (RA 308b, 15)

ki. . . when

4.2 The question of animacy The question of animacy deals with the larger problem of the evolution of object marking. For āfarīdan ‘to create’ (Section 4.1), we saw that a DO is more often marked if it is animate. For example, in the Qābūsnāma, a text of the eleventh century, more than 80 percent of the human DO are marked with rā (Key 2008:

16 The context of (6) and (7) is the following: Nimrod has ordered that Ibrahim should be burned. Angels ask Ibrahim whether he needs help. In (6) he explains why he did not ask for help and the angel Jibril answers in (7).

90

Agnès Lenepveu-Hotz

232). However, the DO tends to be marked progressively with rā if it is a definite DO, even if it is inanimate. The fifteenth century, and thus RA, which was written in this century, present an intermediary stage between marking dependent on animacy (as in the tenth and eleventh centuries) and marking dependent on definiteness (as in contemporary Persian). This is why mar. . .rā is employed in non-prototypical sentences with the verb guftan ‘to say, to tell’, again because the postposition rā alone could be ambiguous between a DO and an IO. In the unambiguous cases in which rā alone marks the IO (seven occurrences), i.e., the person to whom the subject says something, the marked noun or pronoun is always an animate one, such as in (9a).17 When rā marks a DO,18 it is with an inanimate noun related to speaking, like kalamāt ‘words’, suxan ‘words or utterances’, or javāb ‘answer’, i.e., for the contents of the speech (9b and four other occurrences). In the occurrence with mar. . .rā in (10), the IO qalb va rūh ‘heart and spirit’ also consists of inanimate nouns.19 Even though without mar, qalb va rūh would not have been interpreted as the DO of guftan, the author prefers to clarify the function of this unusual inanimate indirect object, since the IO is prototypically human. (9)

a.

rōz-i dīgar kāhin-ān nimrōd rā guft-and: day-EZ other priest-PL PN rā say.PST-3PL ‘The next day the priests said to Nimrod:’ (RA 16b, 8)

b.

īn kalamāt rā bi-gōy-ad this word.PL rā VAFF-say.PRS -3SG ‘He says these words’ (RA 10b, 9)

17 In classical Persian it was very common for the IO of the verb guftan ‘to tell’ to be built with the postposition rā. Cf. Ovčinnikova (1956: 396) and Karimi (1990: 181). In RA, this structure is in competition with bā ‘with’ (twelve occurrences) and ba ‘to’ (four occurrences). 18 I do not discuss occurrences with a DO and an attribute of the object related to the meaning of naming, because in these sentences mar. . .rā is not employed – there is only the structure with rā alone. See for instance: ān-rā duldul mē-guft-and this-rā PN VAFF-say.PST-3PL ‘They called him Duldul’ (RA 320b, 3) 19 See Key (2008: 232): “The classification ‘animate’ included inanimate objects, body parts, places, and abstractions”. Other occurrences show that qalb ‘heart’ and rūh ‘spirit’ are treated as inanimate nouns. Indeed, we must remember that the notion of humanness is scalar (cf. Lazard 1982: 185–186).

Specialization of an ancient object marker in the New Persian

(10)

nafs-i vai mar qalb va rūh rā soul-EZ he mar heart and spirit rā ‘His soul said to (his) heart and (his) spirit:’ (RA 17a, 22)

91

guft: say.PST.3SG

4.3 IO vs. DO with a complex verb: The case of dilālat kardan ‘to indicate’ With complex verbs, one can assume that the situation of marking is different. When an object is marked with rā, it is difficult to interpret it as a DO or an IO because the nominal element of the complex verb can be interpreted as the first object of the verb. If we consider that in this stage of the language the complex verbs are lexicalized, the marked object can be either the DO of the complex verb or its IO. The uses of the postposition and the circumposition show us the same use of disambiguation as those we saw with simple verbs. The IO of dilālat kardan in (11) is marked by mar. . .rā while rā alone marks the DO of the verb in (12). This is especially important as the meaning of the verb is different: “indicate something about or related to someone” in the first example and “lead someone” in the second. (11) ahādīs-i sahīha dilālat bar subūt-i īn xasīsa hadith.PL- EZ true.PL expressing on evidence-EZ this characteristic mē-kun-ad mar ān sarvar rā VAFF-do.PRS -3SG mar that lord rā ‘The sound hadith indicate the evidence of this characteristic about the Prophet’ (RA 315b, 16) (12) dilālat kun-am šumā rā guiding do.PRS -1SG you rā ‘I guide you’ (RA 310a, 11) In fact, šumā rā in (12) is on the same level as subūt and not on that of mar ān sarvar rā in (11).20 These two sentences can be defined as “make an indication for you” in (12) and “make an indication of the evidence of . . .” in (11). Looking 20 So, even if the complex verb must be interpreted as a simple verb and its unmarked object, the distinction of the different level is the same.

92

Agnès Lenepveu-Hotz

more closely at mar. . .rā in (11), one could also analyze this marking in a different way: as the expression of the external possession.

5 Expression of external possession? In many European languages the dative (or the IO) is also employed to express external possession,21 as shown by König and Haspelmath (1998). They note examples in German, Die Mutter wäscht dem Kind die Haare ‘The mother washes the hair of the child’, or in French, On lui a tiré les oreilles ‘He had his ears boxed’. The phenomenon concerns inalienable property: the possessor is animate, specifically human, and the possessed is a part of the body or clothes. The verb cannot be a state verb or a verb of perception but, according to the terminology of König and Haspelmath (1998: 533), it is a “verb of contact or change”. As we will see, it is not this exact form of external possession in our text because of the meaning of the verb and because the complement built with mar still depends on a noun rather than on the main verb. However, it functions like this structure, i.e., mar. . .rā serves as a substitute for the usual possessive pattern with the ezāfe (or with a personal suffix) in some examples. For instance, (11) is almost equivalent to īn xasīsa-i ān sarvar ‘this characteristic of the Prophet’. Because rā never marks external possession in the one hundred pages I studied in RA, mar. . .rā can also avoid ambiguity. In (13), if rā had been employed, the reader could have interpreted the meaning to be that Adam is the recipient of the calling whereas in actuality he is the beneficiary of the prostration. The circumposition clarifies that mar ādam rā is linked to sujūd ‘prostration’ and not to the main verb vāqa‘ šuda ‘took place’ (the context demonstrates that the recipient of the calling is Iblīs). Moreover, (13) does not exhibit the usual order for this type of existential sentence (Indirect ObjectSubject-Verb),22 but the order Subject-Indirect Object-Verb. 21 For the term external possession, see König and Haspelmath (1998: 526): “le possesseur n’a cependant pas besoin de faire partie du même constituant de phrase que le possédé, mais peut aussi sous certaines conditions former un constituant de phrase distinct” [the possessor does not necessarily need to be part of the same phrase as the possessed but under certain conditions it may also constitute a distinct phrase]. 22 For example, in: ibrāhīm rā mu‘āraza bā nīmrōd vaqt-ē vāqa‘ šud ki. . . rā quarrel with PN time-REL placed become.PST.3SG when ‘A quarrel took place between Ibrahim and Nimrod (lit. ‘for Ibrahim with Nimrod’) when . . .’ (RA 18a, 7) PN

Specialization of an ancient object marker in the New Persian

93

(13) xatāb-i sujūd mar ādam rā bā malāyika vāqa‘ šud-a calling-EZ prostration mar PN rā with angel.PL placed become-PP ‘The calling for prostration before Adam took place with the angels (for Iblīs)’ (RA 7b, 19) One can observe a similar phenomenon in other sentences.23 The structure with būdan ‘to be’ and the IO generally marks the possession, like the Latin mihi est type. However, in (14), the author compares Muhammad’s feet to pumice and does not say that Muhammad has no feet. In this case, we do not find the common expression of possession with būdan ‘to be’, which can be seen in other examples. In fact, with mar. . .rā and būdan, none of the five occurrences present the meaning ‘to have’, contrary to rā alone.24 (14)

va and

mar mar

qayqāb pumice

rā rā

ō he

rā rā

mē VAFF

pāy-hā foot-PL

na-būd-a NEG -be-PP

ast be.PRS .3SG

čunān ki as

bāš-ad be.PRS -3SG

‘And his feet were not like pumice’ (RA 302b, 10–11) Unlike external possession, the complement built with mar. . .rā does not function as the beneficiary or recipient of the main verb but maintains its relation to the noun. The possessor is not the experiencer of the action. König and Haspelmath (1998: 567) state that in external possession, “le possesseur est affecté par l’action qui concerne son possédé” [the possessor is affected by the action that concerns its possessed]. However, like external possession, “the possessor is treated as an additional argument of the clause” (Payne and Barshi 1999: 5).

23 Also in Ba: va dar tahniat-i fath-i vai mar hindūstān rā qasīda-ē dārad ‘and in congratulations for his victory against India there is a panegyric poem’ (Ba 127). Even if the phrase mar hindūstān rā ‘against India’ is syntactically independent, it is semantically linked to fath-i vai ‘his victory’. 24 This is not the case with the occurrences in the other texts, in which mar. . .rā is used with būdan ‘to be’ with the meaning of ‘to have’. However, this does not contradict my assertion, because in all of these occurrences, there is a specific use of this structure: all appear in sentences in which the subject or the object follows the verb, giving the impression that there is a value of focalization, as we will see in Section 6 with other examples.

94

Agnès Lenepveu-Hotz

In place of external possession, one could have considered examples (11) and (14) as left dislocation (14) or right dislocation (11), but in (13) the words built with mar. . .rā are placed between the subject and a complement. In addition, in dislocation, the complement is not always in a genitive relation with a noun from the sentence. Nevertheless, one could say that in many examples the placement of mar. . .rā depends on a dislocation that strengthens the use of the circumposition instead of the usual structure with the ezāfe. The complement marked by mar. . .rā cannot be a benefactive IO because of examples such as (13), in which Adam is not the beneficiary of the calling (we are sure of that meaning thanks to the context and the historical background), and examples such as (14), in which būdan ‘to be’ built with an IO does not have its usual possessive meaning. Therefore, it is more relevant to compare this structure with external possession. Moreover, even if the use of mar. . .rā is linked to the original meaning of rā ‘with regard to, for the sake of’, the pattern may be compared to one of external possession. We find the same use of this structure in the German and French examples quoted above: a dative or IO (mar ō rā, dem Kind, lui) represents the possessor of a noun phrase unmarked by a possessive marker (pāy-hā, die Haare, les oreilles and not pāy-hā-aš, seine Haare, ses oreilles). Therefore, this is not exactly the same pattern as in contemporary Persian, as seen in sentences like aqā kučik rā čeqadr xarj-e tahsil-eš kardam [Aqā Kučik, how much I spent on his studies!] given by Lazard (2006: 180): there is not only the postposition rā but also the possessive marker -eš. However, compared with the usual structure with the ezafe, the use of mar. . .rā to express possession may be a nuance of focalization, like rā in contemporary Persian.25

6 Focalization The use of a morpheme as a marker of focalization is quite common in other languages. Pottier (1968: 92) states that we see or do not see the preposition a in Spanish according to the speaker’s purpose. Such an analysis can be applied to the uses of mar. . .rā. This is most likely the case for the occurrences in which the circumposition is employed to avoid ambiguity: the sentence in (10), for example, may be understood as ‘To (his) heart and (his) spirit, his soul said:’. There is further evidence of this role. For instance, in (13), with mar. . .rā, the

25 This was most likely the role of mar alone in Early New Persian (Lazard 1963: 450–451), and perhaps also of the circumposition (Lenepveu-Hotz 2016).

Specialization of an ancient object marker in the New Persian

95

author can focus on an unexpected prostration before Adam instead of the usual prostration before God. The focalization is also tangible in other examples, like in (15).26 (15)

agar if

sajda prostration

takabbur pride

mar mar

haqq truth

rā rā

būd-ē, be.PST.3SG -VAFF

iblīs PN

na-kard-ē NEG -do.PST.3SG -VAFF

‘Had the prostration been for God, Iblīs would not have had a disdainful attitude’ (RA 7b, 10–11) Again, the order of the words is unexpected in such an existential sentence and the author stressed the focalization by using mar. . .rā. Indeed, in this context, we understand that Iblīs agrees to prostrate himself only before God (mar haqq rā) but not before Adam as God commands the angels to do. This will have a considerable consequence: Iblīs becomes the devil and is sent to hell. God (haqq) is then the most important piece of information in the first clause and is focalized by the circumposition, itself strengthened thanks to the placement of the complement.27 Regarding du‘ā kardan ‘to pray’, even though the occurrences are not numerous enough to draw conclusions (two with mar. . .rā and only one with rā), one might conclude that the position of the object again plays a role in its marking. In fact, the two sentences with this verb and mar. . .rā do not contain the same order as the sentence with rā alone (16). There may be a marked order with mar. . .rā in (16a) and an unmarked order with the simple postposition in (16b).28

26 See another example: va mar allāh-i ta‘ālā rā muvahhid nabūda ‘And towards God, the Almighty, he was not monotheist’ (RA 17a, 7). 27 We see the same phenomenon with vāqa‘ šudan ‘to happen’; see (13) above: in this type of existential sentence, the order with rā is unmarked (Indirect Object-Subject-Verb) and marked with mar. . .rā (Subject-Indirect Object-Verb). An unusual order is also found in two other occurrences of mar. . .rā with būdan ‘to be’, in which the IO appears after the verb: va ādam qabla būd mar ān sajda rā ‘And Adam was the direction for that prostration’ (RA 7b, 8) and qasm-i avval ki vājibāt ast hikmat dar taxsīs-i ānhā ba hazrat ziyādatī zulfā va husūl-i darajāt-i ‘olā ast mar ō rā ‘The first part which is the compulsory religious prescriptions is for him the wisdom to devote these things to the Lord, to be much closer and reach the highest degrees (of spirituality)’ (RA 313b, 11). 28 The other texts also present an unusual order SVO in the occurrences with mar. . .rā; see note 24 and this example in RS: čirā sajda namēkunand mar xudāy rā ‘why they do not prostrate themselves before God’ (RS 432).

96 (16)

Agnès Lenepveu-Hotz

a.

Subject-

Verb-

mar Object rā

b.

Subject-

Object rā-

Verb

Furthermore, thanks to the study of the context of these three sentences, it is clear that in the two occurrences with mar. . .rā the author focuses on this complement. In (17), he opposes all the dead at the beginning of the prayer to the man who has died, mar mayyit rā, at the end of the prayer. In (18) ō is focalized because the sentence is aimed at someone who has just killed the man represented by this ō ‘he’. (17)

va and

dar in

mayyit deceased

takbīr-āt-i takbir-PL- EZ rā rā

ba to

dīgar other

maqfirat absolution

du‘ā prayer va and

kard-ē do.PST.3SG -VAFF

mar mar

rahmat mercy

‘And in other takbir, he would pray for the deceased for (his) absolution and mercy’ (RA 298b, 13) (18)

čirā why

du‘ā-i prayer-EZ

barikat blessing

na-kard-ē NEG -do.PST-2SG

mar mar

ō he

rā rā

‘Why did you not pray for his blessing? (lit. ‘do the prayer of blessing for him’)’ (RA 313a, 2) Contrary to (17) and (18), in the sentence with rā alone (19), there is no focalization in the context. This concerns any tribe Muhammad eats with and not one tribe in particular. Thus, in this sentence ān qaum ‘that tribe’ is not focalized. (19)

čūn when du‘ā prayer

nazd-i near-EZ

qaum-ē tribe-INDF

ta‘ām food

xward-ē eat.PST.3SG -VAFF

ān that

qaum tribe

rā rā

kard-ē do.PST.3SG -VAFF

‘When he would eat with a tribe, he would pray for that tribe’ (RA 303b, 17–18)29

29 This occurrence again contradicts the nuance of respect that mar could have expressed: even if the tribe is respected and honored (the Prophet would pray for them all), it is only marked with rā.

Specialization of an ancient object marker in the New Persian

97

One may argue that there are also some occurrences in which rā alone marks the focalization. Indeed, such occurrences, such as (20), can be found with the postposition rā alone, but here it is a focalized subject and no longer an IO. (20)

ān nūr rā dar pīšānī-i vai būd that light rā in forehead-EZ he be.PST.3SG ‘Regarding that light, it was on his forehead’ (RA 4b, 11)

Therefore, one can assume that the author focuses on an IO with mar. . .rā and on other functions with rā.30 There is a similar distinction in Wolof or BandaLinda (Feuillet 2006: 627), where there are two different markers depending on the function of the focalized group: for example, Banda-Linda uses the morpheme kə̀ for the subject and the morpheme də̀ for the object. A parallel can also be made with contemporary Persian, since rā can focus (or topicalize) on different complements, but never on the subject (Meunier and Samvelian 1997: 212–230).

7 Conclusion To conclude, although there are few occurrences of mar. . .rā in the texts of the fifteenth century, it is evident that it is a dialectal feature of Herat, always employed with an IO. This circumposition is part of a continuum of the “zone objectale” [object zone] – according to Lazard’s expression (1994: 95; translated in 1998: 91) – in which there are different degrees of marking, as seen in (21). It depends on the nature of the object, its degree of animacy, and on its precise function, i.e., a DO or an IO. (21)

unmarked direct object indirect object indirect object direct object marked by rā marked by rā marked by mar. . .rā ←________________________________________________________→

As for the focalization, we can only speculate about the phenomenon because of the limited number of occurrences. However, it is a likely hypothesis, as with 30 The use of mar as emphasis was already noted (see, for example, most recently, Bābāsālār 2013: 184, with examples in poetry; and Moayedi and Lotfi 2013: 112). However, no study has highlighted the difference between the emphasis expressed by the circumposition and the emphasis expressed by the postposition alone.

98

Agnès Lenepveu-Hotz

mar. . .rā the author can focus on an IO more intensely than when using rā alone or an ezāfe (in occurrences linked to external possession). The groups with other functions, like the subject, could be focalized with the postposition alone. Therefore, as in other languages, different markers may exist. Consequently, more than a survival of an archaism, there is a restriction, a specialization of the former morpheme mar to distinguish some uses of IO from the more general uses with rā alone. This phenomenon took place at a time in which the postposition rā tended to mark the DO more and more restrictively. Thus, the circumposition helps avoid ambiguity in non-prototypical sentences. It is not an exact rule that authors strictly observe but it is one way in which Persian has renewed itself in order to express the distinction between direct and indirect object, when the postposition rā was in a transitory step between the old marking of indirect object and the new one of direct object.

8 Abbreviations in the glosses EZ – ezāfe (linker between the head noun and its modifiers or the head noun and its possessor); INDF – indefinite; INF – infinitive; NEG – negation; PST – past; PL – plural; PN – proper noun; PP – past participle; PRS – present; REL – introducer of a relative clause; SG – singular, VAFF – verbal affixes (modal or aspectual).

Corpus ‘Abd al-Razzāq Samarqandī, Kamāl al-Dīn. Matla‘-i Sa‘dayn va majma‘-i bahrayn [The rising of the two stars of fortune and the confluence of the two seas], 33–133. ‘Abd al-Husayn Nawāī (ed.). Tehran: Mo’assesse-ye motāle‘āt va tahqiqāt-e farhangi, pažuhešgāh, 1993. al-Axavainī al-Buxārī, Abū Bakr Rabī‘ b. Ahmad. Kitāb hidāyat al-muta‘allimīn [The guide book for students of medicine], 13–112. Jalāl Matini (ed.). Mashhad: Čāpxāne‑ye dānešgāh‑e Mashhad, 1965. Abbreviated HM. al-Daštakī al- Šīrāzī, Amīr Jamāl al-Dīn ‘Atā’ Allāh b. Fazl Allāh al-Husaynī. Rauzat al-Ahbāb fī siyar al-Nabī wa-’l-Āl wa-’l-Ashāb [The garden of the friends in the footsteps of the Prophet, his family and his companions], 1a–26a, 298b–324b. Preserved in the library Āstān-e Qods-e Razavi of Mashhad (Nr. 4109). Abbreviated RA. Davānī, Jalāl al-Dīn Muhammad. ‘Arz-i sipāh-i Uzun Hasan [Presentation of Uzun Hasan’s army], 41. Iraj Afshār (ed.). Tehran: Dāneškade-ye adabiyāt, 1956. Hāfiz-i Abrū. Cinq opuscules de Hāfiz-i Abrū, 68. Félix Tauer (ed.). Prague: Editions de l’Académie tchécoslovaque des Sciences, 1959. Jāmī, ‘Abd al-Rahmān b. Ahmad Nūr al-Dīn. Bahāristān va rasā’il [The spring garden and epistles], 65–169. ‘Alāxān Afsahzād, Mohammad Jān ‘Omarof, & Abu Bakr Zohur al-Din (eds.). Tehran: Mirās-e maktub, 2000.

Specialization of an ancient object marker in the New Persian

99

Mīr Xwānd, Muhammad b. Xāvandšāh. Tārīx-i rauzat al-safā [The garden of purity], 419–519. Jamshid Kiyānfar (ed.). Tehran: Asātir, 2001. al-Xunjī al-Isfahānī, Abul-Xayr Fazl Allāh b. Rūzbihān. Tārīx-i ‘Ālam-ārā-i Amīnī [The worldadorning history of Amini], 21–120. John E. Woods (ed.). London: Royal Asiatic Society, 1992. —. Mihmān-nāma-i Buxārā [The guest-book of Bukhara], 1–50, 306–356. Manučehr Sotudeh (ed.). Tehran: Bongāh-e tarjome va našr-e ketāb, 1962.

References Bābāsālār, Asghar. 2013. Kārbordhā-ye xāss-e rā dar barxi motun-e fārsi [Special uses of rā in some Persian texts]. Adab-e fārsi 3 (1). 181–196. Bahār, Mohammad Taqi. 1994 [1942]. Sabkšenāsi. Tārix-e tatavvor-e nasr-e fārsi [Stylistics. History of the evolution of Persian prose], 3 vols. Tehran: Amir Kabir. Benveniste, Emile. 1938. Sur un fragment d’un psautier syro-persan. Journal asiatique 230. 458–462. Bossong, Georg. 1985. Empirische Universalienforschung. Differentielle Objektmarkierung in den neuiranischen Sprachen. Tübingen: Gunter Narr. Durkin-Meisterernst, Desmond. 2014. Grammatik des Westmitteliranischen (Parthisch und Mittelpersisch). Vienna: Verlag der Österreichischen Akademie der Wissenschaften. Feuillet, Jack. 2006. Introduction à la typologie linguistique. Paris: Honoré Champion. Jensen, Hans. 1931. Neupersische Grammatik. Heidelberg: Carl Winter. Karimi, Simin. 1990. Obliqueness, specificity, and discourse functions: rā in Persian. Linguistic Analysis 20 (3–4). 139–191. Key, Gregory. 2008. Differential object marking in a Medieval Persian text. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian linguistics, 227–247. Newcastle: Cambridge Scholars Publishing. König, Ekkehard & Martin Haspelmath. 1998. Les constructions à possesseur externe dans les langues d’Europe. In Jack Feuillet (ed.), Actance et valence dans les langues de l’Europe, 525–606. Berlin and New York: Mouton de Gruyter. Lazard, Gilbert. 1963. La langue des plus anciens monuments de la prose persane. Paris: Klincksieck. Lazard, Gilbert. 1970. Etude quantitative de l’évolution d’un morphème: la postposition rā en persan. In David Cohen (ed.), Mélanges Marcel Cohen, 381–388. The Hague-Paris: Mouton. Lazard, Gilbert. 1982. Le morphème rā en persan et les relations actancielles. Bulletin de la Société de Linguistique de Paris 77 (1). 177–208. Lazard, Gilbert. 1994. L’actance. Paris: Presse Universitaire de France. Lazard, Gilbert. 1998. Actancy [English translation of L’actance]. Berlin and New York: Mouton de Gruyter. Lazard, Gilbert. 2006. Grammaire du persan contemporain. Tehran: Institut Français de Recherche en Iran. Lenepveu-Hotz, Agnès. 2014. Manuscrits persans et linguistique diachronique: la loi du provisoire? De la difficulté à saisir un morphème dialectal, mar. In Nalini Balbir and Maria Szuppe (eds.), Lecteurs et copistes dans les traditions indiennes, iraniennes et centrasiatiques. Eurasian Studies XII (1/2). 101–120.

100

Agnès Lenepveu-Hotz

Lenepveu-Hotz, Agnès. 2016. L’emploi de mar. . . rā chez Firdausī: simple raison métrique ou cause linguistique? In Céline Redard (ed.), Des contrées avestiques à Mahabad, via Bisotun. Études offertes en hommage à Pierre Lecoq, 259–276. Neuchâtel and Paris: Recherches et publications. Meunier, Annie & Pollet Samvelian. 1997. La postposition rā en persan. Ses liens avec la détermination et sa fonction discursive. Les cahiers de grammaire 22. 187–232. Moayedi, Mona & Ahmad Reza Lotfi. 2013. Barresi-e sāxt-e do maf‘uli dar motun-e adab-e fārsi [Double object construction in the Persian literary texts]. Majalle-ye pažuhešhā-ye zabānšenāsi 5 (1). 101–119. Newman, Andrew J. 1994. Daštakī, ‘Aṭā-Allāh, (d. 1506, 1511, or 1520), a scholar of Hadith in Khorasan in the late Timurid and early Safavid periods. Encyclopædia Iranica VII (1). 100. http://www.iranicaonline.org/articles/dastaki (accessed 22 January 2017). Ovčinnikova, I. K. 1956. Ispol’zovanie posleloga rā v proizvedenijax tadžikskix i persidskix klassičeskix avtorov [The use of the postposition rā in the works of classical Persian and Tajik authors] (XI–XV vv.). Trudy Instituta Jazykoznanija (Moskva) 6. 392–408. Paul, Ludwig. 2003. Early Judaeo-Persian in a historical perspective: The case of the prepositions be, u, pa(d), and the suffix rā. In Ludwig Paul (ed.), Persian origins: Early JudaeoPersian and the emergence of New Persian, 177–194. Wiesbaden: Harrassowitz. Paul, Ludwig. 2008. Some remarks on the Persian suffix -rā as a general and historical linguistic issue. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian linguistics, 329–337. Newcastle: Cambridge Scholars Publishing. Payne, Doris & Immanuel Barshi. 1999. External possession. Amsterdam and Philadelphia: John Benjamins. Phillott, Douglas Craven. 1919. Higher Persian grammar. Calcutta: University Press. Pottier, Bernard. 1968. L’emploi de la préposition a devant l’objet en espagnol. Bulletin de la Société de Linguistique de Paris 63 (1). 83–95. Salemann, Carl & Valentin Shukovski. 1925. Persische grammatik mit literatur, chrestomathie und glossar. Berlin: Reuther und Reichard. Storey, Charles. 1927–1977. Persian literature. A bio-bibliographical survey, 2 vols. London: Luzac and Co.

Lutz Rzehak

6 Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto Abstract: This article deals with spoken language or more precisely with spontaneous speech in colloquial Dari and Pashto. The focus is on stereotyped conversation fillers or “filled pauses” that often have been regarded as “disfluencies” since they constitute a delay in the flow of speech. Lexical items and other elements that are used to fill hesitation pauses are syntactically omissible and structurally dispensable. It is argued that many of these elements can be assigned pragmatic functions, such as signaling an upcoming focused word, or the speaker’s intention to plan or code his or her speech. The article studies the questions that lexical items, in general, can be used as adjuncts of that kind in colloquial Dari and Pashto, how they are inserted in fluent speech, and which pragmatic functions they fulfill. Furthermore, the article will study the question as to what extent the choice of a particular lexical item to be used as filler follows common patterns in a speech community, and to what extent the choice is based on individual preferences. Keywords: Dari, Pashto, spoken language, fillers, communication strategies, pragmatics

1 Introduction The following phrases were said during an interview that was taken by the Federal Office for Migration of Switzerland with an asylum seeker who claimed that she came from Afghanistan. In this part of the conversation the asylum seeker speaks about her migration history.

Lutz Rzehak, Berlin Humboldt-University DOI 10.1515/9783110455793-007

102 (1)

Lutz Rzehak

Female Hazara from Helmand province, about thirty-two years old (I = interviewee, R = respondent)1 I: eh xō awwal bār-e awwal aft sāl bodin da irān bār-e dowwom čand sāl se sāl

Okay. First, the first time you were seven years in Iran and the second time how many years? Three years?

R: se sāl ah se sāl / aft sāl awwal / bād az aft sāl dige raft- se sāl čiz-ē kam se sāl afġānestān zendegi kadēm masalan elmand-u kābol / bād az u āmadēm se sāl da irān

Three years. Yes. Seven years first, then after seven years we went – we lived in Afghanistan for almost three years, for example, in Helmand and Kabul. After that we went to Iran for three years.

I: xō besyār xub

Okay, very well.

R: masalan koll-e az iyā mēša sēzda sāl

And [Or: Well,] all of it makes thirteen years.

I: sēzda sāl

Thirteen years.

In this part of the conversation the respondent uses the lexical item masalan twice. The first time this word is used in its original meaning and function as an adjunct for adducing examples when the speaker mentions some places of residence in Afghanistan. In this case masalan can be translated as ‘for example’, ‘for instance’. For the second usage of masalan such a meaning cannot be admitted and a translation as ‘for example’, ‘for instance’ would not make any sense. In this case the lexical item masalan can be given several meanings or functions. It can be understood as a coordinating or joining conjunction and one can translate it as ‘and’, correspondingly. But since masalan is used here immediately after a response by the interviewer, we may also assume that here masalan is used by the interviewed person as a discourse connective for retaking the turn of the conversation. Hence masalan can be given the meaning of ‘yeah’, ‘um’, ‘well’, ‘but’, or something similar in this case. From a syntactical and semantic point of view, the second masalan can be removed or discarded without compromising the core idea of the utterance. In written language the lexical item 1 Here and in all following cases no punctuation marks are used in the transcription of free speech. A slash marks noticeable prosodic breaks that are accompanied by a pause and last approximately up to half a second. Aborted lexical units are marked by a hyphen at the end. In this interview the interviewer uses the colloquial standard of Dari that is based on the old dialect of Kabul. Most linguistic features of the respondent can also be assigned to this variety. For a description of its phonology, morphology, and syntax, see Farhadi (1955).

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

103

masalan would hardly ever be used in the way it appears the second time in this part of the conversation. Here its usage, undoubtedly, is a special feature of spoken language or, more precisely, of spontaneous speech as realized in the course of an oral face-to-face conversation. Face-to-face communication follows special linguistic rules. Structural elements are the opening and closure of conversation, strategies for taking the turn, holding the turn, and yielding the turn of the conversation, procedures for the production of meaning and the assurance of understanding (paraphrase, repair, etc.). Typical traits of spoken language that are instrumental in the organization and contextualization of conversation are short, often incomplete sentences, mixing of sentence structures, and a more frequent use of discourse particles. Other traits are caused by the fact that in spoken language the language production process is much more compact and compressed as compared to written language. The language planning process includes planning the utterance with regard to what to say, retrieving the words and integrating them into a sentence, articulating the sentence, and monitoring the output.2 Features related to that are, among others, hesitation phenomena, pauses, speech errors, unexpected discontinuities in the expression of ideas, difficulties with word retrieval, and self-repair. Among these features of spoken communication, this article deals with stereotyped conversation fillers. Often fillers are considered as “filled” or “nonsilent pauses” or as “disfluencies” since they constitute a delay in the flow of speech.3 Fillers can be non-lexical items like “eh”, “em”, “hmm”, as well as lexical items such as masalan ‘for example’, ‘for instance’ in example (1) given above. It is a general feature of fillers that they are often syntactically omissible and structurally dispensable. This study is based on the idea that often fillers are assigned pragmatic functions, such as signaling an upcoming focused word, or the speaker’s intention to plan or code his or her speech. In many cases they are associated with referential meaning or they are used for some additional highlighting. The focus of this article is on lexical fillers. The article studies which lexical elements, in general, can be used as fillers in colloquial Dari and Pashto and how they are inserted in fluent speech. Furthermore, the article will address to what extent the choice of a particular lexical item to be used as filler follows common patterns in a speech community, and to what extent the choice is based on individual preferences. The main focus of this article is on Dari. For this study 105 sequences in Dari from thirty-one informants were analyzed. Speakers of Dari belong to a multi2 See Bussmann (2006: 651‒652, 1115). 3 For disfluencies in spontaneous speech, see Brennan and Schober (2001: 274‒275) and Horne (2006: 265).

104

Lutz Rzehak

lingual society with Pashto as another prevalent language and numerous phenomena of language contact.4 On these grounds twenty-nine sequences in Pashto from six informants were also included in the analysis for purposes of comparison. I take my material from recordings of colloquial speech that I have made in various parts of Afghanistan during the last twenty years and from recordings of interviews that were taken by the Federal Office for Migration of Switzerland with migrants from Afghanistan. These interviews were made anonymous and kindly provided to me for research.5 These interviews are structured as dialogues between two native speakers (no interpreter was involved) which allows study of the mentioned phenomena as part of conventional discourse strategies.

2 Fillers in spoken Dari At a first glance, it becomes evident that most fillers can be considered hesitation pauses that have been “filled” for planning the utterance. Many lexical items are used in that function: Table 1: Lexical elements that are used as fillers in spoken Dari Lexical item

Meaning

Number of usage in 105 sequences

ba hesāb čiz masalan če b-estelāh maqsad xō xolāsa ba ġoul-e ma’ruf ami kam-i baxšeš begōyam diga mesl-e če begom rāst-eš

‘so to speak’ (lit. ‘calculated’) ‘thing’ ‘for example’ ‘what’ ‘so to speak’, ‘quasi’ ‘what I mean’ (lit. ‘purpose’, ‘intention’) ‘yet’, ‘but’; ‘after all’ ‘in summary’, ‘briefly’ ‘as the word is’, ‘like they say’ ‘that’ ‘a bit’ ‘I apologize for saying’ ‘else’ ‘like’, ‘as’ ‘what shall I you say’ ‘in reality’, ‘in fact’

10 9 9 7 7 5 4 3 3 2 2 2 1 1 1 1

4 On Pashto-Dari bilingualism in Afghanistan and contact-induced language phenomena in these languages, see Kiselova (1982) and Rzehak (2012). On the theory of language contact studies, see Riehl (2009) and Thomason and Kaufman (1988: 35‒64). 5 I thank the Federal Office for Migration of Switzerland for allowing me to use these interviews for this study.

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

105

2.1 Filled hesitation pauses The most commonly used filler is ba hesāb (lit. ‘calculated’, ‘counted’) that in colloquial language is often used in the meaning of ‘quasi’, ‘so to speak’, ‘in a manner of speaking’, ‘in a way’, or ‘sort of’. It can also be used with ezāfa in the form of ba hesāb-e. However, when inserted into spontaneous speech as filler, none of the mentioned meanings, obviously, seems to be appropriate. Mostly it is used for filling a hesitation pause that was inserted for planning the following parts of the utterance. (2)

Male Tajik from the city of Herat, thirty-nine years old6 mā amu manteγe-yi ke be hesāb zendegi mikadim kōče-ye dudālun be hesāb bein-e e pāasār-u be hesāb čārsu nāraside čārsu. . . unğā hamām-i bud hamām-e azizi. . . bād?? be hesāb-e be rū be rū az ū be hesāb-e čiz bud kūče-i bud dāxel-e az ū kūča mā bodim

That neighborhood where we so to speak lived [was] the Dudalunstreet, in a manner of speaking between Pa-ye Asar and Charsu, before Charsu. . . . There was a bathhouse, the Azizi bathhouse. . . . And so to speak across from it was quasi what, there was a street. We were in this street.

The second most common lexical item used for filling hesitation pauses is čiz ‘thing’. This lexical item can be considered a temporary placeholder, i.e., it must be replaced by something else when the utterance is completed. Usually the speaker has an idea about the syntactic structure of the utterance, but he or she is not yet certain about a particular word or expression. This unclear word or expression is temporarily replaced by čiz until the necessary word comes to his or her mind. In the following examples čiz is translated as ‘thing’ or ‘what’.

6 The sign /γ/ stands for a voiceless uvular fricative that is typical of the dialect of Herat. It corresponds to the voiceless uvular plosive /q/ and voiced uvular fricative /ġ/ of the colloquial standard of Dari. It is more voiced than /q/ but less voiced than /ġ/. The actual pronunciation of /γ/ can vary between a half-voiced uvular plosive and a voiced fricative, depending on the position within a word. See Ioannesyan (1999: 32).

106 (3)

Lutz Rzehak

Male Tajik from the city of Herat, thirty-three years old7 I: šomā xod-etān arōsi kadin

Are you married?

R: mā čiz kadom eh nekāh kadom diga waxt ba arōsi našod moškel dištom āmadom birun

I made the thing, the marriage ceremony but no time was left for the wedding. I had problems and I went abroad.

In this sequence čiz stands for nekāh ‘marriage ceremony’. In example (4) it stands for momtāz ‘talented’. (4)

Male Hazara from Behsud province, fifteen years old8 I: rāğe ba ami maktab če yād-et ast če mitānin bogōyin az daure-ye maktab

What do remember about that school, what can you tell about the school days?

R: darsā-ye mō ke pāyin bod az u diga kelāsā az senfā-ye bālātar yak nafar ke hamintur čiz-e kelās bod momtāz bod u-ra rawān můkadand da kelās-e az mō bar-e mō dars můgoft

In my lower classes a person from the higher classes who was the thing, [the most] talented [of his class] was sent to our class and he gave classes to us.

When we say that čiz is a placeholder, this does not necessarily mean that the syntactic structure must be completed in exactly the same way it was planned originally. In example (5) the usage of čiz shows that the speaker originally had intended to create another structure of his utterance, maybe with an adjective like saxtgir ‘strong’, but he changed his mind and gave a periphrastic description of his father’s behavior.

7 In this sequence the past stem of dāštan ‘have’ appears as dišt, which is also a characteristic feature of the dialect of Herat (my observation). Furthermore, in colloquial Dari for reasons of modesty, the personal pronoun for the first person singular ma(n) can be replaced by the pronoun for the first person plural mā (pluralis modestiae), in these cases the verb can also occur in the form of the first person plural or in the form of the first person singular as in example (3). 8 Example (4) was given in the dialect of the Hazaras (Hazāragi), where /ō/ corresponds to /ā/ of the colloquial standard. The sign /ů/ denotes a long and stable back vowel between /u/ and /ō/, which is characteristic of most varieties of Hazāragi. It is more closed than /ō/ and more open than /u/. It appears mainly in the verbal prefix of many verbs (cf. mi-/mē- in standard Dari). See Efimov (1965: 12).

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

(5)

107

Male Qizilbash from the city of Ghazni, eighteen years old I: xub diga da xāna kodām kumak mikadi kār mikadi čiz mikadi

Well, did you help at home; did you work there or something like that?

R: ma xō xord-e xāna bodim xord-e xāna mēgan sag-e xāna as har če ke mēgoftan mēkadim ma ham xub padar-am bāz čiz bod namimānd ke ziyād birun berom

After all I was the youngest in the house. And people say that the youngest in a house is like the dog of the house. I did everything what they told me. Well, and my father was what, he didn’t allow me to go outside often.

In a similar function the interrogative če ‘what’ or sometimes the rhetoric question če mēgan ‘how it is said?’ can be used as placeholder. (6)

Male Dari speaking Pashtun from Paghman province, nineteen years old I: xod-e maktab-etān da koğā bod

Where was your school?

R: maktab-e mā da če kōte-ye sangi amu taraf-e če / če mēgan da pēš-e dēwānbēgi

My school was what, in Kote-ye Sangi in the direction of what, how it is called, in front of Dewanbegi.

The literal meaning of a filling expression often falls aside. This can be seen by the fact that a particular expression may be pronounced in a strongly reduced form, like, for example, b-eslā (> ba estelāh ‘quasi’, ‘so to speak’) in example (7). (7)

Hazara from Jaghori district of Ghazni province, over forty years old I:

zabān-e mādari-yetān če ast

What is your mother tongue?

R:

zabān-e mādari-ye mā ami zabān-e azāragi b-eslā dari mitānim begōyim

My mother tongue is Hazaragi, so to speak, we can say Dari.

2.2 Fillers as connectors In some cases it is difficult to define exactly in which function a particular expression is used as filler. Some lexical items are inserted not only for filling hesitation pauses but for connecting words or phrases or for expressing the meaning relationship between one utterance and the following one. In these

108

Lutz Rzehak

functions, according to my data, the lexical items xolāsa ‘in summary’, ‘briefly’, b-estelāh ‘quasi’, ‘so to speak’, maqsad ‘what I mean’, or masalan ‘for example’, ‘for instance’ can be used. Thus in example (8) xolāsa can be considered not only a filled pause. From the point of view of semantics, it combines two clauses, but it could be displaced without doing serious harm to the idea of the utterance. (8)

Male Hazara from Qarabagh province, sixteen years old I: sabzimaidān če qesm ğāy ast

What kind of place is Sabzimaidan?

R: sabzimaidān az xod-e darwāze-ye tehrān yak du istgāh bālātar miša xolāsa unğā ke hast harruza bāzār a

It is one or two stations above the Teheran gate. What I mean, there bazaar is every day.

2.3 Fillers used for highlighting Example (8) indicates another important pragmatic function that fillers can fulfill. They can be inserted for highlighting a particular element of an utterance. In example (8) the speaker might have said unğā harruza bāzār a ‘There bazaar is every day’. But he didn’t say that. Instead he said unğā ke hast harruza bāzār a. The commonly used expressions ke hast ‘which is’ (present tense) or ke bod ‘which was’ (past tense) are often inserted for highlighting a preceding element of an utterance. It could be translated as ‘as for . . .’ or ‘as far as . . . is concerned’, but such a translation is not compulsory as examples (9) and (10) show. (9)

Male Hazara from Jaghori district of Ghazni province, age unknown9 I: bāz nazr-e bibi goftin da u če as

Then you mentioned [the custom] nazr-e bibi. What is [in] that?

R: nazr-e bibi ke asta amčunān čiz a maqsad yak nazr můkona amu-ra amu nazr-a xānā ke nazr kada bāz ğam můša můxōra amu rōz xoš tēr můna diga

As for Nazr-e bibi, this is a similar thing. What I mean: People give offerings and these offerings – those houses that have given offerings gather and they spend that day well.

9 Notice the use of maqsad ‘what I mean’ as connector as described above.

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

(10)

109

Male Hazara from Jaghori province of Ghazni province, age unknown10 I: sāl-e nau goftin sāl-e nau če mikonan

You mentioned the New Year. What do people do at New Year?

R: sāl-e nau masalan amu id amu sāl-e nau ke asta diga ke nauruz můša nauruz můgōyi diga amu-rā nauruz-rā migirand

New year, for example, this festival, what concerns the new year, it becomes Nauruz, people call it Nauruz, and people celebrate this Nauruz.

2.4 Fillers as expression of uncertainty Some lexical items that are inserted for filling hesitation pauses are clear symptoms of a speaker’s uncertainty and of his indecisiveness about what to say and how to say it. In this function a great variety of lexical items can be used, and some of them one would hardly expect to be fillers like, for example, dobāra ‘again’ in example (11). (11)

Male Hazara from Jaghori district of Ghazni province, sixteen years old11 yā ṭrēkṭar kerāya můkadi yā masalan če gāw yā xar-mar kerāya můkadi balde če-ši balde u-ra ke dobāra čiz můkadi xarmanikōbi můkadi u-ra dobāra da bād midādi i čizā

Either one rented a tractor or, for example, one rented a cow or a donkey or so for what, for again doing, what, for threshing at the barn-floor and for what, winnowing and so.

2.5 Fillers as discourse markers Often fillers are used as discourse markers in order to address the interlocutor directly and to maintain the conversation. These are often expressions of the Persian ta’ārof (Persian form of civility emphasizing defense and social rank) or rhetoric questions.

10 Notice that in the dialect of the Hazaras the verb form of the second person singular can be used for expressing impersonal forms, i.e., můgōyi stands here for ‘people say’, ‘one says’. 11 In this sequence the retroflex /ṭ/, the preposition balde ‘for’ (cf. standard barāye) and the enclitic pronoun for the third person singular and plural ‑ši (cf. standard ‑aš, ‑eš or ‑ašān, ‑ešān) must be mentioned as special features of the dialect of the Hazaras.

110 (12)

(13)

(14)

Lutz Rzehak

Male Hazara from Jaghori district of Ghazni province, twenty-one years old I: če kešt-u kār mēša unjā

What is cultivated there?

R: unjā masalan amin gandom masalan kešt mikona diga baxšeš begōyim kačālu kešt mikonan piyāz kešt mikonan dehāt ai diga

There, for example, that wheat, for example, is cultivated, else, I apologize for saying, potatoes are cultivated, onion is cultivated; after all it is a rural place.

Male Hazara from Jaghori district of Ghazni province, over forty years old I: kodām mazāmin-a mixāndin da maktab

Which subjects did you learn at school?

R: da maktab-e uskōl da maktab-e uskōl xō mā mazāmin-e moxtalef as wale mā az dari dāštim riyāzi dāštim xedmat-e šomā arz konom ke pašto dāštim qorān-e šarif dāštim

In the school? In the school we had different subjects. We had Dari, we had mathematics – and I make so bold as to say you – we had Pashto, we had the Holy Koran.

Male Tajik from Qarabagh province, seventeen years old mādar-am har birun ke miraft kati čādar miraft az xāter-i ke če mēgand če raqam bar-etān begom az xāter-i ke polis u iyā šak nakona sar-e mā

Whenever my mother went outside she wore a veil in order to – how it is said, how shall I say this to you – so that the police and these people wouldn’t have any doubt on us.

The rhetorical question fāmidi / pāmidi ‘Got that?’ is also often inserted into fluent speech as a discourse marker and for filling a hesitation pause. An Arab speaker, whom I interviewed in Mazar-e Sharif, used it twenty times within a speech sequence with a length of twenty minutes. When a speaker asks in every second sentence fāmidi ‘Got that?’ as this Arab did, he does not really expect an answer. This qualifies this rhetorical question as filler. An answer to this rhetorical question is neither expected nor given. The question is inserted to address the interlocutor and to produce a pause for planning the rest of the utterance. It is worthy of mention that the lexical item fāmidi ‘Got that?’ was used with a similar frequency by other Arab speakers whom I met in Balkh province. This qualifies this filler as a special feature of the Persian dialect of the Arabs of Balkh province.

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

(15)

Male Dari-speaking Arab from Balkh province, fifty years old ma az kalime-ye xān bad mibarom / fāmidi / hič waxt ma ami andēwālāye xod-a xān namēgom hič

(16)

111

I don’t like the word Khan. Got that? I never call my friends Khan.

Male Dari-speaking Arab from Balkh province, fifty years old bače-ye gōsfand-a bara mēgim / bače-ye boz-a bozġāla / boz-e nar-a taka mēgim / fāmidi / boz-e nar-a taka mēgim / boz-e nār-ē ke mā xāyē-š-a kašida bāšim u-ra ke da u diga nari nakona u-ra bāz serka mēgim / fāmidi / boz nar as walē xāyē-š-a kašida ke diga i diga mardi nadāšta bāša / una serka

We call the young of a sheep bara, the young of a goat bozġāla. We call a male goat taka. Got that? We call a male goat taka. A male goat the testicles of which were extracted so that it would not cry anymore, we call it serka. Got that? It is a male goat but its testicles were extracted so that he would not have manhood. This is a serka.

In this respect religious formula must also be mentioned. The expression wallāh ‘I swear by God!’ is often used at the very beginning of an utterance. It can be regarded as an expression that is inserted for retaking the turn of the conversation. (17)

Tajik from the district of Soltan-Saheb of Ghazni province, twenty-four years old I: da afġānestān az kodām qesmateš hastin

Which part do you come from in Afghanistan?

R: wallāh mā az ġazni-stim woloswāli-ye soltān sāheb

I swear by God! I am from Ghazni, from the district Soltan-Saheb.

It is due to the semantics of this formula that wallāh ‘I swear by God!’ is often used when negative answers are given. In the culture of Afghanistan negative answers are regarded as something bad and people try to avoid such answers. The religious formula, obviously, shall underline the truthfulness of a statement that cannot be avoided. (18)

Male Hazara from Jaghori district of Ghazni province, age unknown I: aw dāštin

Did you have water?

R: wallāh ziyād aw nadāštim čun xošksāli šoda

I swear by God! We didn’t have much water because there was a drought.

112 (19)

Lutz Rzehak

Tajik from the city of Ghazni, twenty-four years old I: diga zabānhā-rā ham balad hastin masalan englisi bāša paštō bāša

Do you understand other languages, for example, English or Pashto?

R: wallāh nē paštō-rā yād nadārom englisi-rā ham yād nadārom farz-e mesāl fārsi-rā mēfāmom

I swear by God! No, I do not know Pashto, I do not know English either, but I know, for instance, Farsi.

2.6 Filling longer hesitation pauses If we admit that hesitation pauses are inserted for planning the rest of an utterance, it is a consequential assumption that longer pauses provide more time for planning. In order to fill longer hesitation pauses, filling words that are semantically close to each other can be combined. Typical combinations are taqriban hodud-e ‘approximately about’, masalan farz-e mesāl ‘for example, for instance’ or wallāh hodud-e ‘I swear by God! About’. (20)

Male Qizilbash from the city of Ghazni, eighteen years old I: wa i maktab če qesm bod maktab-e majāni bod yā pul mipardāxtin

What was this school like? Was the school free of charge or did you pay?

R: maktab xō maktab-e afġāniyā bod amu nafar bod ke masalan mesāl diplōm gerefta bod az irān az maktab-e irāniyā ba mā dars midād

The school, after all the school was a school for Afghans. There was a person, for example, for instance, who got his diploma from Iran, from an Iranian school. He taught us.

Example (21) is of special interest because three fillers are combined with each other. (21)

Male Tajik from Ghazni province, twenty-four years old I: če qadr waqt mēša ke āmadin

How long has it been that you arrived?

R: wallāh odud-e ya’ni az xāne-ye mā ke arakat kadim yak panj šaš māh mišawa

I swear by God! About, I mean, since I have started from my home it is about five or six months.

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

113

2.7 Iranian influence Some Afghans who spent some years in Iran have a special sympathy for the Persian language of Iran. Consequently they may also use some expressions for filling hesitation pauses that are not common in colloquial Dari but are in colloquial Persian of Iran. In example (22) this is the expression ba ġoul-e ma’ruf ‘as the word is’, ‘like they say’. According to Dari phonology, this expression should be pronounced ba qaul-e ma’ruf (with the consonant /q/ instead of /ġ/ and with the diphthong /au/ instead of /ou/), but even in this way of pronunciation this expression is not common in Dari. It is rather popular in Iran and must be considered an Iranian influence on the linguistic behavior of these speakers. (22)

Afghan Hazara, who was born in Iran and went to school there for five years, twenty-three years old I: kār mikadin če qadr barāyetān midādan

You worked. How much did they pay you?

R: pul bastagi dāšt diga ba ġoul-e ma’ruf u zamān masalan ke mā kār můkadim bača bodom u zamān xeili kam bod masalan ruz-i panğsad tuman kār mikadom xub da kārxāna ba’d az in ke ba ġoule ma’ruf bozorg šodim xō bištar kār mikadom ruz se hazār tuman čār hazār tuman

The money depended like they say. At that time when I was working I was a child. That time the money was very few, maybe 500 tomans per day. Well, in the factory after, like they say, I had grown up, I worked more, per day 3,000 tomans or 4,000 tomans.

3 Fillers in spoken Pashto Dari and Pashto are very different languages from an historical-genetic point of view. Dari is a west-Iranian language with a rather analytical structure, while Pashto is an east-Iranian language with a synthetic structure. However, both languages coexist in the multilingual society of Afghanistan and many contactinduced linguistic phenomena are apparent. It is no surprise, therefore, that Dari and Pashto to a large extent share a common vocabulary in the field of fillers. Most Pashto fillers are used the same way as in Dari. Most of them are omissible from the point of view of semantics and grammar. The Dari filler ba hesāb ‘so to speak’ has the Pashto equivalent pəhisāb that is used similarly. The speaker in example (23) is bilingual. Pashto is his first language, but he also

114

Lutz Rzehak

Table 2: Lexical items used as fillers in Pashto with Dari equivalents Pashto

Dari equivalents

Meaning

pə hisāb, hisāb-kitāb misāl no če dəi / da poh šwei xo

ba hesāb mesāl diga če (h)asta fāmidi xō

‘so to speak’ (lit. ‘calculated’) ‘for instance’, ‘for example’ ‘then’, ‘so’, ‘hence’ ‘what is’ ‘Got it?’ ‘yet’, ‘but’; ‘after all’ ‘I swear by God!’

wallāh

knows Dari on the level of a second native language. This may have influenced his lexical choice for filling hesitation pauses in Pashto. (23)

(24)

Pashtun from Kunduz province, thirty-five years old I:

tāse yāstəi dǝ afġānistān dǝ kom qismat na

Which part do you come from in Afghanistan?

R:

zə pǝ asl ki dǝ xānābād dǝ ahtāǧ yǝma pǝ qaum niyāzai yǝma pǝ hisāb wilāyat-e kunduz rāʣi dǝ xānābād munga pǝ hisāb dǝ ahtāǧ yāstu dǝ kunduz

I am by origin from Khanabad, from Ahtaj, by tribe I am Niyazai, so to say, this belongs to Kunduz province, from Khanabad, I am, so to say, from Ahtaj, from Kunduz.

Pashto-speaking Pashtun from Khost province, seventeen years old I:

musāferi ʦənga teregi

How is your journey going on?

R:

musāferi xo wallāh ḍer pə muškilāt bānde teregi

My journey is after all – I swear by God! – with many difficulties.

A Pashto equivalent of the Dari expression fāmidi ‘Got that?’ can be seen in example (25). This speaker is also bilingual, but his knowledge of Dari as a second language has hardly influenced his lexical choice in Pashto because no Arabs can be found in the province of Kunduz where this speaker comes from. The expressions poh še ‘Got it?’ is used by numerous speakers of Pashto from various places and it can be considered an original Pashto filling expression here:

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

(25)

115

Pashtun from Kunduz province, fifteen years old I: hamdəlta paidā šǝwai ye hamdǝlta loy šǝway ye hamdǝlta de žwand kǝṛai R: ho dəlta paidā šǝm pǝ čārdaro ki paidā šǝwai yǝm poh še dǝ kunduz čārdaro ki u amǝlta paidā šǝwai yǝma poh še

Were you born there and did you grow up there? Did you live there? Yes, I was born there. I was born in Chardara. Got that? In Chardara from Kunduz. There I was born. Got that?

The Pashto equivalents of the Dari expression ke asta ‘which is’, used for highlighting, are če dəi (masculine) or če da (feminine). As in Dari, they are inserted for highlighting a preceding element of an utterance. They must not necessarily be direct copies of Dari, although the speaker in example (26) comes from a district where Dari is widely used as first language even by many Pashtuns. But the expression če dəi (masculine) or če da (feminine) can also be encountered in the speech of numerous Pashtuns from other places with no knowledge of Dari. (26)

Pashtun from Surkhrod district of Nangarhar province, about twenty-four years old I: dǝ tāso kəli ki ka na kǝli pǝ bāra ʦǝ mung ta mālumāt rākawe ʦǝ tārif kawe

As far as your village is concerned, what do you say about your village, what can you describe?

R: emunga kəlai če dǝi kalāgāni di / har saṛai xpǝla kālā lari

As for our village, there are fortresses. Everyone has his fortress.

A filler that can be considered an exclusive feature of Pashto against Dari is the expression ka na ‘isn’t it?’, ‘isn’t that so?’ which usually is placed after the word or expression it refers to. It can be considered a rhetorical question inserted for highlighting. (27)

Male Pashto speaking Pashtun from Kunduz province, fifteen years old I: yau dāktar dəi

Is there one doctor?

R: dāktarān ləg xo na di ka na ziyād di ka na

Doctors are not few, are they? There are many, aren’t they?

116 (28)

Lutz Rzehak

Male Pashtun from Nangarhar province, seventeen years old12 I: ʦə karǝl kegi tāso ʣāy ki What is cultivated at your place? R: har ʦə kegi xo biyā ziyātar xalk mǝxki ʦo kāluna mǝxki taryāk ʦǝ kāwǝ ka na taryāk ziyāt kedǝ

There grows everything, but previously, some years ago, most people produced what, much opium, didn’t they, opium was much cultivated.

The popularity of the filling expression ka na ‘isn’t it?’, ‘isn’t that so?’ can be seen in the fact that both the interviewer and the interviewee use it in the following part of the same interview. (29)

Male Pashtun from Nangarhar province, seventeen years old I: də kom kǝli yāstǝi

Which village do you come from?

R: šerzād markixēl

Sherzad Markikhel.

I: au markixēl če dəi ka na dā cǝ šai dǝi

And as for Markikhel? What is that?

R: markixēl ləka če šērzād yau ʣāy wu ka na misāl xabara yau qariya da

Markikhel was a place like Sherzad, didn’t it? It is so to speak a village.

The fact that ka na is used not only for highlighting but also for filling a hesitation pause in order to plan the utterance becomes evident in example (30) in which ka na is added to the conjunction ʣəka ‘what’ is really unusual. (30)

Male Pashtun from Paktia, sixteen years old həlta xo ṭilifun nǝšta ʣǝka ka na hǝlta dāse hisāb da če žranda kār nakawi

There are no telephones because, evidently, the mills (for electricity) do not work.

4 Summary The analysis of 105 speech sequences in Dari from thirty-one speakers and twenty-nine sequences in Pashto from six speakers brought to light a great variety of lexical items and expressions that in one way or another are used as 12 Notice that in example (28) the interrogative pronoun ʦə ‘what’ is used as a placeholder in a similar way as če in Dari.

Fillers, emphasizers, and other adjuncts in spoken Dari and Pashto

117

fillers or emphasizers. Fillers in Dari and Pashto show many similarities due to the close neighborhood of the speech communities in the multilingual society of Afghanistan and mutual linguistic influences. Generally speaking, all speakers of Dari and Pashto can resort to these pools of lexical items and expressions when inserting fillers in their speech, but some preferences are evident. Some lexical items or expressions proved to be the individual preference of a particular speaker. This does not mean that other speakers would not use these lexical items and expressions at all, but these particular speakers used them strikingly more often than other speakers. The expression ba hesāb ‘so to speak’ was most often used by a thirty-nineyear-old Tajik from Herat. The expression b-estelāh ‘so to speak’, ‘quasi’ was used by an over forty-year-old Hazara from Jaghori much more often than by other speakers. The lexical item xolāsa ‘in summary’, ‘briefly’ was preferred over other lexical items by a sixteen-year-old Hazara from Qarabagh. The expression ba ġoul-e ma’rūf ‘as is known’ proved to be an exclusive feature of the language of Afghans who lived in Iran for a while and who show special sympathy to the Persian language of Iran. The rhetorical question fāmidi ‘Got it?’ was used by speakers who belong to the group of Persian-speaking Arabs in northern Afghanistan, but by no other speaker. It can be considered a dialect feature of the Persian dialect of the Arabs in northern Afghanistan. In Pashto, most popular are constructions with ka na ‘isn’t it?’ and with če dəi / če da ‘as for . . .’, ‘as far as . . . is concerned’, which are used for highlighting a particular element of an utterance. Its structural and semantic Dari equivalent če (h)ast ‘which is’ is similarly popular in colloquial Dari. The interrogatives če (Dari) and ʦə (Pashto) ‘what’ are likewise used as placeholders. The idea that fillers can be considered filled pauses for planning an utterance is not wrong. But the analysis shows that at the same time the described lexical elements can fulfill various functions from the point of view of pragmatics and discourse strategies. They can be used as discourse markers, as connectors, for highlighting an element of an utterance, or for covering up uncertainty and indecisiveness of a speaker regarding what to say and how to say something.

References Brennan, Susan E. & Michael F. Schober. 2001. How listeners compensate for disfluencies in spontaneous speech. Journal of Memory and Language 44 (2). 274–296. Bussmann, Hadumod. 2006. Routledge dictionary of language and linguistics. Translated and edited by Gregory Trauth and Kerstin Kazzazi. London & New York: Routledge. Efimov, Valentin Aleksandrovič. 1965. Yazyk afganskix xazara: yakaulangskiǐ dialect [The language of the Afghan Hazaras: The dialect of Yakaulang]. Moskva: Nauka.

118

Lutz Rzehak

Farhadi, Abd-ul-Ghafûr. 1955. Le Persan parlé en Afghanistan. Grammaire du Kâboli. Paris: Centre National de la Recherche Scientifique. Horne, Merle. 2006. The filler EH in Swedish. Lund University, Centre for Languages & Literature, Dept. of Linguistics & Phonetics, Working Papers 52. 65–68. Ioannesyan, Julij Arkadʹevič. 1999. Geratskiǐ dialect yazyka dari sovremennogo Afganistana [The dialect of Herat of the Dari language of modern Afghanistan]. Moskva: Vostochnaya literature RAN. Kiseleva, Lidiya Nikolaevna. 1982. Dvuyazyčie pašto-dari v Afganistane [Pashto-Dari bilingualism in Afghanistan]. Narody azii i afriki 6. 94–99. Riehl, Claudia Maria. 2009. Sprachkontaktforschung. Eine Einführung. Tübingen: Gunter Narr Verlag. Rzehak, Lutz. 2012. How to name universities? Or: Is there any linguistic problem in Afghanistan? ORIENT. Deutsche Zeitschrift für Politik, Wirtschaft und Kultur des Orients 53 (II). 84–90. Thomason, Sarah Grey & Terrence Kaufman. 1988. Language contact, creolization and genetic linguistics. Berkeley, Los Angeles, & London: University of California Press.

Youli Ioannesyan

7 The historically unmotivated majhul vowel as a significant areal dialectological feature Abstract: The article focuses on the majhul e /ê/ in modern Tajiki and most of the Afghan Persian (Dari) dialects, which occurs as result of the historically short /i/ changing into /ê/ in a closed syllable when followed by pharyngeal consonants /h/ or /‘/. Because /ê/ in such cases does not go back to historically long /ē/ and is fully determined by environmental factors, I define it as “historically unmotivated”. However, since this phenomenon does not occur in Persian varieties, including the Khorasani Persian dialects of western Afghanistan, it should be regarded as an important distinguishing feature marking, along with other characteristics, the linguistic border between the Persian dialects of Iran (in a broader sense), on the one hand, and the Tajiki and Afghan-Persian varieties, on the other. The article is based on various published (mostly Russian) studies on Tajiki, Afghan Persian, and Iranian Persian and on the field materials I collected in Afghanistan. Keywords: Persian, Afghan Persian and Tajiki dialectology, Iranian linguistics, Iranian studies, classification of Persian dialects The three closely related languages – Iranian Persian, Afghan Persian (Dari), and Tajiki – form a vast continuum of varieties, stretching from western Iran to Afghanistan and Central Asia (Tajikistan, partly Uzbekistan). Because it is not easy to draw a geographical border between the dialects of Persian proper, those of Afghan Persian, and those of Tajiki based on purely linguistic factors, as these varieties overlap and merge into one another, it is therefore reasonable to conceive of this whole area as a single linguistic continuum within which three major groups can be defined, namely, western (western and central Iran), central (north-eastern Iran and north-western Afghanistan), and eastern (central and northern Afghanistan, Tajikistan, and parts of Uzbekistan).1 What is implied by majhul vowels in Iranian linguistics is long /ē/ and /ō/ (with possible variations) going back to classical Persian. This article deals with the former vowel with regard to the abovementioned linguistic continuum. This 1 The central group consists of Khorasani (type) dialects. For more information on these three groups, see my articles: Ioannesyan (1995, 2007). Youli Ioannesyan, Institute of Oriental Manuscripts of the Russian Academy Sciences DOI 10.1515/9783110455793-008

120

Youli Ioannesyan

historically long (majhul) /ē/ has become /i/ in modern Iranian Persian,2 but is preserved in Afghan Persian (and Tajiki) as a long /ē/: ‫ ﺩﯾﺮ‬dēr ‘late’. In this article, I use bold for historic phonemes (in classical Persian). Slant brackets / / indicate Iranian/Afghan Persian and Tajiki phonemes, square brackets [ ] enclose particular phonetic realizations (allophones) of Iranian/Afghan Persian and Tajiki. It is also worth noting that /e/ in Tajiki (“е” in the Cyrillic-based original Tajiki script) always stands for majhul /ē/ (IPA /eː/). For the Herati dialect, I also need to distinguish between the close mid /e/ and open mid /ε/ and between these two and /ә/. The latter symbol in a final position implies a variant of æ and stands for a near open front unrounded vowel, but slightly reduced and, for this reason, with a less clear articulation. One of the most remarkable features of Tajiki, distinguishing it from Iranian Persian, is the changing of the historic short and historic ma’ruf long vowels into majhul vowels in a closed syllable when followed by glottopharyngeal consonants [h] or [ʔ]. This phenomenon as a universal characteristic of Tajiki merits special attention. According to V. S. Rastorguyeva (1964: 29), the changing of /i/ (which in Tajiki combines respective short and long vowels) into [e] (phoneme /ē/) is most common, for it is typical of the literary language and of all the local varieties. As will be shown later, this shift also occurs in a vast majority of Afghan Persian dialects. This article focuses on the changing of historic short and long ĭ, ī into majhul /ē/. Because the latter in such cases does not go back to historic long ē and is fully determined by environmental factors, I define it as “historically unmotivated”. However, since this phenomenon does not occur in Persian varieties, including the Khorasani Persian dialects of western Afghanistan, it should be regarded as an important distinguishing feature marking, along with other characteristics, the linguistic border between the Persian dialects of Iran (in a broader sense), on the one hand, and the Tajiki and Afghan Persian varieties, on the other hand (or between the western and central groups of dialects, on the one hand, and the eastern, on the other hand). Let us now consider a few examples from Tajiki dialects disregarding some possible minor qualitative differences in the articulation of [e] (/ē/) between certain varieties of this language3: 2 With the exception of some dialects. 3 While in some Tajiki dialects this vowel has a very close variant that sounds halfway between [i] and [e], it may be realized as a diphthongoid in some others (Sokolova et al. 1952: 164; Sokolova 1949: 10). This led researchers engaged in field studies to transcribe it differently. The existence of various phonetic forms does not, however, obscure the general picture, for all the variants of majhul /ē/, regardless of qualitative differences, represent the same phoneme, which occupies a distinct place in the vowel system of the respective dialect without merging with /i/. I changed the original Cyrillic script for Tajiki to Latin for the sake of convenience. However, whenever necessary, I will indicate the Cyrillic spelling of a word in literary Tajiki in a note.

The historically unmotivated majhul vowel

121

Table 1: Table 1: Samples from Tajiki dialects illustrating the “historically unmotivated” majhul e(h) < ĭh

e(h) < īh

e(ʔ), e(h)4 < ĭʔ

Badakhshani5 /gire/6 ‘knot’

Badakhshani /saheh/7 ‘true’

Darvazi /toleh/8 ‘fortune’9

Kulabi /gәre/10

Panjakenti /toleh/11

Qarataqi /gireh/12

Kulabi /tobeʔ/13 ‘subject’

Qarategini Ashti

/farbe/14

‘fat’

/farbeh/16

Darvazi /tobeh/15 ‘district’ Kulabi /vose/17 (personal name)

Kassansayi /farbeh/

Kulabi /neʔmatulo/18 (personal name)

Leninabadi /farbeh/

Kulabi /badfeʔl/19 ‘evil-doer’

Kanibadami /farbeh/ Badakhshani /menat/20 ‘labor’ Kulabi /mehnat/21 Varzabi /mehnat/22 Farghanayi /mehnat/ Sokhi /mehnat/ Chusti /menat/ Badakhshani /de(h)qon/23 ‘peasant’ Qarategini /dehqun/24 Darvazi /dekun/25 Varganza26 /deqon/27 Kulabi /be(h)tar/⁄/be/28 ‘better’ Goroni29/betar/30 Panjakenti /beh/31

4 Arabic /ʔ/ in borrowed words is regularly replaced with /h/ in some southern Tajiki varieties. 5 The Tajiki varieties of the Badakhshan area on the Tajiki side of the border between Tajikistan and Afghanistan. 6 Rosenfeld (1982: 80). Here and below in the column the word in literary Persian, in Arabic script: ‫ﮔﺮه‬, in literary Tajiki (in Cyrillic script): гиреҳ. 7 Murvatov (1982: 140 n. 966). The word in literary Persian, in Arabic script: ‫ﺻﺤﯿﺢ‬, in literary Tajiki: саҳеҳ. 8 Rosenfeld (1956: 201). Here and below in the column the word in literary Persian, in Arabic script: ‫ﻃﺎﻟﻊ‬, in literary Tajiki: толеъ. 9 The examples illustrating the shift e(h) < īh and e(h) *pidar ‘father’, sepâh > *sipâh ‘army’, and sekke > *sikke ‘coin’.

4.3 Vowel harmony The harmony of the penultimate vowel with the ultimate one in (mostly) twosyllable words is another feature seen in the phonology of spoken Persian, which is common but not regular. Changes such as sebil > sibil ‘moustache’ and kelid > kilid ‘key’ exist along with cases such as sefid > *sifid ‘white’. It seems that the only regular vowel harmony is the increasing of o > u in the words with the syllable structure CV-CVC, as in doruq > duruq ‘lie’, šoluq > šuluq ‘crowded’, dorud > durud ‘salute’, soqut > suqut ‘fall’, qorur > qurur ‘pride’, lozum > luzum ‘necessity’, forud > furud ‘landing’, and many other similar cases. The o > u shift also happens in very few cases, which have nothing to do with the vowel harmony noted above, and no specific phonological rule can account for it, as in xânom > xânum ‘lady’, nâxon > nâxun ‘fingernail’, and xord > xurd ‘scattered’.

5 Deletions 5.1 Vowel deletions There are several instances of spoken Persian in which vowels of the written Persian are deleted. For example, the deletion of /o/ in the following verbs:

Spoken vs. written Persian: Is Persian diglossic?

193

mi-foruš-am > mi-fruš-am ‘I sell’, mi-gozar-ad > mi-gzar-e ‘it passes’, mi-gozar-ânam > mi-gzar-un-am ‘I spend [the time]’, or in the noun alominiyom > âlminiyom ‘aluminum’. Such a deletion leads to the verbal roots with the initial consonant clusters, which are not possible according to the phonology of standard Persian (also see below). The other example deals with the deletion of /e/ in the following verbs: mi-šenav-am > mi-šnav-am ‘I hear’, mi-šekan-am > mi-škan-am ‘I break’, mišenâs-am > mi-šnâs-am ‘I recognize’, mi-neš-ân-am > mi-nš-un-am ‘I settle’, mi-sepâr-am > mi-spâ-ram ‘I trust’, mi-šemo-ram > mi-šmor-am ‘I count’, and mi-ferest-am > mi-frest-am ‘I send’; along with the noun fâyede > fâyde ‘use’. But this is not again something absolute as there are verbs such as mi-neš-ânam > mi-šun-am ‘I settle’. They are the same, and mi-nsh-un-am is one step further in terms of change than mi-shun-am and mi-keš-ân-am > mi-keš-un-am ‘I pull [somebody]’, which are almost identical in terms of phonotactics, but behave differently when used in spoken Persian. An example for the deletion of /a/ in a verb is andâxtan ‘to drop’: mi-andâz-am > mi-ndâz-am ‘I drop’. The vowels /a/ and /e/ are deleted as the initial vowels of the possessive enclitics when adjacent to the final vowel of the base noun, such as sandaliy=am > sandali=m ‘my chair’, pâ-hâ-y=ešân > pâ-hâ=šun ‘their feet’, and zânuy=at > zânu=t ‘your knee’, in which both the hiatus -y- and the initial /a/ of the clitic are removed. A similar process takes place for the preposition barâye ‘for’, with the enclitics: barây=am > barâ=m ‘for me’, barây=at > barâ=t ‘for you’, barây=aš > barâ=š ‘for him/ her’, etc.

5.2 Consonant deletions Among the several consonant deletions, some are not regular, such as the deletion of /t/ in dust=at dâr-am > dus=et dâr-am ‘I love you’, or xâstegâr > xâs(s)egâr ‘suitor’. We may refer to some major ones, which seem to be more common, such as the deletion of the final consonants in the terminating consonant clusters: for example, čand > čan ‘how many’, past > pas ‘wicked’, češm > češ ‘eye’, mošt > moš ‘fist’, fekr > fek ‘thought’, and dast > das ‘hand’.4 This deletion is quite common, and the final consonant is heard only when a vowel follows it, such as dast-e rezâ ‘Reza’s hand’.

4 Persian does not allow consonant clusters with more than two consonants. Therefore, the pronunciation of words with final clusters attached to another consonant will result in the deletion of the final consonant of the cluster, as in dastkeš > daskeš ‘gloves’, dastmâl > dasmâl ‘handkerchief’, asbsavâri > assavâri ‘horse riding’, past-fetrat > pasfetrat ‘wicked’.

194

Behrooz Mahmoodi-Bakhtiari

But in terms of the specific consonant deletions, we may refer to /h/ deletion in several places. The glottal /h/ is normally deleted after long vowels, such as siyâh > siyâ ‘black’, kolâh > kolâ ‘hat’, masih > masi ‘Christ’, and sotuh > sotu ‘restlessness’. Naturally, they are pronounced in case a vowel follows /h/, such as siyâh-i ‘darkness’, and kolâh-â ‘hats’. Apart from sobh > sob ‘morning’, in which the final /h/ is deleted after the consonant, we may see some other deletions of /h/ in the middle of the word, which might take place with or without the compensatory lengthening of the adjacent vowel. In words such as mašhad > mašad ‘Mashhad City’, and tashih > tasi ‘correction’, we do not recognize a compensatory lengthening, while in the pronunciation of words such as qahr > qa:r ‘anger’, behzâd > be:zâd ‘Behzad’, e’teqâd > e:teqâd ‘belief’, and šahr > ša:r ‘city’, the deletion of the glottal stop /’/ or glottal fricative /h/ results in the lengthening of the previous vowel. Some words also represent the deletion of a consonant with compensatory lengthening of the non-adjacent vowel, such as qat’ > qa:t ‘cut’, šam’ > ša:m ‘candle’, tanhâ > ta:nâ ‘alone’, and jam’ > ja:m ‘group’. An interesting case of /h/ deletion takes place in šâhzâde > šâzde ‘prince’, in which the three-syllable written Persian word is reduced to a twosyllable one, in which the deletion of the final /h/ of the first syllable has resulted in the deletion of the vowel of the second syllable too. Not only /h/ may be deleted in words. The glottal stop /’/ is also deleted in words such as daf’e > dafe ‘turn’, na’lbeki > nalbeki ‘saucer’, ta’ârof > târof ‘offer’, and al’ân > alân ‘now’. A similar deletion may also be seen for /r/ in tašrif > tašif ‘presence’ and /l/ in aslan > asan ‘altogether’.

5.3 Syllable deletions In terms of deleting the syllables in spoken Persian, cases are not regular at all, and normally take place in verbs. The scattered cases that we know include the deletion of /ne/ in words such as mi-nešin-am > mi-šin-am ‘I sit’, and na-nešastam > na-šest-am ‘I did not sit’, along with the deletion of /go/ in mi-go-zâr-am > mi-zâr-am ‘I put’. The words xodâ hâfez and âqâ mirzâ, which turn to xodâfez ‘goodbye’ and âmirzâ ‘Mr Mirzâ’ in casual speech, are examples of syllable deletions in non-verbal words of spoken Persian. I mean nouns or adjectives, etc.

5.4 Non-syllabic cluster deletions There are also cases in which an adjacent vowel and consonant are deleted without having formed a syllable, such as in čahâr > čâr ‘four’, čehel > čel ‘forty’, pirâhan > piran ‘dress’, mo’allaq > mallaq ‘suspended’, golule > gulle ‘bullet’, and

Spoken vs. written Persian: Is Persian diglossic?

195

the verbs mi-âvar-am > mi-yâr-am ‘I bring’, and dar-bi-y-âvar > dar-bi-y-âr, or dar-âr ‘take out!’ The two verbs biyâvar ‘bring!’ and biyandâz ‘drop!’ turn to biyâr and bendâz respectively, and exhibit the irregular phonological changes further.

6 Insertion Since spoken Persian basically focuses on reducing the linguistic items, insertion of phonemes are not so common in it. However, there are several examples of consonant and vowel insertions, such as the vowel insertion in the words mehrbân > mehrabun ‘kind’, kârgar > kâregar ‘worker’, and pâsbân > pâsebân ‘police’. Examples of consonant insertion basically deal with the gemination of specific consonants, such as be-par > be-ppar ‘jump!’, be-pâ > be-ppâ ‘beware!’, do-tâ > do-ttâ ‘two items’, tipâ > tippâ ‘(angry) kick’, be-kan > be-kkan ‘remove!’.

7 Morphological alternations 7.1 Free morphemes Spoken Persian has some specific morphemes that seem to be uniquely its own. These morphemes are not used in written Persian, and act differently in terms of being lexical or functional. For example, the copula ast ‘is’ shows itself as =e in un medâd=e ‘that is a pen’, and -s(t) in in rezâ-s(t) ‘this is Reza’. Also, verbs such as vâysâdan ‘to stop, to stand up’ (cf. written Persian istâdan), pâ šodan ‘stand up’ (cf. written Persian boland šodan), or words such as var ‘direction’, bâše ‘all right, OK’, yavâš ‘slowly, softly’, gošne ‘hungry’, vâse ‘for’, âre ‘yes’; are all specifically found within the lexicon of spoken Persian. Among the items mentioned, the verb vâysâdan (or vâstâdan) stands out as its verbal root is -st-, which is basically impossible in written Persian morphology. We keep in mind that this verb may not be treated identically with spoken Persian verbs such as mi-škan-am ‘I break’, mi-šnav-am ‘I hear’, or mi-šnâs-am ‘I recognize’ as these verbs are the reduced forms of written Persian verbs, while vâstâdan is used on its own and has no written Persian equivalent. The conjugation of this verb is also interesting and unique:

196

Behrooz Mahmoodi-Bakhtiari

Table 2: Conjugation of the verb istâdan in spoken Persian Simple present

vây-mi-st-am, vây-mi-st-I, vây-miste, . . .

Negative present

vây-ne-mi-st-am, vây-ne-mi-st-i, vây-ne-mi-st-e, . . .

Simple past

vây-st-âd-am, vây-st-âd-i, vây-st-âd-Ø, . . .

Negative past

vây-na-st-âd-am, vây-na-st-âd-i, vây-na-st-âd-Ø, . . .

Present subjunctive

vây-s(t)-am, vây-s(t)-i, vây-s(t)-e, . . .

Present durative

dâr-am vây-mi-st-am, dâr-i vây-mi-st-i, dâr-e vây-mi-st-e, . . .

Imperative (positive and negative)

vâ(y)-st-â, vây-na-st-â/ *vâ(y)-st-Ø, *vâ(y)-na-st-Ø

Another important feature of spoken Persian morphology is the addition of personal enclitics to prepositions such as az ‘from’, be ‘to’, and bâ ‘with’. This addition does not change the initial vowel of the clitic when the preposition ends up in a consonant: az=am ‘of me’, but the vowel harmony process takes place with respect to be and bâ, which terminate in vowels, and results in the vowel change of the clitic: be-h=emun ‘to us’, bâ-h=âšun ‘with them’, which never happens in written Persian: Table 3: Personal enclitics with the prefixes az

az=am, az=at, az=aš, az=amun, az=atun, az=ašun

be

be-h=em, be-h=et, be-h=eš, be-h=emun, be-h=etun, be-h=ešun

bâ

bâ-h=âm, bâ-h=ât, bâ-h=âš, bâ-h=âmun, bâ-h=âtun, bâ-h=âšun

Among the free functional morphemes, we may count tu ‘in’ (cf. written Persian dar, dâxel); dam ‘near, by’ (cf. written Persian kenâr), dar ‘in front of, before’ (cf. written Persian jelo) and vâse ‘for’ (cf. written Persian barâye), along with the question words ki ‘who’ (cf. written Persian ke) and či ‘what’ (cf. written Persian če) and their compound nouns: harki ‘every one’, hičči ‘nothing’, hiški ‘no one’. The question words ki and či have their own distributional behavior, too, as they can be pluralized and accept the plural marker -â (written Persian -hâ), while ke and če do not: (4)

*ke-hâ bâ mâ mi-ây-and? > ki-y-â ba mâ mi-y-ân? ‘Which people will come with us?’

(5)

*če-hâ râ mi-šav-ad dar havâpeymâ bord? > či-y-â ro mi-še bord tu havâpeymâ? ‘What items may be taken on board?’

Spoken vs. written Persian: Is Persian diglossic?

197

Another question word that may be pluralized in spoken Persian is key ‘when’, in the form of keyâ: (6)

*key hâ âzâd-tar-i? > key-â âzâd-tar-i? ‘Which times are you more free’?

7.2 Bound morphemes The verbal endings of spoken Persian (SP) (especially in the present tenses) are different from those of written Persian (WP): Table 4: Personal verb endings in WP and SP Person

1SG

2SG

3SG

1PL

2PL

3PL

WP

-am

-i

-ad

-im

-id

-and

SP

-am

-i

-e

-im

-in

-an

SP (for xâstan and âmadan)

-m

-y

-d

-ym

-yn

-n

As Table 4 shows, the conjugations of the verbs xâstan ‘to want’ and âmadan ‘to come’ are exclusively their own as their spoken Persian form ends in a vowel. Table 5 shows the conjugation of these verbs in written Persian and spoken Persian: Table 5: Conjugation of the verb xâstan in WP and SP Person

1st

2nd

3rd

WP (SG )

mixâham/ miâyam

mixâhi/ miâyi

mixâhad/ miâyad

SP (SG )

mixâm/ miyâm

mixây/ miyây

mixâd/ miyâd

WP (PL )

mixâhim/ miâyim

mixâhid/ miâyid

mixâhand/ miâyand

SP (PL )

mixâym/ miyâym

mixâyn/ miyâyn

mixân/ miyân

8 Affixes Among the derivational affixes, the suffix -aki seems to be specific to spoken Persian, and is used in making adverbs such as zur-aki ‘forcefully’, piš-aki ‘in advance’, yavâš-aki ‘silently’, râst-aki ‘really’, moft-aki ‘freely’, dozd-aki ‘sneaky, in a sneaky manner’, doruq-aki ‘fakefully’, šâns-aki ‘by chance’, xar-aki ‘foolishly’, zirzir-aki ‘covertly’, and holhol-aki ‘hurriedly’. It also attaches to some infinitives to make adverbs such as xâbidan-aki ‘in a lying manner’ and

198

Behrooz Mahmoodi-Bakhtiari

vâstâdan-aki ‘in a standing manner’. Another suffix is -u, which forms from nouns or verb stems an adjective denoting a physical or moral characteristic: sibil-u ‘moustache wearing’ (sebil/ sibil ‘moustache’), tars-u ‘cowardly’ (tarsidan ‘to be afraid’), šekam-u ‘owner of a big appetite’ (šekam ‘belley’). The subjunctive prefix be- which is attached to the present roots, may be deleted in some spoken Persian sentences, such as age gom (be)-š-e či? ‘What if it gets lost?’; bâyad kâr (be)-kon-e ‘he should work’. However, in the sentences with compound verbs, when the nominal part of the verb is definite, it should necessarily appear: (7)

mitarsam eštebâh bokonam/ konam. ‘I fear to make a mistake’. mitarsam hamin eštebâh ro bokonam/ *konam. ‘I fear to make the same mistake’. mitunin kar bokonin/ konin. ‘You can work’. Mitunin az in kârâ bokonin/ *konin. ‘You can do such works’. bâyad fekr bokonin/ konin. ‘You must think’. bâyad ye fekri bokonin/ *konin. ‘You must make a thought’.

9 Clitics The personal enclitics that denote both possession and the direct object case are slightly different in written Persian and spoken Persian in terms of form: Table 6: Possessive and direct object personal enclitics Person

1SG

2SG

3SG

1PL

2PL

3PL

WP SP

-am -am

-at -et

-aš -eš

-emân -emun

-etân -etun

-ešân -ešun

However, in terms of use, they act very differently. Perry (2003: 22) counts as three separate features for the use of the enclitic in spoken Persian: as a subject marker (raft-eš ‘he went’), in combination with a preposition (be-h-eš goft ‘he said to him’), and as attached to a sentence constituent other than these or a

Spoken vs. written Persian: Is Persian diglossic?

199

verb (če-t-e ‘what’s wrong with you?’). It should be noted, however, that these usages are all to be found in earlier styles of Persian, notably classical poetry, as well as in modem informal colloquial (such as be-irân-at bord ‘took you to Iran’). For another example, direct objects in the perfective verbs cannot be denoted by clitics in written Persian, while they can in spoken Persian. The question če kasi to râ zad-e ast ‘Who has beaten you?’ cannot be rewritten as *če kasi zade ast=at?, but ki zad-at=et? is acceptable in spoken Persian. The same holds true for the question kasi ke to râ na-did-e ast? > kasi ke nadid-at=et? ‘Nobody saw you, did he?’ Another difference lies in the spoken Persian clitics adjoining the distributive determiners such as hame ‘all’, har ‘every’, and har-do ‘both’, while such a thing does not take place in written Persian, as seen in the following examples: (8)

har-do-ye šomâ > *har-do-y=etân > har do=tun

‘both of you’5

hame-ye mâ > *hame-ye=mân > ham=amun

‘all of us’

These clitics may also be used as secondary predicates in spoken Persian, but not in written Persian: (9)

man bad-axlâq=et ro ham dus dâram. ‘I love you (even when you are) bad-tempered’. qeyr-e rasmi=t ham qabule. ‘You are accepted (although) not official’.

10 Compounding An interesting issue in the study of spoken Persian is how reduplication adds to the lexicon of this linguistic type. Compounding, which is shown mostly in reduplication here, makes adverbs or secondary predicates. In written Persian we also have sentences such as u râ zende-zende suz-ân-d-and ‘They burnt him alive’. But there are also some other reduplicative adverbs that do not seem to be used in written Persian, and are perhaps among the properties of spoken Persian:

5 For more than two, the quantifier tâ is also added to the number: har se-tâ=tun ‘all three of you’.

200 (10)

Behrooz Mahmoodi-Bakhtiari

teflaki javun-javun mord. ‘Poor fellow, he died young’. čerâ čâyi=t ro dâq-dâq mi-xor-i? ‘Why do you drink your tea so hot?’ dust o rafiqâ-ye bad šuxi-šuxi ‘amali=š kard-an. ‘Bad friends and pals gleefully made him an addict’. ba’zi-yâ alaki-alaki be hame jâ res-id-an. ‘Some people have reached everywhere so so easily’. âdam košt-e, râst-râst ham râh mi-r-e. ‘He has done a crime, and is walking so safely too’.

10.1 A specific adjective structure: Noun/adjective + bešo When Persian infinitive receives the adjective making suffix -i, the produced adjective denotes the potential of the content of the verb, such as xord-an-i ‘edible’ (xord-an) ‘to eat’, did-an-i ‘sight to see’ (did-an) ‘to see’. Sometimes it denotes the character is willing to do something, as in mehmân-hâ-ye mâ raftan-i nistand, mând-an-i hastand ‘Our guests do not seem to be willing to go, they would stay’ (they are not going-like, they are staying-like). In spoken Persian, the adjective šodan-i ‘capable of becoming’ is denoted by the use of the imperative verb bešo ‘become!’, together with the noun, mostly in negative sentences: (11)

in bačče âdam-bešo nist. (âdam ‘human’) ‘This child is not treatable’. (lit. ‘not capable of becoming a human’) in mâšin az avval ham dorost-bešo nabud. (dorost ‘correct’) ‘This car was not repairable from the beginning’. in virune dige xune-bešo nemiše. (xune ‘house’) ‘This wreckage will not be a house again’. (lit. ‘is not capable of becoming a house’)

11 Syntactic alternations There are many cases of syntactic differences between written Persian and spoken Persian. Apart from the case of the progressive constructions proposed by Jeremiás (which I approve in terms of their specific use in spoken Persian),

Spoken vs. written Persian: Is Persian diglossic?

201

perhaps the first difference shows itself in the case of agreement and definiteness. Written Persian does not allow adjectives to be pluralized, and the adjectival phrases are pluralized by adding -hâ to the head noun, such as pesar-e xub, pesar hâ-ye xub ‘good boy, good boys’. But the spoken Persian data show that the adjectives may also be pluralized, provided that the Ezafe marker (-e) is not pronounced: doxtar-hâye xošgel > doxtar-xošgel=â ‘beautiful girls’ (xošgel ‘beautiful’), bačče hâye porru > bačče porru=â ‘rude kids’ (porru ‘rude’). A similar case holds true with the definite marker -e, which can be added to the adjectives in spoken Persian and not in written Persian: doxtar-xošgel-e ‘the beautiful girl’, piremard-badaxlâq-e ‘the bad-tempered old man’. In terms of definiteness, in spoken Persian there is a definite marker, stressed -é (-yé after -; -hé after other vowels), which may be appended to a singular: pesar-é umad ‘the boy (in question) came’, tule-hé mord ‘the puppy died’ (see Perry 2007: 981). However, the greatest syntactic differences between written Persian and spoken Persian concern the differences in word order and movements. Often a preposition is deleted from a sentence, and its modifier is moved as a result. For example: (12)

mixâham be sinamâ beravam. > mixâm beram Ø sinamâ. ‘I want to go to cinema’. u râ be esfahân ferestâdim. > ferestâdim=eš Ø esfahân. ‘We sent him to Esfahan’. be baqal=am biyâ. > biyâ Ø baqal=am. ‘Come into my arms’. hamin al’ân be xâne residam. > hamin al’ân residam Ø xune. ‘I arrived home right now’.

The movement of the modifier as a result of preposition deletion may not take place when some other identical sentences follow the main clause with conjunctions: (13)

diruz be sinamâ raftim, be bazâr ham raftim, be resturân ham raftim. > diruz raftim Ø sinamâ, Ø bazâr-am ratim, Ø resturân-am raftim. ‘Yesterday, we went to cinema, we went to the market too, and we went to restaurant too’.

However, the complex (or reciprocating) conjunctions ‘ham . . . ham’ ‘both . . . and’ keep both the movements and the deletions:

202 (14)

Behrooz Mahmoodi-Bakhtiari

diruz ham raftim Ø sinamâ, ham raftim Ø bâzâr, ham raftim Ø resturân. ‘We went both to cinema, and the market, and the restaurant’.

Although such a change seems to take place mostly with the dative preposition be, the destination is not always a concrete noun (15), and the verbs are not necessarily movement verbs (16). Examples for (15) are: (15)

be donbâl=at miâyam. > miyâm Ø donbâl=et. ‘I will come to take you’. (lit. ‘come after you’) be sorâq=aš raftam. > raftam Ø sorâq=eš. ‘I went to him’. (lit. ‘went to his situation’)

Examples for (16): (16)

with the verb andâxtan ‘to drop’: tofang=at râ ruye zamin biyandâz > tofang=et=o bendâz Ø zamin.6 ‘Drop down your gun’. (lit. ‘drop your gun on the ground’) čerâ hame čiz râ be garden=e man miandâzid? > čerâ hame čizo mindâzin Ø garden=e man? ‘Why do you put all the blame on me? (lit. ‘Why do you drop all the blame on my neck?’) âxar to râ be zendân mi-andâzand. > âxâr mi-ndâzan=et Ø zendun. ‘They will finally put (lit. ‘throw’) you in prison’. with the verb gozâštan ‘to put’: kolâh=aš râ bar sar=aš gozâšt. > kola=š=o gozâšt Ø sar=eš. ‘He put his hat on his head’. yek lahze xodat ra be jâye man be-gozâr. > ye lahze xodeto be-zâr Ø jâye man. ‘Put yourself in my place for a minute’. gâhi mive’i dar dahân=am migozâram. > gâhi ye mive’i mizâram Ø dahan=am. ‘I sometimes put a fruit in my mouth’. with the verb xordan ‘to eat, to hit’:

6 Of course for the preposition ruye ‘on’, it is deleted if the item is dropped on the ground (or, metaphorically, down). That is why there is no sentence such as *tofang=et ro bendâz Ø miz ‘drop your gun on the table’.

Spoken vs. written Persian: Is Persian diglossic?

203

be zamin xordam. > xordam Ø zamin. ‘I hit the ground’. with the verb gereftan ‘to hold’: čerâ tofang=at ra be tarafe man gerefte’i? > čerâ tofangeto gerefti Ø tarafe man? ‘Why have you pointed your gun to me?’ However, we should keep in mind that the verbs residan ‘reach’ and raftan ‘go’ cannot pave the way for such a change in case they are used in their idiomatic or metaphoric meanings: (17)

bel’axare be ârezu=yam residam. > *belaxare residam Ø ârezu=m. ‘I finally reached my desire’. sa’y kon xodat ra be kelâs beresâni. > *sa’y kon xodet ro beresuni Ø kelâs. ‘Try to reach (the level of) the class’. dâr o nadâr=aš be yaqmâ raft. > *dâr o nadâr=eš raft Ø yaqmâ. ‘All his belongings were plundered’. (lit. ‘went to nowhere’) har če badi karde bud be sar=aš âmad. > *har či badi karde bud umad Ø sar=eš. ‘He suffered all the wrongdoings he had done’. (lit. ‘all his wrongdoings came to his head’)

Directional adverbs can also lead to such movements and deletions: (18)

be bâlâ raftam/ be pâyin âmadam. > raftam Ø bâlâ/ umadam Ø pâyin. ‘I went up/ I came down’. be in taraf raftam/ be ân taraf raftam. > raftam Ø in var, raftam Ø un var. ‘I went to this direction/I went to that direction’. be samte čap boro/ be samte râst biyâ. > boro Ø samte čap/ biyâ Ø samte râst. ‘Go to the left side/come to the right side’.

The deletion of a preposition may also take place without any change in the word order, especially when the verbs are static: (19)

če dar dastat ast? > či Ø dastete? ‘What are you holding?’ (lit. ‘What do you have in your hand?’)

204

Behrooz Mahmoodi-Bakhtiari

barâye šâm či dârim? > Ø šâm či dârim? ‘What do we have for dinner?’ čerâ kafš be pâyat nist? > čerâ kafš Ø pât nist? ‘Why aren’t you wearing shoes’? (lit. ‘Why are shoes not on your feet?’) Apart from this, the use of clitics is an interesting syntactic issue of spoken Persian. Until recently, many of the Persian endings now known to be clitics were treated as affixes. But now, in the light of the new linguistic findings, we know that Persian and most of the contemporary Western Iranian languages make use of enclitic pronouns to mark objects for their verbs or possessors for their nouns. The major clitics of Persian comprise the verbal endings denoting possession: ketâb=am (book=first person possessive), and objects: zad-am=aš ‘I hit him’ (past of ‘hit’-first person singular=third person singular). The ezafe marker -e and the definite marker -i indefinite are the other enclitics of standard Persian or written Persian. But there are some other clitics that are the contracted forms of the free functional morphemes and are used solely in spoken Persian: -â (as the contracted form of the plural suffix -hâ), -am for ham ‘also’, as well as -o, which is an allomorph both for râ (the direct and specific object marker) and the conjunction va ‘and’. Râ also has a contracted form, r, which acts as a clitic. In spoken Persian syntax, when a pronoun ends in a consonant, the clitic for râ is deleted. Compare: (20)

man ham miâyam > man=am miyâm.

‘I come too’.

man râ ham bebarid > man=am bebarin.

‘Take me too’.

ânhâ râ ham mibarim > unâ=r=am mibarim.

‘We will take them too’.

Another important issue in terms of the syntax of clitics in spoken Persian is the fact that they may also be attached to some adjectives, something very unlikely in written Persian: (21)

nemixâham qiyâfeye šomâ xâ’en-hâ râ bebinam. > nemixâm qiyâfeye xâ’en=etun ro bebinam. ‘I do not want to see the faces of you traitors’. hâlam az didane qiyâfeye toye bišaraf be ham mixorad. > hâlam az didane qiyâfeye bišaraf=et be ham mixore. ‘Seeing the face of you wicked makes me sick’.

Spoken vs. written Persian: Is Persian diglossic?

205

In the case of the emphatic constructions with xod ‘self’, personal clitics may even get attached to the adjectives: (22)

begozârid xod-e ân bišaraf ham bedânad. > bezârin xod-e bišaraf=eš ham bedune. ‘Let that wicked personally know it, too’. xod-e ân kesâfat ast. > xod-e kesâfat=eš-e. ‘It is the same him, the filthy’. (lit. ‘It’s exactly his filthy self’)

Other examples of specific spoken Persian syntactic constructions refer to the special uses of the particle ke. When the particle ke follows the first constituent of an utterance, it acts as an indignant asseverative: to ke man=o košti ‘you (almost) killed me’, mâ ke bâ ham da’vâ nadârim ‘We (really) have nothing to fight over’. In the case of compound verbs, when ke is placed between the noun and the light verb, it indicates the time of the action and means ‘when’: vaqti ke bozorg šodi, mifahmi > borozg ke šodi, mifahmi ‘You will know it when you grow up’; vaqti ke tamâm šod, nešânat midaham > tamum ke šod, nešunet midam ‘I will show it to you when it is finished’. It also means ‘of course’ when placed after the first constituent as well: pul ke dâštam, vali naxâstam xarj=eš konam ‘Of course I had money, but I didn’t want to spend it’; istgâhe âxar ke nist, vali bâyad piyâde šim ‘This is not, of course, the last station, but we have to get off’ (see also Bâteni 1355/1976; Perry 2007: 985; Mahmoodi-Bakhtiari and Tâjâbâdi 1392/2013). Sadat-Tehrani (2003) refers to another use of this particle as “indifference Ke-construction”. Emerging from Jackendoff’s theory of “parallel architecture”, indifference Ke-construction has the following structure and denotes that the content of the verb has not been of importance to the speaker: Verbi - Ke- Verbi For this structure (which is not seen in written Persian), we may cite the following examples: (23)

nayumad ke nayumad, xodemun mirim. ‘It is not important that he didn’t come. We go by ourselves’. (lit. ‘He didn’t come-that-he didn’t come’)

This construction may also denote that the action is totally terminated, and nothing can be done about it:

206 (24)

Behrooz Mahmoodi-Bakhtiari

raft ke raft. ‘(I don’t care) that he went’. Or ‘He went (and did not return)’. (lit. ‘He went that he went’) (See also Mahmudi-Baxtiyâri and Tâjâbâdi 2013.)

In comparison to Sadat Tehrani’s indifference Ke-construction, I would like to propose an Indifferent ham Construction construction, in which the coordinator ham acts like ke, with a difference in changing the stress pattern of the verbs. That is to say, in this construction, the stress of the first verb is placed on its last syllable, while it lands on the first syllable of the second one. The verbs are mostly in simple past tense as they are supposed to denote a type of conditional structures: (25)

raftì ham ràfti. ‘(It is not important) if you go’. (lit. ‘You went, you went too’.) xordìm ham xòrdim. ‘(There is no problem) even if we eat’. (lit. ‘We ate, we ate too’.)

To the list of such symmetrical structures, we may add another one, with a similar function with na tanhâ . . . balke. . . ‘not only. . . but also. . .’. This spoken Persian construction is as follows: (26)

S1-hičči ‘nothing’-S2 (with ham ‘too’) na tanhâ komak nemikonad, balke aziyyat ham mikonad. > komak ke nemikone hičči, aziyyat ham mikone. ‘He not only doesn’t help, but also bothers’.

Finally, let’s review some verbs that have additional meanings in spoken Persian, and naturally, different valences in terms of their transitivity. Some transitive verbs in written Persian can also be used as intransitive ones (with metaphorically different meanings). Some examples are: (27)

boridan ‘to give up, to get tired’ (lit. ‘to cut’): inqadr kâr kardam ke dige boridam. ‘I worked so much, that I am so tired’. tamâm kardan ‘to die’ (lit. ‘to finish’) teflaki dišab tamum kard. ‘Poor fellow died last night’.

Spoken vs. written Persian: Is Persian diglossic?

207

mâlidan ‘to finish’ (lit. ‘to rub’) refâqate mâ dige mâlide. ‘Our friendship is totally over’. kešidan ‘to endure, to tolerate’ (lit. ‘to pull’) dige nemikešam. az injâ miram. ‘I do not stand it any longer. I will leave here’. sâxtan ‘to get along with, to tolerate’ (lit. ‘to make’) čâre’i nist, bâyad besâzi. ‘There is no choice. You have to tolerate it’. kam âvardan ‘to run out of patience’ (lit. ‘to be little of something’) xeyli talâš kardam, ammâ dige kam âvordam. ‘I tried so much, but I ran out of patience/energy’. qâti kardan ‘to (suddenly) get so angry’ (lit. ‘to mix’) ye vaqt qâti mikonam o mizanam tu un dahanet hâ! ‘I may suddenly get angry, and hit you on your mouth, you!’ javâb dâdan ‘to be of use’ (lit. ‘to answer’) sedâqat dige tu in zamune javâb nemide. ‘Honesty does not work out these days’. These verbs (in their spoken Persian use) are all intransitive verbs, while we know that they are transitive verbs in their common use of the language, with different meaning. We terminate our syntactic examples with these items, which are samples of the interface of syntax and semantics in modern Persian.

12 Semantic Alternations Semantic alternations are not among those issues proposed by Ferguson (1959) in his theory of diglossia. However, providing some examples of several semantic readings of words and phrases in written Persian and spoken Persian can clarify the different uses of these two types even better. Persian words show considerable semantic differences when they are used in spoken Persian. It is not possible to count all of them here. However, the examples below may provide a general view of such differences:

208 (28)

Behrooz Mahmoodi-Bakhtiari

a. The word tâze ‘new’ is also used as a conjunction in spoken Persian, meaning ‘as a matter of fact, also’: in pârk kolli vasileye bâzi dare. Tâze estaxr ham dâre. ‘This park has many play tools. As a matter of fact, it has a pool, too’. b. The adjective âxar ‘final’ (spoken Persian âxe) may also be used in spoken Persian as an exclamative marker to show complaint or surprise over something: âxe čerâ râsteš=o nemigi? ‘Why (on earth) don’t you tell the truth?’ It also means ‘because’ in spoken Persian: xunašun nemiram, âxe bâhâš qahram ‘I do not go to their house because I am not on speaking terms with him’. c. The word digar (SP dige) ‘else’, has several meanings in spoken Persian other than its common written Persian use: dige či goft? ‘What else did he say?’; and the indignant asseverative marker, as in to dige harf nazan ‘you don’t speak at all’; dige nemitunam tahammol konam ‘I cannot tolerate it any longer’; qol midam dige dars bexunam ‘I promise, from now on I will study’; dige az man komak naxây-â! ‘Don’t you ever ask me for help!’; boro dige! ‘go then!’; mohandes mixâstan, man ham mohandesi xunde budam dige ‘They needed an engineer; as you know, I had studied engineering’; in dige češe?! ‘What the hell is wrong with this?!’ d. The word kolli (general), with a transition of stress to the first syllable as kólli, means ‘so much, so many’ in spoken Persian: kólli âdam jam’ šodan ‘So many people have gathered’. e. The structure ham ke šode, means ‘even if’ in spoken Persian: piyâde ham ke šode, bâyad xodemun ro beresunim unjâ ‘We have to reach there, even if (we go) on foot’. f.

The conjunction ham ‘also’ means ‘even if’ in spoken Persian too: pesare ra’is ham bâši, nemituni beri tu ‘even if you are the boss’s son, you may not enter there’; man be barâdaram ham komak nemikonam ‘I do not even help my brother’.

g. The adverb aslan (spoken Persian asan) ‘not at all’ means ‘altogether, at all’ in spoken Persian: qabl az nâhâr miyâd, albatte age asan biyâd ‘He will come before lunch, if he comes at all’. h. The adverb hâlâ ‘now’ can act as a discourse marker in spoken Persian in order to change the topic: hâlâ man ye čizi goftam, to čerâ be del gerefti? ‘Well, I said something. Why are you so insulted’? (lit. ‘Why do you take it to your heart?’).

Spoken vs. written Persian: Is Persian diglossic?

209

i.

Nâsalâmati (lit. ‘without health’) means ‘It seems that you have seemingly forgotten that’: nâsalâmati to mard-e in xune hasti! ‘Don’t forget that you are the man of this house!’

j.

The verb oftâdan ‘to fall’ means ‘to invite oneself’ in dišab xuneye rezâ oftâde budim ‘We had invited ourselves to Reza’s house’; and ‘go’ in biyoft jolo, râh=o nešun bede ‘go forward, and show the way’.

k. The adjective xâli ‘empty’ paradoxically means ‘full’ in spoken Persian when it follows a word in Ezafe construction: kabâb kubide čarbi-ye xâliye ‘Minced kebab is full of fat’ (lit. ‘only fat’). l.

The word šekl ‘picture’ means ‘similar’ in spoken Persian: čeqadr šomâ šekl-e hamin ‘How similar you two are’; čerâ šekle gedâ hâ lebâs mipuši? ’Why do you dress like beggars?’

m. The lexicalized verb nagu ‘don’t say’ means ‘while’ in spoken Persian: fekr kardam sâ’atam ro dozdidan, nagu tu daftaram jâ gozâšte budam ‘I thought my watch was stolen, while I had left it in my office’. n. In case the phrase yèk daf’e ‘once’ is lexicalized, the word yekdaf’è (spoken Persian yedafè) means ‘suddenly’: dâštim bâ ham harf mizadim ke yedafè zad zire gerye ‘We were talking when she suddenly burst into tears’. o. The word qadr ‘value’ with the Ezafe marker (qadde) means ‘as big as’ or ‘as much as’: tu xiyâbun ye muš didam, qadde ye gorbe ‘I saw a rat on the street, as big as a cat’. p. In spoken Persian, two phrases mes(l)e inke ‘like, as’ and ma’lum-e ‘It is clear that’ are used to denote the meaning of ‘apparently’, or ‘It seems that’: mese inke delet kotak mixâd ‘You are apparently looking for getting beaten’, ma’lume xodet badet nemiyâd beri unjâ ‘You do not seem to be reluctant to go there’. q. The adverb ettefâqan ‘accidently’ may also mean ‘as a matter of fact’ in spoken Persian: avval nemixâstam beram, ba’d fahmidam ke ettefâqan unjâ jâye mane ‘I did not want to go there at first, but later I realized that as a matter of fact, there was my place to go’. r.

The verb nemixâd (written Persian nemixâhad ‘he doesn’t want’) means ‘it is not necessary’ in spoken Persian too: mâ nemixâd qâtiye in qaziyye bešim ‘we do not need to get involved with this issue’; šomâ nemixâd kâr konin ‘You are not supposed to work’.

210

Behrooz Mahmoodi-Bakhtiari

s. The word hame=aš (spoken Persian hamaš ‘all of it’) is used as an adverb of frequency in spoken Persian too, meaning ‘always’: hamaš qor mizane ‘he always nags’. t.

The phrase ân-ham (spoken Persian un-am ‘that one, too’) means ‘especially’ in spoken Persian: in kâr=â zešte, unam az yek ostâd-e dânešgâh ‘These deeds are wrong, especially from a university professor’.

u. The phrase haminjur ‘this way’ also means ‘consecutively, repeatedly’: haminjur harf mizad, saremun ro bord ‘he kept talking, he talked our heads off’. v. The phrase doròst-e means ‘that’s right’, but the word dorostè is an adverb meaning ‘in one piece’: un mitune to ro dorostè qurt bede ‘he can swallow you in one gulp’. w. Balke ‘but also’ may mean ‘may be’ or ‘in the hopes that’: mixâd bere šahr, balke kâri peydâ kone ‘he wants to go to the town, maybe he can find a job’. x. The lexicalized phrase umadim o lit. ‘We came, and’, means ‘imagine’, or ‘suppose’: umadim o taraf kutâh nayumad, čekâr konim? ‘Imagine that the guy does not compromise, what should we do?’

13 Conclusion I hope I have shown that the differences between spoken Persian and written Persian are not limited to some deletions in syntax or some sound shifts in phonology. I agree with Perry that Persian does not provide us with as many unique spoken Persian lexical items as Arabic, but keep Jeremiás’s discussions in mind, too, that the unavoidable differences between the two varieties of Persian cannot be allocated within the study of the “dialects of Persian”, especially now, when the population of the speakers of spoken Persian has grown, and the spoken Persian that was once the dialect of the people of Tehran has now become the spoken variety of Persian used by almost all Iranian Persian speakers. I have tried to show that Persian is diglossic in comparison to many other known languages in terms of the remarkable differences in its written and spoken forms. If we consider diglossia as a continuum, Persian may have a place on that continuum, although it may not necessarily be close to classical diglossic languages.

Spoken vs. written Persian: Is Persian diglossic?

211

Perry (2003: 26) concludes his article by saying that the real question to be answered is not “How diglossic is Persian?”, but “How did Persian avoid diglossia?” Given the above discussion, I dare to ask “To what extent is Persian diglossic?”

References Bâteni, M. R. 1355/1976. Este’mâl-e ke, dige, âxe, hâ dar fârsi-ye goftâri [The use of ke-, dige, âxe, and hâ in spoken Persian]. Majalleye Dâneškadeye Adabiyyât va ‘Olum-e Ensâniye Dânešgâh-e Tehrân 93‒94. 257‒271. Boyle, John A. 1952. “Notes on the colloquial language of Persia as recorded in certain recent writings. Bulletin of the School of Oriental and African Studies, University of London 14 (3). 451‒462. Coulmas, F. 2013. Sociolinguistics: The study of speakers’ choices, 2nd edn. Cambridge: Cambridge University Press. Deyhim, G. 1368/1989. Gerâyeš hâ-ye âvâyi va vâji-ye fârsi-ye goftâri-ye tehrân [The Phonetic and Phonological Trends of the Spoken Persian of Tehran]. Majalleye Zabânšenâsi 6 (2). 97‒105. Ferguson, Charles A. 1959. Diglossia. Word 15. 325‒340. Ferguson, Charles A. Diglosia revisited. Southwest Journal of Linguistics 10. 34. Fishman, J. 1967. Bilingualism with and without diglossia, diglossia with and without bilingualism. Journal of Social Issues 23 (2). 29–38 Frommer, Paul Robert. 1981. Post-verbal phenomena in colloquial Persian syntax. Los Angeles: University of Southern California dissertation. Ghobadi, Chokofeh. 1996. La langue parlée reflétée dans les écrits. Studia Iranica 25 (1). 135‒ 158. Henderson, Michael M. T. 1975. Diglossia in Kabul Persian phonology. Journal of the American Oriental Society 95 (4). 651‒654. Hodge, C. T. 1957. Some aspects of Persian style. Language 33. 355‒369. Jahangiri, Nader. 1980. A sociolinguistic study of Tehrani Persian. London: University of London dissertation. Jeremiás, Éva. 1984. Diglossia in Persian. Acta Linguistica 34 (3‒4). 271‒287. Kaye, A. 2009. Arabic. In B. Comrie (ed.), The world’s major languages, 560‒577. London: Routledge. Mahmudi-Baxtiyâri, B. 1388/2009. Namâyešnâme-ye Ustâd Noruz-e Pineduz: Manba’i Mohem barâye Motâle’e-ye Fârsi-ye Goftâri-ye Dore-ye Qâjâr [The play Ustâd Noruz, the shoesmith: An important reference for the study of spoken Persian in the Qâjâr era]. Majalle-ye Pažuheš-e Olum-e Ensâni (Bu Ali Sinâ University of Hamedân) 25. 87‒111. Mahmudi-Baxtiyâri, B. & F. Tâjâbâdi. 1392/2013. Čand kârbord-e kalâmiye digar az ke- dar fârsiye goftâri [Some other pragmatic functions of ke- in spoken Persian]. Motâle’âte Zabân va Guyešhâye Qarb-e Irân 2. 103‒116. Perry, John. 2003. Persian as a homoglossic language. Iran Questions Et Connaissances in Cultures Et Sociétés Contemporaines III. 11‒28.

212

Behrooz Mahmoodi-Bakhtiari

Perry, John. 2007. Persian morphology. In Alan S. Kaye (ed.), Morphologies of Asia and Africa, vol. 2, 975‒1019. Winona Lake, IN: Eisenbrauns. Sadat-Tehrani, Nima. 2003. The indifference ke construction in modern Persian. Linguistica Atlantica 21. 137‒151. Samareh, Yadollah. 1977. A course in colloquial Persian (for foreigners). Tehran: University of Tehran Press. Schiffman, H. 1998. Diglossia as a sociolinguistic situation. In F. Coulmas (ed.), The handbook of sociolinguistics, 205‒216. Malden: Blackwell. Schmidt, Richard Wilbur. 1974. Sociostylistic variation in spoken Egyptian Arabic: A re-examination of the concept of diglossia. Providence: Brown University, dissertation. Vahidiyân, Taqi. 1343/1964. Dastur-e Zabân-e ‘Âmiyâne-ye Fârsi [A grammar of spoken Persian]. Tehran: Amir-Kabir.

Lewis Gebhardt

11 Accounting for *yek ta in Persian Abstract: Persian has been described as using numeral classifiers in expressions such as se ta ketab ‘three CL book’ ‘three books’. However, unlike classifiers in many languages, ta is not used with the numeral for ‘one’. Based on a syntactic theory of feature checking in the syntax, I argue that Persian ta has some characteristics of being a number marker but also has properties of numeral classifiers. Keywords: Persian, number, classifiers

1 Introduction Persian uses the numeral classifier ta in expressions that enumerate count nouns (Mahootian 1997: 195; Ghomeshi 2003: 55; Lambton 1974: 43–44). Example (1a) illustrates a classifier construction in Persian, paralleling example (1b) in Mandarin, another classifier language.1 Note that ta is not used with mass nouns, as in (1c). (1)

a.

do/se/dæh ta ketab2 two/three/ten CL book ‘two/three/ten books’

b.

yi/liang ge xuesheng one/two CL student ‘one/two student(s)’ (Sonya Chen, p.c.)3

c. *čænd how.many

ta CL

čai tea

With regard to (1c), it’s important to note that the expression is ungrammatical on the intended reading of quantifying a mass volume of tea. However, čænd ta 1 The order numeral + CL + noun is crosslinguistically common but not the only one (Greenberg 1972; Simpson 2005; Aikhenvald 2000). 2 The Persian transcriptions are in broad IPA except: c=IPA ʧ, y=IPA j, š= IPA ʃ, e=IPA ɛ, ey=a diphthong. 3 Many thanks to Sonya Chen for data, judgments and comments. Lewis Gebhardt, Northeastern Illinois University DOI 10.1515/9783110455793-012

214

Lewis Gebhardt

čai can be used to ask how many cups, or other units, of tea, where the intended quantification is over the number of cups. This is similar to the general ban on pluralizing mass nouns in English, where nonetheless one can ask “How many teas?” with the intended meaning of how many cups of tea. While Persian speakers favor using ta in colloquial speech, it isn’t obligatory for grammaticality, and in writing and more formal speech ta is typically omitted. Even in colloquial Persian, ta is sometimes not used and in some contexts, such as with very large numerals, the use of the classifier is awkward or ungrammatical, as in (2a), a fact consistent with the optional or obligatory absence of classifiers with large numerals in other languages (Aikhenvald 2000: 100). Ta is not used with quantifying words other than numerals, such as xeyli and ziyad in (2b), both meaning ‘many, a lot of’, and hær ‘each’ in (2c), though it does appear in the expression cændta ‘how many’ in (2d). (2)

a.

6.022 × 1023 molkul / 6.022 × 1023 molecule ‘6.022 × 1023 molecules’

b. *xeyli ta ketab many CL book ‘many books’ c. *hær ta ketab each CL book ‘each book d.

/

/

*6.022 × 1023 6.022 × 1023

xeyli many

hær each

ta CL

molkul molkul

ketab book

ketab book

čændta ketab how.many book ‘how many books?’

In this regard, Persian contrasts with Mandarin, which allows a classifier with non-numeral quantifying elements. (3)

hen duo (ge) very many CL ‘many students’ (Sonya Chen, p.c.)

xuesheng student

Persian doesn’t use classifiers with demonstratives on their own, as in (4a), though it’s acceptable to use a classifier with the demonstrative if a numeral

Accounting for *yek ta in Persian

215

appears, as in (4b). This contrasts, again, with Mandarin, which allows cooccurrence of classifier and demonstrative, as in (4c)4; Cantonese also allows a classifier and demonstrative, as in (4d). (4)

a. *un that

ta CL

pærænde bird

b.

un se ta pærænde that three CL bird ‘those three birds’

c.

zhei ben shu this CL book ‘this book’ (Cheng and Sybesma 1999: 510n17)

d.

li1/go2 bun2 syu1 hai6 ngo5-ge3 that/this CL book be 1SG - GE ‘that/this book is mine’ (Cheng and Sybesma 2012: 642)

One other point of contrast is that Persian ta never has a determiner-like function, though in Chinese languages classifiers can function like definite and/or indefinite articles (see Cheng and Sybesma [2005] for numerous examples and discussion). Finally, there are said to be other classifiers in Persian that are less commonly used such as nafar for people and jeld for books and volumes (Lambton 1974: 43‒44), although Gebhardt (2009) argues that those items are classifier modifiers rather than classifiers per se. The distinction is unimportant for the scope of this article.5 Summarizing, I focus on ta and assume for simplicity in argumentation but, at the slight risk of oversimplification, that ta is required for numerals greater 4 Cheng and Sybesma point out that zhei ‘this’ might be analyzed as a demonstrative plus classifier (Cheng and Sybesma 1999: 510 n. 17). 5 For the record, besides do ta ketab ‘two CL book’, Persian allows expressions like do jeld ketab ‘two “CL” book’, where jeld is used for books or volumes and apparently replacing ta; I put gloss jeld as “CL” (in scare quotes because of the argument that jeld is not really a classifier. Persian also allows do ta jeld ketab ‘two CL “CL” book’. Gebhardt (2009) argues that in such cases ta is the closest thing to a classifier in Persian while jeld and some other words merely modify, by adjunction, ta itself, which in this case and others is not obligatory. Other languages lexicalize both the classifier function and the semantically descriptive function into a single item, as in Mandarin san zhang zhuozi ‘three CL.Furniture table’.

216

Lewis Gebhardt

than yek ‘one’ and is barred from appearing with yek, putting aside the expression čændta ‘how many’. It is the narrow purpose of this article to argue that ta isn’t exclusively a classifier and that it also has some properties of the category number, plural in particular. It will be shown that, being plural, ta is then barred from appearing with singular yek. In a broader picture, the idea that ta’s category is squishy between classifier and number is in contrast to theories that see classifier and number as distinct categories. However, at the same time those theories also assert or assume that classifier and number are, more or less, mutually exclusive categories in complementary distribution. Complementary distribution suggests that they might be two manifestations of a single more abstract category, and this is part of the approach here. The article is organized as follows. Section 2 follows with background literature on classifiers. In section 3, I lay out syntactic assumptions and identify features involved in the syntactic operation of Merge regarding items in the determiner phase, whereby syntactic features of a lexical or functional item license that item to combine with other elements. In section 4, I address the specific problem of accounting for why the numeral yek ‘one’ doesn’t occur with the classifier ta, and I briefly conclude and look forward in section 5.

2 Background A well-known proposal to explain the use of classifiers is Chierchia’s (1998) nominal mapping parameter. Theoretically, Chierchia makes the common assumption that a determiner phrase, headed by D, dominates the noun phrase, overtly at least in some languages. Empirically, Chierchia points to attested differences between classifier languages like Mandarin and non-classifier languages like English and French. Generally, classifier languages lack articles and lack general plural marking (Chierchia 1998; also see below). Since classifier languages lack articles, in contrast to English-type languages with articles, it is claimed that nouns don’t require a determiner to be arguments and can appear bare in argument position, as in (5a). English-type languages, however, require an article in count singular contexts, as in (5b). Further, in classifier languages plural marking is not required in plural contexts, as in (5c), again in contrast to English-type languages where the plural must appear, as in (5d). In fact, to the degree that it exists at all, plural morphology is often highly restricted in classifier languages.

Accounting for *yek ta in Persian

(5)

a.

217

gorbe xabid-e cat sleep.Past-3.SG ‘The cat is sleeping’

b. *(the/a) cat is sleeping c.

se ta gorbe three CL cat ‘three cats’

d.

three cat-*(s)

Chierchia notes that the nonappearance of plural and articles in classifier languages has a parallel in English mass nouns, which ordinarily don’t take plural, as in (6a), and don’t require an article to be a verbal argument, as in (6b). (6)

a.

*We pumped airs into the tires

b.

(Some) air has been pumped into the tires

Following from earlier analyses of count and mass (Krifka [1995] and Gillon [1992], among others), Chierchia proposes that all Mandarin nouns are in some sense mass, while some English nouns like “cat” and “student” are count. Mass nouns, in Chierchia’s analysis, don’t need articles to be used as arguments and they can’t be pluralized. Further, since Mandarin nouns are mass, they require a classifier to make countable units, so the use of a word like Persian ta is similar to the use of an expression like “loaves (of)” with English mass nouns as in the expression “three loaves of bread”. This leads to Chierchia’s nominal mapping parameter, which states that languages parametrically set their nouns as mass or count and, correspondingly, arguments or predicates. If a language’s nouns are all mass, then they can appear as bare arguments, don’t have regular plural, and require a classifier in contexts of enumeration. If a language’s nouns are count, then they cannot appear bare in argument positions and require a plural in semantically plural contexts. In Chierchia’s view, Mandarin nouns are all mass arguments, French nouns are count predicates; English-like languages have both count and mass nouns. Borer (2005: 87‒107) presents a number of empirical and theoretical objections to Chierchia’s analysis. An important point of departure for her is that, contra Chierchia, languages cannot be cleanly cloven into those that have classifiers and no plural and those that have plural and no classifiers. Armenian, for example, has both. So it can’t be that languages set their nouns as mass or

218

Lewis Gebhardt

count across the board. Nonetheless, Borer observes, there is a complementary distribution within Armenian such that while an enumeration expression may have only a classifier, as in (7a), or only plural, as in (7b), or neither, as in (7c), a classifier cannot appear with plural and classifier in the same construction, as in (7d); all mean or are intended to mean ‘two umbrellas’. (7)

a.

yergu two

CL

had

hovanoc umbrella

b.

yergu two

hovanoc-ner umbrella-PL

c.

yergu two

hovanoc umbrella

d. *yergu had hovanoc-ner two CL umbrella-PL (adapted from Borer 2005: 117‒118) Still, both Borer’s and Chierchia’s proposals depend partly on the idea that classifiers and number serve similar or parallel functions in unitizing a noun’s denotation into countable units, an approach earlier investigated by Doetjes (1997). Very briefly, Borer’s proposal is that a classifier phrase can be headed by either a classifier or number. The function of the head of the classifier phrase is to individuate a root, which is unspecified for mass or count. Since a classifier and number compete for the same head, their co-occurrence is ruled out in Armenian, as Borer predicts. While Chierchia’s analysis argues for the denotation of nouns across a language and Borer’s analysis focuses on particular constructions, both Chierchia and Borer assume a kind of mutual exclusivity of classifiers and number markers. However, there is ample evidence that languages don’t come in pure types as Chierchia would suggest. In three studies of language samples, Greenberg (1972), Sanches and Slobin (1973), and T’sou (1976) show that, while individual languages may tend toward having classifiers or number, the complementarity is only a rough one, with many languages having both. The World Atlas of Language Structures (WALS) bears similar results (Dryer and Haspelmath 2013). Crossing WALS features 33A and 34A, we come up with the following distribution of languages that have classifiers and/or plural marking.

Accounting for *yek ta in Persian

219

Table 1: Numbers of languages that have classifiers/nominal plural (from WALS)

No classifiers Classifiers exist

No nominal plural

Nominal plural exists

8 (7.0%) 4 (3.5%)

80 (70.2%) 22 (19.3%)

Thus, about a fifth of languages in the WALS sample have both classifiers and plural, a fact that at least in a superficial sense contradicts predictions from Chierchia’s analysis. Borer’s analysis can accommodate the facts in Table 1, since for her the important restriction is that they not appear in the same construction, although both can be available within a language. But troubling for her analysis is that languages with both can and do use them in the same construction, contrary to her prediction. Example (8) is from Paiwan, (9) from Itzaj Maya, (10) from Tariana, (11) from Akatec, and (12) from Jacaltec. (8)

ma-telu a vavayavavayan CL -three A girl.Redup ‘three girls’ (Tang 2004: 385)

(9)

ka’=tuul im-mejen 2=CL .Animate is.a-small ‘my two small children’ (Hofling 2000: 228)

(10)

duha Art.Fem

inaɾu woman

ñham-epa two-NumCL .human

paal-oo’-ej child-PL-Top

kanapeɾi-pidana give.birth-Rem.P.Rep emi-peni youngster-PL

‘The woman gave birth to two children’ (Aikhenveld 2003: 94) (11)

kaa-(e)b’ poon two-NumCL plum ‘two small plums’ (Zavala 2000: 124)

(12)

xwil I.saw ‘I saw (Craig

ca-waŋ two-NumCL two men’ 1977: 137)

heb’ PL

no’ NounCL

winaj man

220

Lewis Gebhardt

Persian too can use both the classifier and the plural marker -ha, which also indicates definiteness/specificity. (13)

pænj ta gorbe-ha five CL cat-PL ‘the five cats’

Without the suffix -ha, the example in (13) would have an indefinite interpretation, ‘five cats’. Most Persian speakers I have consulted in previous research find sentences like (13) with both ta and -ha acceptable. Co-occurrence of the classifier and plural marker is also possible in other Persian languages, such as Tajik. (14)

a.

бист нафар студент-он twenty CL student-PL ‘twenty students’ (Ido 2005: 37)

b.

дар мактаб-и in school-EZ

мо us

122 нафар 122 CL

пионер-он pioneer.Masc-PL

ва and

пионерка-гон ″астнд pioneer.Fem-PL are ‘There are 122 pioneer boys and pioneer girls in our school’ (Perry 2005: 163) Since the purported mutual exclusivity of classifiers and number is not absolute, either across languages or within languages, a proper treatment of their distribution vis-à-vis each other requires a more general theory. In the next section I propose a primarily syntactic account based on the syntactic checking of features that may bundle in various ways. Depending on the precise bundling of features, an item may be a “classifier”, “number”, or something in between.

3 Syntactic Assumptions 3.1 Merge I assume that items in the lexicon are bundles of features and that these bundles are combined through the operation of Merge. Features may be semantically interpretable and contribute meaning or they can be uninterpretable, in which case they are purely for syntactic computation. Since an uninterpretable feature [u-F]

Accounting for *yek ta in Persian

221

can’t be interpreted at the interface of syntax and, especially, semantics, after Merge [u-F] acts as a probe that seeks a matching goal, [F], its interpretable partner. Following Match, [u-F] is checked by [F] and eliminated from the derivation. If uninterpretable features are not eliminated by the end of syntax, the derivation is said to crash, i.e., to result in an illicit structure. Checking is often assumed to be between a [u-F] probe and its c-commanded [F] goal. Finally, when a lexical item merges with another lexical item or with an already merged syntactic object (a phrase), the newly formed object is labeled, typically with the category or feature of the head of the merged item. (For summary overviews of Merge, see Chomsky [2008: 137‒147], among many others, and Hornstein [2009: ch. 3]. See Baker [2008], among others, on the possibility of the goal [F] c-commanding the probe [u-F].) As an example, take the English article “the” and the noun “kitty”. Assume that the feature bundle for “the” comprises at least [u-N, Def ], indicating that it contributes definiteness to the meaning of the merged expression and that it will probe for something with the feature [N], i.e., a noun. The item “kitty” is minimally [N, KITTY], where [N] is the interpretable feature indicating the nominal category of the item; [KITTY] is a semantically interpretable feature or set of features specifying felineness. The two items merge as in (15). (15)

In (15), [u-N], which c-commands N, matches N, then checks with N and is eliminated from the derivation, the strikeout indicating that it has been checked, as in (16). (16)

The bundle [u-N, Def] is a determiner and heads the merged phrase; it thus labels the new phrase, in traditional terms, a determiner phrase, or DP.6 When sent to the interfaces, this constituent will get its pronunciation, [ðəkhæt], and its semantic interpretation, something to the effect of “the particular kitty known to both speaker and hearer”. 6 It is often assumed that lexical roots are uncategorized in the lexicon, to be later categorized with a dedicated node such as n, v, etc., as in Distributed Morphology (e.g., Harley and Noyer 1999; Embick and Marantz 2008), or are categorized in the syntax as in Borer (2005). To streamline exposition, however, however, I will simply indicate nouns as having an inherent nominal feature [N].

222

Lewis Gebhardt

3.2 Some features of heads within DP The proposed features that go into the bundles are similar to and overlap with those in a feature geometry proposed by Harley and Ritter (2002), where the features are hierarchically arranged. In (17), adapted from Harley and Ritter (2002: 486), a referring pronoun has at least a [participant] feature, which specifies first and/or second person, and an [individuation] feature, which specifies for number. The [participant] feature is further specified as [speaker] and/or [addressee] while [individuation] may be, simply put, [group], which is roughly plural, or [minimal], which is, more or less, singular. The features are arranged as in (17), and a concrete example is for the English pronoun “we” in (18). (17)

(18)

Note that the presence of lower features entails the presence of their dominating features. Also, other features are available that aren’t present in (17) or (18) make reference to animacy and gender: a [feminine] feature would be present for “she”, for example. Harley and Ritter’s feature geometry is supposed to capture crosslinguistic facts as well. For example, languages are more likely to have singular and plural morphemes than a dual morpheme, and if a language does have dual, it will likely have a singular/plural distinction as well. Therefore, the specification of dual must be lower in the feature tree. This particular fact they finesse with [individuation] having both the subfeatures [minimal] and [group], together indicating that the pronominal form is the minimum plurality of twoness. Here I adapt Harley and Ritter’s overall schema, though I present more features for other elements in the DP, focusing on number markers and classifiers. Again, at the head of the nominal projection, leaving out Distributed Morphology details, is a noun. Nouns have the category feature [N] and whatever semantic

Accounting for *yek ta in Persian

223

features are needed for interpretation. I assume a number phrase, headed by a morpheme that indicates singular/plural (and presumably dual) (Ritter 1991, 1992). Since number merges with nouns, it has an uninterpretable feature [u-N] along with its feature-geometric feature and value of [individuation: minimal] or [individuation: group]. So a plural like English -s is specified by the feature bundle in (19). I assume a null morpheme for the English singular (20). (19)

-s

(20)

-Ø

[u-N, individuation: group] [u-N, individuation: minimal]

On top of number phrase are determiner phrases. Historically, before the introduction of a number phrase, Abney (1987), following from Brame (1981, 1982), Horrocks and Stavrou (1987), Szabolcsi (1981, 1984, 1987, 1994), and others, had proposed a functional projection headed by a determiner, whose complement was the NP, as in (21a). With the later incorporation of a number phrase, the DP took the form of (21b). (21) a.

b.

However, based on semantic facts and distributional criteria, several analyses have argued that there are at least two D levels. Going back to at least Bowers (1975), Jackendoff (1977), Milsark (1979), and Keenan (1987), it seemed that determiners were of at least two types. Among the differences between them is that it’s harder to have a determiner phrase headed by a definite article in existential constructions, as in (22c),7 although indefinites are possible in the same position (22a, 22b). Further, strong determiners can precede weak ones but not vice versa, as shown in (22d).

7 In (22), “there” is existential, not deictic. In fact, though restricted, the marked interpretation is possible.

224 (22)

Lewis Gebhardt

a.

There are three books on the table.

b.

There’s a book on the table.

c. ?There are the tree books on the table. d.

the few kitties / *few the kitties

In short, determiners such as “the” and “each” were deemed “strong” while “few”, numerals, and others were “weak”. The distinction between them requires the presence of at least two determiner positions, as in Zamparelli (1995). Gebhardt (2009) called the-type determiners strong quantifying determiners and numeral-type determiners weak quantifying determiners. Each kind occupied the head of a separate phrase such that one version of an expanded DP is as in (23). (23)

Since the focus of this article is number and classifiers, I limit the analysis to only as high as Weak Quantifier Phrase since that phrase’s head is what merges with Number P. Hence, my focus is indefinites. Since a numeral like “three” merges with a plural noun, the uninterpretable feature of “three” must therefore be [u-group], which checks with [group] on the plural marker. The numeral also provides interpretable quantification, which is indicated as [q]. So far the feature bundles we have are in (23), and an example of iterated Merge is in (24). Note that for [group] and [minimal] since these two features entail the presence of [individuation], [individuation] is for simplicity omitted from the feature bundle. In (24) only heads and complements are indicated; it’s assumed that Spec positions do not project.8

8 Baker (2003) argues that, in the simple case, the presence of Spec appears for verbs and is the distinguishing syntactic feature of a verb.

Accounting for *yek ta in Persian

(24)

a.

nouns:

[N ]

plural:

[u-N , group]

singular:

[u-N , minimal]

numerals:

[q, u-group]

225

b.

In (24b), -s has merged with “cat” and checked its uninterpretable [u-N] feature. Then, “three” has merged with NumP and in turn checked its uninterpretable [u-group] feature. To the degree that number and classifiers are mutually exclusive, that mutual exclusivity suggests a commonality between the two categories. In Borer’s analysis, in fact, each potentially occupies the head of a classifier phrase, although since they compete for that position, only one of them may appear. In this article, number markers and classifiers share syntactic features and thus may be more or less alike. In those languages that discourage the co-occurrence of number and a classifier, the relevant features for each must account for this fact. However, it’s also in principle possible for a classifier to co-occur with a plural marker as long as there’s no clash in features and that, as always, Merge results in eliminating any uninterpretable features. The literature mentioned in section 2 also often argues that, in languages that have them, classifiers serve to somehow individuate the noun into countable units. And while classifiers may have functions other than just intermediaries between numerals and nouns, their canonical function is in numeral + noun constructions. The individuation function can be accommodated under Harley and Ritter’s [individuation]. And if we assume the general case that classifiers have noun complements, they are [u-N]. As suggested in (1b), the Mandarin classifier is insensitive to whether the noun is singular or plural; it only seeks the general [individuation] feature rather than one of its potential [minimal] or [group] daughters. (25)

classifier:

[u-N , individuation]

A classifier takes an NP to yield a classifier phrase (CLP). In turn, the numeral head is a weak quantifying determiner that takes a CLP complement to yield

226

Lewis Gebhardt

WQP. The feature in the numeral that assures Merge with CLP is [u-individuation], which matches [individuation] in the classifier; the numeral also has the quantification feature [q]. Successive Merge is illustrated in (26b) for the Persian expression in (26a). (26)

a.

do ta danešju two CL student ‘two students’

b.

We now have in place a basic set of computational features involved in iterated merge of items within DP for numeral + CL + noun constructions. The following section provides some refined details to account for additional facts, particularly the ungrammaticality of Persian yek ‘one’ and the classifier ta occurring together.

4 Variant feature bundles and ruling out *yek ta For the data investigated so far, I have proposed that the items in the categories below comprise the feature bundles as follow. (27)

nouns:

[N ]

plural:

[u-N , group]

singular:

[u-N , minimal]

numerals:

[q, u-group]

classifier:

[u-N , individuation]

First of all, languages vary in which of the items in (27) they have in their lexicons. English has nouns, numerals, plural, singular (although singular in English has no phonetic content) but it lacks a classifier. Persian has nouns, numerals, a

Accounting for *yek ta in Persian

227

plural marker, and a classifier. Mandarin has nouns and classifiers, but no singular/plural distinction, at least not as in English.9 Like English, Persian singular has no pronunciation. Persian does have a plural marker, but it’s used for definite nouns.10 (28)

gorbe-ha cat-PL ‘the cats’ / #‘cats’

Therefore, while it’s correct enough to say that both Persian and English have a plural marker, the Persian plural morpheme differs in feature content from English -s; their feature distinction is represented as in (29). (29)

a.

Persian -ha:

[u-N , group, Def]

b.

English -s:

[u-N , group]

That is, English and Persian have bundled available features in different ways. Unlike Mandarin ge, the Persian classifier ta is sensitive to cardinality, occurring with numerals other than yek ‘one’. Therefore, ta must contain the feature [group]. So for Persian we revise feature makeup of the classifier in (27) as that in (30). Plural numerals must therefore be [u-group]. (30)

Persian ta:

[u-N , individuation, group]

plural numerals:

[q, u-group]

That is, ta has features of both a classifier and number. Numerals must therefore be [u-group]. I assume that, as in English, Persian has a null singular marker that nonetheless has syntactic content. It seeks nouns and has the number feature of [minimal], as above.11 9 There are Mandarin morphemes that contain plural meaning, such as -men, but these are restricted in semantics and are not generalized plural markers. The point is that in Mandarin plural marking is not obligatory for semantically interpreted plurality. Since the focus here is Persian, I overlook Mandarin plurals and adopt Chierchia’s characterization that Mandarin doesn’t have “true” plural marking (Chierchia 1998: 355). 10 More precisely, -ha is specific rather than definite (e.g., Karimi 1999: 8). Since the specific/ definite difference isn’t important here, I will indicate the feature as [Def]. 11 Persian can indicate singular indefiniteness with the numeral yek and/or an indefinite suffix -i: yek gorbe / yek-i gorbe ‘a cat’. A more complete syntactic account will accommodate the availability of these markers in the DP complex, but I put those aside here.

228

Lewis Gebhardt

(31)

Persian Ø singular:

[u-N , minimal]

Persian numerals but not other quantifying words like xeyli ‘much, many’ require a classifier. So they contain the computational feature [u-individuation]. Summarizing for Persian is the list in (32). And since numerals want not only an individuated noun but one that’s an absolute quantity (with a numeral) rather than a relative quantity (“few”, “many”, etc.), numerals are specified as [u-abs] seeking an [absolute] feature in ta. Summarizing, we have the following Persian items with their feature makeup. (32)

nouns:

[N ]

null singular

[u-N , minimal]

-ha:

[u-N , group, Def]

ta:

[u-N , individuation:absolute, group]

yek:

[q, u-minim, u-absolute]

other numerals:

[q, u-group, u-absolute]

The structure for an expression with a numeral other than yek (33a) is in (33b), with the probes’ uninterpretable features checked. (33)

a.

do ta danešju two CL student ‘two students’

b.

The structure of an expression with yek (34a) is in (34b), again with uninterpretable features checked. (34)

a.

yek danešju one student ‘one student’

Accounting for *yek ta in Persian

229

b.

The ungrammaticality of yek occurring with ta (35a) is evident in (35b), with the feature [u-minimal] left unchecked. (35)

a. *yek one

ta CL

danešju student

b.

Similarly, to the degree that ta is required with plural numerals in colloquial speech, the ungrammaticality of (36a) is illustrated in (36b), where [u-group] is unchecked. (36)

a. *do two b.

danešju student

230

Lewis Gebhardt

5 Conclusion The ungrammaticality of *yek ta is easily accounted for within a standard framework that holds items from the lexicon to be bundles of features, some of which are purely syntactic for computational purposes that must be checked and eliminated by the end of the derivation. Yek ‘one’ is incompatible with the classifier ta because yek’s uninterpretable [u-minimal] is never checked since ta is [group] and not [minimal]. Since ta is [group], it is essentially, in Persian, a plural marker and not really a classifier in the sense that Mandarin ge, insensitive to cardinality, is a classifier. But since ta occurs only in context of a numeral, neither is it a precisely plural marker like English -s, which is used in all plural contexts regardless of whether a numeral is present. On the one hand it seems that ta belongs to a mixed category, but since lexical items are frequently difficult to categorize discretely, ta’s bicategorial nature should be no surprise. Squishy or overlapping categories are standardly treated as subcategorization details. The approach taken in this article treats the subcategorization via the features inside feature bundles, and it should be no surprise that a language can take presumably universally available features and bundle them in different ways for the particular items in its lexicon. Abstractly, for features [a], [b], [c], and [d] relevant for items in the determiner phrase, one language may lexicalize [a, b] and [c, d], another language may lexicalize [a] and [b, c, d], while a third language may lexicalize [a, b] and [c] while leaving feature [d] unattached to any phonetically realized morpheme. The focus in this article is the incompatibility of yek with ta. Other issues remain, however, such as how plural specific -ha works out in the derivation, not to mention the indefinite marker -i. Quantifying determiners such as xeyli were introduced in passing as a point of reference, but details how xeyli undergoes Merge must also be presented. Another issue is the differential case marker -ra, which only appears on specific direct objects, and what its features are and how its features check in the syntax. Finally, since the focus in this article is on Persian, I glossed over details of Merge in Mandarin, English, and other languages. However, the proposal makes clear predictions about features and how they are bundled across languages. Further research on Persian and other languages will corroborate or falsify the proposed theory.

References Abney, Steven 1987. The English noun phrase in its sentential aspect. Cambridge, MA: MIT dissertation. Aikhenvald, Alexandra. 2000. Classifiers: A typology of noun categorization devices. Oxford: Oxford University Press.

Accounting for *yek ta in Persian

231

Aikhenvald, Alexandra. 2003. A grammar of Tariana. Cambridge: Cambridge University Press. Baker, Mark. 2003. Lexical categories: Verbs, nouns, and adjectives. Cambridge: Cambridge University Press. Baker, Mark. 2008. The syntax of agreement and concord. Cambridge: Cambridge University Press. Borer, Hagit. 2005. In name only. Oxford: Oxford University Press. Bowers, John. 1975. Adjectives and adverbs in English. Foundations of Language 13. 529–562. Brame, Michael. 1981. The general theory of binding and fusion. Linguistic Analysis 7 (3). 277– 325. Brame, Michael. 1982. The head-selector theory of lexical specification and the nonexistence of coarse categories. Linguistic Analysis 10 (4). 321–326. Cheng, Lisa Lai-Shen & Rint Sybesma. 1999. Bare and not-so-bare nouns and the structure of NP. Linguistic Inquiry 30 (4). 509‒542. Cheng, Lisa Lai-Shen & Rint Sybesma. 2005. Classifiers in four varieties of Chinese. In Guglielmo Cinque & Richard Kayne (eds.), The Oxford handbook of comparative syntax, 259‒292. Oxford: Oxford University Press. Cheng, Lisa Lai-Shen & Rint Sybesma. 2012. Classifiers and DP. Linguistic Inquiry 43 (4). 634‒ 650. Chierchia, Gennaro. 1998. Reference to kinds across languages. Natural Language Semantics 6. 339‒405. Chomsky, Noam. 2001. Derivation by phase. In M. Kenstowicz (ed.), Ken Hale: A life in language, 1‒52. Cambridge: MA: MIT Press. Chomsky, Noam. 2008. On phases. In Robert Freidin, Carlos P. Otero & Maria Luisa Zubizarreta (eds.), Foundational issues in linguistic theory: Essays in honor of Jean-Roger Vergnaud, 133‒166. Cambridge, MA: MIT Press. Craig, Collette Grinevald. 1977. The structure of Jacaltec. Austin: University of Texas Press. Doetjes, Jenny. 1997. Quantifiers and selection: On the distribution of quantifying expressions in French, Dutch and English. Leiden: Leiden University dissertation. Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info (accessed 28 August 2015). Embick, David & Alec Marantz. 2008. Architecture and blocking. Linguistic Inquiry 39 (1). 1‒53. Gebhardt, Lewis. 2009. Numeral classifiers and the structure of DP. Evanston: Northwestern University dissertation. Ghomeshi, Jila. 2003. Plural marking, indefiniteness, and the noun phrase. Studia Linguistica 57 (2). 47‒74. Gil, David. 2013. Numeral classifiers. In Matthew Dryer & Martin Haspelmath, (eds.), The world atlas of languages structures online. Max Planck Institute for Evolutionary Anthropology, Leipzig. http://wals.info (accessed 15 September 2014). Gillon, B. 1992. Towards a common semantics for English count and mass nouns. Linguistics and Philosophy 15. 597‒640. Greenberg, Joseph. 1972. Numeral classifiers and substantive number: Problems in the genesis type. In Keith Denning & Suzanne Kemmer (eds.), On language: Selected writings of Joseph H. Greenberg, 166‒198. Stanford, CA: Stanford University Press. Harley, Heidi & Rolf Noyer. 1999. Distributed morphology. Glot International 4 (4). 3‒9. Harley, Heidi & Elizabeth Ritter. 2002. Person and number in pronouns: A feature-geometric analysis. Language 78 (3). 482‒526.

232

Lewis Gebhardt

Haspelmath, Martin. 2013. Occurrence of nominal plurality. In Matthew Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Max Planck Institute for Evolutionary Anthropology, Leipzig. http://wals.info (accessed 15 September 2014). Hofling, Charles. 2000. Itzaj Maya grammar. Salt Lake City: University of Utah Press. Hornstein, Norbert. 2009. A theory of syntax: Minimal operations and universal grammar. Cambridge: Cambridge University Press. Horrocks, Geoffrey & Melita Stavrou. 1987. Bounding theory and Greek syntax: Evidence from wh-movement in NP. Journal of Linguistics 23, 79–108. Ido, Shinji. 2005. Tajik. Muenchen: Lincom GmbH. Jackendoff, Ray. 1977. X’-Syntax: A Study in of Phrase Structure. Cambridge, MA: MIT Press. Karimi, Simin. 1999. A note on parasitic gaps and specificity. Linguistic Inquiry 30. 704‒713. Keenan, Edward 1987. A semantic definition of “indefinite NP”. In Eric J. Reuland & Alice G. B. Meulen (eds.), The representation of (in)definiteness, 286–317. Cambridge, MA: MIT Press. Krifka, Manfred. 1995. Common nouns: A contrastive analysis of Chinese and English. In Greg Carlson & Jeffrey Pelletier (eds.), The generic book. Chicago: University of Chicago Press. Lambton, Ann K. S. 1974. Persian grammar. Cambridge: Cambridge University Press. Mahootian, Shahrzad. 1997. Persian. London: Routledge. Milsark, Gary. 1979. Existential sentences in English. New York: Garland. Perry, John R. 2005. A Tajik reference grammar. Leiden: Brill. Ritter, Elizabeth. 1991. Two functional categories in noun phrases: Evidence from modern Hebrew. In Susan Rothstein (ed.), Syntax and semantics 25: Perspectives on phrase structure, 37‒62. New York: Academic Press. Ritter, Elizabeth. 1992. Cross-linguistic evidence for number phrase. Canadian Journal of Linguistics 37 (2). 197‒218. Sanches, Mary & Linda Slobin. 1973. Numeral classifiers and plural marking: An implicational universal, 1‒22. Stanford, CA: Stanford University Press. Simpson, Andrew. 2005. Classifiers and DP structure in Southeast Asia. In Guglielmo Cinque & Richard Kayne (eds.), The Oxford handbook of comparative syntax, 806‒838. Oxford: Oxford University Press. Szabolcsi, Anna. 1981. The possessive construction in Hungarian: A configurational category in a non-configurational language. Acta Linguistica Scientiarum Academiae Hungaricaae, 31. 261–289. Szabolcsi, Anna. 1984. The possessor that ran away from home. The Linguistic Review 3. 89– 102. Szabolcsi, Anna. 1987. Functional categories in the noun phrase. In Istvan Kenesei (ed.), Approaches to Hungarian, vol. 2: 167–190. Szeged: Jate. Szabolcsi, Anna. 1994. The noun phrase: The syntactic structure of Hungarian. In Ferenc Kiefer & Katalin Kiss (eds.), Syntax and semantics 27. 179–274. New York: Academic Press. Tang, Chih-Chen Jane. 2004. Two types of classifier languages: A typological study of classification markets in the Paiwan noun phrase. Language and Linguistics 5 (2). 377‒407. T’sou, Benjamin. 1976. The structure of nominal classifier systems. In Philip Jenner, Stanley Starosta & Laurence Thompson (eds.), Austroasiatic Studies, vol. 2: 1215‒1248. Honolulu: University of Hawaii Press. Zamparelli, Roberto. 1995. Layers in the determiner phrase. Rochester, NY: University of Rochester dissertation. Zavala, Roberto. 2000. In Gunter Senft (ed.), Systems of nominal classification, 114‒146. Cambridge: Cambridge University Press.

Jila Ghomeshi

12 The associative plural and related constructions in Persian Abstract: This article takes constructions in Persian that consist of a proper name followed by a third person plural pronoun in Persian, Kian ina ‘Kian and his family’, and identifies them as associative plural constructions. Their properties as associative plurals are consistent with what we know of this construction in general, but the article goes further to show that they bear a great resemblance to a general extender construction in Persian in which the third person plural ina can follow any type of constituent to mean something like “etc.” The associative plural also bears some resemblance, at least in meaning, to co-compounding in the language. These resemblances are formalized within the model of the hierarchical lexicon which we find in Construction Morphology. Keywords: Persian, associative plurals, general extenders, co-compounds, Construction Morphology, hierarchical lexicon

1 Introduction In this article I consider the associative plural construction in Modern Persian in the context of similar constructions across other languages. I first establish that the properties of this construction are quite straightforward, given what we know of associative plurals in general. I then use data from a corpus of spoken Persian to show a close connection between the associative plural and the use of

I am grateful to Nima Sadat-Tehrani, Saeed Ghaniabadi, Sharareh Esmaeili, and Ladan Jebheh for transcribing parts of the Callfriend corpus during their time as students and research assistants in linguistics at the University of Manitoba. I have benefited immensely from conversations with Diane Massam and Saeed Ghaniabadi about number and Persian morphology over the years. I would like to thank Graeme Trousdale for an illuminating conversation during which he pointed me toward general extenders and Edith Moravcsik for her helpful comments on an earlier draft of this article. Finally, many thanks to the three anonymous reviewers of this article and the editors of this volume for their support and enthusiasm for Persian linguistics. All errors and omissions are my own. Jila Ghomeshi, University of Manitoba DOI 10.1515/9783110455793-013

234

Jila Ghomeshi

general extenders both in form and meaning. I conclude from this that general extenders are a possible source of associatives, pending an appropriate diachronic study. Moreover, I show that the associative plural also bears a resemblance to co-compounding in Persian and argue that the existence of co-compounds in the language is significant in giving rise to the possibility of a grammaticalized associative plural.1 The discussion of these issues is framed in terms of Construction Grammar. In Modern Standard conversational Persian, a proper name (PN) can combine with the third person plural pronoun ina ‘they’2 to form a compound meaning “[PN] and his or her family and close friends”. The resulting phrase, [PN ina], can be used for enquiring about people known to the speaker and addressee, for example:3,4 (1)

a.

æz Babak ina che xæbær? from Babak 3PL what news ‘How’s Babak (and family)?’

1 The link between associative plurals, similative plurals, general extenders, and echo word formation is also found in Mauri (to appear). In a survey of sixty languages, Mauri identifies these among a range of linguistic strategies that are used to build ad hoc categories in discourse. She notes that more research is needed to better understand the “synchronic and diachronic patterns of multifunctionality” (p. 18) and is carrying out a research program on precisely this – something I was unaware of when I wrote the first draft of this article. I thank Edith Moravcsik for bringing Mauri’s work to my attention. I expect a further look at the creation of ad hoc categories in Persian is bound to yield interesting results. 2 The form in is also the proximate demonstrative ‘this’, which contrasts with the distal demonstrative an ‘that’. Both can bear the plural marker -ha (-a after consonants in colloquial speech): Proximate Distal in ‘this’ un ‘that’ (an in formal pronunciation) PL ina ‘these’ una ‘those’ (anha in formal pronunciation) SG

Persian third person pronouns u ‘she/he’ and ishan ‘they’, which are used only for animates, have come to acquire formal/polite and/or honorific status and so the demonstratives in and ina are now used as the neutral third person forms for animates as well as inanimates. The distal demonstratives un and una can also be used pronominally, although for some they carry a less respectful connotation than in/ina when used in reference to human beings. 3 All naturally occurring examples are taken from CALLFRIEND Farsi (Canavan and Zipperlen 1996), a corpus of unscripted telephone conversations between native speakers of Farsi (Persian) placed inside the continental United States and Canada. See http://catalog.ldc.upenn.edu/ LDC96S50 for more information about the corpus. 4 The abbreviations used in the glosses are given at the end of the article. Where data are taken from other sources, the abbreviations and glosses have been changed to conform to the system used in this article.

The associative plural and related constructions in Persian

b.

235

xæbær næ-dar-im do se hæfte-iye news NEG - have-1PL two three week-3SG . COP ‘We have no news, it’s been two or three weeks’. (CALLFRIEND Farsi, Canavan, and Zipperlen 1996: FA 4117:4:00)

This type of construction is quite common cross-linguistically and is known as the associative plural (APL) construction. In their survey of 237 languages Daniel and Moravcsik (2013) show that a surprising 84 percent of them – that is, 199 languages – have APL constructions. The Persian construction bears almost all the hallmark features of associative plurals, as outlined by Moravcsik (2003) and Daniel and Moravcsik (2013) (see also Daniel [2000] and Corbett [2000]).5 First, with respect to their form, associative plurals commonly involve the same marker as the one used for additive plurals, however, there are languages in which a distinct form such as a plural pronoun is used. The example below from Mandarin shows that it, like Persian, uses a plural pronoun (cf. also the English John ‘n them, which has a reduced conjunction between the PN and the pronoun): (2)

zhangsan tamen Zhangsan they ‘Zhangsan and his group’ (Moravcsik 2003: 470)

Second, the nominal expression to which the APL marker is added is usually restricted to human referents.6 Moravcsik (2003:472.G-1) expresses this as a scale and notes that in any given language, if a nominal on the scale can form an associative plural all nouns to the left of it can also form associative plurals: (3)

Proper Name < Definite Kin Noun < Definite Title Noun < Other Definite Human Noun

Persian falls on the more permissive end of the scale in that not just proper names but kin terms and title nouns can all be following by ina:

5 More recent work that explores the syntax of associative plural marking within a formal generative framework includes Nakanishi and Ritter (2009) on Japanese, Görgülü (2011) on Turkish, and Forbes (2013) on Gitskan, a First Nations language of northwestern British Columbia. 6 As we will see in section 4, ina can be added to nouns with inanimate reference as well, but as a general extender, not as an associative marker. One of the points of this article is that these two uses should be kept distinct as some languages may have one but not the other.

236 (4)

Jila Ghomeshi

a.

færhad ina Farhad 3PL ‘Farhad and his family/close friends’

b.

xahær-et ina sister-2SG .CLC 3PL ‘your sister and her family/close friends’

c.

pæri xanom ina Pari lady 3PL ‘Pari (formal) and her family/close friends’

d.

aqa-ye mohændes ina engineer 3PL sir-EZ ‘Mr. Engineer (honorific) and his family/close friends’

In addition to showing the kinds of terms that can form an associative plural in Persian, examples (4a), (4b), (4c), and (4d) also show that the APL construction in Persian is not a morphological process. The expression that combines with ina need not be a single word, but can be complex. It can be an inflected common noun that serves as a name (4b),7 or a title noun consisting of at least two words (4c and 4d).8 This sets the construction apart from compounding, even though it shares with compounds a binary form. It is more aptly described as a kind of syntactic juxtaposition. In terms of its semantics, the Persian APL construction is also in line with what is known about such constructions in general. Associative plurals refer to a set of individuals who form a conceptually coherent group. The group is ranked such that there is a prominent member who is identified by name, the “focal referent”, and unnamed associates who are typically other family members (Moravcsik 2003: 471‒473). The set of properties exhibited by the Persian APL construction that have been described above can be summarized as follows: (5)

The Associative Plural in Persian The expression [X ina], where X is a proper name, kinship term, or title, refers to the person named and his or her family and close friends.

7 With kinship terms, there may be a preference for ina to follow possessed nominals, though whether this is true requires further investiation. Edith Moravcsik (personal communication) suggests that the same may be true for plural kinship terms in Hungarian. 8 For lack of a more accurate cover term comprising proper names, kinship terms, and titles, I will continue to characterize the Persian APL construction as targeting proper names.

The associative plural and related constructions in Persian

237

In this article I consider the Persian associative plural construction within the context of the grammar of spoken Persian in general. In section 2 I discuss plural marking in Persian and show that the morphological plural is not incompatible with an associative meaning. This makes it all the more interesting why the associative plural is expressed via juxtaposition instead. In section 3 I present a brief survey of coordinate compounds in Persian in order to argue that the semantics of these types of compounds are closer in meaning to the associative than the morphological plural is. In section 4 I argue that the APL construction bears a significant resemblance to a type of general extender in Persian and hypothesize that this is the source from which it has grammaticalized. In section 5 I further discuss the ways in which the APL construction is similar to and different from the general extenders and coordinate compounds and formalize the relationships within the framework of Construction Grammar. Section 6 concludes the article. Throughout this article I use the term construction in the sense of Construction Grammar, that is, as a pairing of form with meaning that involves aspects of the meaning that cannot be attributed to the component parts (see Fried [2015] for an overview of this approach). Since my aim is to present work that is primarily descriptive, I don’t take a position on whether constructions are acquired via a process of categorization (Goldberg 2006) or whether constructions are made available by UG (Universal Grammar [Borer 2005]). I also set aside the exceedingly interesting diachronic questions regarding which came first, the more general or the more specific construction that I link together. Rather I intend this work to highlight the advantages of considering a particular construction in relation to similar phenomena in a given language.

2 Plural marking in Persian Plural marking on common nouns in Persian is of interest to contemporary linguists in large part because it differs from what we expect of inflectional number marking in general, but also because Persian possesses both classifiers and number marking, making it typologically somewhat rare. When discussing plural marking in Persian, most linguists focus on the suffix -ha as it is the default marker, even though there are other ways of forming plurals. For instance, the suffix -an is used with some animate nouns (mærd ‘man’, mærd-an ‘men’) and words of Arabic origin may take their plural form according to Arabic rules (e.g., tæræf ‘side’, ætraf ‘sides’; šæxs ‘person’, æšxas ‘people’, see Lazard (1957, 1992) for more on these points.

238

Jila Ghomeshi

Returning to -ha, it has been noted that it is a stress-affecting suffix, unlike others in the language that are clearly inflectional and do not affect stress placement (see Kahnemuyipour [2000, 2003], who argues that plural marking is derivational). Thus in the following examples, stress falls on the second syllable of ketab ‘book’ when it appears on its own (6a) or with a pronominal clitic possessor (6b), but the stress moves onto -ha when it is plural (6c, this example also shows that when the stem ends in a consonant the plural suffix shows up as just -a in informal colloquial speech): (6)

a.

ketáb book ‘book’

b.

ketáb-æm book + 1SG .CLC ‘my book’

c.

ketab-á book + PL ‘books’

Apart from the fact that -ha doesn’t behave morphophonologically like a typical inflectional affix in Persian, it has unusual syntactic and semantic properties as well. First, it is not required on indefinite common nouns in order to obtain a plural reading in certain contexts, as shown in (7a). When a bare common noun is construed as definite, it is also interpreted as singular as shown in (7b).9 Example (7c) shows a bare noun in object position where it is also number-neutral. (See Ghomeshi [2008] for more on the construal of bare nouns and number neutrality in Persian.) (7)

a.

ketab ru miz hæst book on table be.PRS .COP.3SG ‘There are/is books/a book on the table’.

b.

ketab ru miz hæst book on table be.PRS .COP.3SG ‘The book is on the table’.

c.

ketab xærid-æm book buy.PST +1SG ‘I bought books/some books/a book’.

9 As pointed out by an anonymous reviewer, the agreement facts correlate with definiteness in (7a) and (7b) such that plural agreement on the copula (hæst-ænd be.PRS .COP -3PL ) is not possible in (7a) but is optional with a plural definite noun in (7b): (i) ketab-ha ru miz hæst/ hæst-ænd book-PL on table be.PRS .COP.3SG / be.PRS .COP-3PL ‘The books are on the table’. The optionality of plural agreement in this case has to do with the animacy (or, more precisely, the lack thereof) of the subject (see Sedighi [2010] for more on this animacy effect).

The associative plural and related constructions in Persian

239

Second, it is connected to definiteness but not necessarily so as examples (8b) and (8c) show (see Ghomeshi [2003]; Gebhardt [2008, 2009]; Ghaniabadi [2010] for various analyses of this property): (8)

a.

se-ta ketab three-CL book ‘three books’

b.

se-ta ketab-a three-CL book-PL ‘the three books’

c.

ketab-a-ye jaleb-i book-PL- EZ interesting-INDEF ‘(some) interesting books’

Third, it can appear on mass nouns as well as count nouns, but without the kind of coerced reading that plural mass nouns in English receive, (see Ghaniabadi [2012] for a formal account and Sharifian and Lotfi [2003, 2007] for a conceptualfunctional one): (9)

bærf-a ab=shod snow-PL water=become.PST.3SG ‘The snow melted’ (meaning all the snow in a given context, not types or given quantities of snow)

Fourth, it can appear on constituents other than argument nominals. In (10a) we see plural marking on an adverb and in (10b) we see it on the non-verbal element within a complex predicate (see Hincha [1961] as well as Ghomeshi [2003] and Ghaniabadi [2010] where these facts are mentioned but not formally accounted for): (10)

a.

(un) bala-ha that above-PL ‘up thereabouts’10

10 An anonymous reviewer has point out that the English translation illustrates what might be a similar use of English -s. The use of nominal morphology on non-nominal elements is an area that is left for future research.

240

Jila Ghomeshi

b.

dærd-ha=keshid-im pain-PL =pull.PST-1PL ‘We have suffered a lot’.

The set of properties that the Persian plural marker exhibits has led to a number of proposals on how to treat it within formal theory. It has been analyzed as an adjunct rather than a head (e.g., Ghaniabadi [2010] drawing on Wiltschko [2008]) and as an affix that bears a definiteness/specificity feature (e.g., Gebhardt [2009]; Ghaniabadi [2010, 2012]). For Cowper and Hall (2012) it is a morpheme that instantiates a meaning of “augmented assemblage”, which they abbreviate as “AGGLOM ”. This feature differs from the one they abbreviate as “>1”, which contributes the meaning more typically associated with plural, namely “more than one referent” (cf. Hincha [1961], who also proposes that the semantics of the Persian plural marker is closer to augmentation than “greater than one”). Given that the distribution and meaning of -ha is wider than a grammaticalized inflectional plural like -s in English, it is within the realm of possibility that it could be the marker for associative plurals as well. This would make Persian more like Japanese in which the same marker -tati is used for additive and associative plurals (Nakanishi and Ritter [2009]). Moreover, -ha can appear on proper names. Ghomeshi and Massam (2009) note that Persian proper names can be pluralized, as in English, to refer to groups of people with the same name: (11)

a.

qomeshi-a æksæræn æz shomal-e iran-æn Ghomeshi-PL mostly from north-EZ Iran-COP. 3PL ‘Ghomeshis are usually from the north of Iran’.

b.

æli-a in vær vais-æn bæqiye un vær Ali-PL this side stand.PRS -3PL the.rest that side ‘The Alis should stand on this side and the rest (of you) on the other side’. (Ghomeshi and Massam 2009: 87.40)

Significantly, plural proper names formed with -ha cannot refer to a family the way they can in English: (12)

a.

Have you invited the Ghomeshis over recently?

b.

The Smiths are arriving at noon.

The associative plural and related constructions in Persian

241

This distinction between a group of members sharing the same name and a group of members linked to each other as parts of a whole is what Moravcsik (2003: 476‒477) calls type plurals vs. group plurals. Type plurals are taxonomic and membership in the set denoted by the plural term is based on similarity, i.e., set members are tokens of the same type (e.g., bread and cake, beef and pork). Group plurals are partonomic and membership in the set is based on a sense of cohesion, i.e., set members are parts of a whole (e.g., bread and butter, beef and potatoes). The examples above show that plural proper names in Persian can only be type plurals, not group plurals.11 Given that -ha on proper names yields a type, not group, reading and that the group reading is attained by adding the associative plural ina, the question arises as to whether -ha can ever have an associative or group reading. The use of plural marking on second person pronouns in Persian shows that it can. The use of personal pronouns in Persian is complicated by the social and cultural imperatives to be polite and to confer honorific status on one’s social superiors as well as on those of the same status who are not close or familiar. Thus, as in many European languages, the second person plural pronoun in Persian is often used as a polite singular while the second person singular pronoun is used when the addressee is familiar and of the same social status. As we see in (13a) and (13b) below, the shift to the polite singular shoma is often accompanied by the use of a more formal verb. In (13c) we see the use of -ha on shoma to denote a neutral second person plural: (13)

a.

(to) bayæd bi-yay you.SG must SBJ -come.PRS .2SG ‘You (SG ) should come here’.

inja here

b.

shoma bayæd tashrif=bi-yar-id welcome=SBJ-bring.PRS .2PL you.PL must ‘Youhonorific (SG ) should comehonorific here’. (plural reading for subject also possible)

c.

shoma-ha bayæd bi-yayd you.PL- PL must SBJ -come.PRS .2PL ‘You (PL) should come here’.

inja here

inja here

11 Daniel and Moravcsik (2013) distinguish additive plurals, which are referentially homogeneous in that every member of the set is the same type, from associative plurals, which they call “referentially heterogeneous”. They add that the sets referred to by associative plurals exhibit cohesion – they are not random collections. This appears to be the same distinction Moravcsik (2003) is making with type vs. group plurals. I will continue to use her terms as the term additive is used for a type of co-compound in section 3.

242

Jila Ghomeshi

Moravcsik (2003: 492‒493) argues that first and second person plural pronouns tend to be associative plurals, although they can also be non-associatively interpreted. In the case of (13c) above, if there is only a single addressee, the most likely reading is that the speaker is including known associates of the addressee, i.e., the addressee plus her or his friends or family. Clearly then, there is no obvious reason why the plural marker could not be used to mark associative plurals on proper names as well, yet the language employs what Daniel and Moravcsik (2013) call “periphrastic” means to signal the associative plural.12 In the next section we turn to compounding constructions, particularly of the coordinating type, to show that set extension is often expressed via compounding rather than through affixation. The availability of this strategy may explain how the APL construction in Persian has arisen.

3 Compounding, echo reduplication, and hedging your sets We saw at the beginning of this article that an associative plural refers to a set made up of the individual named by the proper name and the associates of that individual, typically his or her family members. Moravcsik (2003) refers to this kind of set as a group plural in that the members form a cohesive whole, e.g., a family. She further categorizes it as a ranked set in that it contains one focal member, and a partially enumerated one in that the associates are not named. In this section we consider compound constructions that vary along one or more of these parameters of meaning. Compounds are relevant because of their superficial resemblance to the APL construction and also because compounding is a productive and creative area in Persian and many related and contact languages. The types of compound we will consider here are called co-compounds or coordinating compounds (Wälchli [2005], also known by the Sanskrit term dvandva compounds or as copulative compounds). These compounds are so named because their meaning involves the coordination of the meaning of their parts. They are found throughout the languages of central Eurasia and express a variety of semantic relations. For example, in Lezgian (spoken in the eastern 12 Ackema and Neeleman (2014) claim that the associative effect of plural marking with first and second person pronouns is linked to the person system, not the number system. If this is correct, then -ha may only be an additive plural in Persian. I leave this issue for further research as it does not directly affect the exploration of ina as an associative marker.

The associative plural and related constructions in Persian

243

Caucasus), the two members of the compound may appear as a pair (14a), may represent a larger class by virtue of being identifiable members of that class (14b), may be synonyms of each other (14c), or may involve one noncemember (14d): Lezgian (Haspelmath 1993:108) (14)

a.

buba-dide

buba ‘father’

dide ‘mother’

‘parents’

b.

xeb-mal

xeb ‘sheep’

mal ‘cattle’

‘domestic animals’

c.

gaf-č’al

gaf ‘word’

č’al ‘word, language’

‘talking’

d.

ajal-kujal

ajal ‘child’

*kujal

‘child’

Wälchli (2005) provides further detail about each of the four types of cocompounds above as well as identifying many more. He calls the type in (14a) an additive co-compound and notes that these compounds refer to items that naturally occur in pairs or that together exhaustively represent a set: Georgian Additive Co-compounds (Wälchli 2005: 137) (15)

a.

da-dzma

‘sister-brother’

b.

xel-p’exi

‘hand-foot’

Wälchli calls the second type of compound, exemplified in (14b), a collective co-compound. In this case the compound denotes a collective of which the two listed elements are prototypical members. Chuvash Collective Co-compounds (Wälchli 2005: 141) (16)

a.

sĕt-śu

lit. ‘milk-butter’

‘dairy products’

b.

erex-săra

lit. ‘vodka/wine-beer’

‘alcoholic beverages’

c.

xyr-čărăš

lit. ‘pine-spruce’

‘conifers’

Collective co-compounds can also be viewed as hyperonymic in that they denote superordinate-level concepts. Arcodia, Grandi, and Wälchli (2010) contrast hyperonymic with hyponymic coordinating compounds such as “singer-songwriter” in English, which denote subordinate-level concepts (in this case, the intersection of the set of singers with the set of songwriters).13 13 See also Bauer (2010) for discussion of co-compounds in Germanic languages such as English, German, and Dutch. Such languages are notable for the striking lack of the kinds of co-compounds being discussed here.

244

Jila Ghomeshi

The examples in (14c) and (14d) show that it is possible to have cocompounds in which one of the elements is not meaningful, either by virtue of being synonymous with the other element (see also Singh [1982] on synonym compounds in Hindi), or by not being a recognizable word at all, which Wälchli (2005) calls imitative (see also Ourn and Haiman [2000] on “servant words” in Khmer). Examples of each of these types of co-compounds: additive, hyperonymic, synonym, and imitative, can be found in Persian: Persian (cf. Ghaniabadi et al. 2006; Stilo 2004; Shaki 1967) (17)

a.

kot-šælvar

kot ‘coat’, šælvar ‘pants’

‘suit’

additive

b.

kard-o čængal

kard ‘knife, čængal ‘fork’

‘cutlery’

hyperonymic

c.

dad-o færyad

dad ‘shout’, færyad ‘cry’

‘brawl’

synonym

d.

pul-o pæle

pul ‘money’, *pæle

‘wealth’

imitative

A significant point to be made about the compounds in (17) above is that most retain the coordinator -o ‘and’, which seems to be true of many types of compounds in Persian (Stilo 2004: 285‒286).14 Thus with the frequently occurring additive compound meaning ‘parents’, we can get the nouns madær ‘mother’ and pedær ‘father’ in either order, both with and without the coordinator: pedær(-o) madær and madær(-o) pedær. Consequently, the term compound here does not refer to a particular form. Rather, it has to do with whether the expression is lexicalized in the sense of being listed as a chunk in the lexicon with a somewhat idiomatic meaning. Moreover, just as there are expressions containing a coordinator that are indisputably lexicalized, so there are spontaneous coordinated expressions in which the coordinator is not present. Stilo (2004: 308‒309) points out that when a coordinator between two noun phrases is deleted, the resulting sequence takes on the stress pattern of a compound (stress indicated by small caps):

14 An exception would be numeral compounds such as do-se ‘two or three’ (lit. ‘two-three’), čar-panj ‘four or five’ (lit. ‘four-five’) and which are treated phonologically as compounds but never take an overt coordinator as noted by Stilo (2004: 286). They can also express ranges (e.g., hæft-hæsht-dæh ta ‘seven to ten’ (lit. ‘seven-eight-ten’ CL , or dæh-punzdæh ta ‘ten to fifteen’ [lit. ‘ten-fifteen’ CL], which sound better when followed by the default classifier ta). These are very common in everyday speech.

The associative plural and related constructions in Persian

(18)

a.

emruz-færDA mi-r-e. today-tomorrow CONT-go.PRS .3SG ‘S/he’s going today or tomorrow’. (Stilo 2004: 308.127)

b.

pepsi-koKA be-færma-id. Pepsi-Coke IMP-command.PRS .2PL ‘Have a Pepsi or a Coke’. (Stilo 2004: 308.130)

245

In the above cases, a disjunctive reading is more likely, but Stilo notes that it is not always clear whether a conjunctive or disjunctive reading is intended and in some cases the distinction is irrelevant: (19)

xahæ̀ r-bæraDÆR dár-in? sister-brother have.PRS -2PL ‘Do you have brothers or sisters?’/‘Do you have a brother or a sister?’ (Stilo 2004: 308.131)

Note that in the likely event that xahær-bæradær is an additive compound like madær(-o) pedær ‘mother-father > parents’, the example in (19) could also be translated as ‘Do you have any siblings?’ Returning to co-compounds, Wälchli (2005: 167‒168) notes that many cocompounding languages also have compounds made up of echo-words. This phenomenon, also known as echo reduplication, involves the repetition of a base X with part of the base replaced by a fixed segment (see, for example, Lidz [2001]; Keane [2001]; Inkelas and Zoll [2005]). The resulting compound has the meaning “X and the like”, “X and related stuff”, or “X, etc.” Echo reduplication is found in Iranian (e.g., Persian), Indo-Iranian (e.g., Bengali, Hindi), Turkic (e.g., Turkish), and Dravidian (e.g., Kannada) languages and is thus considered an areal feature of South Asia: (20)

a.

Bengali

bari ‘house’

bari ʈari ‘house, etc.’

b.

Hindi

aam ‘mango’

aam vaam ‘mangoes and such fruit’

c.

Kannada

kannu ‘eye’

kannu ginnu ‘eyes and so forth’

d. Tamil maaʈu ‘cow’ (Keane 2005: 240.1‒4)

maaʈu kiiʈu ‘cattle in general’

246

Jila Ghomeshi

In Persian the fixed segment is /m-/ and occasionally /p-/ (see Ghaniabadi [2008]; Ghaniabadi et al. [2006]): Persian (21)

a.

ketab ‘book’

ketab-metab ‘books and stuff/and the like/, etc.’

b.

pul ‘money’

pul-mul ‘money and stuff/and the like/ etc.’

c.

mive ‘fruit’

mive-pive ‘fruit and stuff/and the like/, etc.’

There are several ways in which compounds formed by echo reduplication resemble the other types we have briefly reviewed. They share a similar form with imitative compounds in that they contain one “nonce” member, and they share a similar meaning to collective, or hyperonymic, compounds in that they extend the denotation of the base. Collective compounds evoke a superordinate category by naming two typical members (e.g., kard-o čængal ‘knife and fork → cutlery’), echo words evoke a larger set of items like the named member (e.g., kard-mard ‘knives and stuff’). We can define both as set-extending constructions (set here meaning the denotation set) as follows: (22)

Coordinate compounding in Persian Given X and Y, the compound [X Y] denotes the hypernym, i.e., a superordinate set including X and Y as typical members.

(23) Echo reduplication in Persian Given X, the compound [X m/p-X] denotes a set including X and the like. Note that the semantics of echo reduplication, which is usually characterized as “X and the like” or “X and related stuff” (see references given above), is somewhat vague. A more precise formulation is challenging, however, as the set evoked can vary from context to context. For example, in one situation, what is associated with X might be based on physical similarity (books and things that look like books) while in another, it might be the kinds of things typically scattered on a desk (books, pens, notepaper, etc.) Given that echo reduplication can be characterized as a kind of set extension, we can now link it back to associative constructions. Indeed there are constructions that could easily fit one or the other definition. For instance, Wälchli (2005) cites examples from two Uralic languages spoken in Russia, Mordvin and Udmurt, in which echo-compounds involve pronominal “echoes” rather than phonologically modified copies of the base:

The associative plural and related constructions in Persian

247

Mordvin (Wälchli 2005: 169) (24)

a.

jam.t-mez.t’

‘soup.PL-what.PL > soup and the like’15

b.

koŕon.nek-mez.ńek

‘root.NEK-what:NEK = with roots and everything, with all its roots’

The only reason he does not consider these constructions to be associative plurals is that they bear the morphological markers of co-compounds (plural marking in [24a] and -ńek, a marker is specific to generalizing co-compounds in Mordvin in [24b]).16 Having drawn links between co-compounding, echo reduplication, and associative plurals, there are a few distinctions, particularly between the latter two, that ought to be made. First, echo-compounds are, as Wälchli (2005) notes, of a low informal register. They are unlikely to occur in written texts and have a careless or almost pejorative air. This is not true of the APL construction in Persian. It is similar to echo reduplication in that it is more characteristic of spoken than written language and it is informal, but it is neither “low” nor pejorative or dismissive. Second, echo-compounds are an instance of a type plural, where membership in the set they denote is based on similarity, and the reading is taxonomic. Associative plurals, as we have seen, are group plurals where membership in the set is based on cohesion, and the resulting reading is partonomic, according to Moravcsik (2003). Third, there is a strong tendency for echo-compounds to be based on nouns with inanimate reference while associative plurals are based on proper names, titles, and kinship terms, i.e., on terms referring to human beings.17 15 Note the resemblance here to the English “and whatnot”, which can also be seen as an associative construction. This construction is vividly defined on Urban dictionary as “A more hip hop way of saying ‘etc.’, or a verbal way of expressing ‘. . .’ It is said by those that have so much poppin’ that they don’t have the time or energy to explain what the ‘what not’ is.” (http:// www.urbandictionary.com/define.php?term=whatnot, accessed 6 November 2015). The definition itself is astonishingly similar to one given one hundred years earlier by Poutsma (1916: 914, as cited in Cheshire 2007: 165) for “and (all that)” as a form that “sometimes stands for a vague etc., which the speaker is not prepared to specify in the hurry of the discourse”. 16 Daniel and Moravcsik (2013) also note the similarity between echo reduplication (which they call the similative plural with reference to Telugu) and the associative plural construction in that both pick out sets that do not require their members to be of the same type. Daniel and Moravcsik refer to this as referential heterogeneity. 17 Again, this claim excludes the use of ina as a general extender, which will be discussed in section 4. While (25b) shows that ina cannot be used as an associative plural with an inanimate noun such as mive, it can be used in the sense of “etc.” For this reason I have not marked the example mive ina with an asterisk, indicating outright ungrammaticality but with two question marks. While I am arguing that associative plurals and general extender constructions are distinct, in some cases the meanings are similar enough to bleed into one another.

248

Jila Ghomeshi

Persian (25)

a.

mive ‘fruit’ kian ‘Kian’

mive-pive ‘fruit and stuff/and the like/, etc.’ ?? kian-mian

b.

mive ‘fruit’ kian ‘Kian’

??mive-ina kian-ina ‘Kian and his family/friends’

We have seen arguments for associative plurals being closer to compounds than to plural marked nouns. They fall in easily with the other types of compounds in Persian, many of which are “set-extending”. Rather than viewing associative plurals as simply one kind of compound construction, however, I will argue in the next section that they are connected to a more pervasive means of extension or generalization via coordination.

4 General extenders The associative plural in Persian consists of a proper name and ina, which on its own is the neutral third person plural pronoun ‘they/them’ as well as the plural proximate demonstrative ‘these’ (see note 1). As expected, it is easy to find examples of associative plural and pronominal uses of ina in chunks of casual spoken discourse. What is unexpected, however, is the large number of occurrences of ina do not fall into these two categories. Let us consider the first eight minutes of one telephone conversation from the CALLFRIEND corpus (Canavan and Zipperlen 1996: FA_6345). The call is between two men and, apart from general enquiries about each other and family, the conversation revolves around a government shutdown in the state where speaker A lives, and the news speaker A has just received at the doctor’s office about his high cholesterol count. There are eighteen instances of ina in this chunk of conversation, seven produced by speaker A and eleven by speaker B. Among the eighteen occurrences of ina there are no cases where it functions as a pronoun meaning ‘they’ and exactly one instance of an associative plural: (26)

A:

xune-ye færhad ina hæst-i, chikar mi-kon-æn? house-EZ Farhad 3PL be.COP-2SG what CONT-do.PRS -3PL ‘So you’re at Farhad (and family)’s house, what are they up to?’ (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:4:10)

The rest of the occurrences of ina are what I will call, following Cheshire (2007), instances of its use as a general extender, along the lines of ‘. . . and stuff’ in

The associative plural and related constructions in Persian

249

English.18 I will first present a range of examples and then discuss this use further. In the following two examples ina is linked to a noun via the coordinator -o and extends the denotation of the noun to related things: (27) A: væ muze-ha-o ina hæme tæ’til-e19 and museum-PL- CONJ 3PL all closed-be.COP.3SG ‘And places like museums are all closed’ (here the relevant places are probably state-run as the topic of conversation is a ‘shutdown’) (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:4:10) (28) B: fæqæt mi-tun-i mesinke ab-o ina bo-xor-i only CONT-be.able.PRS -2SG seems-like water-CONJ 3PL SBJ- eat.PRS -2SG ‘You can only drink water and stuff’. (with regard to the morning on a day you have to go for tests for which you are required to fast, presumably the other “stuff” in this case is clear liquids) (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:3:15) The following example similarly shows the set-extending function of ina. We also see that ina can follow a fully inflected noun, in this case a noun bearing both plural marking and a possessive clitic, and one that has been scrambled out of a subordinate clause, in this case the complement of bebinin (the imperative form of ‘see’): (29) A: xærj-a-tun ina be-bin-in che-qædr-e expense-PL -2PL .CLC 3PL IMP-see.PRS -2PL how-much-be.PRS .COP.3SG ‘See how much your expenses and stuff are’. (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 4099:9:20) In the examples so far, ina has extended the denotation of the noun that precedes it. However, there are examples like the one below where ina is linked to a preceding noun via the coordinator -o, but it is not clear what the ‘. . . and stuff’ has scope over: is it generalizing over diets and other things like that, or over activities like putting someone on a diet and other things like that:

18 Cheshire in turn credits Overstreet (1999) for the term general extender, but notes that plenty of other terms have been used (see Cheshire [2007: 156] and references cited therein). 19 While not related to the topic of this article, this example also illustrates another feature of Persian whereby inanimate plural subjects do not trigger plural but rather singular agreement on the verb. See Sedighi (2010), for discussion.

250 (30)

Jila Ghomeshi

A:

un-æm dige be-zar-æt-æm ru dayet-o ina that-as.well well SBJ- put.PRS -3SG .1SG .CLC on diet-CONJ 3PL ‘. . . and, you know, he’ll put me on a diet and stuff’ (with regard to an appointment he has to make to see a nutritionist) (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:2:10)

Apart from nouns, ina can also follow a verb or a verb phrase with or without a coordinator. As with nouns, the generalization may be over other actions (expressed by the verb) or other propositions (expressed by the clause). We consider three examples in turn. In the following example speaker A recounts how he first got the news of his high cholesterol count. Here the complex predicate test=kærdæn ‘to test’ (lit. ‘test=do’) is following by ina meaning that they ‘ran tests and stuff’: (31)

A:

šod, happen.PST.3SG

injuri this.way

ke that

chiz thing

hæfte-ye week-EZ

piš last

ræft-æm-o go.PST-1SG - CONJ

bæ’d then

goft say.PST.3SG

test=kærd-æn test=do.PST-3PL

ina, 3PL

divist-o-pænjah-st two.hundred-CONJ- fifty-be.COP.3SG

‘This way that it happened (is), last week I went and they did tests and stuff (lit. ‘they tested and stuff’), then they said it’s 250’. (with regard to his high cholesterol news) (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:1:46) In the following example ina is linked to the verbal complex avordæn=pa’in ‘to bring down’ via the coordinator -o ‘and’. The verb is transitive though the object is unexpressed and is construed as ‘cholesterol’. The generalization is not necessarily over actions performed to cholesterol but over the whole verb phrase ‘bring cholesterol down’: (32)

A:

mi-g-e CONT. say. PRS -3SG

ta until

bayæd must

be-bin-i SBJ -see.PRS -2SG

bi-yar-i=pain-o IMP-bring.PRS -2SG =down-CONJ

chi what

ina, 3PL

piš=mi-ya-d near=CONT-come.PRS -3SG

‘He says you should bring it down and stuff so that you see what happens’. (talking about what his doctor recommends regarding his cholesterol) (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:6:10)

The associative plural and related constructions in Persian

251

The third example below is similar in that the scope of ina is over the verb phrase “walk on a treadmill”: (33)

B:

to estres-test=kærd-i? 2PL stress-test=do. PST-2SG ‘Did you do a stress test?’

A:

are yeah ‘yeah’

B:

ke ru tered-mil bayæd ra=be-r-i ina that on treadmill must walk=SBJ-go.PRS -2SG 3PL ‘Where you have to walk on a treadmill and stuff . . .’ (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:7:30)

Ina can also form compounds with numerals that refer to times to mean ‘around’ the time specified (cf. footnote 14 on numeral compounds). For example, in one conversation between an older woman (speaker A) and a male relative (speaker B), she asks if he’s coming over the next day and he replies that he is. She then asks several times what time he’ll be coming. He says it’ll be in the afternoon but he can’t be sure exactly when, it depends on what else is going on. Then he continues and says: (34)

B:

. . . do ina mi-res-æm ... . . . two 3PL CONT + arrive+1SG ‘. . . I’ll arrive around two . . .’ (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 4099:0:33)

The use of ina is so pervasive that it can occur more than once during a single turn in conversation. Consider the following example in which speaker B is trying to explain good vs. bad cholesterol to speaker A. The first instance of ina follows mæqz ‘brain’ and means something like ‘brain and other organs’, while the second instance simply generalizes over the effects that bad cholesterol can have. In this example we also see that the coordinator is truly optional as it occurs in one case and not in the other:

252

Jila Ghomeshi

(35) B: væli kolestrol-e bæd-et ke mæsælæn ru but cholesterol-ez bad-3sg.clc that as.if on artri-a hæst, ru artri-a-ye ræg-a-i ke artery-PL be.PRS .COP.3SG on artery-PL- EZ vein-PL- REL that xun mi-bær-e tu mæqz-o ina æsær=mi-zar-e ina blood CONT-take.PRS -3SG in brain-CONJ 3PL effect=CONT-put.PRS -3SG 3PL ‘But your bad cholesterol that is in the arteries . . . in the arteries of the veins that carry the blood to the brain and stuff, (it) has an effect and stuff’. (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6345:7:10) As mentioned at the beginning of this section, this use of ina, which appears commonly in spoken discourse, has been identified as a general extender. General extenders are multifunctional constructions that may be put to a number of different uses at the same time. They have been described as implicating a category of which the named member is an exemplar. They often evoke a category that does not have a lexicalized name but rather has been created spontaneously. We can note that in this use they bear more than a passing resemblance to echo reduplication, discussed above. They also have pragmatic functions such as marking solidarity or rapport, hedging, and/or signaling politeness (see Cheshire [2007] and Parvaresh et al. [2012: 262‒263] for a review of the literature on these points). Cheshire (2007) states that general extenders like “and stuff” in English are thought to have grammaticalized from longer expressions such as “and stuff like that”, citing Aijmer (2002) and Brinton (1996) in this regard. In her analysis of general extenders used by adolescent speakers of British English, she considers forms such as “and things like that”, “and all that lot”, “and all the things like that”, “and all this type of stuff” as well as disjunctive general extenders such as “or something like that”, “or whatever”, “or things like that”. It is clear from her work and the work she references (see, for example, Erman [1995]; Overstreet and Yule [1997]) that the further along the grammaticalization path a general extender is, the shorter it is. She argues that “and that”, “and everything”, and “or something” lead the way in British English in contrast with “and stuff”, which is arguably the most grammaticalized general extender in American English. In a similar study, Parvaresh et al. (2012) survey the use of general extenders by adult speakers both in their native Persian and in their non-native English discourse. Of the general extenders they count in the native Persian corpus starting

253

The associative plural and related constructions in Persian

with væ ‘and’ the most frequent by far is væ ina ‘and these’.20 Table 1 gives the top five of their twenty-one types of general extenders in terms of frequency. Together they represent 64.85 percent of the total number of occurrences of general extenders. Table 1: Average frequency of general extenders in Persian (taken from Parvaresh et al. (2012.264 Table 1) Form

Frequency

Percent

væ ina ‘and stuff’ (lit. ‘and these’)

91

37.60

væ æz in hærf ha ‘and of such talks’ (lit. ‘and of these talks’)

21

8.67

væ in chiz ha ‘and such things’ (lit. ‘and of these things’)

17

7.02

væ nemidunæm æz in hærf ha ‘and I don’t know of such talks’

16

6.61

væ hæme chiz ‘and everything’

12

4.95

It is clear, then, that Persian exhibits the same phenomenon as in English: there are numerous expressions that can serve as general extenders, out of which there has emerged one concise, frequently used form that has arguably become grammaticalized. In Persian this expression is (-o) ina ‘and these’, which can be added to noun phrases, numerals, verbs, verb phrases, or clauses in order to generalize or extend the meaning of the category referred to (an object, a time, an action, a proposition). (36) The general extender construction in Persian The expression [X(-o) ina], where X is any constituent in an utterance, extends the denotation of that constituent to related things. In the next section I will argue that, despite appearances, the associative plural in Persian does not fall under the general extender use of ina but constitutes a construction that is more specific in ways that ties it in with the co-compounds discussed in section 3.

20 Parvaresh et al. (2012) also consider general extenders starting with ya ‘or’ and call them disjunctive as opposed to adjunctive general extenders that begin with væ. It should be noted here that their væ corresponds to -o in this article. I did not find many instances of væ ‘and’ in the spoken corpus I used and indeed, according to Stilo (2004: 288), væ is more common in the formal written language and is rarely used in casual speech. I can only assume that Parvaresh et al. (2012) chose to represent their data in a more formal way for clarity so as to abstract away from the messiness that is the transcription of actual speech.

254

Jila Ghomeshi

5 The place of the associative plural construction within the grammar of conversational Persian In this section I will first show that the associative plural construction has properties that distinguish it from the general extender construction in Persian. I will then go on to outline how the relationships between general extenders, the associative plural, and coordinate compounding can be represented within a Construction Grammar framework. While there is much work on Construction Grammar (see Fillmore [1988], Goldberg [1995], [2006], and Croft [2001], to name but a few references), I will be drawing primarily on Booij’s (2010) Construction Morphology.

5.1 The distinct properties of the associative plural construction We saw in section 4 that the most frequent kind of general extender in Persian is -o ina, which literally means ‘and these’ but functions more like “and stuff” in English. This extender can link to a variety of constituents ranging from nouns and noun phrases to verbs, verb phrases, and clauses. In contrast, the associative plural use of ina is restricted to proper names, kinship terms, and titles. Its meaning is also restricted in that it can only denote the family and close friends of the person named, while a general extender may evoke any type of category, depending on context and the meaning of the expressions involved. A third difference between the two constructions has to do with the number of expressions that can be combined. The associative plural involves adding ina to exactly one nominal, though as mentioned in the introduction to the article, this nominal can be made up of more than one word. A general extender, in contrast, may follow two or more constituents that make up a list:21 Persian (37)

a.

qælæm(-o) medad(-o) ina pen(-CONJ ) pencil(-CONJ ) 3PL ‘pens and pencils and stuff’

b. *sima færhad ina Sima Farhad 3PL ‘Sima and Farhad and family’ 21 According to Parvaresh et al. (2012), it has been proposed that general extenders are used to complete three-part lists and they cite Jefferson (1990) in this regard.

The associative plural and related constructions in Persian

255

Fourth, while the general extender use of ina may involve the coordinator -o, the associative plural cannot. Indeed the presence of the coordinator following a proper name signals that ina should be interpreted as a general extender and not an associative. In the following example we see such a use where the speaker is listing her children or others who live with her. Importantly, the interpretation is additive rather than associative: (38) A: ax susæn-jun mæn ælan umæd-æm dær dæftær-e oh Susan-dear 1SG now come.PST-1SG in office-EZ mæxsus-æm, væqti mi-xa-m, bita-o mæmmæd-o ina special-1SG .CLC when CONT-want.PRS -1SG Bita-CONJ Mohammad-CONJ 3PL chiz næ-baš-æn mi-a-m tu dæftær-e mæxsus-æm, thing NEG - be.SBJ.COP-3PL CONT-come.PRS -1SG in office-EZ special-1SG .CLC ke dær tualet-e

that in toilet-be.COP.3SG ‘Susan dear I’ve come now to my special office, when I want, Bita and Mohammad and others not to be a bother [lit. ‘thing’] I come to my special office, which is in the bathroom’. (CALLFRIEND Farsi, Canavan and Zipperlen 1996: FA 6723:1:15) The ways in which the associative plural and the general extender use of ina differ from each other is summarized in the table below: Table 2: Associative vs. general extender ina in Persian Associative ina

General extender ina

constituent ina attaches to

definite human nominal

any clause or nominal or verbal phrase

resulting meaning

ranked plural

related things or activities

number of conjuncts

exactly two

any number

coordinator -o

not permitted

optional

Table 2 shows that there are aspects of the APL construction that are not predictable from the general extender construction. Nor is the meaning of the construction determined compositionally from the meaning of its parts. For these reasons, I consider the associative plural to be a constructional idiom, an expression that contains both lexically fixed and variable positions (see Booij [2010:13] and references cited therein), and to be lexically listed as such. In the next section I turn to consider the structure and organization of the lexicon and the place of the associative plural within it.

256

Jila Ghomeshi

5.2 The associative plural in the hierarchical lexicon Construction Grammar takes the lexicon to be a hierarchically organized network of generalizations over words as well as multi-word expressions. These generalizations can be formalized as schemas or rules (see Booij [2010: 4 and ch. 2] for arguments why it is preferable to view such generalizations as schemas rather than rules). An example of a schema for compounding is given below: (39) [[a]Xk [b]Ni]Nj ↔ [SEMi with relation R to SEMk]j (Booij 2010: 17.17) The schema above is for right-headed nominal compounds in Germanic languages. Examples from English include words like “book shelf”, “pull tab”, “hard disk”, and “afterthought” (Booij 2010: 17). In each case, the right-hand member of the compound, a noun, is the head and it is preceded by a noun, verb, adjective, and preposition, respectively. Booij explains the formalism in the schema in (39) as follows: the lower case letters a and b are for arbitrary sound sequences, the variable X stands for the major lexical categories (N, A, V, P), and the lowercase variables i, j, k stand for the indices that link the phonological, syntactic, and semantic properties of words (cf. Jackendoff’s [2002] tripartite parallel architecture). The relation R remains unspecified in the semantic representation. It links the semantics (abbreviated SEM) of the head noun to the non-head in some way that is determined by the meaning of the individual words as well as the context. Within Construction Morphology, as presented in Booij (2010), schemas for word formation processes (e.g., inflection and derivation) are not only lexically listed but are linked to one another, forming hierarchies.22 A given schema may be dominated by a more abstract general schema (one that generalizes over a number of related constructions), from which it inherits certain pieces of information, or it may dominate a sub-schema that specifies aspects of meaning or form that are specific to a certain subclass of words that participate in the construction. To give a concrete example of the latter, the schema for compounding given in (39) above has a sub-schema in Dutch where by the word wereld ‘world’ can appear as the first constituent and function as an intensifier rather than

22 While I am drawing primarily on Booij (2010), the notion of an inheritance hierarchy, i.e., a taxonomic network whereby generalizations captured by higher-level constructions are inherited by subordinate ones, runs through all varieties of Construction Grammar. See Boas (2013: 244‒246) and references cited therein for one discussion of such hierarchies and Fried (2015: 13‒14) for hierarchical inheritance vs. partial inheritance.

The associative plural and related constructions in Persian

257

contribute its literal meaning (wereld-vrouw ‘fantastic woman’, wereld-kans ‘great chance’, Booij [2010: 57]). This sub-schema can be expressed as a constructional idiom as follows: (40) [[wereld]N [x]Ni]Nj ↔ [excellent SEMi]j (Booij 2010: 57.13) Turning to the type of compounding discussed in section 3, we can note that it is of a different type than the Germanic type represented in (39) above, expressing coordination rather than modification. Thus in Persian a compound like kot-šælvar (literally ‘coat-pant’) means ‘suit’, while in English a compound “coat pants” would likely be interpreted as “pants that go with a coat” or something along those lines. (Note that Persian has modificational type compounds as well.) The data presented in section 3 lend themselves to a hierarchical analysis whereby there is a general schema for coordinate compounds that dominates various types of sub-schemas (additive compounds, collective compounds, synonym compounds, etc.). This can be represented as in Figure 1 below in which I have shown two possible sub-schemas below the general schema for coordinate compounds, one for hyperonymic co-compounds and the other for echo reduplication:23

Figure 1: A partial hierarchy for coordinate compounding

Another aspect of the hierarchical lexicon that bears on the phenomena discussed here has to do with the fact that lexical entries do not only encode relationships among words and express schemas for word formation, the lexicon 23 As noted at the end of section 3, echo reduplication is limited to a highly informal register. It is an interesting challenge, left for future work, to theorize how such information about register and style should be represented as part of lexical entries.

258

Jila Ghomeshi

also contains schemas for larger constructions such as constructional idioms. In other words, pieces of syntactic structure can be listed in the lexicon with their associated (constructional) meanings (see Jackendoff [2008], for example). Thus using the same descriptive formalism we can represent general extenders in Persian as construction idioms with the associative plural as a sub-schema:

Figure 2: A partial hierarchy for general extenders

Ultimately these two partial hierarchies are linked, though the points of contact require further elaboration of both. The overarching schema is coordination as it is both a source for general extenders and is believed to be a source for cocompounds in at least some languages (Arcodia, Grandi, and Wälchli 2010: 8).

6 Conclusion Starting with the associative plural construction, this article has explored some of the ways in which the denotation of nominal expressions can be extended in Persian. With proper names, the addition of ina yields the meaning of the person named, plus his or her family and close friends. With common nouns, coordinate compounding evokes categories consisting of at least one of the members of the compound and related entities. Both types of construction exhibit referential heterogeneity (Daniel and Moravscik 2013). That is, rather than denoting more than one entity of the same type, they denote sets of related entities. We have also seen the way in which general extenders also extend the denotation of a variety of constituents (nouns, verbs, verb phrases, etc.) in order to evoke a category of which the named constituent is an exemplar. Having explored the resemblance among general extenders, co-compounds, echo reduplication, and associative plurals, we note that not all languages have

The associative plural and related constructions in Persian

259

as full a range of constructions as Persian does. English, for example, has a productive general extender construction but no associative plural.24 English also lacks a productive process of co-compounding. Arcodia, Grandi, and Wälchli (2010) claim that this type of compounding is “areally skewed” in that hyperonymic coordinate compounds are found in Eastern Eurasia (among other places) but not in Standard Average European languages, which have hyponymic coordinate compounds instead. (Hyponymic compounds have a referent that is a hyponym [subordinate] of the meaning of the parts.) It is tempting to connect the lack of an associative plural that derives from a general extender construction in English to the lack of coordinate compounding in general. In other words, it might be the case that English lacks an associative plural because it lacks coordinate compounding. This is not to say that coordinate compounding itself gives rise to an associative plural, but that the step from general extender to associative plural is more easily made in a language that already has the means to stretch the denotation of a nominal to other related things and/or to create superordinate categories by coordination. If this hypothesis turns out to have some validity, it, in turn, supports the view of the lexicon put forward by Construction Grammarians. In a hierarchical lexicon, a densely populated network might give rise to new grammatical constructions more easily than a sparsely populated one. The correlation between co-compounding and associative constructions remains to be explored further as does a diachronic investigation of Persian to determine which uses of ina predate others. The results of these studies will be revealing for those interested in associatives, coordination, and grammaticalization, and Persian has much to contribute in this regard. Table 3: Abbreviations 1 2 3 CL CLC CONJ CONT COP EZ

first person second person third person classifier clitic conjunction continuous copula ezafe

IMP NEG PL PN POSS PRS PST SBJ SG

imperative negative plural proper name possessive present past subjunctive singular

24 As mentioned in the introduction, some speakers can say [PN “’n them”] to mean something like an associative plural but the “’n them” does not necessarily refer to family and “’n them” is not otherwise a common general extender. This example does point to the fact that associatives may arise from coordinating constructions, however.

260

Jila Ghomeshi

References Ackema, Peter & Ad Neeleman. 2014. On the associative effect in first and second person plural pronouns. Talk presented at MGRG meeting, University of Edinburgh, 17 December 2014. Aijmer, Karin. 2002. English discourse particles: Evidence from a corpus. Amsterdam: John Benjamins. Arcodia, Giorgio F., Nicola Grandi & Bernhard Wälchli. 2010. Coordination in compounding. In Sergio Scalise & Irene Vogel (eds.), Cross-disciplinary issues in compounding, 177‒197. Amsterdam: Benjamins. Bauer, Laurie. 2010. Co-compounds in Germanic. Journal of Germanic Linguistics 22 (3). 201‒ 219. Boas, Hans C. 2013. Cognitive construction grammar. In Thomas Hoffman & Graeme Trousdale (eds.), The Oxford handbook of construction grammar, 233‒254. Oxford: Oxford University Press. Booij, Geert. 2010. Construction morphology. Oxford: Oxford University Press. Borer, Hagit. 2005. Structuring sense. Vol. I. Oxford: Oxford University Press. Brinton, Laurel. 1996. Pragmatic markers in English. Grammaticalization and discourse functions. Berlin: Mouton de Gruyter. Canavan, Alexandra & George Zipperlen. 1996. CALLFRIEND Farsi LDC96S50. CD-ROM. Philadelphia: Linguistic Data Consortium. Cheshire, Jenny. 2007. Discourse variation, grammaticalization and stuff like that. Journal of Sociolinguistics 11 (2). 155‒193. Corbett, Greville G. 2000. Number. Cambridge: Cambridge University Press. Cowper, Elizabeth & Daniel Currie Hall. 2012. Aspects of individuation. In Diane Massam (ed.), Count and mass across languages, 27‒53. Oxford: Oxford University Press. Croft, William. 2001. Radical construction grammar: Syntactic theory in typological perspective. Oxford: Oxford University Press. Daniel, Michael & Edith Moravcsik. 2013. The associative plural. In Matthew S. Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info/chapter/36 (accessed on 16 November 2015). Daniel, Mixail. 2000. Tipologija associativnoj množestvennosti [The typology of associative plurals]. Moscow: Russian State University for Humanities. Erman, Britt. 1995. Grammaticalization in progress: The case of or something. In Inger Moen, Hanne Gram Simonsen & Helga Lødrup (eds.), Papers from the XVth Scandinavian Conference of Linguistics, Oslo, January 13‒15, 1995, 136‒147. Oslo: Department of Linguistics, University of Oslo. Fillmore, C. J. 1988. The mechanisms of “construction grammar”. In Shelley Axmaker, Annie Jaisser & Helen Singmaster (eds.), Proceedings of the fourteenth annual meeting of the Berkeley Linguistics Society, 35‒55. Berkeley: Berkeley Linguistics Society, Inc. Forbes, Clarissa. 2013. Associative plurality in the Gitskan nominal domain. In Shan Luo (ed.), Proceedings of the 2013 annual conference of the Canadian Linguistic Association. http:// cla-acl.ca/actes-2013-proceedings/ (accessed 11 March 2017). Fried, Mirjam. 2015. Construction grammar. In Tibor Kiss & Artemis Alexiadou (eds.), Syntax – theory and analysis. An international handbook, vol. 1: 974‒1003. Berlin & Boston: De Gruyter Mouton.

The associative plural and related constructions in Persian

261

Gebhardt, Lewis. 2008. Classifiers, plural and definiteness in Persian. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian linguistics, 35‒56. Newcastle upon Tyne: Cambridge Scholars Publishing. Gebhardt, Lewis. 2009. Numeral classifiers and the structure of DP. Evanston: Northwestern University dissertation. Ghaniabadi, Saeed. 2008. Optionality and variation: A stochastic OT analysis of M/p-echo reduplication in colloquial Persian. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian linguistics, 57‒84. Newcastle upon Tyne: Cambridge Scholars Publishing. Ghaniabadi, Saeed. 2010. The empty noun construction in Persian. Winnipeg: University of Manitoba dissertation. Ghaniabadi, Saeed. 2012. Plural marking beyond count nouns. In Diane Massam (ed.), Count and mass across languages, 112‒128. Oxford: Oxford University Press. Ghaniabadi, Saeed, Jila Ghomeshi & Nima Sadat-Tehrani. 2006. Reduplication in Persian: A morphological doubling approach. In Claire Gurski & Milica Radišić (eds.), Proceedings of the 2006 annual conference of the Canadian Linguistic Association. http://cla-acl.ca/ actes-2006-proceedings/ (accessed 11 March 2017). Ghomeshi, Jila. 2003. Plural marking, indefiniteness, and the noun phrase. Studia Linguistica 57 (2). 47‒74. Ghomeshi, Jila. 2008. Markedness and bare nouns in Persian. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian linguistics, 85‒112. Newcastle upon Tyne: Cambridge Scholars Publishing. Ghomeshi, Jila & Diane Massam. 2009. The proper D connection. In Jila Ghomeshi, Ileana Paul & Martina Wiltschko (eds.), Determiners: Universals and variation, 67–95. Amsterdam & Philadelphia: John Benjamins Publishing Company. Goldberg, Adele. 1995. Constructions: A construction grammar approach to argument structure. Chicago: Chicago University Press. Goldberg, Adele. 2006. Constructions at work: The nature of generalization in language. Oxford: Oxford University Press. Görgülü, Emrah. 2011. Plural marking in Turkish: Additive or associative? Working Papers of the Linguistics Circle of the University of Victoria 21 (1). 70‒80. Haspelmath, Martin. 1993. A grammar of Lezgian. Berlin & New York: Mouton de Gruyter. Hincha, Georg. 1961. Beiträge zu einer Morphemlehre des Neupersischen. Der Islam 37 (1–3). 136‒201. Inkelas, Sharon & Cheryl Zoll. 2005. Reduplication. Cambridge: Cambridge University Press. Jackendoff, Ray. 2002. Foundations of language. Oxford: Oxford University Press. Jackendoff, Ray. 2008. Construction after construction and its theoretical challenge. Language 84 (1). 8‒28. Jefferson, Gail. 1990. List construction as a task and resource. In G. Psathas (ed.), Interaction competence, 63‒72. Lanham: University Press of America. Kahnemuyipour, Arsalan. 2000. On the derivationality of some inflectional affixes in Persian. Paper presented at the annual meeting of the Linguistic Society of America Conference, Chicago, 8 January 2000. Kahnemuyipour, Arsalan. 2003. Syntactic categories and Persian stress. Natural Language & Linguistic Theory 21 (2). 333‒379. Keane, Elinor. 2001. Echo words in Tamil. Oxford: Merton College dissertation. Keane, Elinor. 2005. Phrasal reduplication and dual description. In Bernhard Hurch (ed.), Studies in reduplication, 239–262. Berlin & New York: Mouton de Gruyter. Lazard, Gilbert. 1957. Grammaire du persan contemporain. Paris: Klincksieck.

262

Jila Ghomeshi

Lazard, Gilbert. 1992. A grammar of contemporary Persian. Costa Mesa: Mazda Publishers. Lidz, Jeffrey. 2001. Echo reduplication in Kannada and the theory of word-formation. The Linguistic Review 18 (4). 375‒394. Mauri, Caterina. Forthcoming. Building and interpreting ad hoc categories: A linguistic analysis. In Joanna Blochowiak, Cristina Grisot, Stephanie Durrlemann & Christopher Laenzlinger (eds.), Formal models in the study of language. Springer International Publishing. Moravcsik, Edith. 2003. A semantic analysis of associative plurals. Studies in Language 27 (3). 469‒503. Nakanishi, Kimiko & Elizabeth Ritter. 2009. Plurality in languages without a count-mass distinction. Handout of paper presented at the Mass-Count Workshop, University of Toronto, 7 February. Ourn, Noeurng & John Haiman. 2000. Symmetrical compounds in Khmer. Studies in Language 24 (3). 483‒514. Overstreet, Maryann. 1999. Whales, candlelight, and stuff like that. New York: Oxford University Press. Overstreet, Maryann & George Yule. 1997. On being inexplicit and stuff in contemporary American English. Journal of English Linguistics 25 (3). 250‒258. Parvaresh, Vahid, Manoochehr Tavangar, Abbas Eslami Rasekh & Dariush Izadi. 2012. About his friend, how good she is, and this and that: General extenders in native Persian and nonnative English discourse. Journal of Pragmatics 44 (3). 261‒279. Sedighi, Anousha. 2010. Agreement restrictions in Persian. Leiden: Leiden University Press. Shaki, Mansour. 1967. Principles of Persian bound phraseology. Prague: The Oriental Institute in Academia, Publishing House of the Czechoslovak Academy of Sciences. Sharifian, Farzad & Ahmad R. Lotfi. 2003. “Rices” and “waters”: The mass-count distinction in modern Persian. Anthropological Linguistics 45 (2). 226‒244. Sharifian, Farzad & Ahmad R. Lotfi. 2007. “When stones falls”: A conceptual-functional account of subject-verb agreement in Persian. Language Sciences 29 (6). 787‒803. Singh, Rajendra. 1982. On some “redundant compounds” in Modern Hindi. Lingua 56 (3–4). 345‒351. Stilo, Donald. 2004. Coordination in three Western Iranian languages: Vafsi, Persian and Gilaki. In Martin Haspelmath (ed.), Coordinating constructions, 269‒330. Amsterdam & Philadelphia: John Benjamins Publishing Company. Wälchli, Bernhard. 2005. Co-compounds and natural coordination. Oxford: Oxford University Press. Wiltschko, Martina. 2008. The syntax of non-inflectional plural marking. Natural Language and Linguistic Theory 26 (3). 639‒694.

Shahrzad Mahootian and Lewis Gebhardt

13 Revisiting the status of -eš in Persian1 Abstract: Persian clitics, while not as studied as clitics in some other languages, have generated more interest in recent years. One of the main questions linguists seek to answer about clitics is what category they belong to, both in Persian in particular and across languages in general. This article has three goals regarding the status of Persian clitics, in particular the third person singular -eš. First, we review some of the recent literature that argues for at least some clitics being agreement markers. Second, we present data to remind us that -eš still exhibits robust clitic properties. Third, we report on a survey of speakers’ judgments on the acceptability and meaning of -eš in various syntactic positions as well as introduce novel data that show clitics to be more elusive than previously thought. Keywords: agreement, affixes, clitics, -eš, Persian

1 Introduction Clitics have received increasing attention in recent decades. Some of the main problems in the study of clitics revolve around not only their syntactic and morphological behavior but also their categorial status. It is often difficult to tell whether a particular item is a clitic or an affix, since clitics typically exhibit properties of both categories, not to mention properties of independent words. In this article we summarize recent work in the study of clitics in Persian and further look into their distribution, particularly that of the third person singular marker -eš. Previous work has determined that items like -eš, while they may have been clitics in earlier Persian, are or are in the process of becoming affixes, i.e., agreement markers (Rasekh 2011, 2014). Although we do not dispute that -eš may be an affix in certain environments, we focus on data that suggest that -eš does have robust clitic properties. However, our main

1 We extend special thanks to Nastaran Malekshahi for her enthusiasm and assistance in collecting the data. Shahrzad Mahootian and Lewis Gebhardt, Northeastern Illinois University DOI 10.1515/9783110455793-014

264

Shahrzad Mahootian and Lewis Gebhardt

contribution is presenting speaker judgments suggesting that the status of -eš and other clitics is more complex and subtle than previously thought. In the next section we review recent literature regarding clitics in Persian. In section 3, we present speaker judgments informally gathered from several native speakers, showing that the nature of -eš is even more elusive than previously thought. The data show that speakers vary significantly in their judgments over the grammaticality of sentences with clitics and in what those clitics mean. Nonetheless, a few preliminary generalizations emerge. Finally, we conclude with a look toward continuing with a larger sample of clitic data in Persian with a broader range of verbs, especially of compound verbs.

2 Background Investigations into Persian clitics, as is the case with clitics cross-linguistically, typically show that the elements under study hover in status between that of agreement affixes and argument-denoting clitics. However, as a matter of terminology, following common practice in the literature, we refer to the items that are clearly agreement affixes as “affixes” and the elements that are not clearly affixes as “clitics”, even though members of the “clitic” category are not a uniform category of items with only clitic properties. Indeed, the very point of much of the literature is to determine precisely which, if any, category “clitics” belong to. First, there is general agreement on the affixes. Persian has a set of them, each agreeing with the sentential subject in person and number, although in Persian, a prodrop language, the subject need not appear overtly. The affixes are identical in present and past, except for third person singular. Table 1: Persian verbal agreement markers2 Person

Present singular

Present plural

Past singular

Past plural

1

-æm

-im

-æm

-im

2

-i

-id (formal) -in (informal)

-i

-id (formal) -in (informal)

3

-e (informal) -æd (formal)

-æn(d)

-Ø

-æn(d)

In the present tense examples in (1a), the mi- prefix, which is mostly used in generic statements and progressives, is not relevant to the topic of the chapter. 2 Transcription is broad IPA with exceptions: with č = IPA ʧ, y = IPA j. In the glosses, Ez = ezafe linker.

Revisiting the status of -eš in Persian

265

The prefix is included because it is typically obligatory in the present. The examples in (1b) show the nearly identical past tense suffixes, the only difference being that in the past the third person singular is null. We assume these agreement markers without comment, except to note that their status as affixes is supported by two facts: they appear obligatorily on the verb and in compound verbs they appear only on the verbal element. (1)

a.

b.

Present forms of the verb (mæn) mi-xor-æm 1SG Prog-eat-1SG ‘I’m eating’

xordæn ‘to eat’ (ma) mi-xor-im 1PL Prog-eat-1PL ‘We’re eating’

(to) mi-xor-i 2SG Prog-eat-2SG ‘You’re eating’

(šoma) mi-xor-id 2PL Prog-eat-2PL ‘You’re eating’

(u) mi-xor-e 3SG Prog-eat-3SG ‘she/he is eating’

(unha) mi-xor-æn they Prog-eat-3PL ‘They’re eating’3

Past forms of the verb xordæn ‘to eat’ (mæn) xord-æm (ma) xord-im 1SG ate-1SG 1PL ate-1PL ‘I ate’ ‘We ate’ (to) xord-i 2SG ate-2SG ‘You ate’

(šoma) xord-id 2PL ate-2PL ‘You ate’

(u) xord-Ø 3SG ate-3SG ‘she/he ate’

(unha) xord-æn they ate-3PL ‘They ate’

The contrast between (2a) and (2b), with a compound verb, shows that the agreement suffix can only appear on the verb. (2)

a.

baz kærd-i open did-2SG ‘Did you open it?’

b. *baz-i open-2SG

kærd did

3 The verb doesn’t have to show plural agreement with nonhuman plural subjects.

266

Shahrzad Mahootian and Lewis Gebhardt

The forms of the clitics are in Table 2. Table 2: Persian clitics4 Person

Singular

Plural

1

-æm

-emun / -eman

2

-et / -æt

-etun / -etan

3

-eš / -æš

-ešun / -ešan

Clitic constructions parallel full DP constructions as possessives (3a, 3b), objects of prepositions (3c, 3d), and as objects of verbs (3e, 3f). Note also in (3e, 3d) that clitic doubling is possible if the subject ma ‘we’ is pronounced. (3)

a.

doxtær-æm daughter-3SG ‘my daughter’

b.

doxtær-e daughter-Ez ‘my daughter’

mæn I

c.

næzdik-æm near-1SG ‘near me’

d.

næzdik-e near-Ez ‘near me’

mæn I

e.

(ma) did-im-eš we saw-1PL-3SG ‘We saw him/her/it’

f.

(ma) u-ra we 3SG -Def.Acc ‘We saw him/her’

did-im saw-1PL

In compound verbs, which consist of a nonverbal element and verbal or lightverb element, the clitic may appear on either the light verb (4a) or the nonverbal element (4b). (4)

a.

komæk kærd-im-eš help did-1PL-3SG ‘We helped her/him’ (Mahootian 1997: 139)

b.

komæk-eš kærd-im help-3SG did-1PL ‘We helped her/him’

As mentioned above, among the important issues in the study of clitics is to determine just what they are. Since clitics correspond to full-word forms but also have properties that overlap with those of affixes, a three-way series of tests must be appealed to in order to distinguish the three categories of affix, clitic, 4 The first form in each pair is the colloquial; the second is the formal form. Throughout this chapter we will be using the colloquial forms. The vowels in the clitics appear when the stem ends in a consonant. Also, we uniformly join the affixes/clitics to roots with a single hyphen.

Revisiting the status of -eš in Persian

267

and word (Zwicky 1985). Zwicky (1977, 1985) and Zwicky and Pullum (1983) proposed tests to distinguish clitics from other elements. The criteria for distinguishing clitics from affixes should be read as tendencies rather than strictly necessary or sufficient conditions. (Also see Klavans [1995] and Haspelmath and Sims [2010], among others, for discussion.) – Clitics have more freedom in choosing a host, while affixes are highly selective in the category of host they attach to. – Clitics have fuller paradigms than affixes, which often exhibit gaps. – Clitics are quite regular whereas affixes show more morphophonological idiosyncrasies. – Clitics are more semantically regular than affixes, which show more idiosyncrasies in their semantics. – Syntactic rules don’t affect clitic groups but may affect affixed words. – Clitics can attach to clitics but affixes don’t attach to clitics. The more of the criteria that show clitichood for an item, the more clitic-like that item is. Items often don’t behave exclusively like clitics or affixes, and, hence, the boundary between what a clitic is and what an affix is remains fuzzy (Haspelmath and Sims 2010: 197–203). The fuzziness holds in Persian, where in this chapter we are primarily interested in the clitic/affix boundary, particularly with regard to the third person singular -eš. Some earlier studies, e.g., Lazard (1957), Mahootian (1997), have classified the elements in Table 2, above, as clitics, partly because of their ability to attach to various hosts. Samvelian and Tseng (2010) refer to the items in Table 2 (although in variant forms) as pronominal (en)clitics but they also point to the difficulty in Persian and other languages in differentiating them from affixes. Despite the difficulty in distinguishing clitics from affixes, their conclusion, based on certain phonological effects and co-occurrence constraints, is that these Persian items are more like suffixes than like “independent syntactic elements” (Samvelian and Tseng 2010: 213). Rasekh (2011, 2014) takes a historical perspective in analyzing the status of clitics in impersonal constructions. The impersonal construction, illustrated in (5), has been much studied because of its apparently nonstandard agreement, where the verb itself shows the null third person singular agreement and the nonverbal element seems to agree with the overt or covert subject, which might be a topic. (5)

xoš-æm ‘amæd like-Enc.1SG come.Past.3SG ‘I liked it’ (Rasekh 2014: 17)

268

Shahrzad Mahootian and Lewis Gebhardt

Rasekh determines that the clitic -æm in (5) should be treated as an agreement affix, arguing that the clitics are in the process of grammaticalizing into subject agreement markers. The affixal status of the clitic in (5), according to Rasakh, contrasts with its status as a true clitic in other cases, such as -eš in (6). (6)

(mæn) did-æm-eš (I) see.Past-1SG .Su-Enc.3SG ‘I saw him/her’ (Rasekh 2014: 2)

In effect, Rasekh proposes that an older agreement paradigm is evolving into a new paradigm, as in Table 3, as the clitic -eš evolves into an agreement affix that fills the third person singular gap in the overt endings of the old paradigm. Table 3: Verbal agreement paradigm’s change in Persian (Rasekh 2011: 25) Old Paradigm

New Paradigm

1SG

-æm

-æm

2SG

-i

-i

3SG

-Ø

-eš

1PL

-im

-im

2PL

-id

-id

3PL

-ænd

-ænd

Kazeminejad (2014) argues that in Persian pronominal complex predicates (i.e., Rasekh’s impersonal construction) the pronominal clitic is a phrasal affix and agreement marker (also see Karimi [1997] on complex predicates). In contrast to Rasekh, who calls the null agreement on the verb itself in a sentence like that in (5) a “default / zero morph” (Rasekh 2014: 1), Kazeminejad argues that the null agreement should be interpreted as ordinary third person singular agreement with the subject, which is the theme argument of a state or event, in the case of (5) a state of liking. The enclitic itself agrees with the topic. In the brief review of recent literature above, the consensus is that Persian clitics are or are evolving into agreement markers, at least in some constructions. In the next section we look at various data involving mostly -eš to further examine its status, with an eye toward reminding us of its still robust clitic properties.

Revisiting the status of -eš in Persian

269

3 More data on Persian -eš In this section we recapitulate a number of facts about Persian clitics, focusing on the third person singular -eš, and then proceed to other observations. The item -eš is most arguably a clitic when it linearly follows an affix, as in (7), where -eš follows the agreement marker -im on the verb. (7)

ma gorbe-ra we cat-Def.Acc ‘We saw the cat’

did-im-eš saw-1SG -3SG

It is also a clitic when it appears in a location that bars agreement. As mentioned above, in compound verbs the agreement affixes can only appear on the light verb. Thus, if -eš appears on the nonverbal element in a compound verb, it must be a clitic as in (8a), alternating with the construction where the clitic can appear after the affix on the verbal element (8b). (8)

a.

mæn be færzad nešan-eš I to Farzad indicate-3SG ‘I showed it to Farzad’

b.

mæn be færzad nešan I to Farzad indicate ‘I showed it to Farzad’

dad-æm gave-1SG

dad-æm-eš gave-1SG -3SG

But often it is simply unclear whether -eš is a clitic or affix. In (9), the -eš could be agreeing with the singular third person subject or functioning as a clitic for the subject. That the overt subject maman appears isn’t conclusive evidence that -eš is an agreement marker since it is well established that clitics can double with overt arguments. (9)

maman birun ræft-eš mom out went-3SG ‘Mom went out’

If -eš is an agreement marker, then a clitic should be able to attach to its right, but this expectation is not borne out, as shown in (10a) and (10b). However, on the other hand, if -eš is a clitic we’d likewise predict (10a) and (10b) to be acceptable, since clitic sequences are typically permitted.

270 (10)

Shahrzad Mahootian and Lewis Gebhardt

a. *peyda kærd-eš-emun find did-3SG -1PL intended: ‘She/he found us’ b. *bær dašt-eš-ešun next to had-3SG -3PL intended: ‘she/he took them’

Generally, it seems, -eš has a definite/specific interpretation; if -eš is a clitic, then it co-refers to, or is co-indexed with, a specific earlier mentioned referent. Example (7) illustrates -eš as co-referential with the definite gorbe-ra ‘the cat’ and both examples in (8) show that -eš refers to a previously introduced referent, translated as ‘it’. However, in asking several native Persian speakers a variety of sentences (discussed below), we did find a few examples where -eš seemed to be acceptable with an indefinite reading. (11)

yek gorbe yek muš a cat a mouse ‘A cat ate a mouse’

xord-eš ate-3SG

In (11), both arguments are indefinite, yet all four speakers that we consulted found the sentence acceptable, with a preference for interpreting -eš as referring to the subject. This fact could be evidence that -eš in (11) marks agreement, since there is no evidence elsewhere in Persian that (in)definiteness is a factor in agreement. This point is also discussed in Fuß (2005: 133). On the other hand, if -eš is an agreement marker, it should allow a clitic to follow, but -eš seems to resist having a clitic after it, as in (12a). In fact, -eš can’t be followed by itself (12b) or any of the other clitics (12c). (12)

a. *did-eš-æm saw-3SG -1SG intended: ‘She saw me’ b. *did-eš-eš saw-3SG -3SG intended: ‘She/he saw him/her’ c. *did-eš-*et/*emun/*etun/*ešun saw-3SG -2SG /1PL /2PL /3PL intended: ‘She saw you/us/you(PL)/them’

The data in (12a, 12b, and 12c) argue for -eš being a clitic. Since Persian is a prodrop language, it’s possible that the antecedent of a clitic may not be in the same clause, at least not overtly, and may have been

Revisiting the status of -eš in Persian

271

introduced in an earlier clause. In (13), -eš, which can be translated as ‘it’, appears at the end of the second clause, after the agreement marker -æm while the shirt it’s referring to is mentioned in the first clause, in piran. (13) in piran-e xeyli čerk-e emruz bayæd be-šur-æm-eš This shirt-Def very dirty-is today must Subjunctive-wash-1SG -3SG ‘This shirt is very dirty, I’ll have to wash it today’ (http://persian.nmelrc.org/pvc/pvc.php?verb=shostan&snt, accessed 15 October 2015) The examples so far suggest that -eš has both affixal and properties as an object clitic, but it is arguably a subject clitic as well. Of course the problem here is that if -eš is related to the subject, it might be an agreement marker, as previous literature suggests. In (9), for example, it’s difficult to tell whether the -eš simply agrees with the third person subject maman ‘Mom’ or is an argument clitic. In compound verbs, a curious distribution results in the positions of the clitics. In transitive compound verbs, -eš can appear on either the verbal (14a) or nonverbal element (14b). (14)

a.

peyda kærd-æm-eš find did-1SG -3SG ‘I found it’

b.

peyda-š kærd-æm found-3SG did-1SG ‘I found it’

As mentioned in the literature review in the previous section, in a subclass of compound verbs, the impersonal constructions, the -eš appears only on the nonverbal element as in (15a). Recall that this -eš is argued to be an agreement affix by Rasekh, obligatorily appearing on the nonverbal element and never the verbal element. However, we find that (15b) is acceptable, with the sentence-final -eš in some kind of relationship with færzad. If -eš is a third person agreement affix, at least in some contexts, it might be predicted that (15b) should be good, with -eš being Rasekh’s new-paradigm affix replacing -Ø. But if it’s not an agreement affix, then it is a subject clitic. (15)

a.

xoš-eš umæd like-3SG came ‘She/he liked it’

b.

færzad æz to Farzad from you ‘Farzad liked you’

xoš-eš like-3.SG

umæd-e came-3.SG

272

Shahrzad Mahootian and Lewis Gebhardt

Sentences like (15) need further investigation and for now we put them aside, accepting the generalization that impersonal verbs have obligatory agreement on the nonverbal element. Finally, in intransitives, excluding the impersonals, -eš appears only on the verbal element, optionally. (16)

a.

leila rah oftad(-eš) Leila way fell-3SG ‘Leila got going’

b. *leila ra(h)-(e)š oftad fell Leila way-3SG intended: ‘Leila got going’ Let’s summarize the patterns. The -eš appears optionally on either element of a transitive compound verb. It is barred from the nonverbal element in intransitives and from the verbal element in experiencer intransitives. Finally, it is obligatory on the nonverbal element in experiencer intransitives. The facts appear in Table 4. In standard notation, an element in parentheses is optional and an asterisk outside the parentheses indicates that the element is obligatory. Table 4: The appearance of -eš in different compound verb types Verb type

Nonverbal element

Verbal element

transitive

(-eš )

(-eš )

intransitive

*-eš

(-eš )

experiencer intransitive

*(-eš )

*-eš (but see example [15])

One potentially interesting asymmetry is the unacceptability of -eš on the nonverbal element of ordinary intransitive verbs compared with the unacceptability of -eš on the verbal element of the experiencer intransitives. Another asymmetry is that in transitive compound verbs, -eš optionally appears on the nonverbal element or on the verbal element, while for both kinds of intransitives there is some restriction on the appearance of -eš. Finally, only in the case of the experiencer intransitives is -eš required. Indeed this obligatory appearance is one reason for thinking -eš here is an agreement affix. But if optionality is taken as a diagnostic, all the other appearances of -eš could be as clitics. Recapitulating, there is mixed evidence as to whether -eš is a clitic or affix. Relying on data in the literature and our own judgments, we were nonetheless curious to get input from at least a small sample of nonlinguists. The judgments were gathered by asking four adult native speakers of Persian (three university

Revisiting the status of -eš in Persian

273

students and one homemaker) about the use of -eš, through both grammaticality judgments and intuitions about meaning. The sentences were presented in written Persian and participants were asked to give an acceptability judgment and to state whether the -eš refers to the subject or the object. Because of the small sample size, the results are not conclusive, but we think they are suggestive. It should be noted that some of the responses were anomalous or nonsensical and we didn’t include them. As an example of an anomalous result, for one sentence participants said that -eš referred to an object, although the verb was intransitive. Further, although we included them, some judgments didn’t agree with ours, with participants claiming ungrammaticality where we considered them acceptable. While of course judgments may vary, there were cases where we felt the participants’ judgments of unacceptability stemmed from lack of context, since we presented the sentences in isolation and in written rather than in spoken form, where in some cases intonation would have clarified acceptability or ambiguity. For example, in (xii) in Table 5, the speakers all judged the sentence acceptable and said that -eš referred to the object. However, an intonation shift makes it easy to interpret the clitic as referring to the subject. Similarly, a different intonation allows for the clitic to be interpreted as co-referential with the subject instead of the object. Also, in a number of cases participants had a consistent judgment of an unambiguous referent of -eš where we found ambiguity. This too must be more rigorously studied, but for now we suggest that some of the unambiguous readings for participants were really preferred readings, with another reading being possible. In Table 5, we summarize the results of our survey. In the left column is the sentence, with our intended meaning. In each of the other four columns is a speaker’s rating of acceptability and meaning. In a few instances participants judged a sentence as unacceptable but commented on what -eš was anyway. On the top row of each cell of the responses are their acceptability judgments: G for grammatical, U for ungrammatical. The notation on the second row of each cell of speakers’ responses indicates what the speakers thought the referent of -eš was: S if referring to the subject and O if referring to the object. S/O indicates that the speaker thought it could refer to either. In sentences where -eš appeared twice, we indicated the left -eš as 1 and the second -eš as 2. So “1-O 2-S/O” means that the speaker thought the first -eš referred to the object and the second -eš could refer to either the subject or the object. Note that speaker 4 didn’t complete the questions. As can be seen in Table 5, the results are mixed, which perhaps isn’t surprising given common analyses of -eš as affix and/or clitic. Participants were entirely consistent on some sentences: for example, in (xvii) all four participants agreed that -eš referred to the object. However, for others there was variation on whether the sentence is grammatical and what the clitics referred to.

274

Shahrzad Mahootian and Lewis Gebhardt

Table 5: Speakers’ judgments on the use of -eš Persian sentence & intended meaning

Speaker 1 Acceptability Meaning of -eš

Speaker 2 Acceptability Meaning of -eš

Speaker 3 Acceptability Meaning of -eš

Speaker 4 Acceptability Meaning of -eš

i. bidar-eš kærd-eš ‘She woke him up’

U 1-O 2-S/O

U 1-O 2-O

G 1-O 2-O

ii. gorbe yek muš xord-eš ‘The cat ate a mouse’

U S

U O

G O

iii. širin bidar-eš kærd-eš ‘Shirin woke him up’

U 1-O 2-S

U 1-O 2-O

G 1-O 2-S/O

iv. xord-æm-eš ‘I ate it’

G O

G O

G O

v. gerɛft-ænd-et ‘They caught you’

G O

G O

G O

vi. xanum-ha resid-ænd-ešun ‘The women arrived’

G S

G S

G S

vii. xord-id-eš ‘You ate it’

G O

G O

G O

viii. ketab-ra xund-eš ‘She read the book’

G S

G O

G S/O

ix. yek gorbe yek muš xord-eš ‘A cat ate a mouse’

G S

G S

G S/O

x. leila ketab-ra xund-eš ‘Leila read the book’

G S/O

U S

G S/O

xi. peyda-š kærd-eš ‘She found it’

G 1-O 2-S/O

U 1-O 2-O

G 1-S/O 2-S/O

G O-O

xii. kia xord-eš ‘Kia ate it’

G O

G O

G O

G O

xiii. xabid-eš ‘She slept’

U S

U ?

G S

G S

xiv. xab-eš bord ‘She fell asleep’

G S

G S

G S

G S

xv. peyda-š kaerd ‘She found it’

G O

G O

G O

G O

xvi. gorbe muš xord-eš ‘The cat ate a mouse’

G S/O

G O

G S

U S

xvii. bidar-eš kærd ‘She woke him up’

G O

G O

G O

G O

xviii. xord-eš ‘She ate it’

G O

G O

G O

G S/O

xix. bidar kærd-eš ‘She woke her up’

G O

G O

G O

xx. peyda kærd-eš ‘She found it’

G O

U O

G O

Revisiting the status of -eš in Persian

275

There was a tendency for speakers to see the -eš on the nonverbal element as connected to the object, while the clitic on the light verb was more open to a subject-oriented reading. This is further illustrated in the following contrast. In (17a) speakers interpreted -eš on the nonverbal element as referring to the object, faerhad, while in (17b) they saw -eš on the verbal element as referring to the subject, širin. (17)

a.

širin [færhad-o]i Shirin Farhad-Def.Acc ‘Shirin found Farhad’

peyda-[š]i find-3SG

b.

širini færhad-o Shirin Farhad-Def.Acc ‘Shirin found Farhad’

peyda find

kærd did

kærd-[eš]i did-3SG

4 Conclusion In this chapter we noted a tendency in the recent literature to analyze clitics, at least in some constructions, as affixes. We lodged no major objection to the idea that Persian clitics may be becoming affixes, and certainly it is the case that, as clitics in many languages, Persian clitics are not clearly one category or another. Further, we presented data to remind us that there’s still a lot of cliticness in clitics. Taking a more neutral stance in collecting data from native speakers in a preliminary informal survey, we simply asked participants “Is -eš the subject or the object?”, leaving aside as to whether they are more clitic-like or affix-like in any given sentence. Also, though not conclusive, some patterns emerged, particularly a preference for -eš on the nonverbal element in compound verbs to be associated with the object and -eš on the verbal element to be associated with the subject. In further study, we will sample a wider range of ordinary verbs and compound verbs to test native speakers’ understanding of -eš, expecting that a large sample will provide for stronger generalizations that will help to more accurately address the question of what -eš is in particular, and what clitics are more generally in Persian.

References Fuß, Eric. 2005. The rise of agreement: A formal approach to the syntax and grammaticalization of verbal inflection. Philadelphia: John Benjamins.

276

Shahrzad Mahootian and Lewis Gebhardt

Haspelmath, Martin & Andrea D. Sims. 2010. Understanding morphology. London: Hodder Education. Karimi, Simin. 1997. Persian complex verbs: Idiomatic or compositional. Lexicology 3 (2). 272– 318. Kazeminejad, Ghazaleh. 2014. Pronominal complex predicates in colloquial Persian. Lexington: University of Kentucky dissertation. Klavans, Judith. 1995. The independence of syntax and phonology in cliticization. Language 61 (1). 95‒120. Lazard, Gilbert. 1957. Grammaire du persan contemporain. Tehran: Institut Français de Recherche en Iran. Mahootian, Shahrzad. 1997. Persian. New York: Routledge. Rasekh, Mohammad. 2011. The rise of agreement: The case of Persian enclitics. In International Conference on Languages, Literature and Linguistics. IPEDR 26. Singapore: IACSIT Press. Rasekh, Mohammad. 2014. Persian clitics: Doubling and agreement. Journal of Modern Languages 24 (1). 16‒33. Samvelian, Pollet & Jesse Tseng. 2010. Persian object clitics and the syntax-morphology interface. In Stefan Müller (ed.), Proceedings of the HPSG 10 Conference, 212–232. Paris: CSLI Publications. Zwicky, Arnold. 1977. On clitics. Bloomington: Indiana University Linguistics Club. Zwicky, Arnold. 1985. Clitics and particles. Language 61 (2). 283‒305. Zwicky, A. & G. K. Pullum. 1983. Cliticization vs. inflection: English n’t. Language 59 (3). 502‒513.

Arseniy Vydrin

14 ‘Difficult’ and ‘easy’ in Ossetic Abstract: In this article, I will study the peculiarities of the Ossetic dedicated construction conveying the meanings ‘difficult/easy to accomplish’. By dedicated I mean that the construction expresses nothing but the intended meaning. The construction is not mentioned in Ossetic standard grammars and is unique among Iranian and Caucasian languages. The construction is close to passive constructions and a modal construction of possibility, however, its origin and the time of appearance in Ossetic is not clear. Keywords: modality, grammaticalization, Ossetic, Iranian languages, Caucasian languages

1 Introduction Though the meanings ‘easy/difficult to accomplish’ can be expressed by grammatical means, e.g., via syntactic constructions, these constructions are usually not dedicated ones and have other meanings. Cf. the so-called Tough Construction (Comrie and Matthews 1990), which besides easy and difficult can be used with other VALUE adjectives, e.g., This room is pleasant to sleep in. This article deals with a typologically rare grammaticalization of the modal meaning ‘easy/difficult to accomplish’, which is found in the modern Ossetic language (Eastern Iranian). In section 2, I give general information on the Ossetic language and clarify the sources of my data. Section 3 deals with grammatical ways of expressing the meaning ‘easy/difficult to accomplish’. In sections 4, 7, and 8, I describe formation, morpho-syntactic properties, and semantics of the dedicated construction expressing the meaning ‘easy/difficult to accomplish’. In sections 5 and 6, I compare the standard passive constructions and complex verbs to the dedicated construction under discussion. Section 9 is an attempt to trace the origin of the construction. Section 10 shows that

This research was carried out with the financial support of the fellowship of the president of the Russian Federation (MK-1920.2014.6) and the RFH grant no. 13-04-00342. I thank my consultants – Ossetic native speakers – for their patience. Arseniy Vydrin, Institute for Linguistic Studies of the Russian Academy of Sciences. DOI 10.1515/9783110455793-015

278

Arseniy Vydrin

none of other Iranian languages and almost none of the Caucasian languages geographically close to Ossetic have dedicated grammatical means to express the meaning ‘easy/difficult to accomplish’. Section 11 is the conclusion.

2 Ossetic language Ossetic is an Eastern Iranian language that is mainly spoken in the Caucasus, in the Republic of North Ossetia-Alania, Russia, and in the Republic of South Ossetia. The total number of Ossetic native speakers in the world is around 529,000 (Ethnologue 2017). Ossetians living in Russia are usually bilingual (Russian and Ossetic). Ossetic has two major dialects: Iron and Digor. This article focuses on the Iron dialect on which the literary standard Ossetic is based. However, there are some published texts (fiction, poetry, newspapers) in Digor Ossetic as well (cf. our Digor Ossetic written corpus http://corpus-digor.ossetic-studies.org/en). The Ossetic examples cited in the paper are taken from three sources: Ossetic National Corpus, Ossetic oral texts, and my field data. Ossetic National Corpus (ONC) is available online (http://corpus.ossetic-studies.org/en) and mainly consists of contemporary texts from fiction and the literary journal Max dug published in North Ossetia in 2000‒2012. All the texts have been automatically morphologically annotated in English and Russian. By the time of submission, ONC had about 5 million tokens. The examples from the Ossetic National Corpus are marked as ONC and have a reference to the source. The oral texts are also available online (http://www.ossetic-studies.org/en/texts/iron) and consist of Iron dialect texts recorded in different parts of North Ossetia in 2007‒2012. All the oral texts are transcribed, translated, and interlinearized in English and Russian. By the time of submission, the oral text had about 50,000 words. The examples from the oral texts are marked as Oral text and have the name of the text and the sentence’s number. My field data was collected in North Ossetia from 2008 to 2010. Examples obtained from native speakers do not have any special marking.

3 “Difficult to accomplish” and “easy to accomplish” in Ossetic Ossetic has several grammatical ways to convey the meanings ‘difficult to accomplish’ and ‘easy to accomplish’. In what follows, I will briefly discuss them.

‘Difficult’ and ‘easy’ in Ossetic

279

3.1 Dative construction The most usual way to express the meanings ‘difficult to accomplish’ and ‘easy to accomplish’ is to use the construction where the adjective ɜnson ‘easy’ or žən ‘difficult’ forms a predicate with the verb wɜvǝn ‘to be’ in third person singular or with the habitual verb of being vɜjjǝn. The lexical verb is used in the infinitive (-ǝn ending) and the Experiencer is marked by the dative. The construction is similar to English: “It is difficult / easy for someone to do something”. I will call it the dative construction. Compare the examples with žən ‘difficult’ (examples [1])1 and [3]) and ɜnson ‘easy’ (example [2]) and with a transitive (example [1]) and an intransitive (example [3]) verb. (1) nɜ NEG

tsɜ ɜrba-jjɜft-оn ɜmɜ mǝn žǝn 3PL . ENCL .GEN PREF- catch.up- PST.TR .1SG and 1SG . ENCL . DAT difficult

u tsɜ nom zur-ǝn be.PRS .3SG POSS .3SG name say- INF ‘They had died before I met them and it is difficult for me to say their names’ (Oral text. Zangieva_Khabalova_2. 71.1). (2) ɜmɜ iwǝrdǝgɜj mɜ-χisɜn dɜr tǝng ɜnson vɜjj-ǝ and from.one.side POSS .1SG - RFL . DAT FOC very easy be.HAB - PRS .3SG wǝj tǝχχɜj ɜmɜ ɜž warž-ǝn iron ɜvžag that.GEN POST and I love- PRS .1SG Ossetic language ‘On the one hand, it is very easy for me [to speak Ossetic] because I love Ossetic language’ (Oral text. Alagir school. 2.9). (3)

χɜrž very

žǝn difficult

u be.PRS .3SG

ɜmbаžǝg. . . companion.in.arms

mɜl-ǝn, die- INF

mе POSS .1SG

fšǝmɜr brother

ɜmbаžǝg. . . companion.in.arms

‘It is very difficult to die, my brother companion-in-arms. . . companionin-arms’ (ONC. Max dug, no. 7, 2003). 1 All the examples are transcribed in accordance with the modern standard (Iron) Ossetic pronunciation (for details see Dzahova 2009). In general, the most important phonetic peculiarities are connected to the pronunciation of the consonants (in Ossetic script) с, з, дз, and ц. In most of the cases, the letter с is pronounced as /š/, з as /ž/, дз as /z/, and ц as /s/. While transcribing the examples I used the IPA symbols. Clitics are used without the symbol = and follow the Ossetic orthography (sometimes they are written separately, sometimes with a hyphen).

280

Arseniy Vydrin

Note that in the dative construction the auxiliary is obligatory in third person singular. The construction is not restricted to the meanings ‘difficult’ and ‘easy’; other VALUE adjectives also can be used in it. Compare the following example with the adjective ɜχson ‘pleasant’: (4)

Asimo-imɜ nəχaš kɜn-ən adɜjmag-ɜn ɜχson u Asimo-COM word do- NMLZ man-DAT pleasant be.PRS .3SG ‘It is a pleasure for a human being to talk with Asimo (robot’s name)’ (ONC. Max dug, 2003, no. 5, p. 152).

The lexical verb can be omitted in the dative construction. Compare: (5)

žǝn difficult

nǝn 1PL . ENCL . DAT

nǝn 1PL . ENCL . DAT

zǝ 3SG . ENCL . INESS

zǝ-iw 3SG . ENCL . INESS - ITER

wǝdi be.PST.3SG

wǝdi be.PST.3SG

fɜlɜ but

tǝng very

žǝn difficult

wɜddɜr however

‘It was hard there, it was very hard for us [to live there], though (we don’t complain)’ (Oral text. Aguzarova Izeta. 33.4).

3.2 Facilitive-difficilitive construction There is another construction in Ossetic, which also consists of the adjective ɜnson ‘easy’ or žən ‘difficult’, the auxiliary wɜvǝn ‘to be’ (or the habitual auxiliary of being vɜjjǝn), and a verbal derivate of a lexical verb. A Patient-like participant is marked by the nominative while an Agent-like participant2 or Sole participant is obligatory in the dative. When the construction is formed from a transitive verb, the auxiliary agrees in person and number with a Patient-like participant.3 Compare the examples (6) to (8) below: example (6) shows the use of the construction

2 The term Agent-like participant is broader than A(gent), see example (31) where the unexpressed first participant of the verb ‘to forget’ is Experiencer rather than Agent. Agent-like participant means “the main participant, the ‘hero’ of the Situation, who is primarily responsible for the fact that this Situation takes place” (Kibrik 1997: 292). For this participant, Kibrik proposed the hyperrole Principal. Patient-like participant is “the most Effect(Patient)-like participant of a multi-participant event” (Kibrik 1997: 292). Kibrik uses the term Patientive for this participant. 3 The basic strategy of argument case marking is nominative for Sole participant of an intransitive verb and for the first participant of a transitive verb (Subject) and nominative/genitive for the second participant of a transitive verb (Object). The choice between the nominative and the genitive depends on animacy and definiteness (the genitive only if an Object is animate).

‘Difficult’ and ‘easy’ in Ossetic

281

with the adjective ‘easy’, while in examples (7) and (8), the adjective ‘difficult’ is used. In examples (6) and (7), the construction is formed from a transitive verb, in example (8) from an intransitive. (6) nɜl fǝš-ǝ dǝmɜg-аw ɜnson kɜrd-ɜn štǝ male sheep-GEN fat.tail-EQU easy cut- NMLZ be.PRS .3PL ‘It is easy to cut them, like a sheep’s fat tail’ (Abaev 1959: 112). (7) asə fəš-tɜ nən žən ɜrs-aχš-ɜn štə this sheep-PL . NOM 1PL . ENCL . DAT difficult PREF- catch- NMLZ be.PRS .3PL ‘It is difficult for us to catch these sheep’ (lit. ‘these sheep are difficult for catching for us’). (8) ɜnɜ dɜw mən žən sɜr-ɜn u without you.SG 1SG . ENCL . DAT difficult live- NMLZ be.PRS .3SG ‘It is difficult for me to live without you’ (ONC. Kokaev T. A. Nebesnyj ključ. Vladikavkaz, 2004, p. 91). The construction differs from the dative construction morphologically, syntactically, and semantically. First of all, the construction is restricted to only two adjectives – ɜnson ‘easy’ or žən ‘difficult’ – and conveys only ‘easy to accomplish’ or ‘difficult to accomplish’ respectively; compare example (6) to (7) and (8). Hereinafter I will refer to it as facilitive-difficilitive construction,4 or FDC, from Latin facilis ‘easy’ and difficilis ‘difficult’. Secondly, in FDC, a lexical verb is obligatory used in the non-finite form in -ɜn (the properties of this verbal form will be considered in section 4.1 below). Finally, the auxiliary in FDC has full person and number paradigm and agrees in person and number with a Patientlike participant (if the lexical verb is transitive); compare examples (6) and (7) where the auxiliary is in third person plural; an Agent-like participant is obligatory marked by the dative; cf. example (7). If the lexical verb is intransitive, the auxiliary is used in third person singular and a Sole participant is marked by the dative (example [8]). It is important to point out that, when FDC is formed from a transitive predicate, the auxiliary can agree only with a Patient-like participant; cf. (9) where the auxiliary agrees with the Recipient and the sentence is ungrammatical:

4 It could be split up into two constructions (the facilitive construction and the difficilitive construction); however, the use of different adjectives does not affect the morphology or syntax of the construction. I have decided to consider it as a single facilitive-difficilitive construction.

282

Arseniy Vydrin

(9) *wǝdon ɜnson gɜrtam ratt-ɜn štǝ they easy bribe give- NMLZ be.PRS .3PL Intended meaning: ‘It is easy to give them a bribe’. Also note that when FDC is formed from a ditransitive predicate with an overt Agent (e.g., ‘it was difficult for us to give her a great deal of assistance’), the Recipient (which is also marked by the dative in Ossetic) is obligatorily omitted; cf. the following, where example (10) allows the insertion of an Agent (example [10a]), but it does not allow the simultaneous expression of Agent and Recipient (example [10b]). (10) sɜj PRTCL

žǝn dɜtt-ɜn dɜ, ud, ɜmɜ sɜj žǝnarγ difficult give- NMLZ be.PRS .2SG soul and PRTCL valuable

dɜ, sаrd be.PRS .2SG life ‘How is it difficult to give you, the soul (lit. ‘how are you difficult for giving, the soul’), and how are you valuable, the life!’ (ONC. Degoeva S. M. Pogasšyj luč solnca. Vladikavkaz: Ir, 2002). žǝn dɜtt-ɜn mǝn dɜ, mɜ ud PRTCL difficult give- NMLZ 1SG . ENCL . DAT be.PRS .2SG POSS .1SG soul ‘How it is difficult for me to give you, my soul!’

a. sɜj

žǝn dɜtt-ɜn mǝn ɜn dɜ, ud difficult give- NMLZ 1SG . ENCL . DAT 3SG . ENCL . DAT be.PRS .2SG soul Intended meaning: ‘How it is difficult for me to give you, the soul, to him!’ (lit. ‘how are you difficult for giving, the soul, for me to him’).

b. *sɜj

PRTCL

The omission of Recipient in this case is explained by the fact that its place is already occupied by the dative participant (Agent). In FDC formed from a transitive predicate, the auxiliary obligatory agrees with a Patient-like participant in person but not always in number; cf. the following example where the auxiliary is in plural, while the Patient is in singular: (11)

žǝn dar-ɜn wǝdǝštǝ utɜppɜt sot difficult keep- NMLZ be.PST.3PL so.much posterity ‘It was difficult to keep so many descendants’ (ONC. Biragova L. H., Agkaceva L. T. Sbornik diktantov. Vladikavkaz: Ir, 2005, p. 127).

‘Difficult’ and ‘easy’ in Ossetic

283

Disagreement in number is common in Ossetic if the subject is a collective noun (Ahvlediani 1969: 123). Though the dative construction is well known to Ossetic grammars, FDC is mentioned neither in Ossetic standard grammars (e.g., Abaev 1964) nor in dedicated studies of Ossetic modality (e.g., Tehov 1970) and the verb (e.g., Takazov 1992). In the following sections, I will discuss only FDC. In the next three sections, I will consider the FDC’s constituents, namely, the verbal form in -ɜn, the adjectives žən ‘difficult’ and ɜnson ‘easy’, and the auxiliary wɜvǝn ‘to be’. I will show that, though FDC is similar to the passive construction and complex predicates, it differs from both of them and has to be considered a separate construction (sections 5 and 6). In sections 7, 8, 9, and 10 I will discuss FDC’s peculiarities, semantics, and origin.

4 FDC’s constituents 4.1 Verbal derivate in -ɜn The verbal derivate in -ɜn, which is used in FDC, is formed from the present stem of a verb by the suffix -ɜn. The origin of the suffix is uncertain. There were some attempts to connect it to the Indo-Iranian verbal suffix -anā (Takazov 1992: 108– 137); however, -ɜn cannot continue -anā for phonetic reasons. Thordarson proposes to derive -ɜn from the Old Iranian nominal suffix *-ana- (Thordarson 2009: 145‒ 146). The suffix -ana- was used in Old Aryan to derive verbal abstracts, “names of tools and places suitable or intended for some action” (Thordarson 2009: 146): Old Indian sam-áraṇa, Old Persian ham-arana ‘battle’ (ar- ‘to move’). The Ossetic dative case was also derived from the suffix *-ana. The dative case and the verbal suffix -ɜn express close functions: the meaning ‘to, for’, the direction or destination of the referent. “It is therefore natural to presume that at some stage in the history of the language the derivative suffix *-ana was grammaticalised as a case ending” (Thordarson 2009: 146). The verbal derivate in -ɜn can be used as a head of a noun phrase ( χiž-ɜn graze.PRS - NMLZ ‘pasture’) or an attribute ( fəšš-ɜn zawma-tɜ write.PRS - NMLZ thing-PL . NOM ‘ writing-materials’). Besides FDC, the derivate can be used only in the modal construction of participant-external possibility, which will be briefly discussed in section 9. In the examples, for convenience I will interlinearize the suffix -ɜn as NMLZ (nominalization), however, the nominalizing function of the suffix needs a separate study.

284

Arseniy Vydrin

4.2 Nouns ɜnson and žən The words ɜnson and žən used in FDC can function either as attributes (‘easy’, ‘difficult’) or nouns (‘easiness’, ‘difficulty’). The origin of žən ‘difficult(y)’ is connected to the Avestian zyā- ‘to damage’, cf. Sogdian *žīn- ‘heavy, painful’ (Abaev 1989: 322). ɜnson ‘easiness, easy’ is a parallel form of ɜnsoj ‘rest’, which originates from *ham-č(y)āna, č(y)ā-‘rest’, ‘enjoy’ and the prefix ham- (Abaev 1958: 151–152). Besides FDC, ɜnson and žən can be used independently as attributes (12) or heads of an noun phrase (ɜnson ‘convenience’ and žən ‘difficulty’). The essential peculiarities of the Ossetic noun phrase are as follows (for more details, see Thordarson [2009]): an attribute and any dependent noun phrase precedes its head; an attribute has no case or number marking; possessive proclitics and possessive noun phrases precede the first attribute of the noun phrase; the noun phrase cannot be split by any external material, including clitics or particles; the only inflexional suffix used with adjectives is the comparative suffix -dɜr. The example below shows a noun phrase with the possessive noun phrase and the comparative suffix -dɜr used with the adjective: (12) šǝrdon tа jɜ sard-ǝ [žǝn-dɜr bon-t-ɜm]NP ɜr-χɜccɜ. . . Syrdon CONTR POSS .3SG life-GEN difficult-COMPAR day-PL-ALL PREF- reached ‘Syrdon again has reached the most difficult days of his life’ (ONC. Džusojty N. G. Sljozy Syrdona. Vladikavkaz, 2004, p. 640). According to ONC data, when used in FDC, ɜnson ‘easy’ or žən ‘difficult’ function as attributes to the verbal derivate in -ɜn and the word combination can be analyzed as a noun phrase.5 In FDC, ɜnson ‘easy’ or žən ‘difficult’ are always preposed to the verbal form in -ɜn and cannot be separated from it by other words, clitics, or particles. The second-position enclitic pronouns cannot be put between ɜnson ‘easy’ / žən ‘difficult’ and the verbal form in -ɜn; cf. the following example, where the enclitic is obligatorily used after the verbal form in -ɜn: (13)

ɜnson kuš-ɜn zə u? easy work- NMLZ 3SG . ENCL . INESS be.PRS .3SG ‘Is it easy (comfortable) to work in it (in the tails)?’ (ONC. Маx dug, no. 5, 2003, p. 130).

5 However, it should be pointed out that, when used in FDC, the head of the noun phrase cannot vary in case and number.

‘Difficult’ and ‘easy’ in Ossetic

285

Used in FDC, ɜnson ‘easy’ or žən ‘difficult’ cannot have case or number affixes; however, it can be marked by the comparative suffix -dɜr; cf.: (14) žɜχχ-ə ɜmɜ foš-ə kwəštɜg-tɜ χɜχbɜšt-ə land-GEN and cattle-GEN work-PL . NOM mountainous.terrain-INESS wɜldaj žən-dɜr kɜn-ɜn wəd-əštə particularly difficult-COMPAR do- NMLZ be-PST.3PL ‘It was more difficult to farm the land and to ranch the cattle in particularly in the mountainous terrain’ (ONC. Max dug, no. 8, 2004, p. 91). It is worth mentioning that, used outside FDC with some verbal forms in -ɜn, ɜnson and žən can form set expressions, e.g., ɜnson-ɜmbar-ɜn ‘clear, understandable’ (easy-understand- NMLZ ), žǝn-bǝχš-ɜn ‘something difficult to endure’ (difficult-endure- NMLZ ). However, in FDC, ɜnson or žən do not form a compound with the verbal form in -ɜn as they maintain the attributive functions (can be marked by the comparative suffix -dɜr).

4.3 Auxiliary The auxiliary wɜvǝn ‘to be’ can be used in FDC in all tenses and moods, including the imperative; see example (15). In third person singular of the present indicative, the form u is used – see example (13); note that in some other modal constructions the existential copula i / iš / j 6 is used in this case; see section 9. The auxiliary wɜvǝn ‘to be’ can be used with verbal prefixes; see example (15). (15)

nɜ POSS .1PL

χɜšš-ɜn keep- NMLZ

foš cattle

žəmɜdž-ə winter-INESS

ɜnson easy

dar-ɜn, keep- NMLZ

ɜnson easy

fɜ-wɜnt! PREF- be.IMP.3PL

‘Let our cattle ranch easily during the winter time!’ (ONC. Ajlarov I., Gadžinova R., Kcoeva R. Poslovicy. Vladikavkaz, 2005, p. 606). In FDC, besides the verb wɜvǝn ‘to be’, the habitual verb vɜjjǝn ‘to be’ can be used, which has only third person singular and third person plural forms of the present indicative; cf.:

6 The allomorph j is used with negation.

286

Arseniy Vydrin

(16) finnag gorɜt Kuopio-jǝ universitet-ǝ psiχolog-tɜ Finnish town Kuopio-GEN university-GEN psychologist-PL . NOM kwǝd š-bɜrɜg kod-t-oj, aftɜmɜj, dam, pessemist-t-ɜn nɜ, how PREF- known do-TR- PST.3PL so CIT pessimist-PL- DAT NEG fɜlɜ optimist-t-ɜn žǝn-dɜr sɜr-ɜn vɜjj-ǝ but optimist-PL- DAT difficult-COMPAR live- NMLZ be.HAB - PRS .3SG ‘The psychologists of the Kuopio university found out that the life is more difficult not to the pessimists but to the optimists’ (ONC. Max dug, no. 8, 2002, p. 163). Negation particles always attach to the auxiliary in FDC; cf.: (17)

. . .wəj ɜnson ra-χat-ɜn nɜ wəzɜn that easy PREF- understand- NMLZ NEG be.FUT.3SG ‘It will be not easy to understand it’ (ONC. Bicoev G. Kh. Večernjaja zvezda. Vladikavkaz, 2003, p. 156).

The auxiliary can be either postposed (see example [17]) or preposed (see example [18]) to the noun phrase . In ONC, a preposition of the auxiliary is rarer than its postposition. (18) fɜlɜ ajnɜdž-ə šɜr-t-ə marγ-ɜn u ɜnson a-tɜχ-ɜn, adɜm, but rock-GEN top-PL- INESS bird-DAT be.PRS .3SG easy PREF- fly- NMLZ man gal-tɜ ɜmɜ wɜrdɜ-ttɜ sə kɜn-oj tɜχ-ən nɜ žon-əns!.. ox-PL . NOM and cart-PL . NOM what do- CONJ.3PL fly- INF NEG know- PRS .3PL ‘But for a bird it is easy to fly (lit. ‘a bird easily flies = a bird can fly easily’) across a rock and what should men, oxen and carts do, they cannot fly!’ (ONC. Džusojty N. G. Sljozy Syrdona. Vladikavkaz, 2004, p. 472). The auxiliary can be separated from the noun phrase with the verbal form in -ɜn; cf. the following examples in which the citative clitic and the interrogative adverb are placed between the verbal derivate in -ɜn and the auxiliary: (19) žən ɜmbar-ɜn, dam, štə, Nafi-jə fəšt-ət-aw wɜžžau. . . difficult understand- NMLZ CIT be.PRS .3PL Nafi-GEN letter-PL- EQU hard ‘They say, it is difficult to understand them as the Nafi’s letters, it’s hard . . .’. (ONC. Маx dug, no. 9, 2003, p. 21).

‘Difficult’ and ‘easy’ in Ossetic

287

(20) kɜd χ wǝsaw iš wɜd rɜštzinad aftɜ žǝn ar-ɜn sɜmɜn u if God EXT than truth so difficult find- NMLZ why be.PRS .3SG ‘If God exists, why is it so difficult to find the truth’ (ONC. Aγnatǝ Gɜštɜn. Temǝrǝ kɜštɜr čǝžg. Vladikavkaz, 2013). Also see example (30) where the subordinator stands between the combination of ɜnson + the verbal derivate in -ɜn and the auxiliary.

5 Voice and FDC In the previous section, it has been mentioned that in FDC, the auxiliary agrees with a Patient-like participant, while an Agent-like participant is marked by the dative. One can note that FDC is close to passive constructions. Ossetic has a number of passive and modal passive constructions. In the paper, I will mention only the standard passive construction (for details, see Vydrin 2011). The standard passive construction is formed by the past participle of a lexical verb and one of the auxiliaries, sɜwǝn ‘to go’ or wɜvǝn ‘to be’ (or the habitual vɜjjǝn), which agrees with Patient in person and number; Agent is marked by the ablative; cf. the active construction (see example [21]) and the standard passive construction (see example [22]): (21)

kušdžǝ-tɜ χɜzаr аrɜžt-оj worker-PL . NOM house build- PST.TR .3PL ‘The workers have built the house’.

(22)

χɜzаr аrɜžt u kušžǝ-t-ǝ аmаl-ɜj house built.PART. PST be.PRS .3SG worker-PL- GEN means-ABL ‘The house has been built by the workers (lit. ‘by the powers of workers’).

Besides different verbal forms (past participle vs. the form in -ɜn), the standard passive construction differs from FDC in the following: (a) case marking of Agent-like participant (the dative in FDC and the ablative in the passive construction); (b) transitivity of the lexical verb (FDC is used both with transitive and intransitive verbs, while the passive construction can be formed only from transitive verbs); (c) semantics (the standard passive construction does not convey any modal meaning, while FDC does).

288

Arseniy Vydrin

6 FDC and Ossetic complex predicates As with all other modern Iranian, in Ossetic, the bulk of the verbal lexicon is formed by the so-called complex verbs – predicates consisting of an N-constituent (a noun, an adjective or a verbal derivate) and a V-constituent (a finite verb); N is always preposed to V; for example, š-аχwǝr kɜn-ǝn (PREF- study do- INF ) ‘to teach’, ‘to study’. The detailed discussion of morphological and syntactic peculiarities of the Ossetic complex verbs goes far from the topic of this article (cf. Grashchenkov 2010; Vydrin 2014: 43–48). According to my understanding of the Ossetic complex verbs, their key features are as follows: the constituents cannot be separated from each other by other words or moved to another position; N cannot have most of the nominal flexion; the verbal negation particles cannot attach to the verb and are placed in front of the whole complex predicate; verbal prefixes are attached only to N. If we compare the peculiarities of complex verbs with the syntactic features of FDC, one can note that the auxiliary and the combination of ‘easy’/‘difficult’ + a verbal derivate in -ɜn cannot be considered a complex predicate. The auxiliary in FDC can be separated from the adjective + a derivate in -ɜn combination; it can also be used in front of it; the negative markers are attached only to the auxiliary while the verbal prefixes can attach either to the auxiliary or to the verbal derivate in -ɜn (or to the nominal part of the verbal derivate in -ɜn in case of a complex verb) (see section 4.3).

7 The use of FDC I have checked the compatibility of FDC with other voice, valence-increasing and modal constructions. FDC can be used only in the causative construction. The causative construction is formed by the auxiliary kɜnǝn ‘to do’ used together with the infinitive of the lexical verb; Causee is marked by the nominative / genitive7 (if the verb is intransitive) or by the dative (if the verb is transitive). For example: (23) mad jɜ čǝžg-ɜn š-kɜn-ǝn kod-t-a nog k?aba mother POSS .3SG daughter-DAT PREF- do- INF do-TR- PST.3SG new dress ‘Mother forced her daughter to put on a new dress’ (Bagaev 1965: 340). 7 Nominative can be used only with inanimate Causee.

‘Difficult’ and ‘easy’ in Ossetic

289

Compare the following example of FDC used in the causative construction: (24)

zonəγ sledge

jɜ POSS .3SG

ɜvɜccɜgɜn, probably

wəsə that

mid-bənat-ə inside-place-INESS ɜnɜ-bon without-strength

fe-nk?wəš-ən-gɜn-ɜn PREF- move- INF- do- NMLZ

ɜmɜ, and

a-šald PREF- freeze.PST.3SG song-ɜn hand-DAT

žən difficult

wəd be.PST.3SG

‘The sledge got frozen to the earth and apparently for his/her weak hands it was difficult to move it’ (ONC. Beštaev G. G. Proizvedenija. 3 vols. Vladikavkaz, 2004, p. 449). FDC can be used both in assertive and interrogative sentences. The examples of FDC used in assertive sentences have been given above. The following examples show its use in interrogative sentences. It is worth mentioning that it is only the verbal form in -ɜn that cannot function as the question focus (cf. possible translations of example [25]). (25)

wədon žən nəv kɜn-ɜn štə? they difficult paint do- NMLZ be.PRS .3PL ‘Is it difficult to paint them?’ (Is it difficult to paint THEM or smb else? Is it DIFFICULT or easy to paint them? *Is it difficult to PAINT them or to find them?).

(26)

wəmɜn wədon žən š-nəv that.DAT they difficult PREF- paint ‘Is it difficult for him to paint them?’

kɜn-ɜn do- NMLZ

štə? be.PRS .3PL

The construction can be used in contrastive sentences, for example: (27)

asə this

lɜppu boy

žən difficult

nəv paint

ta

nɜ-w

CONTR

NEG - be.PRS .3SG

kɜn-ɜn do- NMLZ

u be.PRS .3SG

wəsə that

lɜppu boy

‘It is easy to paint this boy, but not that one’. (28)

ɜnɜ without

dɜw you.SG .GEN

demɜ you.SG .COM

ta CONTR

mən 1SG . ENCL . DAT

žən difficult

sɜr-ɜn live- NMLZ

u, be.PRS .3SG

ɜnson easy

‘It is difficult for me to live without you, but with you, it is easy’.

290

Arseniy Vydrin

In FDC, any of the constituents can be omitted, i.e., Patient (example [10]), Agent (example [27]) or Sole participant (example [13]), the auxiliary (example [28]) and the verbal form in -ɜn (examples [27] and [28]). The construction can be used both in the main and in the subordinate clauses; cf. the following example in which FDC is used in a purpose clause: (29)

sɜmɜj for

wəj that

ɜnson-dɜr easy-COMPAR

ra-mbul-ɜn PREF- win- NMLZ

wa be.CONJ.3SG

wəj that

təχχɜj

qɜw-ə təng birɜ arχaj-ən need- PRS .3SG very many work- INF ‘One should train a lot to best him easily’. POST

8 Semantics Semantically, FDC is used to convey the meaning ‘easy or difficult to accomplish’. Sometimes FDC can have a non-epistemic possibility meaning; cf. example (18). When formed from a transitive predicate, FDC conveys the properties of the Patient-like participant, which cause difficulties for the accomplishment of the situation (difficilitive meaning) or make the accomplishment of the situation easy or possible (facilitive meaning). When formed from an intransitive predicate, the construction expresses the situation, which is easy or difficult to accomplish for the Sole participant of the situation. An interesting point is that FDC’s Agentlike participant can be either animate or inanimate; cf. the example in which it is inanimate: (30)

ɜmɜ and

ištə something

dɜr

ɜnson easy

FOC

nisɜjag worthless

ɜfχɜr-ɜn insult- NMLZ

fe-žnɜt-gɜn-ɜn PREF- fury-do- NMLZ

kwə if

nəχaš-ɜn word-DAT

wa be.CONJ.3SG

‘(He was afraid that) some worthless, insulting words will easily throw him into a rage’ (lit. ‘he will be easily thrown into a rage by worthless, insulting words’) (ONC. Gusalov B. M. I vozdastsja každomu. Vladikavkaz, 2003, p. 102). FDC can express both agentive and non-agentive situations; cf. the following examples with the non-agentive verbs roχ kɜnən ‘to forget’ and ulɜfən ‘to breathe’:

‘Difficult’ and ‘easy’ in Ossetic

(31)

šəγdɜg pure ne NEG

žɜrdɜ-jə heart-GEN

ɜnk?ar-ɜn-tɜ feel- NMLZ- PL . NOM

aftɜ so

ɜnson easy

291

roχ-gɜn-ɜn forgotten-do- NMLZ

štə be.PRS .3PL

‘The feelings arisen from a pure heart are not so easy to forget’ (ONC. Max dug, no. 9, 2003, p. 79). (32)

ra-jqal PREF- awake wəd be.PST.3SG

dɜn, be.PRS .1SG wəmɜ that.ALL

žən difficult

ulɜf-ɜn breath- NMLZ

mən 1SG . ENCL . DAT

kɜj that

gɜšgɜ POST

‘I woke up because it was difficult to breathe’ (ONC. Max dug, no. 4, 2001, p. 131).

9 Origin There is no diachronic evidence of the FDC’s existence before the nineteenth century as Ossetic was an unwritten language till the middle of nineteenth century8 and there is very little data about the Alanian (the ancestor of the modern Ossetic) language. However, apparently the origin of FDC is connected to the construction of participant-external possibility9. Both constructions consist of the verbal derivate in -ɜn and the auxiliary. Agent-like participant is marked by the dative in both constructions. Also note that the verbal derivate in -ɜn is used only in these two modal constructions; cf. the following examples of the participant-external possibility construction formed from an intransitive (example [33]) and a transitive (example [34]) verb. (33) Baratašvili-jə qoməšdžən Baratašvili-GEN powerful k?wəndɜg narrow

kurdiat-ɜn talent-DAT

passivon romantizm-ə passive romanticism-GEN

fɜlgɜt-t-ə ba-sɜw-ɜn nɜ frame-PL- INESS PREF- go- NMLZ NEG

wəd be.PST.3SG

‘Powerful talent of Baratašvili could not fit in the narrow frames of romanticism’ (ONC. Beštaev G. G. Proizvedenija. 3 vols. Vladikavkaz, 2004, p. 226). 8 The first book in Ossetic was published in 1798 using the Church Slavonic alphabet (translation of the Catechism). Since then there were several unsuccessful attempts to develop an Ossetic alphabet based on the Georgian alphabet (Khutsuri script). Only in the middle of the nineteenth century was the Ossetic alphabet based on Cyrillic letters developed. 9 Participant-external possibility is understood as one of the main meanings of non-epistemic possibility, which “refers to circumstances that are external to the participant, if any, engaged in the state of affairs” and that make this state of affairs possible (e.g., To get to the station, you can take bus 66) (van der Auwera and Plungian 1998: 80).

292 (34)

Arseniy Vydrin

adɜm-ɜn adɜm mar-ɜn nɜ-j man-DAT man kill- NMLZ NEG - EXT ‘A human being is not allowed to kill a human being’.

Semantically, the two constructions are also close to each other: FDC can convey non-epistemic possibility; cf. example (18). However, the constructions differ greatly in their morpho-syntax. U nlike FDC, where the auxiliary agrees with a Patient-like participant, the auxiliary in the participant-external possibility construction is always in third person singular (in the present indicative the existential copula i [iš] is used). One can hardly imagine the direct evolution of FDC from the participant-external possibility construction. The semantic map of modality (van der Auwera and Plungian 1998) also does not offer the grammaticalization path . It is not clear why FDC uses only two adjectives and does not allow other VALUE - adjectives (e.g., ‘pleasant’, ‘long’ (time), ‘terrible’, etc.). There are two logical options of FDC’s evolution. Either FDC is an old construction and used to be compatible with an open list of VALUE adjectives, and then the list shortened to only the two adjectives; or FDC appeared in Ossetic recently on the basis of the two adjectives, and later the use of other VALUE adjectives may become possible. Due to the lack of data on the Ossetic language before the nineteenth century, it is impossible to prove or disprove any of the named logical options.

10 Facilitive/difficilitive in other Caucasian and Iranian languages According to available grammatical studies, besides Ossetic, no other Iranian languages are reported to have a dedicated marker or a dedicated construction to convey facilitive or difficilitive semantics. Among the Caucasian languages geographically close to Ossetic, only Adyghe (North West Caucasian) grammaticalizes the meaning ‘easy / difficult to accomplish’. In Adyghe, the suffix -ʁweṣ̂wǝ conveys facilitive meaning and the suffix -ʁ waje expresses difficilitive semantics (Rogava and Keraševa 1966: 297–298); cf.: (35)

ar bwew ṣ̂e-ʁweṣ̂ǝ aš ɣe-pλe-ž’ǝ-ʁwaj-ep 3SG . ABS very do-FCL 3SG . ERG LOC - look.at-REFACTIVE - DFC - NEG ‘This is easy to do, this is not difficult to look at again’ (Rogava and Keraševa 1966: 298).

‘Difficult’ and ‘easy’ in Ossetic

293

Adyghe facilitive and difficilitive suffixes originate in the combinations of the roots ʁ we ‘time’ + ṣ̂wǝ ‘good’ (facilitive) and ʁ we ‘time’ + je ‘evil’ (difficilitive) (Rogava and Keraševa 1966: 297). Note that in Kabardian, which is a close relative of Adyghe and which, unlike Adyghe, has been in close contact with Ossetic for a long time, facilitive and difficilitive suffixes are not attested (e.g., Abitov and Balkarov 1957). It is reasonable to assume that Ossetic was not under the areal influence and developed FDC independently. A separate study is needed to find out where else, beyond the Caucasian region, the grammaticalization of the meaning ‘easy / difficult to accomplish’ is possible. However, it seems that Ossetic FDC is a typological rare phenomenon.

11 Conclusion In this article, I have examined the peculiarities of the dedicated construction (facilitive-difficilitive construction or FDC) conveying the meanings ‘difficult to accomplish’ and ‘easy to accomplish’ in Ossetic. This construction is not attested in standard Ossetic grammars. FDC resembles the Tough Construction, which is attested in well-studied European languages; cf. English John is tough (for Mary) to please; girls are tough to please (among many others see, e.g., Comrie and Matthews 1990; among recent studies see, for example, Hicks 2009), where the auxiliary agrees with a Patient-like participant, an Agent-like participant is marked by the preposition, which is usually used for Beneficiary. However, unlike FDC, Tough Construction is not restricted to the use of adjectives with the meanings ‘easy’ and ‘difficult’ and allows the use of almost all VALUE adjectives, cf. This room is pleasant to sleep in. The narrow semantics of FDC determines its specificity. However, further research is needed to find out where else the facilitive and difficilitive meanings can be grammaticalized to a dedicated marker or construction.

12 Abbreviations – ablative; ABS – absolutive; ALL – allative; CIT – citative; COM – comitative; – comparative; CONJ – conjunctive; CONTR – contrastive; DAT – dative; DFC – difficilitive; ENCL – enclitic; EQU – equative; ERG – ergative; EXT – existential; FCL – facilitive; FDC – facilitive-difficilitive construction; FOC – focus; ABL

COMPAR

294

Arseniy Vydrin

– future; GEN – genitive; HAB – habitual; IMP – imperative; INESS – inessive; – infinitive; ITER – iterative; LOC – locative; NEG – negation; NMLZ – nominalization; NOM – nominative; NP – noun phrase; ONC – Ossetic National Corpus; PL – plural; POSS – possessive; POST – postposition; PREF – prefix; PRS – present; PRTCL – particle; PST – past; RFL – reflexive; SG – single; TR – transitive. FUT INF

References Abaev, Vasilij I. 1958. Istoriko-etimologičeskij slovar’ osetinskogo jazyka. T. 1. Moscow and Leningrad: Izdatel’stvo akademii nauk SSSR. (Etymology dictionary of the Ossetic language. Vol. 1. In Russian.) Abaev, Vasilij I. 1959. Grammatičeskij očerk osetinskogo jazyka. Ordžonikidze: SeveroOsetinskoje knižnoje izdatel’stvo. (A grammatical sketch of Ossetic. In Russian.) Abaev, Vasilij I. 1964. A grammatical sketch of Ossetic. The Hague: Mouton. Abaev, Vasilij I. 1989. Istoriko-etimologičeskij slovar’ osetinskogo jazyka. T. 4. Leningrad: Nauka. (Etymology dictionary of Ossetic language. Vol. 4. In Russian.) Abitov, Muhob L. & B. H. Balkarov. 1957. Grammatika kabardino-čerkesskogo literaturnogo jazyka. Moscow: Izdatel’stvo akademii nauk SSSR. (A grammar of the literal KabardianCherkess language. In Russian.) Ahvlediani, Georgij S. (red.). 1969. Grammatika osetinskogo jazyka. T. 2. Sintaksis. Ordžonikidze: Naučno-issledovatel’skij institut pri sovete ministrov Severo-Osetinskoj ASSR. (A grammar of the Ossetic language. Vol. 2. Syntax. In Russian.) Bagaev, Nikolaj K. 1965. Sovremennyj osetinskij jazyk. Čast’ 1 (fonetika i morfologija). Ordžonikidze: Severo-Osetinskoje knižnoje izdatel’stvo. (Modern Ossetic Language. Part 1. In Russian.) Comrie, Bernard & Stephen Matthews. 1990. Prolegomena to a typology of Tough Movement. In William Croft, Keith Denning & Suzanne Kemmer (eds.), Studies in typology and diachrony, 44–58. Papers presented to Joseph H. Greenberg on his seventy-fifth birthday. Amsterdam and Philadelphia: John Benjamins. Dzahova, Veronika T. 2009. Fonetičeskie harakteristiki fonologičeskoj sistemy sovremennogo osetinskogo (ironskogo) literaturnogo jazyka. Monografija. Vladikavkaz: Izdatel’stvo SOGPI. (Phonetic description of phonological system of modern standard [Iron] Ossetic. In Russian.) Grashchenkov, Pavel V. 2010. Složnye predikaty v osetinskom jazyke. Bjulleten’ Obščestva vostokovedov RAN. Vyp. 17 [Special issue: Trudy mežinstitutskoj naučnoj konferencii “Vostokovednye čtenija 2008”: Moskva, 8-10 oktjabrja 2008 g.] Moscow: Učreždenie Rossijskoj akademii nauk “Institut vostokovedenija RAN”, pp. 115–130 (Complex predicates in Ossetian. In Russian.) Hicks, Glyn. 2009. Tough constructions and their derivation. Linguistic Inquiry 40 (4). 535–566. Kibrik, Aleksandr E. 1997. Beyond subject and object: Toward a comprehensive relational typology. Linguistic Typology 1 (3). 279‒346. Rogava, Georgi V. & Zejnab I. Keraševa. 1966. Grammatika adygejskogo jazyka. Krasnodar and Majkop: Knižnoe izdatel’stvo Krasnodar. (A grammar of the Adyghe language. In Russian.)

‘Difficult’ and ‘easy’ in Ossetic

295

Takazov, Harum. A. 1992. Kategorija glagola v sovremennom osetinskom jazyke. Moscow: Institute of Linguistics of the Russian Academy of Sciences, doctoral dissertation. (Verb in the Ossetic language. Unpublished doctoral dissertation. In Russian.) Tehov, Fidar D. 1970. Vyraženije modal’nosti v osetinskom jazyke. Tbilisi: Mecniereba. (Modality in the Ossetic language. In Russian.) Thordarson, Fridrik. 2009. Ossetic grammatical studies. (Sitszungsberichte der philosophischhistorischen Klasse, Bd. 788, Veröffentlichungin zur Iranistik. Nr. 48). Wien: Verlag der Österreichischen Akademie der Wissenschaften. van der Auwera, Johan & Vladimir Plungian. 1998. Modality’s semantic map. Linguistic Typology 2: 79–124. Vydrin, Arsenij P. 2011. Sistema modal’nosti osetinskogo jazyka v sopostavitel’nom osveščenii. Saint Petersburg: Institute for Linguistic Studies of the Russian Academy of Sciences, unpublished dissertation. http://www.ossetic-studies.org/vydrin/index.php/publications/. (Modal system of Ossetic. In Russian.) Vydrin, Arsenij P. 2014. Glagol v osetinskom jazyke. Vostokovedenie. Istoriko-filologičeskie issledovanija. Mežvuzovskij sbornik statej. Vyp. 30 (zaklučitel’nyj). Pamjati akad. M. N. Bogoljubova. Saint Petersburg, pp. 25–80. (Ossetic verb. In Russian.)

Z. A. Yusupova

15 Possessive construction in Kurdish Abstract: The article contains an analysis of the possessive construction in Kurdish. This construction, employed with structural and semantic variations in all Kurdish dialects, is most widespread in southern Kurdish dialects, where it is formed with enclitic personal pronouns. The northern dialects, which don’t have enclitic personal pronouns, use the oblique case forms of personal and sometimes reflexive pronouns for the construction. For the study of the southern Kurdish dialects, I use the divans (poetic ontologies) of eighteenth- and nineteenthcenturies’ Kurdish poets published in Iraqi Kurdistan, while the study of the northern dialects is based on the folklore texts recorded by O. Mann and I. Zuckermann. Standard romanized transcription for modern linguistic Kurdish studies are used. Keywords: Kurdish studies, Kurdish linguistics, Kurdish dialects, Iranian languages Table 1: Symbols used in this article and corresponding IPA symbols: Symbol used in this article

IPA symbols

C,c Ç,ç E,e Ê,ê ẍ Î,î J,j L,l Ī R,r R̅,ȓ Ş,ş Û,û Y,y ‘

[ʤ] [ʧ] [æ] [e:] [ɣ] [i:] [ʒ] [l] Velar [l] [ɾ] [r] (rolled front [r]) [ʃ] [u:] [j] [ʔ]

As in other Iranian languages, in Kurdish the so-called possessive construction is widely used, its grammatical meaning still not adequately understood. Kurdological studies have actually described certain variants of that construction Z. A. Yusupova, Institute of Oriental Manuscripts of the Russian Academy Sciences DOI 10.1515/9783110455793-016

298

Z. A. Yusupova

on the basis of separate Kurdish dialects; their authors perceived it either as a possessive construction as such or as a sentence with the thematically emphasized subject.1 The newly reviewed sources, including texts of literary compositions in both southern Kurdish dialects (Gorani, Avramani, Sorani) and northern ones (Kurmanji, Zaza), attest to the fact that the possessive construction varying across these dialects may have quite a few meanings. As follows from the reviewed materials, besides expressing the idea of possessing per se, the construction was widely used to characterize the subject of a sentence, its condition, or a feature it might have. This article analyzes possessive constructions denoting the subject’s condition. In southern Kurdish dialects, the possessive construction is formed with enclitic personal pronouns (first, second, and third person singular and plural). In such cases, the subject is frequently expressed repeatedly: with a noun (or a personal pronoun) usually opening the sentence, as well as an enclitic pronoun, the position of the latter depending on the structure of the sentence. At that, the verb acquires the form of third person singular. The available texts allow for distinguishing the following structural and semantic variants of the analyzed construction.

1 The most common version having a copula verb Model 1, the most frequent one, where the subject is expressed twice, once with a name or a personal pronoun, and then an enclitic pronoun duplicating it and having a corresponding form. Its position may vary: It may be merged with the object2 to express genitive relation (the most common case): (1)

DiĪ naĪan-iş-en diĪ naĪan-iş-en (DS, 107)3 [The heart is moaning, the heart is moaning (lit. ‘the heart – its moaning // it has’);] Nergis to dîde-t nabîna û kor-en (DS, 175) [(Oh), narcissus, your blind eyes cannot see;] Ce guĪzar buĪbuĪan axêz-şan-en (DS, 69) [Nightingales are awakening in the garden;]

1 See Tzukerman (1965: 161‒165, 1986: 184‒187); Eyubi and Smirnova (1968: 138‒140); Yusupova (1985: 112‒114, 1998: 121‒125). 2 Here and below a more precise term would be “object of the state of being”. 3 Between the parentheses are the universally accepted abbreviations denoting the quoted sources and their page numbers.

Possessive construction in Kurdish

299

It may be merged with an indirect object (rarely): (2)

Dirext û dar ce paîz-şan zwîrî-n (DS, 68) [(All) forests and trees are offended by the fall;] Ya ‘EwdaĪ Seîdî hawar le to-ş-en (DS, 62) [(Oh), mountain Awdal, Saidi appeals to you for help;]

It may be merged with the instrumental-directive preposition pê: (3)

Min çunke hîcran[î] mecnûnî pê-m-en (DR, 127) [As I feel sad to be away, (like) Majnun (did). . . ;] Min ce to fêşter meyl[î] to pê-m-en (XQ, 743) [I love you more than you (love me).]

There is one case involving the locative preposition la: (4)

Minîç kemê fam[î] Eyaz la-m-en (DM, 156) [I have some of Ayaz’s wisdom as well.]

Model 2, where the subject is expressed solely with the enclitic pronoun merged with the object or, less frequently, with a preposition: (5)

Arezûy dîdar[î] dîn[î] to-şan-en (XQ, 774) [They have the desire to see you;] Ce dûrî[y] yaran bêqerarî-m-en (DS, 107) [Due to my being away from my beloved, I am worried;] Pê-m weş-en gêcaw[î] deryay xem (DM, 360) [I like to whirl in the maelstrom of sadness.]

Model 3, where the enclitic pronoun duplicates the indirect prepositional object expressed by the words denoting parts of the subject (soul, body, heart, eyes, etc.). Its position may be the following: It may be merged with the preposition: (6)

Dîdey piȓ esrim xaw ce la-ş dûr-en (DS, 62) [My eyes full of tears are sleepless (lit. ‘sleep is far from them’);] Xatirim yend xar[î] meynet tê-ş-da . . . (DM, 262) [My soul has so many thorns of grief . . . ;]

300

Z. A. Yusupova

It may precede the preposition: (7)

Ten zam[î] xedeng[î] mujey to-ş pê-we-n [My body has wounds caused by the shots of your eyelashes.]

Model 4, where the enclitic pronoun not only duplicates the indirect object, but also the subject:4 (8)

Wer to bînaî-t perde-ş ha ne ser . . . (DM, 72) [In case (there is) a cloth (of ignorance) over your eyes . . . ]

In this last example, like in the one with the preposition tê (see above), the copula has been omitted. In the negative construction having a copula verb, the enclitic duplicating the subject may occupy the following positions: It may be merged with the negative form of the copula: (9)

Çe ko te bijnewu deng-it nîye-m goş (DS, 38) [How can I listen to you, (if) I do not hear you (lit. ‘if your voice is away from my ears’);] Çi ser zemîne nîyen-iş qerar (DS, 132) [There is no rest for him on this Earth:]

It may be merged with the object or with the word defining it: (10)

Xeşmit ‘ezîmen kerem bêsaman Kes tewanay qehr[î] to-ş nîyen (DS, 222) [Your rage is great, your generosity, limitless, (But) nobody has the strength to be angry with you;] Hîç kes ne hîç ca sebûrî-ş nîyen (DR, 67) [Nobody has the patience nowhere.]

It is of interest to compare these cases with those where the subject of a negative construction is solely expressed with an enclitic pronoun: (11)

Ta ser çenî kes nîyen wefa-şan (DS, 85) [They are never totally faithful to anybody;] Ew dîdey mest-iş coyay xew nîyen (DS, 100) [Her intoxicated eyes do not know sleep (lit. ‘do not seek sleep’).]

4 Here and below a more precise term would be “subject of the state of being”.

Possessive construction in Kurdish

301

The texts also contain examples of sentences containing copulas, in which the negation is expressed with the particle ne preceding the object while the copula remains positive: (12)

Seîdî tasey sext[î] dûrî[y] yar-iş-en Ne aram, ne sebir, ne qerar-iş-en (DS, 61) [Saidi suffers terribly (due to) his being away from his beloved one, He has neither peace, nor patience, nor rest;] Seîdî ne ȓo xurd, ne şew xaw-iş-en (DS, 101) [Saidi neither eats during the day, nor sleeps during the night (lit. ‘neither food at daytime, nor sleep at night he has’).]

2 Constructions with various tenses of the verb bî ‘to be’ Model 1, where the subject is duplicated by an enclitic pronoun in the following positions: It may be merged with the object: (13)

Toîç însaf-it bo henî wese wes . . . (DM, 26) [You, be just, enough tormenting [me], enough!;] Ez Uromon mekan-im bê, wuĪatim (DS, 13) [Avraman was my refuge, my motherland;] Eger to pîr û pîr-ê-t bo tikakar . . . (DS, 54) [If you are a pir, and you [yourself] require a pir . . . ;]

It may be merged with the verb: (14)

To bo-t ne diĪ hof[î] xudat (DS, 97) [May in your heart there be fear (in front) of your God;] Dirêẍ min bîa-m dû hezar dîde . . . (DS, 55) [If I only had two thousand eyes!;] Cew duma her wext Qeys bê-ş arezû . . . (DS, 95) [After that, every time when Qays developed a desire . . . ]

It may be merged with the preposition: (15)

Er fam pê-ş bîa diĪey mecnûn xo Biryaş temay laĪ[î] şîrîn ȓaz[î] to (DS, 63) [If my mad heart were conscientious, It would reject the clandestine desire of your sweet lips.]

302

Z. A. Yusupova

Model 2, where the subject is solely expressed through an enclitic pronoun merged with the object: (16)

Qiblem, ta seher guft-u-go-şan bê, Daway hamşanî zuĪfî to-şan bê (DS, 34) [Darling, till the (very) morning they chatted, (they) Tried to compete with the smell of your locks;] BeĪkim xeĪasî-m ne des dûrî-m bo (DM, 45) [Maybe, (then) I will be saved from separation.

The negative construction having the verb bî standing in preterit can appear in two shapes: With the verb having a negative form: (17)

Axir dawa-şan tan û po-ş nebê, Hîç kam boy ‘etrî zuĪfî to-ş nebê (DS, 35) [Finally, (they understood) that their dispute was senseless, (As) none (of them) had the smell of your locks;] Seîdî te hîç-it çar nebê, ta ser çenît kes yar nebê (DS, 25) [Saidi, you had no other way out, (as) nobody remained your friend till the end;]

With the verb having a positive form while the negation gets expressed with the particle ne ‘neither-nor’ (18)

Ne be şew werar, ne be ȓo xaw-it bê (DS, 86) [You had neither peace at night, nor sleep during daytime.]

3 Constructions having the verb hen ‘to have’ Model 1, where the subject is denoted twice, with a name (or a pronoun) and an enclitic pronoun: (19)

Aşiqî be diĪ denaĪê meylî giryan-î heye (DN, 584) [The loving one whose heart is moaning has a tendency to weep;] Ez yarêwe cone-m hene (DS, 26) [I have a beautiful sweetheart.]

Possessive construction in Kurdish

303

Model 2, where the subject is denoted solely with an enclitic pronoun: (20)

Hey hay, cuanim, firêm hen mirad (DM, 4) [Alas, my beauty, I have a lot of wishes!;] ‘Adetêk-î heye, hergîz le kesê napirsê (DN, 592) [She has a rule. (She) is never curious about anybody.]

The negative construction having the verb hen, like the one with bî, is formed with nîye(n) or with the negative particle ne whenever the verb remains in the positive form: (21)

Ax le geĪ ême, Hebîbe ser û peywend-î nîye (DN, 592) [Woe to us, Habibe wants nothing to do with us!;] Hêz[î] pay ȓeft-û-ama-şan nîyen (DM, 29) [They have no (more) strength to move.]

Another construction has been noticed in the texts, in which the negative particle is accompanied by the particle niho, also expressing negation: (22)

Ne fikr-iş hen ce law muĪkî, ne maĪî, Niho bak-iş ce ferzend û ‘eyaĪa (DS, 151) [He has thoughts neither about his property nor his house, He has no fear about his kids or his wife.]

Thus, the primary purpose of the reviewed versions of the possessive constructions in southern Kurdish dialects looks like reporting the features of the subject: indication of its physical or psychical condition, or else indicating its temporary or permanent characteristics. With regard to its structure, this construction follows the possessive model proper (“I a son I have”), also having two basic versions: (1) the subject can be expressed twice, once with a noun (or a pronoun) in the direct case (nominative), and once with a personal enclitic pronoun duplicating it; (2) the subject can be expressed solely with an enclitic pronoun. In other words, the main elements of the possessive construction are the enclitic pronouns actually forming an indirect construction grammatically opposing its direct counterpart; cf.: Êmeyç naĪe-man xo bêkar nîyen (DS, 157) [However, our / of ourselves moaning also brings about a result] versus Axir nek êmeyç dax ne diĪ-ênmê (DM, 410) [However, our hearts are sad as well (lit. ‘we are with sadness in our hearts’)]. In the first case, with the subject expressed with a personal pronoun first person singular, the verbal copula has the form of the third person singular, i.e., it corresponds with the object of the state of

304

Z. A. Yusupova

being, while in the second case, the copula reflects both the person and the number of the (grammatical) subject. Northern Kurdish dialects have lost their enclitic pronouns,5 and therefore similar constructions use indirect forms of personal pronouns in order to denote the subject of the state of being: (23)

Te dil bê xwab u bê xurd-e (MC, 335) [Your heart (is) without sleep or food;] Min dilberek wek duȓ heye (MC, 94) [I have a sweetheart (who is) like a pearl . . . ;] Me ji zulfên botan ȓeştir-e bext (MC, 320) [My happiness is darker than the locks of Botan’s (beauties).]

Compare this also with the negative construction having the verb “to have”: (24)

Bi xudê qet te di dilda ji xudê tirs nehin (MC, 535) [By God, there is no fear of the Lord in your heart.]

Along with that, in Kurmanji, mostly in folklore texts, there is a construction where the subject of the state of being/possession expressed with a noun (or a pronoun) is duplicated with an indirect form of a personal pronoun or a reflexive-possessive pronoun xe/xwe linked with the object via an izafat: (25)

Qewaz lê nihêrî, wekî Zuhre – halêwê t’une (Tz I, 163) [The messenger sees that Zuhre is unwell (– she is in no mood);] E’slan-padşa – wextekê c’yakî wî hebûye (Tz I, 163) [Once upon a time, King Aslan had a mountain;] Mervek – kurekî wî hebûye [A man had a son.]

The construction having a reflexive pronoun duplicating the subject is typical for Khorasan Kurmanji: (26) Wan xorak’î xe jî tonnewun (Tz II, 185) [They had no food either;] Min sed sal omri xe heye (Tz II, 185) [I am a hundred years old (lit. ‘a hundred years of my own life there are’).] 5 With the exception of the pronominal enclitic -ê corresponding to the indirect form of a personal pronoun third person singular. In southern dialects, this enclitic has lost its semantics.

Possessive construction in Kurdish

305

Some occasions when reflexive pronouns duplicated the subject of the possessive construction have been found in the texts by O. Mann (1932) dealing with a Zaza subdialect (the Kor area): jû cinêy jû dostî xu bî (OM, 337) ‘A woman had a friend’; jû kenay xu bî (OM, 342) ‘He had a daughter’. As we see, in the second example, the subject is not lexically expressed. The possessive construction with enclitic pronouns acting as subjects has been noticed in the texts not solely with the verbs denoting being or having. It has also been attested with certain verbs denoting action, most of them intransitive (including compound denominative ones, also expressing the condition of the subject).

4 The constructions with intransitive verbs Perfect tense (most commonly): (27)

To bê wefayî-t ce hed berşîyen (DM, 337) [Your inconstancy has crossed all limits;] Wat ce dîdey bed min pê-m neyawan (DS, 126) [He said, “(It) did not happen to me due to an evil eye”;] To çêş pê-t aman, hamderdî saĪan? (DW, 29) [What has happened to you (lit. ‘has reached you’), old friend?]

Simple past tense (rarely): (28)

Min diĪ-im we tîr[î] to êşa, ya ew? (DM, 405) [Was (it) my heart which was hurt with your arrow, or his?] Min ce dax[î] to bey tewr pê-m ama (DS, 80) [This is what happened to me due to my missing you;] Şîrîn cey guftar derûn-iş coşa (XQ, 322) [These words made Shirin’s soul boil;]

Present-future tense, indicative mood: (29)

Ta şem‘î ȓûy to nûr mewaro lê-ş Min kogay ‘umr-im aîr meşo lê-ş (DS, 63) [As long as the candle of your face radiates light, Flame will emanate from my soul (lit. ‘the source of my life’);] Pêşanî-ş piȓşing[î] nûr cê-ş mixêzo (XQ, 44) [Her forehead radiates bright light;] Mebo pok pî çilê min zêĪ u ceste-m (DS, 54) [With the help of this bough my heart and soul will get cleansed;]

306

Z. A. Yusupova

Present-future tense, subjunctive mood: (30)

Êmeyç ba sate bezmê-man cem bo (DM, 185) [Let us have some merry time;] To wexten we sî biyawo saĪ-it (DR, 115) [Soon you must be thirty years old.]

There are cases with the subject expressed solely with an enclitic pronoun: (31)

Mêşe-m derûn, kize gîon (DS, 25) [(Now) my soul is in pain, (and) my heart is moaning;] Hoş-iş pê nemend (XQ, 619) [He lost consciousness (lit. ‘he had no consciousness left’);] R̅ îşe-m berama (DM, 27) [“I have started growing a beard”.]

This possessive construction with verbs of action has also been found in the “Divan” by Malae Jeziri: (32)

Min hûn bi co têtin ji dil (MC, 266) [Blood is flowing from my heart like a river;] Me dil mabû di xeyalêda (MC, 544) [My heart stayed dwelling in dreams;] Ji agir min nefes hilbû (MC, 554) [The flame (of love) strangled my breath.]

As one can see, in all the listed passages, the subject is expressed with an indirect personal pronoun of first person singular. However, folklore texts may contain such possessive structure with state verbs, including not solely the subject of status, but also an indirect personal pronoun duplicating it: (33)

Siabend û Xecê ‘eşq-mih’beta wan ze‘f çû bal hev (Tz I, 161) [Siaband and Khadje fell in love with each other;] Zînê – şew-ro xurê wê bû girî û hêsr (Tz I, 161) [Zine did nothing but cried day and night;] Koçer – şev deng p-ê k’et, bar kirin (Tz I, 162) [At night, a rumor spread among the nomads, and they left.]

This last passage is peculiar, as the indirect object duplicating the subject is expressed with the enclitic pronoun -ê.

Possessive construction in Kurdish

307

The language of the texts in Gorani dialect has yet another version of the possessive construction; there, it may have a transitive verb in the perfect form of third person singular with the subject expressed with a first person singular personal pronoun duplicated with an enclitic pronoun in the same form. This rare construction is meant to express the physical condition of the subject caused by another condition it had in the past: (34)

Min dûrîy azîz sakin-im senden (DC, 159) [Being away from my beloved one deprived me of peace;] Min derd[î] dûrî tewana-m berden (DM, 317) [The pain of separation deprived me of strength;] Min ye ‘eşq[î] to quwet lê-m senden (XQ, 387) [It was my love to you which deprived me of strength;] Min ye ‘eşq[î] to zebûn-im kerden (XQ, 243) [It was my love to you which made me unhappy;] BeĪam Me‘dûm heris[î] dîde-ş coş werden (DM, 56) [However, tears boiled in Ma’dum’s eyes.]

The following example, attested in Maulawi’s “Divan”, looks grammatically indicative, which proves the structural similarity between the possessive and passive constructions: (35) Min dan[î] zîndegî-m kenden (DM, 16) [The tooth of my life has been pulled out (at me/my life tooth is pulled out)] At the same time, the present construction is formally identical with the one having an active transitive verb in the past tense with an enclitic indication of the person. Thus outside a definite context, this sentence may well mean ‘I have pulled out the tooth of my life’. Despite that structural similarity, these constructions were used to express different relations between the subject and the object. While a possessive construction has a subject enclitic acting as an indirect object (= logical subject) with the verb coordinated with the object of possession or state of being, the construction having a transitive verb in the past tense, the function performed by the enclitic is totally different. There, it expresses the person and number of the grammatical subject. In other words, enclitic pronouns are actually morphologized in the second construction, even though in some dialects the morphologization process has not yet been concluded. That circumstance has its proof in such features as the occasional use of enclitic subjects, the possibility to conjugate transitive verbs like intransitive

308

Z. A. Yusupova

ones, etc.6 Nevertheless, the available material seems to attest to the fact that these constructions have common roots with their origins, making it possible to trace, via a diachronic approach, the reviewed linguistic facts.

Sources DC: Dîwanî Cefayî. Beẍda [Baghdad]: Çapxaney Şefîq, 1980. DM: Dîwanî Mewlewî. Beẍda [Baghdad]: Çapxaney Elnecah, 1961. DN: Dîwanî Nalî. Beẍda [Baghdad]: Çapxaney Koȓî Zanyarî Kurd, 1976. DR: Dîwanî Rencûrî. Beẍda [Baghdad]: Çapxaney Afaq ‘Erebîye, 1983 DS: Dîwanî Seîdî. Silêmanî: Çapxaney Koȓî Zanyarî Kurd, 1971. DW: Hewramî Usman. Welî-Dêwane. Beẍda [Baghdad]: Çapxaney Koȓî Zanyarî Kurd, 1976. MC: Dîwana Melaê Cizîrî. Beẍda [Baghdad]: Çapxaney Koȓî Zanyarî Kurd, 1977. OM: Mann, O. Mundarten der Zâzâ, hauptsächlich aus Siwerek und Kor. Bearbeitet von Karl Hadank. Berlin: Verlag der Preussischen Academie der Wissenschaften, 1932. XQ: Xanay Qubadî. Şîrîn û Xusrew. Beẍda [Baghdad]: Çapxaney Korî Zanyarî Kurd, 1975. Tz I: Tzukerman (1965, 1). Tz II: Tzukerman (1965, 2).

References Eyubi, Karim Rahmanovitch & Irayida Anatolyevna Smirnova. 1968. Kurdsky dialect mukri [The Kurdish dialect of Mukri]. Leningrad: Nauka. Tzukerman, Isaak Iosifovitch. 1965. K characteristike predlozheniya s tematicheskim podlezhashchim [On the characteristic of the sentence with a thematic subject]. Palestinsky sbornik 13 (76). 161‒165. Tzukerman, Isaak Iosifovitch. 1986. Khorasancky kurmandzhi [Khorasani Kurmanji]. Moscow: Nauka. Yusupova, Zare Alievna. 1985. Suleymaniysky dialect kurdskogo yazyka [The Suleymani Kurdish dialect]. Moscow: Nauka. Yusupova, Zare Alievna. 1998. Kurdsky dialect gorani [The Gorani Kurdish dialect]. St. Petersburg: Nauka.

6 See Yusupova (1985: 116‒118).

Carina Jahani

16 To bring the distant near: On deixis in Iranian oral literature Abstract: The purpose of this article is to study oral narratives in a number of Iranian languages with a particular focus on how the audience is brought inside the framework of the story. The oral narratives selected for this study are traditional folktales and legends in Koroshi Balochi, Sistani Balochi, Vafsi, and Gorani. Deictic devices locate an event and its participants in time and space and cannot be fully interpreted without reference to the context. They also bring coherence to the narrative. A deictic center is a point to which the deictic element is anchored. Deixis can be absolute, i.e., place the deictic center at the location and moment of utterance, but the speaker does not necessarily need to adopt his or her own time and location as the deictic center. It is also possible to detach the deictic center completely from not only the temporal and locational setting of the speech, but also from the real world, and to place it at a time and place that never existed or will exist inside an imaginary story (deictic shift). The four linguistic variants in this study show interesting variation when it comes to deictic shift. It is more common for spatial deixis to be shifted to the story than for tense to be anchored in the story. Koroshi Balochi, Sistani Balochi, and Vafsi present almost total spatial deictic shift, whereas in Gorani the deixis is occasionally moved outside the story. Gorani is the language that has the strongest tense anchoring inside the narrative, with almost exclusive use of the non-past tense. At the other extreme we find Sistani Balochi, which has no tense anchoring in the narrative (only past tense verb forms). Koroshi Balochi uses non-past tense for events in the story line and Vafsi changes between using non-past and past tense. Keywords: oral literature, deixis, deictic shift, Balochi, Vafsi, Gorani

1 Introduction The purpose of this article is to study oral narratives in a number of Iranian languages with a particular focus on how deictic devices bring these stories close to the audience in space and time, or rather perhaps how the audience is brought Carina Jahani, Uppsala University DOI 10.1515/9783110455793-017

310

Carina Jahani

inside the framework of the story. The oral narratives selected for this study are traditional folktales and legends. Nyberg (2004: 25) stresses the entertaining rather than moralizing function of oral narratives. Segal (1995: 62) finds that “narrative allows us to vicariously experience phenomena that would be too dangerous or costly to experience directly” and points out that its popularity is obvious since it is “emotionally involving, structurally appealing, and educational”. Like with soap operas of the TV age, it is important to keep the audience’s attention, and various linguistic and extralinguistic means have been observed in Iranian oral narratives for this very purpose. In his discussion of genres in Persian literature, Utas (2008: 229) describes the language of both oral and written literature as “normalized, conventionalized and consciously shaped to be remembered” and thereby fundamentally different from spoken language. In this he opposes Ong (1982: 11–14), who discards the concept of oral literature. Ong believes that only written words are of a lasting character, and that “oral tradition has no such residue or deposit”. The language of oral narratives described in this study is thus, according to Utas, to be regarded as a consciously shaped literary language, albeit not a written language, and by no means an ad hoc creation on each occasion of storytelling. It must therefore be assumed that such a language has its own rules, which are rather stable and possible to describe. This study will investigate how different Iranian languages employ deictic strategies to bring the audience into the story, whether these strategies are employed in all the linguistic varieties under study, and if the strategies are similar or vary considerably from language to language. The focus here will be to investigate deictic pronouns and determiners as well as tense use in four different Iranian varieties, namely Koroshi Balochi, Sistani Balochi, Vafsi, and Gorani. For all these varieties, there are publications with oral tales in transcription and translation into English. For three of the variants, one or more of the texts is fully glossed. There are also grammatical sketches and word lists available, which makes it possible to analyze the texts even when they are not fully glossed. The narratives selected consist of: – four tales in Koroshi Balochi with full glossing and English translation published by Nourzaei et al. (2015: 123–209); – nine tales in Sistani Balochi with full glossing and translation into English published by Barjasteh Delforooz (2010: 286–325, 336–391); – seven tales in Vafsi with English translation published by Stilo (2004: 26–29, 32–57, 104–123); – three tales in Gorani, all with English translation and one with full glossing published by Mahmoudveysi et al. (2012: 63–103).

To bring the distant near: On deixis in Iranian oral literature

311

The selected variants are all representatives of northwestern Iranian languages, although a strict division of western Iranian languages into a northwestern and a southwestern group has lately been questioned by Paul (2003: 71) and Korn (2005: 329–330). Koroshi Balochi is spoken by scattered populations throughout southern Iran, “from Hormozgan all the way to Khuzestan, and onto the Iranian plateau. [. . .] Three areas with significant concentrations of Korosh are Bandar Abbas, around Shiraz, and across the southern part of Fars Province” (Nourzaei et al. 2015: 21). Although Koroshi shares many features with Southern Balochi,1 it must be regarded as a distinct subgroup of Balochi with its own dialect division into Northern Koroshi and Southern Koroshi (Nourzaei et al. 2015: 25). The fieldwork for the monograph in which the tales were published was carried out between 2009 and 2014 by Maryam Nourzaei in and around Shiraz in northwestern Fars Province, Iran, and represents northern Koroshi (Nourzaei et al. 2015: 17–18). Sistani Balochi, which is a variant of Western Balochi, is spoken in Iranian Sistan as well as in adjacent parts of Afghanistan and in scattered pockets throughout northeastern Iran. It is also very similar to the Balochi dialect spoken in Turkmenistan. The fieldwork for the monograph in which the tales were published was carried out between 2000 and 2005 by Barjasteh Delforooz and comprised speakers from both the Iranian and the Afghan part of Sistan (Barjasteh Delforooz 2010: 26). Vafsi is, according to Stilo (2004: 1), “spoken in four villages in west central Iran: Vafs, Chehreqān, Gurchān, and Fark”. These villages are situated southwest of Tehran, between Saveh and Hamedan in Markazi Province, Iran. The folktales published by Stilo were collected by L. P. Elwell-Sutton in the village of Gurchān in 1958. Stilo himself conducted extensive fieldwork on Vafsi in the 1960s and 1970s. This volume was prepared between 1997 and 2000 (Stilo 2004: vii, 5–10). Gorani, as used by Mahmoudveysi et al. (2012: 2–4), is a general term for a number of vernaculars spoken in pockets in Kermanshah Province, Iran (an area dominated by Southern Kurdish dialects), as well as in adjacent parts of Iraq. The dialect under study in this monograph, Gawraǰūyī, is one of several Gorani variants. Fieldwork for the present monograph was carried out in 2007 and 2008 by Mahmoudveysi in Gawraǰū, “a cluster of four hamlets in the Zimkān river valley” in Kermanshah Province, Iran (Mahmoudveysi et al. 2012: 1). All the variants under study have a basic split between past and non-past (present-future) tense. They are all pro-drop and have agreement marking by means of person-marking suffixes on the verb in the non-ergative domain 1 For dialect divisions in Balochi, see Jahani and Korn (2009: 636–638).

312

Carina Jahani

(intransitive verbs and transitive verbs in the non-past tense). Vafsi and Sistani Balochi exhibit a type of differential object marking (DOM) that is common in Iranian languages (Haig 2008: 157–158; Stilo 2004: 232; Barjasteh Delforooz 2010: 286–391), but Koroshi Balochi shows interesting divergences from the normal DOM system in Iranian languages (Nourzaei et al. 2015: 35–36). In the variant of Gorani described by Mahmoudveysi et al. (2012) there is no case marking. In Vafsi, the subject of intransitive verbs (S), the agent of transitive verbs (A) in the non-past system, and the patient (P) of transitive verbs in the past system take the direct case, whereas the A in the past system and the P in the non-past system take the oblique case. We thus have a split ergative construction in Vafsi, in line with the common tense-split ergativity found in Iranian languages (Stilo 2004: 232). In Koroshi Balochi, alignment is normally non-ergative, but the enclitic pronouns remain as agreement markers in the past tense of transitive verbs, in contrast with person-marking suffixes found on all intransitive verbs and on transitive verbs in the non-past system (Nourzaei et al. 2015: 83). The same system of two different sets of agreement markers is found in Gorani of Gawraǰū (Mahmoudveysi et al. 2012: 27–28). Sistani Balochi also has non-ergative alignment, and all verbs mainly have the same person-marking suffixes in both the non-past and the past tense (Barjasteh Delforooz 2010: 286–391).

2 Deixis Deictic devices locate an event and its participants in time and space and cannot be fully interpreted without reference to the context. They also bring coherence to the narrative. Anderson and Keenan (1985: 259) define deictic expressions as “those linguistic elements whose interpretation in simple sentences makes essential reference to properties of the extralinguistic context of the utterance in which they occur”. They (Anderson and Keenan 1985: 259) recognize only three major categories of deixis: person deixis, spatial deixis, and temporal deixis. However, Fillmore (1997: 61) adds social deixis (“the social relationships on the part of the participants in the conversation”), and crucially for this article, discourse deixis, which deals with “the choice of lexical or grammatical elements which indicate or otherwise refer to some portion or aspect of the ongoing discourse” (Fillmore 1997: 103). A deictic center is a point to which the deictic element is anchored. Deixis can be absolute, i.e., place the deictic center at the location and moment of utterance. Lyons (1977: 578–579) finds that the speaker does not necessarily

To bring the distant near: On deixis in Iranian oral literature

313

need to adopt his or her own time and location as the deictic center, as in e.g., “I am going abroad next week”. It is alternatively possible to adopt the spatiotemporal setting of, e.g., an addressee, as in “look left”, to be understood as left of the addressee. Lyons calls this phenomenon deictic projection.2 It is also possible for the deictic center not to be grounded in the actual speech situation at all, which in the case of oral literature would be when the story is actually told. In this case, deictic projection is realized as a movement of the deictic center “from the speaker to an imaginary observer in the story world” (Diessel 1999: 95). It is thus possible to detach the deictic center completely from not only the temporal and locational setting of the speech, but also from the real world, and to place it at a time and place that never existed or will exist, inside an imaginary story. Segal (1995: 14–15) calls this “deictic shift” and finds that: this act of imagination was commented on over 2,000 years ago by Aristotle in his Poetics. He pointed out that poetry (tragedy, comedy, epic) was a mimetic art; its primary mode was to represent actions. The Greek word, mimesis, refers to imitation, or representation, or experience of that which is not literally present. [. . .] The deictic shift approach is consistent with phenomenological experience. When reading fictional text, most readers feel they are in the middle of the story, and they eagerly or hesitantly wait to see what will happen next. Readers get inside of stories and vicariously experience them. They feel happy when good things occur, worry when characters are in danger, feel sad, and may even cry, when misfortune strikes. While in the middle of a story, they are likely to use past tense verbs for events that have already occurred, and future tense for those that have not.

Deictic shift is applicable to written as well as oral narration. In oral narratives it is common for the narrator to first provide an introduction outside the frame of the story and with the deictic center in the actual speech situation (Zubin and Hewitt 1995: 131) by an introduction such as “Once upon a time there was . . .”. Then the narrator has the option of moving the deictic center into the story by “decoupling the linguistic marking of deixis from the speech situation, and reorienting it to the major characters, the locations, and a fictive present time of the story world itself” (Zubin and Hewitt 1995: 131). In the following two sections, spatial, discourse, and temporal deixis in the selected corpus will be discussed in detail, with a particular emphasis on demonstrating how spatial and temporal deixis is anchored inside the story rather than in the real world.

2 Levinsohn (2015: 144–145) provides a set of questions to determine whether and to what extent a particular language allows deictic projection.

314

Carina Jahani

3 Spatial and discourse deixis in the analyzed corpus In this section, spatial and discourse deixis in the present corpus will be analyzed. All the variants under study have a two-way contrast, proximal versus distal deixis for demonstrative determiners3 and pronouns. For deictic spatial adverbs, Sistani Balochi (Barjasteh Delforooz 2010: 138) and possibly also Koroshi Balochi (see n. 5), exhibit a three-way deixis with the locational adverbs “here, there1, there2”. It should be noted that the transcription of language data in the sections below follows that of the works in which the stories were originally published. I have, however, removed marking of stress in those texts that had stress marking, since stress marking was found in only two of the corpuses. The translation mainly follows that of the original works, but minor adjustments have sometimes been made, particularly to demonstrate overtly the meaning of the deictic items. Direct speech has by default its deictic anchoring inside the story and is therefore of less interest in the discussion of deictic shift. Even so, there are interesting conclusions to be drawn from the use of deictic devices in direct speech as well, and they are therefore marked and discussed in the examples in sections 3.1–3.4 below. All forms that contain a deictic element are marked in bold and provided in brackets in the original language.

3.1 Koroshi Balochi Four Koroshi Balochi stories have been investigated. The findings in two of these will be discussed below. The other two stories show a similar picture. Demonstrative determiners and spatial adverbs found in Koroshi Balochi are given in Table 1 and demonstrative pronouns are given in Table 2 (see also Nourzaei et al. 2015: 49–50).4

3 In fact, the contrast involving determiners is a three-way one between the absence of a determiner (e.g., “lion”) and the presence of a proximal or a distal determiner (e.g., “this lion”, “that lion”). 4 Most of these forms can add an emphatic ham before the actual demonstrative (e.g., hamē ‘this very’, hamā ‘that very’, hamēšī ‘of this very’, hamēšān ‘these very’, hamīdān ‘right here’).

To bring the distant near: On deixis in Iranian oral literature

315

Table 1: Demonstrative determiners and spatial adverbs in Koroshi Balochi Proximal

Distal

Determiner

ē, ī ‘this’

ā ‘that’

Adverb

īŋa(r), īŋā, ēdā(n), edā(n), eda, īdān ‘here’ ēdānākō, ēdānakō, īdānākō, īdānakō ‘right here’

āŋa, ādā, odān, ōdān5 ‘there’ ōdānākō ‘right there’

Table 2: Demonstrative pronouns in Koroshi Balochi Nominative SG

Oblique

ī, ē, ēš

Genitive

Object

ēšī, īšī, ešī

Proximal PL SG

Distal PL

ēšān, īšān, ešān, šān ā

ēšānī, īšānī, ešānī

ēšānā

āhī, āī, āšī āšān

āšānī

āšānā

I begin with a detailed investigation of spatial deixis in the story Goli and Ahmad, told by an experienced storyteller, Alamdar Samsanian, and published in Nourzaei et al. (2015: 130–146). In this story, the first four discourse units (see Nourzaei et al. 2015: 20) have the character of an introduction with the deictic center outside the story: Well, once upon a time . . . Well, a woman, there was a woman, her name was Goli. [She]6 was actually very bad. [She] was actually very bad. [She] was giving her husband a hard time, you know.

5 There is not enough data to determine if odān/ōdān and ādā show two different degrees of distal deixis in this corpus, as they do in Sistani Balochi (see Barjasteh Delforooz 2010: 138). 6 Personal and demonstratives pronouns, as well as the adverb “there”, that are not found in the original text are given in square brackets. This also applies when there is an enclitic pronoun functioning as agent clitic in the ergative domain. However, non-canonical (dative) subjects expressed by enclitic pronouns are given without brackets (e.g., “to me there is a child”, meaning “I have a child”). In order to make the text easier to read, other words that have been supplied in the English translation are placed in square brackets only if they occur in a deictic expression. No brackets are provided in summarized text sections. Other additions are not marked here.

316

Carina Jahani

Then there is a tense change to the non-past tense7 (see also section 3), and the actual story starts. The next five discourse units take place on the very same day. So, one day her husband says, “Hey, Goli, all these people (ī hāmmo mardom) are going to pick green herbs here [and] there (āŋa īŋa), come on, let the two of us go, too”. [She] says, “Very well, let’s go”. [They] go out into this wilderness (hamī sahrā), you know, [they] pick green herbs, like this (āŋa īŋa). The man goes and finds a well. The man finds a well. [He] says, “Hey, my wife, look into this well (ē čāhā), what is this (ē) [thing] that shines?” The woman comes, sir, to look into the well. The man pushes her in such a way that she (lit. ‘the woman’) falls into the well.

In this section, there are three proximal demonstrative determiners, one proximal demonstrative pronoun, and two occurrences of the phrase āŋa īŋa ‘here and there, like this’ (lit. ‘there here’), which consists of both a proximal and a distal spatial deictic adverb. Three of the demonstratives8 occur in direct speech, and one in the narration. The deictic center for the reported speech is, of course, the location of the speaker, but the deictic center for the narration is now inside the story, something that can be seen in the use of a proximal deictic device outside direct speech (i.e., this wilderness). Here we thus see that a deictic shift has taken place. In the next five discourse units the story evolves in the following way: [She] falls into the well and [he] comes back home. After four, five days her husband says, “I shouldn’t have done like this, I shouldn’t have thrown this one (ēšī) into the well. [She] was my wife”. Anyhow, [he] becomes troubled, takes a rope and goes. [He] takes [it] and goes until [he] arrives at the well. [He] throws the rope into the well and says, “Hey, Goli, if you are alive, take hold of the rope, so that I can pull you up”. Well, [he] pulls up the rope, like this (hamītaw). [He] sees the rope is heavy.

Here, there is only one proximal demonstrative pronoun in direct speech, and one adverb with proximal deixis in the narration. Although the husband is now rather far away from the well where he threw his wife, he is still able to use proximal deixis when he is talking about her. It seems that proximal deixis is used here to make a minor participant (in this case Goli) temporarily salient, i.e., the current center of attention (Levinsohn 2015: 139–140). In the next eight discourse units the story takes an unexpected turn.

7 Note that in Nourzaei et al. (2015) the story is translated into the past tense, since this is the default tense for narration of past events in English. Here, on the contrary, the translation reflects the actual tense form of the verb. The demonstratives are also translated with the corresponding English proximal or distal demonstrative to reflect the original structure. 8 The phrase āŋa īŋa is not counted here.

To bring the distant near: On deixis in Iranian oral literature

317

[He] pulls and pulls until [he] suddenly sees that a dragon came up. A dragon came up. [He] becomes panicky and wants to let go of the rope, but the dragon says, “Don’t let go of the rope, I will give you whatever you want”. The man says, “Fine, can you get the king’s daughter for me?” The dragon says, “Yes, I will get [her] for you”. So then this one (ē) (i.e., the dragon) says, “Very well, tonight I will go and wrap myself around the neck of the king’s daughter. Then, no matter who comes (lit. ‘came’), I will not unwrap myself except for you. When you come (lit. ‘came’), I will unwrap myself. Then say, ‘Oh king, if you are going to marry off your daughter, give [her] to me, and I will open up this dragon (ē aždahā)’”. Well, in the evening the dragon goes and wraps itself around the neck of the king’s daughter.

There are two proximal demonstratives in this section: one is a pronoun in a narrative section, indicating that the dragon, rather than the man, becomes the center of attention, and the other is a determiner in direct speech. The next five discourse units take place the next morning. In the morning, when [they] get up, [they] see that the dragon is indeed wrapped around the neck of the king’s daughter. Even though [they] bring all kinds of wise men, all kinds from here [and] there (āŋa īŋar), [it] doesn’t unwrap itself. [They] say, “So, is there anyone left?” Someone says, “There is one person, a poor fellow who is called Ahmad, that one (ā) is left”. The king says, “Well, go and bring that one (hamāhī), too!”

Here, in addition to the phrase “here and there” already discussed above, there are two distal demonstratives, both in direct speech. It seems that there is a need to establish a contrast between the people at the deictic center (the court) and Ahmad, who is in another place, so therefore the distal deictic is used. There is thus both a spatial and a mental distance between those who are speaking and Ahmad. The next four units tell how Ahmad succeeds in becoming the king’s son-inlaw. [They] go and when [they] bring Ahmad, Ahmad says to the king, “O king, if you are going to marry off this your daughter (ē ǰanekat), give [her] to me, so that I can unwrap this dragon (ē aždahā) from her neck”. Well, then the king has no choice. [He] says, “Very well, I will marry [her] off, I will give [her] to you”. Anyway, [he] marries her off, [he] gives her to Ahmad and then Ahmad goes and whispers something in the dragon’s ear and the dragon goes away.

There are two proximal demonstrative determiners here; both are used in direct speech and for entities that are close to the speaker. In the following six units there is yet another development in the story, when the dragon wraps itself around the neck of another king’s daughter.

318

Carina Jahani

When (lit. ‘here that’ ēdān ke) the dragon unwraps itself to go, well, it says, “Ahmad, I am leaving right away (lit. ‘[I] went’) but if [I] ever wrap myself around anybody’s neck, [you] should not come, you know! If [you] do, then [I] will get angry and eat you”. [He] says, “Well, no, I won’t come”. Anyhow, the dragon goes its way. [It] goes its way, [it] goes to another town and wraps itself around the neck of another king’s daughter. Even though [they] bring all the wise men from here [and] there (āŋa īŋar), sir, [it] doesn’t unwrap itself. [It] doesn’t unwrap itself until people say, “In such-and-such a town there is a person called Ahmad, [he] is the king’s son-in-law. That one (ā) can unwrap it”.

In this passage there is one distal demonstrative pronoun in direct speech. This distal demonstrative indicates a long distance from the present deictic center, i.e., Ahmad who is in another town. In addition, it brings out the contrast between the wise men who failed and Ahmad, who is assumed to be able to solve the problem. There is also one proximal spatial adverb in the conjunction that denotes a temporal relation between the subordinate and the main clause at the beginning of the passage, and one occurrence of the phrase “here and there” discussed above. In the following six units, Ahmad is approached to solve the problem. He is hesitant at first, because of the dragon’s previous threat. There are no demonstratives in this passage. So [they] go to find Ahmad. But Ahmad, who is dead scared of the dragon, is not coming. Anyway, the king says, “No, you must go. [It] is improper”, and such things. [He] sends him away. Ahmad is coming but [he] is worried, you know, and says to himself, “What should I do? The dragon will eat me”, things like that. Then, in the middle of the road, suddenly [he] gets an idea. [He] gets an idea about what to say (lit. ‘what [I] should say’). [He] says, “Great!” [They] see that Ahmad, who was very worried before, is now laughing and happy. Someone says, “Ahmad, how are you feeling?” [He] says, “Never mind, let’s go, I will unwrap the dragon”.

In the following ten units the story is resolved. [They] go and go and when they arrive here on this side (hamī īŋare) of the court, you know, [they] see that indeed the dragon is wrapped around the neck of the king’s daughter. So, these (šān), go closer. When the dragon’s eye falls on Ahmad, [it] gets angry. [It] says, “Well, didn’t [I] tell you not to come?!” Ahmad says, “I didn’t come to say ‘unwrap yourself’ actually. [I] didn’t come to say ‘come loose, and go!’” The dragon says, “So what do you have to say?” Ahmad says, “I only have a message for you”. The dragon says, “What is [it]?” Ahmad says, “Goli has come out of the well and she is looking for you!” Sir, the dragon unwraps itself out of fear, and how [it] is running!

In this final part of the story, there is one proximal spatial adverb, one proximal demonstrative determiner, and one proximal demonstrative pronoun. Again,

To bring the distant near: On deixis in Iranian oral literature

319

Ahmad and his people arrive close to the palace; they do not pass by it in relation to where they came from. Where Ahmad (the main protagonist) is, the deictic center is there as well. The final unit is the formal ending, which does not belong to the actual story; in it the deixis has shifted from inside the story back to the actual occasion of the narration, thus the distal demonstrative determiner. Now, may our enemy experience what Goli did and our friend what that Ahmad (hamā ahmad) did.

In the rest of the material from Koroshi Balochi a similar picture emerges. The longest story in the corpus, entitled The King’s Son and published in Nourzaei et al. (2015: 162–209), can be summarized as in Table 3: Table 3: Deixis in The King’s Son

Determiner Pronoun Adverb

Proximal

Distal

41 26 13

8 2 1

There are many examples where proximal deixis is used for objects and persons physically remote from the deictic center, but still at the current center of attention. Unit 12 (Nourzaei et al. 2015: 165) is said about a foal that has not yet been born, but which is the current center of attention due to the magical power it will possess. The previous two units have described how it should be raised, and the following statement is then made: Then this foal (ē korrag) can provide you with whatever you may want.

In unit 112 (Nourzaei et al. 2015: 200), a woman is talking to her husband about her sisters, who are not physically with them, but who are the reason why she is sad, and thus are important in this context. She says, I know that you are not bald, I know who you are, these (īšān) are ridiculing me. Distal demonstratives can create a contrast between two different entities. In units 75–76 (Nourzaei et al. 2015: 188), the six elder sisters, marked with a distal demonstrative, do one thing, and their little sister, marked with a proximal demonstrative, does something else and unexpected, which attracts attention. In fact, they are all at the same location when the action takes place:

320

Carina Jahani

So, those six sisters of hers (ā šīš gāhārī), each one of them hits someone, one, for example, hits the vizier’s son in the chest. In short, each one hits a rich person, you know. [They] hit some boys, but this youngest girl (ī kassānoēn ǰanek) doesn’t throw her apple.

Two more instances of a distal demonstrative being used to create a contrast between two entities are found. One is in the reported speech of unit 98 (Nourzaei et al. 2015: 195) and the other is in the narration of unit 111 (Nourzaei et al. 2015: 200). Then [they] say well, these six sons-in-law (ī šīš dūmād) say, “Well, there is [another] one too, he has (lit. ‘there is to him’) a lame mule, [he] comes afterwards, give him the meat, that one (ā) should bring [it]”. . . . these six sisters of hers (ē šīš gāhārī) keep ridiculing that one (āī) (i.e., the youngest sister) . . .

Distal demonstratives are also found as a reactivating device. In unit 70 (Nourzaei et al. 2015: 186), the two entities that are marked with the distal demonstrative determiner are present at the current deictic center, but they are marked with the distal demonstrative for discourse deictic purposes, i.e., to reactivate a previously mentioned topic without changing the deictic center. The proximal demonstratives in units 69 and 70 (ē/ī bāġā, ī) (Nourzaei et al. 2015: 185–186) indicate the current center of attention. She sees, dear Lord! [There] is a rider on a horse in this garden (ī bāġā), [he] is riding around, it is as if [there] is an angel riding around in this garden (ē bāġā). After a while [he] stops, [he] stops. [She] sees that [he] gets off the horse. [She] goes towards [him] and this one (ī) came towards [her] too. [He] pulls that same aforementioned stomach (hamā komaokā) over his head. [She] sees that it is that very bald one (hamā kačal) who works in their garden.

Two distal demonstrative determiners are found in the very last unit of the story, unit 141 (Nourzaei et al. 2015: 209), where the spatial anchoring is no longer located within the story but is at the moment of narration: Now, may it happen to our friend like to that king’s son (čō hamā šāhay bačā) and to our enemy like to those six servants (čō hamā šīš nawkarā).

There are three more distal demonstrative determiners in the text, which will be discussed in section 4.1 because they are all connected to temporal adverbs.

To bring the distant near: On deixis in Iranian oral literature

321

3.2 Sistani Balochi For Sistani Balochi nine stories have been included in this study (Barjasteh Delforooz 2010: 286–325, 336–391). The results for two of them will be presented here. The rest provide a similar picture. The demonstrative determiners and spatial adverbs found in the Sistani Balochi texts in this corpus are presented in Table 4, and the Sistani Balochi demonstrative pronouns in Table 5 (see also Barjasteh Delforooz 2010: 138, 150–151).9 Table 4: Demonstrative determiners and spatial adverbs in Sistani Balochi Proximal

Distal 1

Distal 2

Determiner

ē, ī ‘this’

ā ‘that’

Adverb

idā ‘here’

ōdā ‘there’

ā(d)dā ‘far away there’

Table 5: Demonstrative pronouns in Sistani Balochi Nominative SG

ē, yē, ī, ēš

PL

ē, ī, ēš

SG

ā

PL

ā

Proximal

Distal

Oblique

Genitive ēšī, ešī

ēšān

ēšānī, ešānī ā(y)ī

āwān

Object ēšā, ešā, ēširā, ēšīrā ēšānā ā(y)irā

āwānī

āwānā

One of the Sistani Balochi tales, The Story of the Lion and the Three Bulls, told by the experienced storyteller Paraddin Gorgej (Barjasteh Delforooz 2010: 378–383), is here presented in its entirety. Like those in Koroshi Balochi, the stories in Sistani Balochi exhibit deictic shift, and in the story below proximal deixis is overwhelmingly predominant, to such an extent that nouns are used more frequently with a determiner than without. Sir, [they] say, there were three cows. All these three cows (ī har say gōk) were in unity (lit. ‘in one heart’), there was a black one, there was a light brown one, and there was a white one. These (ē) were in unity from that [old] time (amā waxtā). Wherever [they] grazed no beast of prey had any power against these (ešānī). If there was a lion, if there was a leopard, if there was a wolf, that attacked one [of them], all three attacked [it] and no beast of prey attacked these (ēšānī sarā), because they were of one heart and of one mind. What did a certain lion do? [It] was stalking these (ēšānī), [it] was lying in ambush for these (ēšānī). It said, “Unless I change the mind of each one of these (ēšānā), I won’t 9 These forms can add an emphatic am before the actual demonstrative (e.g., amē ‘this very’, amā ‘that very’, amēšī ‘of this very’, amēšān ‘these very’, amidā ‘right here’).

322

Carina Jahani

be able to eat these (ī) (lit. ‘these won’t be eaten’)”. This lion (ē šīr) came and lied to these (ēšānā), told these (ēšān) a lie, “O fellows, you are this kind of [good] friends (ē rangēn rapēɣ), I am your fourth brother. I have seen a pasture in a place. [It] is very green (lit. ‘spring’). You . . . let’s go there (ōdā), I will take you there, you eat that grass (lit. ‘spring’) (ā bahārā). I will watch over you”, the lion said. [It] deceived (lit. ‘made donkey’) these (ēšānā) and took [them]. When [it] took [them], [it] said, “Now I will watch over you on this mound (ē dikkayay sarā), you eat this grass (ē bahārānā)”. [It] passed, that day (lit. ‘today’) passed and the next day (lit. ‘tomorrow’) passed, [it] whispered in the light brown cow’s ear and in the white one’s. [It] said, “O friend, your hair and my hair are the same colour, that black [one] (amā siyāhēn) is ill-matched (lit. ‘unripe’) among us. You, don’t help [it], I will eat that [one] (āyirā), a lot of grass will remain for you”. [It] confused these ones’ (ēšānī) minds and one of the cows said, “Fine”. So it seized the black one. Those (ā) didn’t help. When [they] didn’t help, [it] overpowered this (ešī) single one. [It] ate this one (ēšā) up. When [it] ate and finished it . . . , when [it] finished this (ešā) [it] whispered in the light brown cow’s ear and said, “Your hair and mine are the same colour, you, don’t help [it], when I eat this (amē) white one, all the grass will remain for you. Then I will watch over you here (idā), you can eat!” [It] said, “Fine”. When [it] seized this one (ēširā) too, that one (ā) (i.e., the light brown one) didn’t help. The lion ate this one (ēširā) (i.e., the white one) too. When [it] finished this one (ēšā) too, [it] said to that one (āyirā), “Now [I] alone am more powerful than you”. [It] ate that one (āyirā) too. In this manner (ē rangī), with this trick (ē sīyāsat), [it] destroyed these ones (ēšānā).

Proximal deixis must be seen as the default in this story. It is interesting to note that even in the final comment “in this manner, with this trick, it destroyed these ones”, where a shift out of the actual story may have been expected, proximal deixis remains. In addition to the proximal deictics, the story provides some interesting examples of distal deixis. The first three, “that (ā) old time”, “there (ōdā)”, and that grass (ā bahārā) can easily be explained as indicating a distant time and a distant place, far from the present deictic center. The following five ones, though, all indicate entities that are present at the deictic center; however, a contrast needs to be established between the cow that is to be eaten and the other cow(s). Another story in Sistani Balochi, Xarmizza (Barjasteh Delforooz 2010: 286– 293) contain the following deictic items (Table 6): Table 6: Deixis in Xarmizza

Determiner Pronoun Adverb

Proximal

Distal

20 30 1

— 1 3

As in the previous Sistani Balochi tale, here, too, proximal deixis is completely predominant when compared to distal deixis. This time, however, the proximal

To bring the distant near: On deixis in Iranian oral literature

323

determiner is used less frequently than the simple noun.10 There are several examples of how a proximal deictic determiner or pronoun is used even if the entity referred to is not present at the deictic center, such as when the king in his palace refers to a dragon far away from the palace with proximal deixis (units 14–20, Barjasteh Delforooz 2010: 287). The use of the verb “to bring” clearly shows that the dragon is not present at the palace: [They] came back. One of them said, “Lord king, [it] is a dragon”. The king said, “Oh . . . this dragon (ē aždiyā) has something to say, [it] has a petition, who can bring this one’s (ēšī) petition to me?”

In another very interesting example, proximal deixis is used to refer to the palace, although the story has now moved from the palace to the mountain, where the dragon obviously lives (units 34–43, Barjasteh Delforooz 2010: 288– 289). However, the passage starts out with a distal deictic adverb indicating a movement from the former deictic center (the town, where the dragon has come to seek help for a problem) to the new deictic center (the mountain, where the dragon takes the carpenter who has come to help). The dragon gave the carpenter one hint after another and went onto the mountain. When [he] went there (ōdā), good heavens ohhh . . . this dragon (ē aždiyā) has a mother as well. This one’s (ēšī) mother, there are (lit. ‘aren’t’) this [kind of] wild mountain goats (amē kōhay pāčin) with big horns, it has caught one of these (amēšān) and these (amē) its horns have got stuck in the mother’s throat and [it] is short of breath, the dragon has come here (idā) and informed the king.

In this example, “has come here (idā)” refers to the previous deictic center, i.e., the town where the previous scene was located, rather than to the mountain, where the hero and the dragon have now gone. There are a few additional distal demonstratives to account for, apart from the one in the example above “there (ōdā)”. Two more occurrences of “there (ōdā)” are found in the text, one referring to a place that is not the current deictic center (Barjasteh Delforooz 2010: 287) and one at the very end of the story “the name xarmizza (‘melon’, lit. ‘donkey tasted’) remained from there (ōdā)” (Barjasteh Delforooz 2010: 293). The latter occurrence must be seen as an addition to the actual story where the narrator shifts the deictic center from within the story to the moment of narration to account for why this fruit is called what it is even now. The one distal demonstrative pronoun in this text (Barjasteh Delforooz 2010: 289) is found in a context where there is a need to create a contrast between two entities that both are present at the deictic center, namely

10 For example, the noun for “king” occurs once with the proximal determiner and six times with no determiner.

324

Carina Jahani

the carpenter, who rescues the mother dragon and is the participant through whom the story will continue to develop, and the mother dragon herself, who has no further part to play in the story: This one (ē) (i.e., the carpenter) sawed the wild goat’s horns and that one (ā) (i.e. the mother dragon) was rescued.

3.3 Vafsi Seven stories have been investigated for Vafsi. The results from four of them are presented here. The other stories show similar results. The demonstrative determiners and adverbs found in the Vafsi texts are presented in Table 7 and the demonstrative pronouns in Table 8 (see also Stilo 2004: 225, 227). Table 7: Demonstrative determiners and adverbs in Vafsi Proximal

Distal

Determiner

in ‘this’

an ‘that’

Adverb

indi, indiænæ ‘here’ ena ‘this way’

andi, andiænæ ‘there’ ana ‘that way’

Table 8: Demonstrative pronouns in Vafsi Nominative Proximal Distal

Oblique

SG

in

tini, intine, intini

PL

ine

tinan

SG

an

tani, tane, antane

PL

ane

tanan

Like in Koroshi and Sistani Balochi, deictic shift and an overwhelming predominance of the proximal demonstratives are found in the Vafsi narratives. This can clearly be seen in the tale entitled Moses, the holy hermit and the infidel chieftain (Stilo 2004: 110–115). This is the story in full: Once his holiness Moses11 went to Mt. Sinai and asked, “Oh God, is a merciful infidel12 better (lit. ‘good’)? Or a merciless Muslim?” A voice proclaimed from the heavens, saying, “Of course, a merciful infidel is better than a merciless Muslim. Now go out from this mountain (in ku) and see for yourself”. When Moses went out off this mountain (in ku), 11 The phrase hæzræt-e musa ‘his holiness Moses’ with the honorific hæzræt will be translated as only ‘Moses’ in the rest of the story. 12 There are two words translated ‘infidel’ in this story. In the general discussion at the beginning of the story the word kafær is used. Later on in the story the word gæbr is used as well, sometimes in combination with kafær.

To bring the distant near: On deixis in Iranian oral literature

325

he went up to this top of a mountain (in ku-kællæ) and saw that a holy hermit – he wasn’t actually there (andi) himself – had put some bushes under this one’s (tini) cooking pot as firewood, and the fire had gone out. Moses said, “Well, since this [fire] under the pot (in zer-dizi) went out, I will light this [fire] under the pot (in zer-dizi). When this holy hermit (in abed) comes back, maybe [he] will give me a little of this lunch of his (in nahares), let me take the trouble to light the fire myself”. When Moses came to light this [fire] under the pot (in zer-dizi), the bushes got bumped and everything in the pot spilled out. Just as it spilled out, this holy hermit (in abedæ) came back and said, “Why did you spill the pot out?” [Moses] said, “Honestly, I wanted to make this (in) better, but [it] turned out bad to rekindle the fire under the pot”. “No, what business was [it] of yours?” And the hermit threw two or three punches at Moses and Moses threw two or three punches at the hermit, and through an act of the Lord the backs of these (tinan) two stuck together. After the backs of the two had stuck together, as [they] walked along, Moses carried this one (tini) on his back for a while and then the hermit carried Moses for a while. Until [they] arrived, that is to say, to a plain where [they] saw, wow, a hundred tents were pitched. These infidels (in gæbre)13 . . . [these] were the tents of infidels. [They] got up to these infidels (in gæbran) – their chieftain himself was 90 years old . . . was 100 years old and his wife 90 years old. These (tinan) hadn’t had any children. But the Lord had just given these (tinan) a newborn in swaddling clothes. [He] had just given these (tinan) a newborn when these (ine) two arrived and here (indi), so to say, were this one’s (tine) guests and this infidel chief (in sær-kærde-ye-gæbri, gæbr-e kafæri) said, “Well, how is it that you have got like this (æzin)?” [They] said, “Well, our story is like this (æzin), it happened like this (æzin). [We] got into a fight and we ended up like this (æzin)”. [He] said, “well, you . . . is there no cure for this (in)?” [He] said, “Yes, there is a cure”. Moses turned and said, “[There is] a cure. But [it] should be an infidel. [He] should be the chief of 100 tents. He should be 100 years old himself. His wife should be 90. [They] should have had no son, but the Lord should just have given that one (tani) a son. [They] should take that son (an lazey) and slit his throat (lit. ‘cut his head’), right between the two of us, then we will come apart from one another”. This infidel’s (in gæbri) wife had gone to the bathhouse and had put this baby (in zarru) to sleep in the cradle. This infidel (in gæbri) considered it for a bit and said, “Well, I didn’t have a child until now. And now, God has willed [it] (lit. ‘it pleases God’). Now, let me slit this one’s (tini) throat so these (in) can get apart from one another, I don’t want this child (in zarru) any more”. This infidel (in gæbr) goes and gets a pen knife and slits his son’s throat, right between Moses and the hermit, and these (ine) come apart from one another and start walking away. [They] start walking away and [he] quietly puts this newborn (in qondaq) back into the cradle and covers his head, but when the wife returns from the bathhouse, this infidel (in gæbr) says, “Oh God, what should [I] say in answer to my wife?” This one (in) starts to beseech the Lord. The wife comes back and says, “Husband, the baby hasn’t woken up, has he?” [He] says, “No, [he] hasn’t woken up yet. All of a sudden, [he] sees the baby crying. As the baby is crying, the woman comes and picks up the newborn to breast-feed [him] and this infidel (in gæbr) comes over to inspect this newborn’s (in qondaqi) throat. The woman asks him, “Husband, what are you looking at so closely?” [He] tells her, “Wife, to tell the truth, [it] happened like this (æzin). I slit this one’s (tine) throat and [it] happened like this (æzin). The wife says, “Husband, those two people (an do næfær) 13 The different case forms of the noun gæbr found here are direct singular (gæbr), oblique singular (gæbri), direct plural (gæbre), oblique plural (gæbran) (see Stilo 2004: 223).

326

Carina Jahani

were people of very high standing. Run and bring those ones (tanan) back for a feast”. This one (in) runs, actually goes and brings these ones (tinan) back. [He] says, “[You] are people of very high standing”. [He] brings these ones (tinan) back and puts on (lit. ‘gives’) a feast. When [he] brings these ones (tinan) back, this one (ine) says, “I am Moses”. So this infidel (in gæbr) and the people in the hundred tents all become Muslims. When Moses goes back to Mt. Sinai God asks, “O Moses, is a merciful infidel better? Or a merciless Muslim? [He] answers, “O God, a merciful infidel”. And [that] is the end. This (in) is the Vafsi for it.

Deictic shift applies to the whole story, even in the comment at the end “this (in) is the Vafsi for it”. In this story, the vast majority of spatial and discourse deictic lexical items thus denote proximal deixis and place the narrator and the audience inside the story. There are only five distal deictic items. The first one, “he wasn’t actually there (andi) himself”, is an explanatory comment to the audience outside the story. The two following ones, “the Lord should [just] have given that one (tani) a son” and “they should take that son (an lazey) and slit his throat” occur in a description of an imagined situation. The final two deictic devices are “those two people (an do næfær) were people of very high standing. Run and bring those ones (tanan) back for a feast”. In this case, distal deixis is in fact used for persons who are far from the deictic center, but it could also fulfill the function of highlighting a climactic statement. In the three stories The Needle Dirties Himself, The Molla and the Jew, and Shangol, Mangol and Dastegol (Stilo 2004: 49–57) there are no distal deictics whatsoever, except for the adverb ana ‘that way’ in the expression ena-o ana ‘this way and that way’ (unit 41, Stilo 2004: 56). There are, however, several instances of a proximal deictic demonstrative being used to denote an entity that is not present at the place where the speaker is, such as in unit 33 (Stilo 2004: 48–50), where the needle says “[I]’ll go tell this mouse (in muši) [and] he’ll come make a hole in your bed”. The presence of the verbs “to go” and “to come” show that the mouse is not where the needle is, though it is the center of attention within the reported speech.

3.4 Gorani Three texts in the Gawraǰūyī dialect of Gorani have been analyzed in this study. Spatial and discourse deictic demonstratives and adverbs in this dialect are presented in tables 9 and 10 (see also Mahmoudveysi et al. 2012: 15, 17). Table 9: Demonstrative determiners and adverbs in Gorani of Gawraǰū Proximal

Distal

Determiner

ī ‘this’

ā ‘that’

Adverb

īnā, īnahā ‘here’

āna, ānā ‘there’

To bring the distant near: On deixis in Iranian oral literature

327

Table 10: Demonstrative pronouns in Gorani of Gawraǰū Proximal Distal

SG

īn(a), īnī

PL

īnān(a), īn(ak)ānī

SG

ān(a), ānī

PL

ānān(a), ānānī

The three stories Tītīla and Bībīla14 (Mahmoudveysi et al. 2012: 63–77), The Tale of Bizbal (Mahmoudveysi et al. 2012: 81–88), and Mard and Nāmard (Mahmoudveysi et al. 2012: 96–103) will be analyzed here. The text Mard and Nāmard, which contains the largest number of deictics, is presented in full. Parts of the story where there are no deictics are summarized in brackets. Well, where should [we] begin, where should [we] hear [it], the story of two friends, two men. Both of them go looking for work. [They] are together; their names are Mard and Nāmard. Both of them make a contract together; one says, “Brother”. The other says, “Yes?” The first one says, “[We] will go find work to do and a town, a place, where we may earn a morsel of bread for our children, and [we] will come back again together”. [They] say, “All right”. From home, this one (ī(n)) wraps up bread and [other] victuals and ties [it] to his back. That one (ānī), simply brings bread and [other] victuals and ties [it] to his back. (They eat and then they fall asleep. Nāmard wakes up first. He takes off with all the foodstuffs. When Mard wakes up, he realises that he has been abandoned.) [He] goes a long way until he reaches the inside of a mill, a machine. [He] goes inside there (āna); [it] is old, nothing anymore, for example, [they] do not work in it anymore. Eh, [he] sits down inside there (āna), [he] hides himself. (A bear, a wolf, and a lion come.)15 One of them says, “[Here] is the scent of a human being!” One of them looks around and says, “There is nothing, believe [me], no, [there] is no human being in this place (ī dawray)”. [They] sit down and – like me now – they tell a story. This one (īn) (i.e., the wolf) says, “Brother”. [They] say, “Yes?” [It] says, “The king’s daughter has gone insane. Do [you] know what the cure for her is?” It is this (īna). The wolf speaks. These (īnakānī) say, “No”. The wolf says, “They tried all kinds of medicine and remedies, but there has not been a cure for her. The dog with the flock, if I only were a human being, [I] would have killed that dog (ā tūta) with the flock, [I] would have taken out its brain, [I] would have left it out in the sun, so [it] would have become dry. [I] would have ground [it], [I] would have brought [it], [I] would have steeped [it] like tea, [I] would have given [it] to the king’s daughter, so she becomes completely well again”. The man says: “Well, this (īna) is the first of the stories.16 The wolf says, “As for me, I would eat, 14 The story Tītīla and Bībīla is a longer version of the same story as Shangol, Mangol and Dastegol in Vafsi, and The tale of Bizbal is also structurally very similar to The needle dirties himself in the Vafsi corpus. 15 The storyteller changes the three animals throughout the story. The bear turns into a leopard, and on one occasion a dog is also mentioned. 16 An alternative translation of īna here and below is ‘it is so/thus’ (personal communication, Denise Bailey).

328

Carina Jahani

be full with its meat, of the flock”. It is not my concern anymore, finally then, that one (āna), that one (āna) tells another story, thing. The lion answers, he says: “Have you seen this tree (ī dāra) outside this mill (az ī bar āsyāw)? [It] has become dry, this (īna) has not brought forth fruit for several years. If only I were a human being, if [I] could find a way for the tree to spread its roots. There were three royal vases in it. [They] are full of gold and precious stones. If only I were a human being, if [I] would have found [it], this tree (ī dāra) too would have then born fruit. This (īna) is the second story. The leopard answers, saying, “Inside the mill, whatever they did, it has not worked. You must find it. There are also two vases in it. If the owner would come for attending this mill (ī āsyāwa), he would put it to work, it would start to work”. This (īna) is all three stories. Brother, as for the man, Mard, [he] simply listens until the early morning becomes day. (Then he goes, finds the vases, puts them in a place where he can find them later, finds the flock, kills the dog and takes out its brain.) [He] takes [it] out in that same way (ā jür(a)) the wolf said, [he] puts it out in the sun, [it] becomes dry, and [he] grinds [it] and puts [it] into his bag. [He] sets off on his way, [he] goes. [He] goes, [he] reaches the city, where [he] sees that, yes this (īna) is [it]. The king whose daughter has become insane is from this city (ī šāray). Finally, [he] reaches there (ānā) and says, he goes and knocks on the door of the king’s house, and this one (īn) (i.e., a person at the door) says, “Who is [it]?” That one (ān) (i.e., another person) says, “Who is [it]?” And [Mard] says, “I have come to cure your daughter, [I] am a doctor”. These ones (īnān) in turn say, the people in the king’s house say, “So many medicines and doctors came and [they] brought remedies, and the doctor gave medicine; his medicine did not bring about healing. (Mard claims to be different and promises to cure the king’s daughter within a few days.) The king says, “What is this (īna)?” The servant says, “By God, a young man has come, [he] says, ‘I will cure his daughter’. Your highness, what do [you] command?” The king says, “Let [him] come upstairs, no problem, [he] is welcome, this one (īn) too, up like those doctors”. The king says, “Well, doctor”. (Mard introduces himself and asks what he will get if he cures the king’s daughter.) The king says, “My daughter, as a gift, [I] will give [her] to you, [I] will also give this crown and throne (ī tāǰ-u taxt) of mine to you”. Mard says, “No, may your crown and throne be a gift to yourself. But if [I] cure your daughter, then [I] want your daughter in marriage”. The king says, “So be [it], may [she] be a gift to you”. So [they] make a contract there (āna). This one (īn) (i.e., Mard) also goes, [he] goes a little way to attend to the girl. (He gives her the medicine, trying to remember what the wolf had said.) “After that, for example, then, anoint her back and these (īnān) (i.e., some other body parts) with it, put the medicine on it until she is well again”. (The girl gets well and is given to Mard in marriage. Then Nāmard turns up.) Nāmard says, “I recognized you. You are Mard, [you] are indeed Mard, what have [you] done that you reached this [high] position (ī pāya)? I wander about in this state (ī ǰüra) without purpose, [I] still have achieved nothing, nothing at all”. Mard says, “You are not a good man, [you] have proven yourself. We were friends, you yourself stole the bread and went your way. [You] did not wait right at that moment. I was so hungry, [I] ate earth. You man without a conscience! Nevertheless, now I will also give you this advice (ī řāwēza), listen! Me, from then on, this God (ī xwiyā) had mercy on me, [he] placed this much good (ī hamkay xayrša) in front of me (lit. ‘my mouth’). Go into the mill, to a corner high up, a leopard and a dog and a lion, [they] come back in the evening, [they] talk. Listen to their stories”. Nāmard says, “Fine!” Brother, this one (īnī) goes at once. They have a pipe for the

To bring the distant near: On deixis in Iranian oral literature

329

stove, Nāmard goes and just sits up on that stovepipe and makes himself very comfortable. In the evening, the wolf and the lion and the leopard return. [They] say, “[Here] is the scent of a human being!” [They] grab Nāmard by his leg, bring him down and tear him to pieces. A bouquet of flowers, a bouquet of narcissus, may I never see your death, never.

The proximal demonstrative determiner is not used as frequently in this text as in the Koroshi Balochi, Sistani Balochi, and Vafsi texts. However, it is still clear that we find deictic shift and that the deictic center is mostly inside the story. The four occurrences of the deictic adverb “there (āna, ānā)”, however, seem to break this pattern and move the deixis outside the story. The four occurrences of the distal demonstrative pronoun can be explained as establishing a contrast between Mard and Nāmard, between the first and the second animal who tell their stories, and between the first and the second person at the king’s door, while the two occurrences of the distal demonstrative determiner can be explained as indicating spatial and temporal distance. When the wolf tells the story, that dog (ā tūta) is not present, and that same way (ā jür(a)) refers back in the discourse to when the wolf had told its story. There is a formal ending to this story as well, but there are no deictic elements in it. The other two stories Tītīla and Bībīla and The Tale of Bizbal contain the following deictic elements (Table 11): Table 11: Deixis in Tītīla and Bībīla and The Tale of Bizbal

Determiner Pronoun Adverb

Proximal

Distal

12 8 —

5 3 —

Distal deixis with a determiner is found four times in the expression “the other side” of the river (units 6, 7, 12, Mahmoudveysi et al. 2012: 63–64), thus indicating a contrast to the side of the river where the deictic center is located: [They] go to the Zimkān [river]; the flock goes to that (ā) [other] side.

A distal demonstrative pronoun is found once in connection with the main protagonists, where they are probably both contrasted with “the flock” and also highlighted (unit 4, Mahmoudveysi et al. 2012: 63): [They] are at home; the flock goes to the mountains and those (ānān) stay at home.

Another distal demonstrative pronoun (unit 75, Mahmoudveysi et al. 2012: 73) occurs after a dialogue, and refers back to the protagonist who was not the

330

Carina Jahani

last speaker (the wolf), indicating that the wolf remains the current center of attention. Thus, the distal demonstrative creates a contrast between the wolf and the goat: The goat says, “Morning, at midday [there] will be war. [I] will come to the square and [we] will fight”. The wolf says, “All right”. That one (ānī) (i.e., the goat) comes, comes to Lālo Pāydar . . . (units 73–75, Mahmoudveysi et al. 2012: 73).

However, in another example (unit 40, Mahmoudveysi et al. 2012: 84), the distal demonstrative pronoun does refer back to the last speaker (Auntie Tahmineh). The effect is to direct attention back to the previous speaker (the cat), who is the main character of the story: So the cat goes and says to the mother of Čiman, the cat says, “Auntie Tahmineh, please send your daughter, [she] should dance for one hour”. That one (ānī) (i.e., Auntie Tahmineh) says, “Brother, she has no shoes, you must go and make shoes for her”. The cat says, “Sure”. (units 39–41, Mahmoudveysi et al. 2012: 84)

4 Temporal deixis in the analyzed corpus The main means of anchoring a story in time is the tense form of the verbs. In the following subsections, the use of tense in Koroshi Balochi, Sistani Balochi, Vafsi, and Gorani narratives will be investigated. Interesting observations about temporal adverbs will also be noted.

4.1 Koroshi Balochi In Koroshi Balochi, “the default tense of narration of past events [. . .] is the nonpast form” (Nourzaei et al. 2015: 20). The introduction of a story, as well as background material in the story, are in the past tense, whereas all the events in the main story line are in the non-past tense. This can be observed in the story Goli and Ahmad, published in Nourzaei et al. (2015: 130–146), which was presented in full in section 3.1 above, as well as in the other stories in the corpus. The verb forms at the beginning of Goli and Ahmad are presented below as they are in the original text. (Note that the translation of the verb forms was somewhat modified in section 3.1 to give a more idiomatic translation into English.) Verb forms in direct speech have not been analyzed, since they are not part of the story line. In fact, it could be argued that verbs in subordinate nominal clauses after verbs of perception are not part of the story line either, but they have been included in the analysis because they exhibit an interesting variation in tense

To bring the distant near: On deixis in Iranian oral literature

331

use. The same pattern as in this extract is found in all the tales in Koroshi Balochi.17 Well, once upon a time (lit. ‘Well, there was [ad] one, there was [nayad] no one, except for God, there was [nayad] no one’) well, a woman, there has been (boda) a woman, her name has been (boda) Goli. Actually she has been (boda) very bad. Actually she has been (boda) very bad. She has been troubling her husband (azzīyate šūay makanā boda), you know. So, one day her husband says (ašī), “Hey, Goli, all these people are going to pick green herbs here and there, come on (byā), let the two of us go, too”. She says (ašī), “Very well, let’s go”. They go (arran) out into this wilderness, you know, they have been picking (mačenēn boda) green herbs, like this. The man goes (arra) and finds (pēdā akant) a well. The man finds (pēdā akant) a well. He says (ašī), “Hey, my wife, look into this well, what is this thing that shines?” The woman comes (akay), sir, to look (say kan) into the well. The man pushes her (lohe adā) in such a way that she goes (arra) into the well. She goes (arra) into the well and he comes (akay) back home. After four, five days her husband says (ašīt), “I shouldn’t have done like this, I shouldn’t have thrown her into the well. She was my wife”. Anyhow, he becomes (abī) troubled, takes (azo) a rope and goes (arra). He takes (azo) it and goes (arra) until he arrives (arasī) at the well. He throws (aprēnī) the rope into the well and says (ašī), “Hey, Goli, if you are alive, take hold of the rope, so that I can pull you up”. Well, he pulls (akašī) up the rope, like this. He sees (agennī) the rope is (en) heavy. He pulls (akašīd) and pulls (akašīd) until suddenly he sees (agennīt) that a dragon came (āk) up. A dragon came (āk) up. He becomes (abī) panicky to let go (wel dā) of the rope, but the dragon says (ašī), “Don’t let go of the rope, I will give you whatever you want”. The man says (ašī), “Fine, can you get the king’s daughter for me?” The dragon says (ašī), “Yes, I will get her for you”. Then the dragon says (ašī), “Very well, tonight I will go and wrap myself around the neck of the king’s daughter. Then, whoever came, I will not unwrap myself except for you. When you came, I will unwrap myself. Then say, ‘Oh king, if you marry off your daughter, give her to me, so that I may open up this dragon’”. Well, in the evening the dragon goes (arra) and wraps (apēčī) itself around the neck of the king’s daughter.

All the verbs in the story line are in the non-past tense. The only verbs found in the past tense are those that introduce the story in the first paragraph, and those that describe background imperfective (ongoing) events (they have been picking (mačenēn boda) green herbs). Once there is also a verb in the past tense in the subordinate nominal clause after a verb of perception (he sees (agennīt) that a dragon came (āk); i.e., ‘had come’). In another nominal clause, however, the non-past is used (he sees (agennī) the rope is (en) heavy). 17 Subject pronouns and other words that have been supplied for syntactic reasons or for the sake of better comprehension are not marked in this section, since the discussion here concerns the verbs.

332

Carina Jahani

For tense forms of the verbs, we thus have a deictic shift in Koroshi Balochi and the tense is anchored in the very story, which is portrayed as ongoing at the very moment by means of using the non-past tense. There are, however, three time adverbials in the corpus that modify this picture slightly, since they all contain a distal demonstrative determiner: that day (ā rōč) (unit 18, Nourzaei et al. 2015: 168), that day (ā rō) (unit 68, Nourzaei et al. 2015: 185), and that hour (ā sāhat) (unit 100, Nourzaei et al. 2015: 196).

4.2 Sistani Balochi The picture that emerges for Sistani Balochi is somewhat different. Here all the verb forms are in the past tense and therefore the story is temporally anchored at the time of narration rather than within the story itself. All verb forms in the narrative parts of the story (direct speech excluded), which is the same story as that in section 3.2, are given below. Only the evidential marker gušīt at the very beginning of the story is in the non-past tense, which is logical, since it refers to a non-past event (Barjasteh Delforooz 2010: 378–383). All the other verb forms are in the past tense. Sir, they say (lit. ‘he says’) (gušīt), there were (atant) three cows. All these three cows were (atant) in unity (lit. ‘in one heart’), there was (at) a black one, there was (at) a light brown one, and there was (at) a white one. These were (atant) in unity from that old time. Wherever they grazed (čartant) no beast of prey had (nadāšt) any power over them. If there was (būtēn) a lion, if there was (būtēn) a leopard, if there was (būtēn) a wolf, that attacked (alma kurtēn) one of them, all three attacked (amlaa kurtant) it and no beast of prey attacked (lit. ‘went on’) (našut) them, because they were (atant) of one hear and of one mind. What did a certain lion do (kurt)? It was (būt) stalking them, it was (būt) lying in ambush for them. It said (guštī), “Unless I change the mind of each one of them, I won’t be able to eat them”. This lion came (āt) and lied (drōg ǰat) to them, told them a lie (drōgē ǰat), “O fellows, you are such good friends, I am your fourth brother. I have seen a pasture in a place. It is very green (lit. ‘spring’). You . . . let’s go there, I will take you there, you eat that grass, I will watch over you”, the lion said (gušt). It deceived (lit. ‘made donkey’) (xarē kurt) them and took (burt) them. When it took (burtē) them there, it said (guštī), “Now I will watch over you on this mound, you eat this grass”. It passed (būt), that day passed (būt) and the next day passed (būt), it whispered (lit. ‘entered’) (putrit) in the light brown cow’s ear and in that of the white one. It said (guštī), “O friend, your hair and my hair are the same colour, that black one is ill-matched (lit. ‘unripe’) among us. You, don’t help it, I will eat it, a lot of grass will remain for you”. It confused (lit. ‘ruined’) (xarābē kurt) their minds and one of the cows said (guštī), “Fine”. So it seized (čalāpt) the black one. They (i.e., the other cows) didn’t help (kumak nakurtant). When they didn’t help (kumak nakurtant), it overpowered (lit. ‘was strong’) (zōr at) this single one. It ate (wārtē) this one up. When it ate (wārt) and finished it (alāsē ku) . . . , when it finished it (alāsē ku), it whispered (lit. ‘sneaked’) (putrit) in the light brown cow’s ear and said

To bring the distant near: On deixis in Iranian oral literature

333

(guštī), “Your hair and mine are the same colour, you, don’t help it, when I eat this white one, all the grass will remain for you. Then I will watch over you here, you can eat!” It said (guštī), “Fine”. When it seized (gipt) this one too, that one (i.e., the light brown one) didn’t help (komak nakurt). The lion ate (wārt) this one (i.e., the white one) too. When it finished it (alāsē kurt) this one too, it said (gu) to that one, “Now I alone am more powerful than you”. It ate (wārt) that one too. In this manner, with this trick, it destroyed (ziyānē kurt) them.

When it comes to time adverbials, unlike tense, they sometimes demonstrate proximal deixis, i.e., anchoring in the story (deictic shift). There are a number of instances of proximal time deictics in the corpus, such as the demonstrative deictics in unit 114 (Barjasteh Delforooz 2010: 343, units 113–119 are translated below) and unit 15b (Barjasteh Delforooz 2010: 358), which refer to time rather than to space: From the first day when this girl had been born, until this very [time] (amē) when she was mature and had reached puberty, except for her mother and father who knew that this was a girl, even the neighbours didn’t know that this was a girl. . . . the Baloch nomad, at this moment (lit. ‘here’) (idā) (i.e., at the moment when his camel was dying), experienced the feeling of desperation. . . .

Also the time adverbials “today, tomorrow, and the day after tomorrow” are used in these texts in narrative parts to refer to “the same day, the next day, the day after the next” (unit 10, Barjasteh Delforooz 2010: 343; units 42–43, Barjasteh Delforooz 2010: 380; unit 28, Barjasteh Delforooz 2010: 385; unit 50, Barjasteh Delforooz 2010: 387). Units 41–44 (Barjasteh Delforooz 2010: 380–381) read as follows: It passed, that day (lit. ‘today’) (mrōčī) passed and the next day (lit. ‘tomorrow’) (bāndā) passed, it whispered in the light brown cow’s ear and in that of the white one.

Another passage that would indicate temporal anchoring in the actual narration is units 72–73 (Barjasteh Delforooz 2010: 299) where the word šapī is translated “tonight”. However, in view of the fact that a parallel adverb sōbī has been translated “in the morning” (unit 119, Barjasteh Delforooz 2010: 374; unit 72, Barjasteh Delforooz 2010: 388), it seems likely that šapī can also be translated as “at night”18: “Surely I will die tonight (or ‘at night’) (šapī). Death did not come to him tonight (or ‘at night’) (šapī). . .”. 18 This translation, or the translation ‘this night’ was also suggested by one of my Baloch friends.

334

Carina Jahani

4.3 Vafsi The verb forms in Vafsi exhibit yet another pattern. Most stories start out in the past tense, not only for the initial background description, but also for foreground events in the event line at the beginning of the story. In Tale B7 (Stilo 2004: 110–113), given in full above, there is one tense shift to the non-past tense, which takes place between units 24 and 27.19 In Tale A1 (: 26–29), Tale A3 (Stilo 2004: 32–47), Tale A4 (Stilo 2004: 48–51), Tale A9 (Stilo 2004: 104–109), and Tale B8 (Stilo 2004: 116–123), there are several shifts between past and non-past tense. All the tales start out in the past and end in the non-past tense, except for A1, which ends in the past tense. Tale A5 (Stilo 2004: 52–53) is in the past tense from the beginning until the end. This is also true for Tale A6 (Stilo 2004: 54–57). The point where tense shift in Tale B7 occurs is presented here: The infidel considered (molazæs bækærde) it for a bit and said (va), “Well, I didn’t have a child until now. And now, God has willed it (lit. ‘it pleases God’). Now, let me slit his throat so they can get apart from one another, I don’t want this child any more”. This infidel goes (ætari) and gets (ærgiri) a pen knife and slits (ærbirine) his son’s throat, right between Moses and the hermit, and they come apart ( joda -rbuænd) from one another and start going (bæna -rkærende siæn).

In Vafsi, the same word is used for ‘tomorrow’ and ‘the next day’ (soæy) (Stilo 2004: 277). In Text A5 (unit 9, Stilo 2004: 52), a proximal demonstrative determiner is even added before soæy. An enclitic pronoun (=s) is added to the phrase soæy šo ‘tomorrow night’ to denote ‘the next night’ (unit 91, Stilo 2004: 40): The next day (lit. ‘this tomorrow’) (in soæy) the Jew came . . . Then it turned into the next night (lit. ‘it’s tomorrow night’) (soæy šos). On the next night (lit. ‘it’s tomorrow night’) (soæy šos), this vizier got up again.

It is also interesting to note the proximal demonstrative determiner in the temporal expression “at first” in Tale A9 (unit 31, Stilo 2004: 106): At first (lit. ‘this first’) (in ævvæl) they said no.

4.4 Gorani In the three Gorani tales, when direct speech is excluded, the verbs are almost invariably in the non-past tense both in the introduction and in the event line. One or two past tense verb forms are, however, used for background informa19 Units 25–26 are direct speech.

To bring the distant near: On deixis in Iranian oral literature

335

tion, e.g., in a subordinate adverbial clause (unit 13, Mahmoudveysi et al. 2012: 64) and in a relative clause (unit 64, Mahmoudveysi et al. 2012: 100). Units 2–8 of the tale Tītīla and Bībīla (Mahmoudveysi et al. 2012: 63–64), where both the introduction and verbs in the event line are found, read like this: Tītīla and Bībīla, in the Kurdish language we say (mwāžām), “The lame goat and the lame ram”. There is (mawu) a goat and there is (maw(u)) a ram; they are (mawin) lame. They are (mawin) at home; the flock goes (mašu) to the mountains and they (i.e., the goat and the ram) stay (mamanin) at home. They . . . The front of the gate is (mawu) open; they go (mařawin) out of the courtyard and they say, “Let’s go, let’s reach the flock”. They go (mašin) to the Zimkān river; the flock goes (mašu) to that other side. From the side of the Zimkān river, the flock crosses (lit. ‘does’) (makarī) to that other side. The flock crosses (lit. ‘does’) (makarī) to that side. . . . Suddenly they (i.e., the goat and the ram) say (mwān), “Hey, brother!” The ram says (mwāy), “Yes?”

When it comes to time adverbials, Mahmoudveysi et al. (2012: 81 n. 43) note that the difference between “that/this evening” and “the following evening/night” is īšaw versus ī šaw. What is relevant to this discussion is that according to this definition, there is no difference between “that evening” and “this evening”. It is also interesting to note that the phrase “the following evening/night” contains the proximal demonstrative determiner (ī).

5 Summary and comparison with other Iranian languages The four linguistic variants in this study show interesting variation when it comes to deictic shift (see Table 12). First of all, it is more common for spatial deixis to be shifted to the story than for tense to be anchored in the story. Koroshi Balochi, Sistani Balochi, and Vafsi present almost total spatial deictic shift, whereas in Gorani there are a few cases of the deictic adverb ‘there (āna, ānā)’, which move the deixis outside the story. It also seems that proximal deixis is the norm in at least Koroshi Balochi, Sistani Balochi, and Vafsi to repeatedly mark the current center of attention with deictics.20 Gorani is the language that has the strongest tense anchoring inside the narrative, with almost exclusive use of the non-past tense. At the other extreme

20 Thomas Jügel (personal communication) also notes the same phenomenon in German, and comments that distal pronouns are rarely used, their function being taken over by proximal pronouns and elements that are neutral with respect to deixis.

336

Carina Jahani

we find Sistani Balochi, which has no tense anchoring in the narrative (only past tense verb forms), but does have some adverbials that move the deictic center into the story. Vafsi is ambivalent, with constant shifts between the past and non-past tenses. Normally the tales start out in the past tense and end in the non-past tense. As for Koroshi Balochi, tense use is consistent with background material in the past tense and foreground material in the event line in the nonpast tense. On the other hand, there are a few temporal adverbials with distal demonstrative determiners in Koroshi Balochi. Table 12: Summary of spatial and temporal deixis in the four variants under study

Language

Spatial deixis inside narrative

Tense anchoring inside narrative21

Koroshi Balochi Sistani Balochi Vafsi Gorani

+ + + +/‒

+ ‒ +/‒ +

Studying deixis in Persian, Roberts (2009: 233) argues for “a bias or preference for proximal deixis over distal deixis”. Persian prefers, as an example, to use the proximal deictics ‘now, here, this, yesterday, today, tomorrow etc., both with the speech moment as the deictic center (he will come tomorrow) and with another deictic center than the speech moment (he told me that he would come the next day (lit. ‘tomorrow’) (Roberts 2009: 235–241). Further studies are needed to determine to what extent the preference for proximal deixis in Persian should be attributed to deictic shift in this language. Deictic shift has also been reported for Talyshi folktales by Paul (2011: 93), who shows that the deictic center sometimes “is projected onto the narrative’s chief protagonists” and sometimes “onto the central locational reference point” of a certain episode in the narrative. Barjasteh Delforooz (2010: 146) reports the same phenomenon for Sistani Balochi. He finds that the spatial deictic center in the stories he analyzes in depth is normally where the main participant of the story is located, but that the deictic center sometimes moves ahead of the major participant to the scene where an important event or the climax of the story will take place. The point is that it stays inside the story, which means that we are dealing with deictic shift. Barjasteh Delforooz concludes that in his data, proximal deixis “is much more frequent than distal deixis” and that in fact it is even more frequent in Sistani Balochi than in Persian (Barjasteh Delforooz 2010: 159). 21 Here only the verbs in the event line are considered.

To bring the distant near: On deixis in Iranian oral literature

337

The distal demonstratives are mainly used for establishing a contrast between two different entities, for highlighting purposes, and for reactivation. The use of a distal demonstrative pronoun to establish contrast is also noted in Kumzari by Wal Anonby (2015: 67), who finds that “a secondary participant is referenced by the pronoun ān [. . .] in place of third person singular yē, to distinguish it from a primary participant”. Other features of the stories under study also contribute to the deictic shift of the listener from the present world to the world of the narrative. One of these is the total predominance of direct speech. Now and then skillful narrators also address the audience with phrases such as “listen brothers” (Barjasteh Delforooz 2010: 303, unit 1), “sir” (Nourzaei et al. 2015: 123, unit 2), or an even more personal address such as “dear doctor”, as found in Barjasteh Delforooz (2010: 391, unit 115). It is thus clear that there are numerous devices to make the oral narration lively and exciting by bringing the distant near, thereby giving the audience a fascinating experience of journeying to “a time and a land that is not”.22

6 Acknowledgments Sincere thanks to Denise Bailey, Thomas Jügel, Stephen H. Levinsohn, and an anonymous reviewer for useful comments on earlier versions of this article. All remaining errors and shortcomings are, of course, my own.

References Anderson, Stephen R. & Edward Keenan. 1985. Deixis. In Timothy Shopen (ed.), Language typology and syntactic description, vol. III: 259–308. Cambridge: Cambridge University Press. Barjasteh Delforooz, Behrooz. 2010. Discourse features in Balochi of Sistan. Oral narratives [Studia Iranica Upsaliensia 15]. Revised version. Uppsala: Acta Universitatis Upsaliensis. http://uu.diva-portal.org/smash/record.jsf?pid=diva2:345413 (accessed 10 December 2015). Diessel, Holger. 1999. Demonstratives. Form, function, and grammaticalization. Amsterdam and Philadelphia: John Benjamins. Fillmore, Charles J. 1997. Lectures on deixis. Stanford, CA: CSLI Publications. Haig, Geoffrey L. J. 2008. Alignment change in Iranian languages. A constructional grammar approach. Berlin & New York: Mouton de Gruyter. 22 See www.poemhunter.com/poem/the-land-that-is-not/ for a poem with the English title The Land That Is Not, originally written in Swedish by the poet Edith Södergran.

338

Carina Jahani

Jahani, Carina & Agnes Korn. 2009. Balochi. In Gernot Windfuhr (ed.), The Iranian languages, 634–692. London & New York: Routledge. Korn, Agnes. 2005. Towards a historical grammar of Balochi [Beiträge zur Iranistik, 26]. Wiesbaden: Reichert. Levinsohn, Stephen H. 2015. Self-instruction materials on narrative discourse analysis. SIL International. http://www-01.sil.org/~levinsohns/narr.pdf (accessed 23 November 2015). Lyons, John. 1977. Semantics. Vol. 2. Cambridge: Cambridge University Press. Mahmoudveysi, Parvin, Denise Bailey, Ludwig Paul & Geoffrey Haig. 2012. The Gorani language of Gawraǰū, a village of west Iran [Beiträge zur Iranistik, 35]. Wiesbaden: Reichert. Nourzaei, Maryam, Carina Jahani, Erik Anonby & Abbas Ali Ahangar. 2015. Koroshi. A corpusbased grammatical description [Studia Iranica Upsaliensia, 13]. Uppsala: Acta Universitatis Upsaliensis. http://uu.diva-portal.org/smash/record.jsf?pid=diva2%3A810250 (accessed 10 January 2016). Nyberg, H. S. 2004. Muntlig tradition, skriftlig fixering och författarskap. Compiled and edited by Bo Utas. Uppsala: Kungl. Humanistiska Vetenskaps-Samfundet i Uppsala. Ong, Walter. 1982. Orality and literacy. The technologization of the word. London & New York: Methuen. Paul, Daniel. 2011. A glance at the deixis of nominal demonstratives in Iranian Taleshi. In Agnes Korn, Geoffrey Haig, Simin Karimi & Pollet Samvelian (eds.), Topics in Iranian linguistics, 89–102 [Beiträge zur Iranistik, 34]. Wiesbaden: Reichert. Paul, Ludwig. 2003. The position of Balochi among western Iranian languages: The verbal system. In Carina Jahani & Agnes Korn (eds.), The Baloch and their neighbours: Ethnic and linguistic contact in Balochistan in historical and modern times, 61–71. Wiesbaden: Reichert. Roberts, John R. 2009. A study of Persian discourse structure [Studia Iranica Upsaliensia, 12]. Uppsala: Acta Universitatis Upsaliensis. http://www.diva-portal.org/smash/record. jsf?pid=diva2%3A285628 (accessed 28 October 2015). Segal, Erwin M. 1995. A cognitive-phenomenological theory of fictional narrative. In Judith F. Duchan, Gail A. Bruder & Lynne E. Hewitt (eds.), Deixis in narrative: A cognitive science perspective, 61–78. Hillsdale, NJ, & Hove, UK: Lawrence Erlbaum Associates. Stilo, Donald L. 2004. Vafsi folk tales. Supplied with folklorist notes and edited by Ulrich Marzolph [Beiträge zur Iranistik, 25]. Wiesbaden: Reichert. Utas, Bo. 2008. Manuscript, text and literature. Collected essays on Middle and New Persian texts. [Beiträge zur Iranistik, 29]. Edited by Carina Jahani and Dariush Kargar. Wiesbaden: Reichert. Wal Anonby, Christina van der. 2015. A grammar of Kuzmari. A mixed Perso-Arabian language of Oman. Leiden: Leiden University dissertation. Zubin, David A. & Lynne E. Hewitt. 1995. The deictic center: A theory of deixis in narrative. In Judith F. Duchan, Gail A. Bruder & Lynne E. Hewitt (eds.), Deixis in narrative: A cognitive science perspective, 129–155. Hillsdale, NJ, & Hove, UK: Lawrence Erlbaum Associates.

Katarzyna Marszalek-Kowalewska

17 Extracting semantic similarity from Persian texts Abstract: The study of semantics has always attracted researchers from the fields of philosophy, linguistics, and communication theory. Recent developments in corpus linguistics and in computational linguistics have enabled an empirical study of semantics on a large scale. The goal of this article is to present procedures for extracting semantic information from a corpus of Persian newspaper texts. The analysis – modeled on Mason’s (2006) study – focuses on applying the procedures for identifying semantic information to the Persian language. First, two procedures – collocational overlap and usage patterns – are introduced. Next, a comparative study of English loanwords in Persian, and their native Persian counterparts is presented. The aim of this study is to examine whether these two – the loanword and its native counterpart approved by the Academy of Language and Literature – are semantically similar. Keywords: information retrieval, semantic similarity, Persian

1 Introduction Semantics, the study of meaning, has been one of the core concepts and is widely discussed in philosophy, linguistics, and information theory. John Firth (1957: 190), in emphasizing the importance of semantics, wrote “[t]he study of meaning is a permanent interest of scholarship”. This article tries to deal with semantics from a linguistic perspective. Since the study of meaning involves both linguistic and nonlinguistic worlds, it has been rather neglected (not to say ignored) in linguistic analyses for a long time. Linguists have claimed, for instance, that it is not possible to analyze semantics empirically, e.g., “word meanings are not among the phenomena which can be covered by empirical, predicative scientific theories” (Sampson 2001: 206). Although some of the critics were right in their beliefs, e.g., that meaning changes and is inseparably connected with speakers, the idea that meaning should not be analyzed empirically seems rather unconvincing, especially nowadays. Recent advances in corpus and in computational linguistics, and particularly in Katarzyna Marszalek-Kowalewska, Adam Mickiewicz University DOI 10.1515/9783110455793-017

340

Katarzyna Marszalek-Kowalewska

natural language processing, make it possible to not only analyze semantics empirically, but to conduct those analyses on a scale that was not possible before. To name just a few examples, let’s consider: Brown et al.’s (1991) study on word sense disambiguation; Fellbaum’s (1998) work on WordNet; Magnini and Cavigliá’s (2000) proposal on integrating subject field codes into WordNet; Danielsson’s (2001) study of the automatic identification of meaningful units in language; and Girju, Badulescu, and Moldovan’s (2003) research on semantic constraints in the automatic discovery of part-whole relations. This article presents a study of the process for identifying certain semantic information in the Persian language and, to be more precise, it focuses on extracting semantic similarities. It presents two procedures for extracting semantic information presented by Mason (2006). Mason’s work focuses on the English language, but here his procedures are applied to Persian. Firstly, two introductory sections are presented. The former describes general notions about corpus and about computational linguistics, with respect to a semantic analysis. The latter section focuses on Persian corpus linguistics and its contribution to the study of meaning. Then the two procedures mentioned above – collocational overlap and usage patterns – are introduced. The methodology and corpus details are also provided, and a comparative study of English loanwords in Persian, along with their native counterparts approved by the Academy of Persian Language and Literature (Farhangestāne Zabān va Adabiyāte Fārsi), is presented. Finally, this article concludes with a discussion of the results.

2 Extracting semantic information from a corpus Since this article deals with extracting semantic information from a corpus of Persian texts, certain terms connected to corpus and computational linguistics need to be defined. A corpus is usually defined as a systematic collection of naturally occurring texts. The field of corpus linguistics therefore enables the analysis of naturally occurring texts, on the basis of an analysis often carried out with the use of specialized software. Computational linguistics deals with the statistical or rule-based modeling of natural language from a computational perspective. Finally, the area of information retrieval (IR) can be briefly described as the science of finding objects in any media relevant to a user query. The literature and the examples of corpus-based semantic information retrieval are enormous. First of all, a corpus can be annotated with semantic information. Although adding semantic metadata to a corpus is a complex task, nowadays more and more corpora provide semantic information, including: CLEF Corpus (Roberts et al. 2007), GENIA (Kim et al. 2003), PropBank (Palmer, Gildea, and

Extracting semantic similarity from Persian texts

341

Kingsbury 2005), FrameNet (Baker, Fillmore, and Lowe 1998), Penn Discourse TreeBank (Prasad et al. 2008), and OntoNotes (Hovy et al. 2006). Computational semantics is the area of computational linguistics that focuses on linguistic meaning within a computational approach to the study of a natural language. It focuses on an approach where the meaning of words, phrases, and sentences can be computed systematically from the meaning of their syntactic constituents. This subdiscipline of computational linguistics is currently gaining more and more interest. Within the Association of Computational Linguistics, there is a Special Interest Group on Computational Semantics (SIGSEM), which hosts three conferences on computational semantics: International Workshop on Computational Semantics, *SEM Joint Conference on Lexical and Computational Semantics and Inference in Computational Semantics.1 Another example demonstrating the popularity of computational semantics is FraCaS,2 the Framework for Computational Semantics, which is a European Union project that deals with convergences in computational semantics. There are also numerous semantic web projects, tools, and platforms that deal with semantic data management. To name just a few, these are: Semantic Web, DataVersity, PreDose, Semantic Bookmarking Platform, GoNTogle, the Knowledge and Information Management Platform, SemTag, OntoMat, MnM,3 etc. One of the main foci in computational semantics is word-sense disambiguation (WSD). This is a process for identifying the meanings of a word in a manner that is determined by its context. In the area of computational linguistics and natural language processing, it becomes clear that semantic disambiguation at a lexical level is necessary for even a basic understanding of language (Stevenson and Wilks 2003). Some early trials to deal with the problem of ambiguity were undertaken by Wilks (1972), with his preference semantic systems; Small (1980) with word expert parsing; and Hirst (1987) with polaroid words. In the dictionary-based approaches, researchers have made use of machine-readable dictionaries for word-sense disambiguation, e.g., Lesk (1986) and McRoy (1992). Another way of dealing with the problem has been the connectionist approach. Examples of this include: Waltz and Pollack (1985), who independently developed a model based on psycholinguistic observations; and Ide and Veronis (1980), who constructed large neural networks in order to solve the problem of language

1 For more information on the activities of SIGSEM, see http://www.sigsem.org/w/. 2 For more information, see http://www.lt-world.org/kb/players-and-teams/projects/obj_62590. 3 For more information on the semantic web and semantic web platforms, see Reeve and Hyoil (2005) Survey on Semantic Annotation Platforms, and Kiryakov et al. (2004) Semantic annotation, indexing and retrieval.

342

Katarzyna Marszalek-Kowalewska

ambiguity. Currently, most approaches to word-sense disambiguation use statistical and machine learning approaches. These include Yarowsky (1992) and Schütze (1992). In the area of information retrieval, information is stored in various thesauri and ontologies. The most popular database that allows the retrieval of semantic information is WordNet (Fellbaum 1998), a large lexical database of the English language, in which nouns, verbs, adjectives, and adverbs are grouped into synsets (sets of cognitive synonyms).4 The semantic information extracted via natural language processing, as well as corpus and computational linguistic approaches, are useful in many areas, e.g., text summarization, advertisement targeting, biomedical text mining, etc. This very general description of the methods for identifying semantic information in the fields of corpus linguistics, computational linguistics, and natural language processing aims to provide a background for the possibilities of finding semantic information in texts. For a more detailed description, please refer to The Oxford handbook of computational linguistics (Mitkow 2003), the Handbook of natural language processing (Dale, Moisl, and Somers 2000), and Word sense disambiguation (Eneko and Edmonds 2007).

3 Persian corpus linguistics and semantic information retrieval The Persian (also known as Farsi) language belongs to the Indo-Iranian branch of the Indo-European language family, and is spoken by more than 100 million people throughout the world. It is the official language of Iran, Afghanistan, and Tajikistan. Despite the large number of speakers, Persian has, unfortunately, been among the less resourced and least analyzed languages from a computational point of view. Although there are more and more complex developments in the areas of Persian natural language processing, corpora, and computational linguistics, it is still not as fully researched as, for example, English. One reason for this may be the fact that the Persian language poses certain problems for computational and corpus linguists (Megerdoomian 2010; Seraji, Megyesi, and Nivre 2012; Shamsfard 2011; Ghayoomi, Momtazi, and Bijankhan 2010; Ghayhoomi and Momtazi 2009; Farajian 2011). Some of the main challenges in Persian text processing are:

4 For more information on WordNet, see httpsː//wordnet.princeton.edu/.

Extracting semantic similarity from Persian texts

343

(a) Unclear word boundaries and resulting unclear phrase boundaries: One of the most problematic issues in processing texts in Farsi is the fact that word boundaries are often not clear. First of all, there is the problem of pseudospaces. Pseudospaces or zero-width non-joiner (ZWNJ, \u200c in Unicode) indicate boundaries of words or compound parts. Users often use white spaces instead of pseudospaces, which causes words such as ‫ ﺯﺑﺎﻥ ﺷﻨﺎﺳﯽ‬zabānshenāsi ‘linguistics’ to be processed as the separate words ‫ ﺯﺑﺎﻥ‬zabān ‘language’ and ‫ ﺷﻨﺎﺳﯽ‬shenāsi ‘knowledge’, instead of as the single word ‘linguistics’. What is more, white spaces are not used when they should be, e.g., ‫ ﺗﻮﺍﺯﻣﺎﮔﺮﻓﺖ‬toazmāgereft ‘YouTookFromUs’. As a result this phrase would be processed as one lexeme instead of as four separate items. Finally, the inconsistency in the use of white spaces in terms with detached morphemes creates another challenge. The number of possibilities resulting from such situations can be observed in the following table: Table 1: Word boundary ambiguity (adapted from Ghayoomi, Momtazi, and Bijankhan 2010; examples mine) Affix

Attached

‫ﻣﯽ‬ ‫ﺗﺮﯾﻦ‬ ‫ﻫﺎ‬

‫ﻣﯿﺨﺮﯾﺪ‬ ‫ﮐﻮﭼﮑﱰﯾﻦ‬ ‫ﺯﻧﻬﺎ‬

Pseudospace

White space

‫ﻣﯽ ﺧﺮﯾﺪ‬ ‫ﮐﻮﭼﮏ ﺗﺮﯾﻦ‬ ‫ﺯﻥ ﻫﺎ‬

(b) Diacritics: The following diacritic marks are used in Farsi: – zabar ( fathe) – used to mark the vowel [a] after a consonant, e.g., ‫ َﺩﺭ‬dar ‘door’; – zir (kasre) – used to mark the vowel [e] after a consonant, e.g., ‫ ِﮔﻞ‬gel ‘mud’; – piš (zamme) – used to mark the vowel [o] after a consonant, e.g., ‫ﴎ‬ ُ sor ‘slide’; – sokun – used to mark that the consonant is not followed by any vowel, e.g., ‫ ﮔ ْﻨﺞ‬ganj ‘treasure’; – tašdid – indicates a double consonant, e.g., ‫ ﺭ ّﺩ‬radd ‘dismissal’; – tanvin – used on a final alef to mark the final [an] of certain adverbs ً‫ﺑﻌﺪﺍ‬ ba’dan ‘later’, – hamze – used to mark ezafe construction after the vowel [e], e.g., xāne-ye man ‘my house’. The fact that diacritics are usually not written poses several problems. To begin with, this can cause a lot of ambiguity, particularly in the case of homographic lexemes. As a result, it blurs the statistics and frequencies, as after tokenization, these words are counted as one, whereas the real frequency of each of them is difficult to establish.

344

Katarzyna Marszalek-Kowalewska

(c) The Ezafe construction. The Ezafe marker is a short vowel added between prepositions, adjectives, and nouns in a phrase in order to determine the relation between nouns and their modifiers, e.g., ‫ ﮐﺘﺎﺏ ﻣﻦ‬ketāb-e man ‘my book’. The Ezafe marker is always pronounced but whether it is written is arbitrary. When not written, it can lead to ambiguity. The ambiguity here refers to problems in chunking, as well as to the semantic and syntactic processing of a sentence. (d) Complex tokens. This category refers to multi-element lexemes that consist of a lexeme itself and an attached part that represents a separate lexical category or a part of speech from the one that it is attached to. The number of possibilities resulting from such a situation can be observed in the following table: Table 2: Complex tokens (adapted from Ghayoomi, Momtazi, and Bijankhan 2010) Word

Type

White space

Pseudospace

Attached

‫ﺑﻪ‬ ‫ﻫﻢ‬ ‫ﺍﯾﻦ‬ ‫ﺁﻥ‬ ‫ﺭﺍ‬ ‫ﮐﻪ‬

Preposition

‫ﺑﻪ ﺷﯿﻮه‬ ‫ﻫﻢ ﮐﻼﺱ‬ ‫ﺍﯾﻦ ﻣﺮﺩ‬ ‫ﺁﻥ ﻗﺪﺭ‬ ‫ﴍﺍﯾﻂ ﺭﺍ‬ ‫ﭼﻨﺎﻥ ﮐﻪ‬

‫ﺑﻪ ﺷﯿﻮه‬ ‫ﻫﻢ ﮐﻼﺱ‬ ‫ﺍﯾﻦ ﻣﺮﺩ‬ ‫ﺁﻥ ﻗﺪﺭ‬ ‫ﴍﺍﯾﻄﺮﺍ‬ ‫ﭼﻨﺎﻥ ﮐﻪ‬

‫ﺑﺸﯿﻮه‬ ‫ﻫﻤﮑﻼﺱ‬ ‫ﺍﯾﻨﻤﺮﺩ‬ ‫ﺁﻧﻘﺪﺭ‬ ‫ﴍﺍﯾﻄﺮﺍ‬ ‫ﭼﻨﺎﻧﮑﻪ‬

Prefix Determiner Determiner Postposition Relativizer

(e) Encoding. Since texts in Farsi are written in an Arabic script, some online materials are written in mixed Arabic and Persian codes, i.e., as well as the Unicode characters for Farsi, Arabic or ASCII characters tend to be used as well. To be more precise, letters ‫ ﮎ‬and ‫ ﯼ‬can be expressed in Persian (\u06a9 for ‫ ﮎ‬and \u064a for ‫ )ﯼ‬or Arabic encoding (\u0643 for ‫ ﮎ‬and \u06cc or \0649 for ‫)ﯼ‬. Despite the problems presented here, linguists dealing with the Persian language have undertaken steps to provide corpora, as well as tools for analyzing the Persian language. There is a great variety of Persian corpora. In the area of monolingual corpora, the following examples should be cited: The Persian Linguistic Database,5 prepared under the supervision of Professor Seyyed Mostafa Assi at the Institute for Humanities and Cultural Studies; the Hamshahri Collection,6 with data from Hamshahri, the first online newspaper in Iran; The Peykareh Text Corpus (Bijankhan et al. 2011), designed and developed by Bijankhan; and finally the Bijankhan Corpus,7 the first linguistically annotated Persian corpus. 5 For more information, see http://pldb.ihcs.ac.ir/. 6 For more information, see http://ece.ut.ac.ir/dbrg/hamshahri/. 7 For more information, see http://ece.ut.ac.ir/dbrg/bijankhan/.

Extracting semantic similarity from Persian texts

345

There are also a few examples of multilingual corpora. The Comparative Persian-English corpus (Hashemi et al. 2010), compiled by the Intelligent Systems Research Laboratory of Tehran University, consists of two news sources: Persian news from the Hamshahri News agency and English news from the BBC news agency. Other examples are the English-Persian Parallel Corpus (Farajian 2011), which belongs to the ELRA project, and the Persian 1984 corpus,8 which consists of a translation of George Orwell’s novel 1984, and belongs to the MULTEX-East parallel corpus. The Shiraz corpus is a bilingual parallel-tagged corpus developed from a large Persian corpus of online material. There are also syntactically annotated corpora like: The Persian Treebank9 (Per TreeBank); The Persian Dependency Treebank10 (PerDT); and The Uppsala Persian Dependency Treebank.11 A few corpora have also been compiled and used for speech recognition, such as the Farsi Speech Database12 (FARSDAT), which was built in 1996 by the Research Centre of Intelligence Signal Processing (Ghayoomi, Momtazi, and Bijankhan 2010). Another speech corpus, the large Persian Speech Database, contains data on over 1,000 hours of speech, which was recorded by one hundred native Persian speakers representing ten different dialects. There are a few corpora focusing on the linguistics of telephone calls, in particular the Linguistic Data Consortium, which offers two telephone speech corpora – OGI Multilingual Corpus containing 175 calls; and CALLFRIEND containing 109 calls. Another corpus, the Persian Telephone Database, comprises one hundred hours of dialogue between 200 native speakers of Persian (Ghayoomi, Momtazi, and Bijankhan 2010). In order to facilitate studies of the Persian language, certain computational tools have also been developed, including: Persian LG (Dehdari and Lonsdale 2008), PerStem (Jadidinejad, Mahmoudi, and Dehdari 2010), PersPred (Samvelian and Faghiri 2013), PerLex (Sagot et al. 2011), SteP-1 (Shamsfard, Jafari, and Ilbeygi 2010), etc. There have also been developments in the semantic resources available in Persian Natural Language Processing research. One such work is undoubtedly FarsNet,13 a lexical ontology for the Persian language. FarsNet aims to provide lexical, syntactic, and semantic information on words and phrases organized in sets of cognitive relations. It consists of the Persian WordNet and Persian Net of

8 For more information, see http://catalog.elra.info/product_info.php?products_id=1124. 9 For more information, see httpsː//hpsg.fu-berlin.de/~ghayoomi/PTB.html. 10 For more information, see http://dadegan.ir/en/perdt. 11 For more information, see http://stp.lingfil.uu.se/~mojgan/UPDT.html. 12 For more information, see http://catalog.elra.info/product_info.php?products_id=18. 13 For more information, see http://dadegan.ir/catalog/farsnet.

346

Katarzyna Marszalek-Kowalewska

Verb (PeVNet) frames and includes two types of connections: inner- and interlanguage relations. The former consists of the relations between different senses and synsets, e.g., synonymy, hyperonymy, hyponymy, meronymy, cause, and antonymy. The latter consists of equal-to and near-equal-to relations. The following lexical resources were used to construct FarsNet: two monolingual dictionaries (the Sokhan Dictionary and the Sadir Afshar Dictionary, two corpora (the Persian Linguistic Database and Peykareh), bilingual dictionaries, the Persian Thesaurus, and a dictionary of Persian synonyms and antonyms (Shamsfard, Fadaei, and Fekri 2010). Numerous research studies have been conducted on the FarsNet framework, including: Dehkharghani and Shamsharf (2009) on the mapping of Persian words to the WordNet synsets; Shamsharf et al. (2010) on extracting lexico-conceptual knowledge for the development of the Persian WordNet; Rouhizadeh, Yarmohammadi, and Shamsfard (2008) on WordNet for Persian verbs; Taheri and Shamsfard (2011) on mapping FarsNet to suggest an upper merged ontology, etc. There is also serious work being published that relates to word-sense disambiguation in the Persian language. As examples, consider: Soltani (2010) on statistical word-sense disambiguation in Persian; Hamidi, Borji, and Ghidary (2007) on Persian word-sense disambiguation; Makki and Homayounpour (2008) on the word-sense disambiguation of Persian homographs; Miangah and Khalafi (2005) on the usage of a learner corpus in a machine translation system; and Makki and Homayounpour (2008) on using a decision list for Farsi word-sense disambiguation.

4 The analysis of semantic similarity: A corpus-based study Similarity is a complex concept widely discussed in philosophy, linguistics, and information theory. So far, many different measures of semantic similarity between word pairs have been proposed. Some use lexical databases, some statistical or distributional techniques, and some a hybrid approach, which combines both statistical and lexical techniques. This section presents two methods of measuring semantic similarity proposed by Mason (2006) – “collocational overlap” and “usage patterns”. Then a case study on semantic similarity between English loanwords in Persian and their native Persian equivalents will be presented. Yet, before describing the procedure and case study, let’s first focus on the data that underwent scrutiny.

Extracting semantic similarity from Persian texts

347

In order to identify semantic information, a new corpus of Persian modern newspaper data was compiled. The compilation process consisted of collecting news articles from popular Iranian online newspapers (MehrNews and Hamshahri), which were preprocessed (e.g., removing links, dates, etc.) and prepared for linguistic analysis. It consists of 10 million tokens and in terms of features, it is as follows: – specialized: it contains data from one source: newspaper texts – synchronic: it contains data from 2010 – written: it contains written texts – monolingual: the collected data concern the Persian language – full text: it contains full samples – static: it is meant to represent the language at a particular point in time To carry out the analysis, WordSmith14 6.0 tool was used. It is a complex program for corpus analysis prepared by Mike Scott. It allows extraction of word lists, keyword lists, collocations; analysis of concordances; and provision of numerous cluster extraction functions, as well as some statistical measures.

4.1 Collocational overlap Collocation is one of the core concepts in linguistics and can be described with Firth’s (1957: 179) words as the “company a word keeps”. To be more precise, a collocation is a sequence of words that tend to co-occur with one another more often than it might be expected. Collocational overlap is based on the idea that “collocations are the words which occur within the context of a node word more significantly than expected by chance. Therefore collocates of a word describe its context in a condensed way, and through analysing the distribution of collocates across the environment of a number of words we can assess their semantic similarity” (Mason 2006: 228). The whole procedure starts with a choice of words15 to be investigated, i.e., the node word and candidate words hypothetically chosen as similar ones. Then collocates of the node and candidates are examined and the overlap of collocates is analyzed. This overlap was determined by the following formula: dði; jÞ ¼ 1–

2 ðci \ cjÞ ci þ cj

14 For more information, see http://www.lexically.net/wordsmith/. 15 It would also be possible to conduct the analysis without preidentified words using natural language processing techniques.

348

Katarzyna Marszalek-Kowalewska

i.e., the distance between two words i and j equals twice the number of shared collocates divided by the sum of the sizes of each word’s collocates. This is subtracted from 1 in order to yield a distance value, with 1.0 being the maximum distance, and 0.0 being identity. On the basis of data analysis (Mason 2006: 229), it was concluded that words having overlap distance less than 0.8 are classified as similar, i.e., two words will be classified as semantically similar when the overlap distance between them is less than 0.8. If the overlap distance is more than 0.8, two words are not classified as semantically similar. As an example, consider ‫ ﺭﺍﯾﺎﻧﻪ‬rāyāne ‘computer’ and its collocational overlap with the following words: Table 3: Collocational overlap Distance between

‫ ﺭﺍﯾﺎﻧﻪ‬and ‫‘ ﮐﺎﻣﭙﯿﻮﺗﺮ‬computer’ (English loanword) ‫‘ ﻣﺎﺷﯿﻦ‬machine’ ‫‘ ﻓﻨﯽ‬technological’ ‫‘ ﺍﺳﺒﺎﺏ‬device’

Qualified as similar (YES < 0.8 > NO

0.5

YES

0.36

YES

0.79

YES

0.87

NO

Words that qualify as similar are then collected in one set. Another set is made from all collocates these candidates have. Then for each candidate, the co-occurrence frequencies of each word from the collocates set are counted. The counts (as well as the n) are entered into a contingency table that is then analyzed using Correspondence Analysis (for more details, see Mason [2006: 230]). One very important aspect that needs to be remembered here is that the result will depend on the corpus used in the analysis, e.g., whether it is a reference corpus (designed to provide a comprehensive reflection of a language) or a specialized (compiled to reflect a very specific part of a language) one. What is more, the test used for extracting collocates also leads to different results (for example, the Mutual Information test, which was used here, measures the strength of association between words whereas t-score measures the confidence with which we can assert that there is an association).

4.2 Usage patterns This method is based on the assumption that similar entities have more properties in common. The starting point here is the ready set of triples – the output of grammatical usage pattern extraction – two words and the relation between

Extracting semantic similarity from Persian texts

349

them. For each of these words there is a set of significant arguments – words that have similar sets of partners in one or more relationships. These particular arguments are responsible for defining the shared environment between the investigated words. What is more important here, these words provide a link to those words that themselves have one or more of those arguments, which are classified as semantically related candidates (Mason 2006). The set of arguments of each candidate word is prepared, and later it is compared with a set of arguments of a target word. We may illustrate this with the example of the NN relation of ‫ ﮐﺎﻣﭙﯿﻮﺗﺮ‬kāmpyuter ‘computer’ (English loanword). The results are ‫ ﺁﺯﻣﺎﯾﺶ‬āzmāyesh ‘test’, ‫ ﺧﻮﺍﺹ‬khavās ‘properties’, based on the argument ‫ﻣﺪﻝ‬ modl ‘model’. Another set of substitutes would be ‫ ﺍﭘﻞ‬apl ‘apple’ and ‫ﺍﻃﻼﻋﺎﺕ‬ etelāat ‘information’ based on the argument ‫ ﺳﯿﺴﺘﻢ‬sistem ‘system’. The grammatical relations provide information on lexical co-occurrence within syntactic relations. The set of co-occurrence of a word within syntactic relations provides a strong restriction of its semantic properties.

4.3 Comparative analysis of English loanwords and their Persian counterparts The application of measuring semantic similarity will be presented as a comparative study between English loanwords and their Persian equivalents. Thus, the study focuses on English loanwords present in Persian and their Persian counterparts approved by the Academy of Language and Literature. The reason

Figure 1: A sample of A Collection of Terms Approved by the Academy of Persian Language and Literature.

350

Katarzyna Marszalek-Kowalewska

behind such an analysis was the Iranian language policy toward loanwords, which can be characterized by linguistic purism. The beginnings of Iranian language policy can be traced back to the tenth century and the times of Abu Ibn Sina. However, since this thesis deals with the modern Persian language, we will describe briefly only the most recent period of the purist movement (for more on Iranian language policy, see Marszałek-Kowalewska [2010]). In 1991, the Third Supreme Council of the Iranian Revolution established the Farhangestāne Zabān va Adabiyāte Fārsi (Persian Academy of Language and Literature). The academy members, language experts, and professors are responsible for studying grammar, orthography, manuscripts, and various Iranian dialects. The policy of the Third Academy is as follows: – In coining and choosing a new word, Persian phonetic rules and learned speakers’ way of talking and Islamic points of views should be regarded as the main criterion. – Phonetic rules should be followed according to the Persian way of talking. – New words should follow the Persian grammatical rules for coining nouns, adjectives, verbs, and so on. – New words should be chosen or coined out of the most common or frequent words that have been used since AD 250. – New words can be chosen from among the most frequent and common Arabic words as used in Persian. – New words can be chosen from the Middle and Old Persian stages of the language. – There should be only one equivalent in Persian for any of the Latin words, particularly technical ones. – It is not necessary to adapt or create new Persian words for those Latin words that have been used internationally and globally (Farhangestan-e Zaban [2001] as quoted in Monajemi 2011: 5). Therefore, the decision was taken to compare English loanwords and their Farsi counterparts. The following section presents the research procedure on the example of COMPUTER. Both an English loanword ‫ ﮐﺎﻣﭙﯿﻮﺗﺮ‬and its Persian equivalent ‫ ﺭﺍﯾﺎﻧﻪ‬are used by native speakers. Firstly, let’s consider the raw frequency in the corpus. As can be seen, the native word is much more frequent; it appears 799 times in the corpus while the English loanword appears only eighty times.

351

Extracting semantic similarity from Persian texts

Table 4: Raw frequency of loanwords in a corpus English loanword frequency

Persian counterpart frequency

‫ﮐﺎﻣﭙﯿﻮﺗﺮ‬

‫ﺭﺍﯾﺎﻧﻪ‬

80

799

The first part of the analysis consists of comparing n-grams. N-gram is a sequence of words that appear consecutively in the text. It can consist of two words (bi-gram), three words (tri-gram), four words (4-grams), etc. They are easy to formulate and extract and they provide useful reflections of lexical, semantic, and syntactic relations. The following tables present n-grams frequency, loanword n-grams, and native word n-grams, respectively. There was a condition that as n-gram we define every sequence of words that appeared a minimum of five times in a corpus. There were eighteen loanword n-grams and fifty-eight native n-grams: Table 5: N-gram frequency English loanword n-gram number (minimal n-gram frequency=5)

Farsi counterpart n-gram number (minimal n-gram frequency=5)

18

58

Table 6: Loanword n-grams N

Translation

1

computer and

2 3

and computer computer in

4

electronic and computer

5

computer engineering

6

computer + direct object

7

computer sciences

8

use of the computer

9

one/a computer

10 11

Internet and computer computer electronics

12

from computer

13

personal computer

14

computer dimension

15

computer protecting products

16

computer protection

17

computer electronics and

18

with computer

N-gram

Freq.

Length

‫ﮐﺎﻣﭙﯿﻮﺗﺮ ﻭ‬ ‫ﻭ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﮐﺎﻣﭙﯿﻮﺗﺮ ﺩﺭ‬ ‫ﺍﻟﮑﱰﻭﻧﯿﮏ ﻭ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﻣﻬﻨﺪﺳﯽ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﮐﺎﻣﭙﯿﻮﺗﺮ ﺭﺍ‬ ‫ﻋﻠﻮﻡ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺍﺳﺘﻔﺎﺩه ﺍﺯﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﯾﮏ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺍﯾﻨﱰﻧﺖ ﻭ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺍﻟﮑﱰﻭﻧﯿﮏ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺍﺯ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﮐﺎﻣﭙﯿﻮﺗﺮ ﺷﺨﺼﯽ‬ ‫ﺍﺑﻌﺎﺩ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﻣﺤﺼﻮﻻﺕ ﺣﻔﺎﻇﺖ ﮐﻨﻨﺪه ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺣﻔﺎﻇﺖ ﮐﻨﻨﺪه ﺍﺯ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺍﻟﮑﱰﻭﻧﯿﮏ ﮐﺎﻣﭙﯿﻮﺗﺮ ﻭ‬ ‫ﺑﺎ ﮐﺎﻣﭙﯿﻮﺗﺮ‬

31

2

17 16

2 2

8

3

8

2

7

2

7

2

7

3

7

2

6 6

3 2

6

2

6

2

6

2

5

4

5

4

5

3

5

2

352

Katarzyna Marszalek-Kowalewska

Table 7: Persian word n-grams N

Translation

1

a computer

2

some computers

3

computers

4

cloud computing

5

computer games

6

this computer

7

from computer

8

computer crime

9

to computer

10

use of the computer

11

corporate computer

12

tablet PC

13

computer and

14

association of computing

15

and computer

16

computer trade organization

17

in computer

18

computer + direct object

19

fighting with computer crime

20

with computer crime

21

from computers

22

national computer

23

national foundation of computer games

24

in computer

25

national computer games

26

national supercomputer

27

computer game

28

personal computers

29

PCs

30

first computer

31

computer is

32

world computer

33

with computer

34

computer virus

35

on computer

36

thousands of computers

37

special computer

N-gram

‫ﺭﺍﯾﺎﻧﻪ ﺍﯼ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎ‬ ‫ﺍﺑﺮ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺑﺎﺯﯾﻬﺎﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺍﯾﻦ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺍﺯ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺟﺮﺍﯾﻢ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺑﻪ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺍﺳﺘﻔﺎﺩه ﺍﺯ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺻﻨﻔﯽ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻟﻮﺡ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻭ‬ ‫ﻧﻈﺎﻡ ﺻﻨﻔﯽ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻭ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺳﺎﺯﻣﺎﻥ ﻧﻈﺎﻡ ﺻﻨﻔﯽ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺩﺭ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺭﺍ‬ ‫ﻣﺒﺎﺭﺯه ﺑﺎ ﺟﺮﺍﯾﻢ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺑﺎ ﺟﺮﺍﯾﻢ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺍﺯ ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻣﻠﯽ‬ ‫ﺑﻨﯿﺎﺩ ﻣﻠﯽ ﺑﺎﺯﯾﻬﺎﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﻭﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻣﻠﯽ ﺑﺎﺯﯾﻬﺎﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺍﺑﺮ ﺭﺍﯾﺎﻧﻪ ﻣﻠﯽ‬ ‫ﺑﺎﺯﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﺷﺨﺼﯽ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﺧﺎﻧﮕﯽ‬ ‫ﺍﻭﻟﯿﻦ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺍﯼ ﺍﺳﺖ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺟﻬﺎﻥ‬ ‫ﺑﺎ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻭﯾﺮﻭﺱ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺑﺮ ﺭﻭﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻫﺰﺍﺭ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻭﯾﮋه ﺭﺍﯾﺎﻧﻪ‬

Freq.

Length

291

2

109

2

65

2

60

2

51

2

49

2

31

2

28

2

26

2

24

3

22

2

22

2

22

2

22

3

21

2

20

4

20

2

19

3

18

4

16

3

15

2

15

2

14

4

14

2

14

3

13

3

13

2

13

2

12

2

12

2

11

2

11

2

11

2

10

2

10

3

8

2

8

2

353

Extracting semantic similarity from Persian texts

38

computer that

39

personal computer

40

foreign computer

41

production of national supercomputer

42

infected computers

43

computer industry

44

portable computers

45

world computers

46

first supercomputer

47

computer bug/worm

48

computer system

49

PC

50

home computer

51

company computer

52

father of computer

53

telecom computing

54

oldest computer

55

computer information

56

computer fraud and scums

57

computer crime hoax

58

computer for

‫ﺭﺍﯾﺎﻧﻪ ﮐﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺷﺨﺼﯽ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺍﯼ ﺧﺎﺭﺟﯽ‬ ‫ﺗﻮﻟﯿﺪ ﺍﺑﺮ ﺭﺍﯾﺎﻧﻪ ﻣﻠﯽ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﺁﻟﻮﺩه‬ ‫ﺻﻨﻌﺖ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﻗﺎﺑﻞ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﺩﻧﯿﺎ‬ ‫ﺍﻭﻟﯿﻦ ﺍﺑﺮ ﺭﺍﯾﺎﻧﻪ‬ ‫ﮐﺮﻡ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺳﯿﺴﺘﻢ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻃﺮﯾﻖ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺧﺎﻧﮕﯽ‬ ‫ﺭﺍﯾﺎﻧﻪ ﴍﮐﺖ‬ ‫ﭘﺪﺭ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﴍﮐﺖ ﻣﺨﺎﺑﺮﺍﺕ‬ ‫ﻗﺪﯾﻤﯽ ﺗﺮﯾﻦ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺍﻃﻼﻋﺎﺕ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺍﯼ ﺟﻌﻞ ﻭ ﮐﻼﻫﱪﺩﺍﺭﯼ‬ ‫ﺑﺎ ﺟﺮﺍﯾﻢ ﺭﺍﯾﺎﻧﻪ ﺍﯼ ﺟﻌﻞ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺍﯼ ﺑﺮﺍﯼ‬

8

2

8

2

8

2

8

4

7

2

7

2

7

2

7

2

6

3

6

2

6

2

6

2

6

2

6

2

5

2

5

3

5

3

5

2

5

5

5

4

5

2

There are eight common n-grams (presented in italics in the tables) between the English loanword and its Persian native counterpart, and it is the Persian word that tends to have more of them. Yet, when looking at Figure 2 presenting ngram relative frequency, the difference is not that big, and in fact it is the English loanword that has more n-grams with respect to its frequency in the corpus:

Figure 2: Relative n-grams frequency with respect to common n-grams

354

Katarzyna Marszalek-Kowalewska

In order to establish the semantic similarity between the loanword and its Farsi counterpart based on their n-grams, the following formula (collocational overlap measure) was used: dði; jÞ ¼ 1–

2 ðci \ cjÞ ci þ cj

N-grams overlap = 0.79 Semantic similarity: YES There are eight common n-grams that stand for 0.79 of n-gram overlap. Given that semantically similar lexemes have the overlap less than 0.8, the two lexemes in question can be classified as similar in terms of n-grams. To answer the question whether the difference between the number of n-grams is statistically significant, the Chi2 test was applied. The Chi2 value is calculated from the formula: X 2 ¼ ðdf ; NÞ ¼

X ð fo feÞ2 fe

In this example X 2 ¼ ð1; 879Þ ¼ 21:37 p < 0.01, which proves that there is a significant difference between the number of loanword and its native counterpart n-grams. Another level of analysis consists of studying phrasemes or multiword expressions of both lexemes. A phraseme can be here very generally described as a conventional, repeated multiword unit.16 Tables 8 and 9 present loanword and Persian phrasemes respectively: Table 8: Phrasemes with loanword 1.

computer engineering

2.

computer sciences

3.

personal computer

4.

computer dimension

‫ﻣﻬﻨﺪﺳﯽ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﻋﻠﻮﻡ ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﮐﺎﻣﭙﯿﻮﺗﺮ ﺷﺨﺼﯽ‬ ‫ﺍﺑﻌﺎﺩ ﮐﺎﻣﭙﯿﻮﺗﺮ‬

16 For more on phrasemes, see, e.g., Dobrovol’skij & Piirainen (2005).

Extracting semantic similarity from Persian texts

355

Table 9: Phrasemes with Farsi word 1.

supercomputer

2.

cloud computing

3.

computer crime

4.

corporate computer

5. 6.

tablet PC computer game

7.

PCs

8.

computer virus

9.

personal computer

10.

infected computers

11.

computer industry

12.

portable computers

13.

computer bug/worm

14.

father of computer

‫ﺍﺑﺮ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺍﺑﺮﯼ‬ ‫ﺟﺮﺍﯾﻢ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺻﻨﻔﯽ ﺭﺍﯾﺎﻧﻪ‬ ‫ﻟﻮﺡ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺑﺎﺯﯼ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﺧﺎﻧﮕﯽ‬ ‫ﻭﯾﺮﻭﺱ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﺷﺨﺼﯽ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﺁﻟﻮﺩه‬ ‫ﺻﻨﻌﺖ ﺭﺍﯾﺎﻧﻪ‬ ‫ﺭﺍﯾﺎﻧﻪ ﻫﺎﯼ ﻗﺎﺑﻞ‬ ‫ﮐﺮﻡ ﺭﺍﯾﺎﻧﻪ‬ ‫ﭘﺪﺭ ﺭﺍﯾﺎﻧﻪ‬

In order to establish the semantic similarity between the loanword and its Farsi counterpart, this time based on their phrasemes, the formula of the collocational overlap was used: dði; jÞ ¼ 1–

2 ðci \ cjÞ ci þ cj

Phraseme overlap = 0.89 Semantic similarity: NO There is only one common phraseme that stands for 0.89 of phraseme overlap. Given that semantically similar lexemes have the overlap at most 0.8, the two lexemes in question cannot be classified as similar in terms of phrasemes. Again, the Chi2 value was calculated to answer the question whether the difference between the numbers of phrasemes was statistically significant. Chi2 value is calculated from the following formula: X 2 ¼ ðdf ; NÞ ¼

X ðfo feÞ2

In this example: X 2 ¼ ð1; 879Þ ¼ 3:62 p > 0.5,

fe

356

Katarzyna Marszalek-Kowalewska

which proves that there is no statistically significant difference between loanword and Persian phrasemes since if the chi square value results in a probability that is more than 0.05 (i.e., more than 5 percent), it is not considered statistically significant. The above example shows that in terms of n-grams, this particular loanword and its native Persian counterpart can be classified as similar. Yet, the results of their phrasemes give contradictory results. Let’s now consider other words that hypothetically could be considered as similar: Table 10: Semantic similarity based on phrasemes Lexeme translation

Synonymic lexemes

Collocational overlap

Classified as semantically similar

strategy

‫ﺍﺳﱰﺍﺗﮋﯼ‬ ‫ﺭﺍﻫﱪﺩ‬ ‫ﺳﯿﺴﺘﻢ‬ ‫ﻧﻈﺎﻡ‬ ‫ﻧﺎﻣﺰﺩ‬ ‫ﮐﺎﻧﺪﯾﺪﺍ‬ ‫ﺍﻟﮕﻮ‬ ‫ﻣﺪﻝ‬ ‫ﺗﻮﺭﯾﺴﺘﯽ‬ ‫ﮔﺮﺩﺷﮕﺮ‬ ‫ﺳﻤﺒﻞ‬ ‫ﻧﻤﺎﺩ‬ ‫ﺭﺍﯾﺎﻧﻪ‬ ‫ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺑﺎﺯﯼ‬ ‫ﮔﯿﻢ‬ ‫ﺭﻗﻤﯽ‬ ‫ﺩﯾﺠﯿﺘﺎﻝ‬

1

NO

1

NO

0.8

NO

1

NO

0.82

NO

1

NO

0.89

NO

1

NO

1

NO

system candidate model touristic symbol computer game digital

Table 11 presents the results of analyzing semantic similarity of hypothetically similar words. As can be seen, none of them can be classified as semantically similar in terms of phrasemes and only five can be classified as similar in terms of n-grams. Therefore, the study reveals that words that at first glance could be treated as similar cannot be treated as such, at least not in terms of the phrasemes they form. Of course, it needs to be remembered that corpus results always depend on the type of corpora used. In this case, the results are confined to the corpus whose data represent the newspaper domain.

Extracting semantic similarity from Persian texts

357

Table 11: Semantic similarity based on n-grams Lexeme translation

Synonymic lexemes

Collocational overlap

Classified as semantically similar

strategy

‫ﺍﺳﱰﺍﺗﮋﯼ‬ ‫ﺭﺍﻫﱪﺩ‬ ‫ﺳﯿﺴﺘﻢ‬ ‫ﻧﻈﺎﻡ‬ ‫ﻧﺎﻣﺰﺩ‬ ‫ﮐﺎﻧﺪﯾﺪﺍ‬ ‫ﺍﻟﮕﻮ‬ ‫ﻣﺪﻝ‬ ‫ﺗﻮﺭﯾﺴﺘﯽ‬ ‫ﮔﺮﺩﺷﮕﺮ‬ ‫ﺳﻤﺒﻞ‬ ‫ﻧﻤﺎﺩ‬ ‫ﺭﺍﯾﺎﻧﻪ‬ ‫ﮐﺎﻣﭙﯿﻮﺗﺮ‬ ‫ﺑﺎﺯﯼ‬ ‫ﮔﯿﻢ‬ ‫ﺭﻗﻤﯽ‬ ‫ﺩﯾﺠﯿﺘﺎﻝ‬

0.83

NO

0.77

YES

0.64

YES

0.86

NO

0.68

YES

0.76

YES

0.79

YES

0.98

NO

0.82

NO

system candidate model touristic symbol computer game digital

5 Conclusion Dr. Samuel Johnson, when emphasizing the importance of semantics in the preface to his pioneering dictionary in 1799, wrote: “It is not sufficient that a word is found, unless it be so combined as that its meaning is apparently determined by the track and tenor of the sentence”. Therefore, the core of studying meaning is to analyze words in context. In this article, certain procedures for extracting semantic information from texts have been discussed. I have tried to present it with the example of the Persian language. The loanword-native equivalent analysis aimed at outlining semantic differences and similarities between the two with respect to their n-grams and phrasemes. The analysis revealed that words that at first glance seem to be similar, in terms of their collocates or phrasemes, cannot necessarily be treated as such. Consider, for example, the computer case. It seems that both English loanword and Persian native word favor different collocates, e.g., ‫ ﻣﻬﻨﺪﺳﯽ ﮐﺎﻣﭙﯿﻮﺗﺮ‬mohandesi kāmpiuter ‘computer engineering’ or ‫ ﻋﻠﻮﻡ ﮐﺎﻣﭙﯿﻮﺗﺮ‬elme kāmpiuter ‘computer sciences’ in the case of the loanword and ‫ ﺭﺍﯾﺎﻧﻪ ﺍﺑﺮﯼ‬rāyāne abri ‘cloud computing’ or ‫ ﺟﺮﺍﯾﻢ ﺭﺍﯾﺎﻧﻪ‬jarāme rāyāne ‘computer crime’ in the

358

Katarzyna Marszalek-Kowalewska

case of the native word. The fact that hypothetically similar words tend to form different phrasemes and the possibility of identifying these different phrasemes in a corpus seem to be particularly interesting. This finding seems to be of great importance to translators and students of Persian who often struggle with the lexical richness of the Persian language. Although this article presents only an extract on the abounding study of meaning, it provides empirical analysis and therefore it attempts to contribute to the acquisition of meaning – “the Holy Grail of lexical acquisition” (Manning and Schütze 2000: 294).

References Baker, Collin F., Charles J. Fillmore & John B. Lowe. 1998. The Berkeley FrameNet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 86‒90. Montreal: Association for Computational Linguistics. Bijankhan, Mahmood, Sheykhzadegan Javad, Bahrani Mohammad & Masood Ghayoomi. 2011. Lessons from building a Persian written corpus: Peykare. In Language Resources and Evaluation, 45 (2). 143‒164. Secaucus: Springer. Brown, Peter, Stephen Della Pietra, Vincent Della Pietra & Robert Mercer. 1991. Word sense disambiguation using statistical methods. In Proceedings of the 29th annual meeting of the Association for Computational Linguistics, 264‒270. Stroudsburg: Association of Computational Linguistics. Dale, Robert, Hermann Moisl & Harold Somers. 2000. Handbook of natural language processing. New York: Marcel Dekker. Danielsson, Pernilla. 2001. The automatic identification of meaningful units in language. Göteborg: Göteborg University dissertation. Dehdari, Jon & Deryle Lonsdale. 2008. A link grammar parser for Persian. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian linguistics, vol. 1: 19–34. Cambridge: Cambridge Scholars Press. Dehkharghani, Rahim & Mehrnoush Shamsfard. 2009. Mapping Persian words to WordNet Synsets. The International Journal of Interactive Multimedia and Artificial Intelligence 1 (2). 6‒13. Dobrovol’skij, Dmitri O. & Elisabeth Piirainen. 2005. Figurative language: Cross-cultural and cross-linguistic perspectives. Amsterdam: Elsevier. Eneko, Agirre & Philip Edmonds. 2007. Word sense disambiguation: Algorithms and applications. Dordrecht: Springer. Farajian, Mohammad Amin. 2011. PEN: Parallel English-Persian news corpus. In Proceedings of 2011 International Conference on Artificial Intelligence (ICAI’11), 523–528. Las Vegas: ICAT. Fellbaum, Christiane (ed.). 1998. WordNet: An electronic lexical database and some of its applications. Cambridge: MIT Press. Firth, John. 1957. A synopsis of linguistic theory, 1930‒1955. In John Firth (ed.), Studies in linguistic analysis. Special volume of the Philological Society, 1‒32. Oxford: Blackwell.

Extracting semantic similarity from Persian texts

359

Ghayoomi, Masood & Saeedeh Momtazi. 2009. Challenges in developing Persian corpora from online resources. In Min Zhang, Haizhou Li, Kim-Teng Lua & Minghui Dong (eds.), Proceedings of 2009 IEEE International Conference on Asian Language Processing, 108‒113. Los Alamitos: IEEE Computer Society Press. Ghayoomi, Masood, Saeedeh Momtazi & Mahmood Bijankhan. 2010. A study of corpus development for Persian. International Journal on Asian Language Processing 20 (1). 17‒33. Girju, Roxana, Adriana Badulescu & Dan Moldovan. 2003. Learning semantic constraints for the automatic discovery of part-whole relations. In Proceedings of the Association for Computational Linguistics – Human Language Technologies, 1‒8. Stroudsburg: Association for Computational Linguistics. Hamidi, Mandana, Ali Borji & Saeed Shiry Ghidary. 2007. Persian word sense disambiguation. In Proceeding of 15th Iranian Conference of Electrical and Electronics Engineers, 114–118. Teheran: ICEE. Hashemi, Homa Baradaran. 2010. Creating a Persian-English comparable corpus. In Maristella Agosti, Nicola Ferro, Carol Peters, Maarten de Rijke & Alan F. Smeaton (eds.), Proceedings of the 2010 International Conference on Multilingual and Multimodal Information Access Evaluation: Cross-language evaluation forum lecture notes in computer science, 27‒39. Padua: Springer. Hirst, Graeme. 1987. Semantic interpretation and the resolution of ambiguity. Cambridge: Cambridge University Press. Hovy, Eduard, Mitchell Marcus, Martha Palmer, Lance Ramshaw & Ralph Weischedel. 2006. OntoNotes: The 90% solution. In Proceedings of the Human Language Technology Conference of the NAACL, 57‒60. New York: Association for Computational Linguistics. Ide, Nancy & Jean Veronis. 1990. Word sense disambiguation with very large neural networks extracted from machine readable dictionaries. In Hans Karlgren (ed.), Proceedings of the 13th Conference on Computational linguistics, 389‒404. Helsinki: Association for Computational Linguistics. Jadidinejad, Amir Hossein, Fariborz Mahmoudi & Jon Dehdari. 2010. Evaluation of Perstem: A simple and efficient stemming algorithm for Persian. In Carol Peters, Giorgio Di Nunzio, Mikko Kurimo, Thomas Mandl, Djamel Mostefa, Anselmo Peñas & Giovanna Roda (eds.), Multilingual information access evaluation I. Text retrieval experiments, lecture notes in computer science, 98–101. Heidelberg: Springer. Kim, Jin-Dong, Tomoko Ohta, Yuka Teteisi & Jun’ichi Tsujii. 2003. GENIA corpus – a semantically annotated corpus for bio-textmining. In Bioinformatics, 180‒182. Oxford: Oxford University Press. Kiryakov, Atanas, Popov Borislav, Terziev Ivan, Manov Dimitar & Damyan Ognyanoff. 2004. Semantic annotation, indexing, and retrieval. In Web Semantics: Science, Services and Agents on the World Wide Web 2 (1). 49–79. Amsterdam: Elsevier. Lesk, Michael. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Virginia DeBuys (ed.), Proceedings of the 5th Annual International Conference on Systems Documentation, 24‒26. New York: ACM. Magnini, Bernardo & Gabriela Cavagliá. 2000. Integrating subject field codes into WordNet. In Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation, 1413‒1418. Athens: European Language Resources Association. Makki, Raheleh & Mohammad Mehdi Homayounpour. 2008. Word sense disambiguation of Farsi homographs using thesaurus and corpus. In Advances in natural language processing, Lecture notes in computer science, 315‒323. Gothenburg: Springer.

360

Katarzyna Marszalek-Kowalewska

Makki, Raheleh & Mohammad Mehdi Homayounpour. 2008. Using decision list for Farsi word sense disambiguation. In Proceedings of 2nd Joint Congress on Fuzzy and Intelligent Systems. Teheran: Malek Ashtar University. Manning, Chris & Hinrich Schütze. 2000. Foundations of statistical natural language processing. Cambridge: MIT Press. Marszałek-Kowalewska, Katarzyna. 2010. Iranian language policy: A case of linguistic purism. In Investigationes Linguisticae, 89‒103. Poznan: Institute of Linguistics UAM. Mason, Olivier. 2006. The automatic extraction of linguistic information from text corpora. Birmingham: University of Birmingham dissertation. McRoy, Susan. 1992. Using multiple knowledge sources for word sense discrimination. In Computational linguistics, 1‒30. Cambridge: MIT Press. Megerdoomian, Karine. 2010. Developing a Persian part-of-speech tagger. In Zabane Farsi va Rayane [Persian language and computers]. Tehran: SAMT Publishers. Miangah, Mosavi & A. Delavar Khalafi. 2005. Word sense disambiguation using target language corpus in a machine translation system. In Literary and linguistic computing, 237‒249. Oxford: Oxford University Press. Mitkow, Ruslan (ed.). 2003. The Oxford handbook of computational linguistics. Oxford: Oxford University Press. Monajemi, Ebrahim. 2011. Can ethnic minority languages survive in the context of global development? http://www.sil.org/asia/ldc/parallel_papers/ ebrahim_monajemi.pdf (accessed 5 January 2011). Palmer, Martha, Dan Gildea & Paul Kingsbury. 2005. The proposition bank: A corpus annotated with semantic roles. In Computational Linguistics Journal 31 (1). 71–106. Prasad, Rashimi, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi & Bonnie Webber. 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), 2961–2968. Marrakech: European Language Resource Association. Reeve, Lawrence & Han Hyoil. 2005. Survey of semantic annotation platforms. In Lorrie M. Liebrock (ed.), SAC’05: Proceedings of the 2005 ACM Symposium on Applied Computing, 1634–1638. New York: ACM. Roberts, Angus, Robert Gaizauskas, Mark Hepple, Neil Davis, George Demetriou, Yikun Guo, Jay (Subbarao) Kola, Ian Roberts, Andrea Setzer, Archana Tapuria & Bill Wheeldin. 2007. The CLEF corpus: Semantic annotation of clinical text. In Proceedings of the AMIA Knowledge Centre, 625–629. Chicago: American Medical Informatics Association. Rouhizadeh, Masoud, Mahsa A Yarmohammadi & Mehrnoush Shamsfard. 2008. Building a WordNet for Persian verbs. In Attila Tanács, Dóra Csendes, Veronika Vincze, Christiane Fellbaum & Piek Vossen (eds.), Proceedings of The Fourth Global WordNet Conference, 406‒412. Szeged: University of Szeged. Sagot, Benoît, Géraldine Walther & Pegah Faghiri, Pollet Samvelian. 2011. Développement de ressources pour le persan : le nouveau lexique morphologique PerLex 2 et l’étiqueteur morphosyntaxique MEltfa. In Proceedings of TALN 2011, Conference on Natural Language Processing. Montpellier: TALN. Sampson, Geoffrey. 2001. Empirical linguistics. London & New York: Continuum. Samvelian, Pollet & Pegah Faghiri. 2013. Introducing PersPred, a syntactic and semantic database for Persian complex predicates. In Valia Kordoni, Carlos Ramisch & Aline Villavicencio (eds.), Proceedings of the 9th Workshop on Multiword Expressions, NAACL-HLT 2013, 11‒20. Atlanta: Association for Computational Linguistics.

Extracting semantic similarity from Persian texts

361

Schütze, Henrich. 1992. Dimensions of meaning. In Robert Werner (ed.), Proceedings of the 1992 The Association for Computing Machinery (ACM) and the IEEE Computer Society conference on Supercomputing, 787‒796. Los Alamitos: IEEE Computer Society Press. Seraji, Mojgan, Beáta Megyesi & Joakim Nivre. 2012. A basic language resource kit for Persian. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Ugur Dogan, Bente Maegaard, Joseph Mariani, Jan Okijk & Stelios Piperidis (eds.), Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), 2245‒2252. Istanbul: European Language Resource Association. Shamsfard, Mehrnoush. 2011. Challenges and open problems in Persian text processing. In Proceedings of Language Technology Conference 11, 65‒69. Poznań: LTC. Shamsfard, Mehrnoush, Hakimeh Fadaei & Elham Fekri. 2010. Extracting lexico-conceptual knowledge for developing Persian WordNet. In Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner & Daniel Tapias (eds.), Proceedings of Language Resources and Evaluation (LREC 2010), 3798–3802. Malta: European Language Resource Association. Shamsfard, Mehrnoush, Akbar Hesabi, Hakimeh Fadaei, Niloofar Mansoory, Ali Famian, Somayeh Bagherbeigi, Elham Fekri, Maliheh Monshizadeh & S. Mostafa Assi. 2010. Semi automatic development of FarsNet: The Persian WordNet. In Proceedings of 5th Global WordNet Conference. Mumbai: GWA. Shamsfard, Mehrnoush, Hoda Sadat Jafari & Mahdi Ilbeygi. 2010. STeP-1: A set of fundamental tools for Persian text processing. In Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner & Daniel Tapias (eds.), Proceedings of Language Resources and Evaluation (LREC 2010), 859–865. Malta: European Language Resource Association. Small, Steven. 1980. Word expert parsing: A theory of distributed word-based natural language understanding. College Park: University of Maryland dissertation. Soltani, Mehdi. 2010. A statistical approach on Persian word sense disambiguation. In Informatics and Systems (INFOS), 1‒6. Cairo: IEEE Computer Society Press. Stevenson, Mark & Yorick Wilks. 2003. Word sense disambiguation. In Ruslan Mitkov (ed.), Oxford handbook of computational linguistics, 249‒265. Oxford: Oxford University Press. Taheri, Aynaz & Mehrnoush Shamsfard. 2011. Mapping FarsNet to suggested upper merged ontology. In Mohamed Mohamed Salem, Khaled Shaalan, Farhad Oroumchian, Azadeh Shakery & Halim Khelalfa (eds.), Proceedings of Asia Information Retrieval Societies Conference 2011, 604‒613. Dubai: Springer. Waltz, David & Jordan Pollack. 1985. Massively parallel parsing: A strongly interactive model of natural language interpretation. In Cognitive Science, 51‒74. Norwood: Ablex Publishing. Wilks, Yorick. 1972. Grammar, meaning and the machine analysis of language. London & Boston: Routledge. Yarowsky, David. 1992. Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In Proceedings of COLING-92, 454‒460. Stroudsburg: Association for Computational Linguistics.

List of contributors Mohammad Abdolhosseini Isfahan University, Isfahan, Iran mabdolhosseini@gmail.com Shadi Dini Drexel University, School of Education One Drexel Plaza, 3141 Chestnut Street, Philadelphia PA, 19104, USA shadi.dini@drexel.edu Zohreh R. Eslami Texas A&M University College Station, Texas 77845, USA zeslami@tamu.edu Lewis Gebhardt Department of Linguistics, Northeastern Illinois University 5500 North St. Louis Avenue, Chicago, IL 60625, USA l-gebhardt@neiu.edu Jila Ghomeshi Department of Linguistics, University of Manitoba Winnipeg, MB, R3T 5V5, Canada Jila.Ghomeshi@umanitoba.ca Shinji Ido Graduate School of Humanities, Nagoya University Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan ido@nagoya-u.jp Youli Ioannesyan Institute of Oriental Manuscripts of the Russian Academy of Sciences Dvortsovaya emb., 18, Saint Petersburg, 191186, Russia youli19@gmail.com Carina Jahani Department of Linguistics and Philology, Uppsala University Box 635, SE-751 26 Uppsala, Sweden carina.jahani@lingfil.uu.se Alireza Korangy Societas Philologica Persica 438 West 37th Street, Room 5H, New York, NY 10018, USA korangy@gmail.com

Agnès Lenepveu-Hotz Department of Persian Studies, University of Strasbourg 22 rue René Descartes, 67084 Strasbourg, France a.hotz@unistra.fr Behrooz Mahmoodi-Bakhtiari University of Tehran, College of Fine Arts, Department of Performing Arts Enghelab Avenue, Tehran, Iran mbakhtiari@ut.ac.ir Shahrzad Mahootian Department of Linguistics, Northeastern Illinois University 5500 North St. Louis Avenue, Chicago, IL 60625, USA s-mahootian@neiu.edu Katarzyna Marszalek-Kowalewska Institute of Linguistics, Adam Mickiewicz University al Niepodleglości 4, 61-874 Poznań k.marszalek.kowalewska@gmail.com Corey Miller The MITRE Corporation 7515 Colshire Drive, McLean, VA 22102, USA camiller@mitre.org Lutz Rzehak Berlin Humboldt-University, Central-Asian Seminar Unter den Linden 6, 10099 Berlin, Germany lutz.rzehak@gmx.de Adriano V. Rossi Department of Asian, African and Mediterranean Studies University L’Orientale – Piazza S. Domenico M. 12, 80134 Naples, Italy arossi@unior.it Hooman Saeli University of Tennessee, Knoxville 2344 Dunford Hall, Knoxville, TN 37996, USA hsaeli@utk.edu Martin Schwartz Department of Near Eastern Studies, University of California, Berkeley 250 Barrows Hall, Berkeley, CA 94720, USA martz@berkeley.edu

364

List of contributors

Toon Van Hal Department of Linguistics, University of Leuven Blijde-Inkomststraat 21, pb 3008, BE-3000 Leuven, Belgium toon.vanhal@kuleuven.be Arseniy Vydrin Institute for Linguistic Studies of the Russian Academy of Sciences Tuckhov pereulok, 9, Saint Petersburg, 199053, Russia senjacom@gmail.com

Z. A. Yusupova Institute of Oriental Manuscripts of the Russian Academy of Sciences Dvortsovaya emb., 18, Saint Petersburg, 191186, Russia youli19@gmail.com

Index ablative 287, 293 address forms 135–41, 143–7, 149–53, 155– 61 address pronoun 138 adjuncts x, 101, 103, 105, 107, 109, 111, 113, 115, 117 adverbs 197, 199, 210, 231, 239, 315–16, 319, 322, 324, 326, 329, 342–3 – spatial 314–15, 321 Adyghe 294 Afghanistan x, 67, 71, 101–2, 104, 111, 114, 118–19, 121, 123, 131, 311, 342 agreement affixes 264, 268–9, 271–2 allophones 120, 125, 127 Amerindian languages 3 animacy 87–90, 97, 222, 238, 280 Arabs 73, 75, 110, 114, 117 Aramaic 69, 72–3, 76 argots 70, 72–3, 75–7 Armenian 4, 217–18 associative plurals 233–6, 240–2, 247–8, 258, 260, 262 Badakhshani 121, 127, 132 Balkh 110–11, 144 Balochi xi, 54–9, 61, 63–4, 66–7, 309, 311, 337–8 – Koroshi 309–12, 314–15, 319, 321, 329– 32, 335–6 – lexicography x, 53, 55, 57, 59, 61, 63, 65, 67 – Sistani 309–12, 314–15, 321–2, 324, 329– 30, 332, 335–6 Bengali 245 Benveniste, Emile 62, 65, 82, 99 bilingualism 188, 211 Brāhūī 65 Caucasian languages 277–8, 292 Central Asia 50, 69–71, 73–5, 77, 119, 122 Cheshire, Jenny 247, 249, 252, 260 Cheung, Johnny 77–9 Chierchia, Gennaro 216–18, 227, 231

Chinese 21–2, 24, 32, 37, 42, 44, 50–1, 231–2 – Middle 44, 51 circumposition 81–8, 91–2, 94–5, 97–8 classical Persian 76, 90, 100, 119–20, 129– 30, 190 classifiers 213–20, 222, 224–8, 230–2, 237, 259, 261 cliticization 276 clitics xi, 193, 196, 198–9, 204, 259, 263–4, 266–73, 275–6, 279, 284 collocations 76, 347–8, 357 colloquial Dari x, 101, 103, 105–7, 109, 111, 113, 115, 117 colloquial Persian 113, 182–4, 188, 190–201, 204–12, 214, 233, 261, 276 complex verbs 85, 87, 91, 276–7, 288 compound verbs 85, 198, 205, 264–6, 269, 271, 275 computational linguistics 339–42, 358–61 Construction Grammar 234, 237, 254, 256, 260 Construction Morphology 233, 254, 256, 260 corpus linguistics 339–40, 342 Dari x, xii, 101–3, 105, 107, 110, 113–23, 126, 130 – colloquial x, 101, 103, 105–7, 109, 111, 113, 115, 117 – literary 124, 126–7 dative 92, 94, 279–83, 287–8, 291, 293, 315 deictics 223, 314, 327, 335 deixis xi, 309, 312–13, 319, 322, 329, 335–8 Delforooz, Barjasteh 310–12, 314–15, 321–3, 332–3, 336–7 demonstrative determiners 314, 319–21, 324, 329, 332, 336 demonstrative pronouns 314–15, 318, 321, 323–4, 329–30, 337 derivation 54, 72, 152, 221, 230–1, 256, 294 determiner 216, 221, 223–4, 261, 310, 314– 15, 317, 319, 321–4, 326, 329, 344 demonstrative 314, 319–21, 324, 329, 332, 336

366

Index

determiner phrase 216, 221, 223, 230, 232 diglossia 163, 167, 183–6, 188–9, 191, 207, 210–12 direct objects 81, 97–8, 165, 198–9, 230, 351–2 direct speech 314, 316–18, 330, 332, 334, 337 discourse markers 109–10, 117, 208 disfluencies 101, 103, 117 disjunctive 245, 252–3 Distributed Morphology 221–2, 231 Dravidian 57, 64, 245 Dutch 5, 8, 10 Early New Persian 51, 76, 81, 85, 88, 94, 129 Elfenbein, Josef 57–63, 66–7 enclitic, personal 196, 198 enclitic pronouns 109, 204, 298–307, 312, 315, 334 English loanwords 348–50, 353 Ethnolinguistics 54–5 ethnonyms 4, 58 etymology 75, 77, 82 ezāfe 92, 94, 98, 147, 172, 201, 204, 209, 259, 264, 344 familiarizers 165, 170, 175, 180 features – linguistic 102, 179–80, 190 – phonetic 130–1 – syntactic 190, 216, 224–5, 288 Ferguson, Charles 185, 207, 211 fillers x, 101, 103–5, 107–15, 117 Fillmore, Charles 254, 260, 312, 337, 341 first names 135, 137, 142–3, 145, 148–54, 157–61, 165, 180 focalization 81, 93–7 folktales xi, 297, 304, 306, 309–11 gender 138, 144, 146, 154–5, 168, 170, 172, 175–6, 180, 182, 222 general extenders 233–5, 237, 247–9, 252– 5, 258–9 German 1, 3–6, 8–11, 13–15, 92, 185, 243, 335 Ghaniabadi, Saeed 239–40, 244, 246, 261 Ghazni 107–12

glottal stop 194 grammaticalization 259–60, 275, 277, 293, 337 Greek 5–6, 9, 185 Haig, Geoffrey 338 Haspelmath, Martin 92–3, 181, 218, 231–2, 243, 260–2, 267, 276 Hazaras 106–7, 109, 117, 123, 127 Hebrew 3–5, 8–9, 69, 71, 74–5, 77, 185 modern 232 Helmand 102 Herat 72, 83–5, 105–6, 117, 130, 132 Herati 84, 97, 105–6, 118, 120, 123–5, 127– 8, 131–2 hesitation pauses 101, 104–5, 110, 112, 116 Hungarian 232, 236 hyperonymy 243–4, 246, 257, 259 iconyms x, 53–4, 56–7, 60 inanimate nouns 90, 247 indefiniteness 231, 261 – singular 227 indefinites 223–4 indirect object 81, 97–8, 299–300, 306–7 inanimate 90 indirectness 164, 181 Indo-Aryan 61, 64, 68 Indo-Iranian 10, 245, 283, 342 inflectional morphemes 164–5, 170 – singular 172, 175 informality 174, 189, 247, 257 intransitive verbs 207, 280, 287, 305, 312 Jewish languages 74 Jews 74, 78 – Iranian 70–1 Judeo-Persian 69, 129 Kabul 67, 72, 102, 131, 133 Kabuli 122–5, 127–8, 131–2 Karimi, Simin 99–100, 261, 358 Khorasani 119, 124, 127, 132 Koroshi Balochi 309–12, 314–15, 319, 321, 329–32, 335–6 Kunduz 114–15 Kurdish 297, 335

Index

– possessive construction in xi, 297, 299, 301, 303, 305, 307 Kurdish dialects 297, 308 – southern 297–8, 303, 311 Kurds 14, 61 language contact 104, 118 last names 135, 142–3, 146–51, 153, 157–61, 165, 177–8 Latin 2, 5–6, 9–12, 120, 350 Lazard, Gilbert 83, 85–6, 88, 90, 94, 99, 129–30, 132, 237, 261–2, 267, 276 literary Persian 122, 124 Mandarin 213–15, 227, 230, 235 Manichaean Middle Persian 130–1 Mashhad 72, 82, 98, 124, 127, 148 mass nouns 213, 217, 231, 239 – plural 239 – pluralizing 214 Middle Chinese 44, 51 Middle Iranian 59, 69 Middle Persian 76, 82, 128, 130 morphemes 82, 94, 126, 163, 170, 195, 223, 240 – bound 197 – detached 343 – dual 222 – free functional 196, 204 – plural 222, 227 morpho-syntax xi, 292 morphology 71, 102, 163, 169, 183, 281 – nominal 49, 239 – plural 216 multilingualism 2, 64, 103, 156, 345, 359 names – last 135, 142–3, 146–51, 153, 157–61, 165, 177–8 – proper 89, 95, 98, 233–6, 240–2, 247–8, 254–5, 258–9 Nangarhar 115–16 natural language processing 340–2, 347, 358–60 negative 300, 302–4 nouns – generic 135, 144–5

367

– inanimate 90, 247 null agreement 268 numeral classifiers 213, 231–2, 261 Ossetic xi, 67, 78, 125, 277–83, 285, 287–9, 291–5 Paghman 107 Parthian 77, 131 particles 205, 276, 284, 294, 301–2 – negative 303 Pashto x, 57, 65, 67, 101, 103–5, 107, 109–17 passive constructions 277, 283, 287, 307 past tense 71, 108, 189, 307, 309, 312, 316, 330–2, 334, 336 Persian – classical 76, 90, 100, 119–20, 129–30, 190 – clitics 263–4, 266, 268–9, 275–6 – colloquial 113, 182–4, 188, 190–201, 204– 12, 214, 233, 261, 276 – Early New 51, 76, 81, 85, 88, 94, 129 – Iranian 26, 119–20, 127, 210 – literary 122, 124 – Middle 76, 82, 128, 130 – Timurid 22–4, 26–7, 37, 48–50 – written xi, 167, 183, 185, 187–93, 195–9, 201, 203–7, 209–11, 273 personal pronouns 88, 106, 241, 297–8, 303–4 phonetics x, 131, 133, 163, 211 phonology 102, 132–3, 183, 187, 192–3, 210, 276 politeness xi, 144, 157, 163–4, 166–7, 169, 181–2 possessive construction 232, 297–8, 303, 305–7 pronouns – address 138 – demonstrative 314–15, 318, 321, 323–4, 329–30, 337 – enclitic 109, 204, 298–307, 312, 315, 334 – formal 136 personal 88, 106, 241, 297–8, 303–4 plural 138, 170, 172–4, 233, 235, 241–2, 248, 260 – reflexive 297, 304–5

368

Index

– second person 241–2 pronunciation 25, 29, 42, 44, 47, 113, 129, 149, 186, 191, 193–4, 221, 227, 279 proper names 89, 95, 98, 233–6, 240–2, 247–8, 254–5, 258–9 Russian 29, 119, 129, 278, 294–5 Sanskrit 1, 14–15, 59 Scythians 9, 14–15 semantic information 340, 342, 345, 347 semantic similarity 339, 346–7, 354–7 Semitic languages 4, 17 Sistani Balochi 309–12, 314–15, 321–2, 324, 329–30, 332, 335–6 sociolinguistics 156, 181, 183, 211–12, 260 spatial adverbs 314–15, 321 speech acts 163, 165, 168 Spooner, Brian 60, 63, 67 Steingass, Francis Joseph 30, 39–41, 43, 47, 51 Stilo, Donald 244–5, 253, 262, 310–12, 324– 6, 334, 338 stress 206, 208, 238, 244, 310, 314 suffix 197–8, 200, 220, 237, 267, 283, 292 – comparative 284–5 – difficilitive 293 – person-marking 311–12 – possessive 76 – verbal 283 syntax 13, 102, 183, 204, 210, 213, 221, 230–2, 235, 260, 262, 275–6, 281, 294

Tajik x, 21, 24, 26–34, 36–40, 43–9, 51, 58, 111–12, 119–25, 127, 129–33, 220, 232 – dialectology 119–21, 127, 132–3 – literary 120–2, 124 Tamil 261 Timurid Persian 22–4, 26–7, 37, 48–50 toponyms xii, 58, 60, 62–3, 77 transitive verbs 206–7, 280–1, 287, 307, 312 Transoxiana 83, 85 Turkish 235, 245, 261 – pre-thirteenth-century 50 – spoken 185 variation x–xi, 26, 71–2, 84, 120, 124, 127, 129–30, 137–9, 141, 143, 151–5, 182–3, 261, 310–12 verbs, simple 85, 91 vowel 22, 48, 106, 119–20, 123–8, 192–4, 196–7, 201, 266, 343 – close-mid 125 – final 75, 192–3 – harmony 192 – initial 193, 196 – long 120, 194 – open 123–5, 131 – penultimate 192 – reduction 49, 127 – shifts 191 – short 58, 344 – unrounded 120, 124 word sense disambiguation 340–2, 346, 358–61