Showing posts with label phonology. Show all posts
Showing posts with label phonology. Show all posts

Tuesday, December 17, 2013

Grammar and Wordbuilding

There are a couple of additional considerations that we need to talk about before starting to construct words from our list of morphemes. Just how do those morphemes come together to form words? This is a question with a lot of complicated answers, so we're going to stick to our Indo-European inspirations and ignore a lot of possible variations and intricacies. Even within that limitation, though, there's more than one way to build names. So we need to answer a few basic questions about the grammar of our naming language in order to proceed. We don't need an entire grammar, but we need to know a few things before moving on.

Plurals

In most languages there is some provision for what's called grammatical number. This generally applies to nouns and pronouns but in some languages verbs and adjectives must agree with the number of the noun they are related to.

English, like most major languages, recognizes singular (nouns of the number one) and plural (more than one). Grammatically this is signified by tacking an -s onto the end of the noun or some variation thereof. So singular coin becomes plural coins, but singular loaf becomes plural loaves. Generalizing this rule, we might say that plurality is denoted by a suffix -s, except where the word ends in /f/, in which case the sound changes to /v/ and the suffix becomes -es. (English also inherits weird pluralization rules for many words from other languages, but we're ignoring that.)

Denoting number can be much more complicated than this. In Latin the plural form depends on the declension and case of the noun, and there are many possible combinations. Some languages, including Arabic, recognize a third, dual number. In still others the distinction is not between singular (one) and plural (more than one) but collective (any number) and singulative (only one.)

For Cythric we are going with a simple rule. If the noun or pronoun ends in a vowel, plurality is indicated by the suffix -ga.If it ends in any consonant other than n, the suffix is merely -a. If the singular has final n, the suffix id -da. (This does imply that if our nouns have cases indicated by inflection, that that inflection is something other than a suffix. Which is fine, and which we may or may not ever care about, and even then there would be ways around it.)

Thus singular eddra "song" becomes plural eddraga, "songs." Singular hearhm "helm" becomes plural hearhma "helms." And singular firhn "hall" becomes plural firhnda, "halls."

Articles

Grammatical articles specify the definiteness of a noun. In English we have four articles: the definite articles /ðʌ/ and /ði/, both spelled the, and the indefinite articles /eɪ/, spelled a and /æn/, spelled an. The indefinite article indicates some example of the noun, a sword or an apple. The apple indicates some specific apple.

Articles can be differentiated by case, gender and so on; English juts happens to have a fairly simple set (but not as simple as it seems!) Welsh has a definite article but not indefinite version, while German has articles that differ by gender, number and case. Latin has no articles at all; you have to infer definiteness from context.

In Cythric I want a fairly simple system, so we'll again use Welsh as the model and just have a definite article, y for nouns beginning with consonants, and yd for nouns beginning with a vowel or with n.

Word Order

Different languages have different word order. For some languages, like English, word order is very important: "The man climbed the hill" makes perfect sense but "the hill climbed the man" is gibberish. This is because in English the subject of the verb precedes the verb while the object follows it. In other words, English uses SVO word order, and varying from that order requires awkward contortions like "the hill was climbed by the man."

Other languages use different word order: Latin, for example, uses a default SOV order but because the subject and object are indicated by the case those nouns are in word order is very flexible. Most Romance languages are SVO like English, while Welsh and Arabic are VSO. The great majority of human languages are SOV, SVO or VSO, but examples exist of VOS, OVS and OSV languages. And as always, this can get more complicated; German and Dutch use something called V2 word order which is kind of SVO most of the time but not really.

We're going to say that Cythric uses VSO word order. We don't really care too much at this point since the kind of phrases we're constructing won't often include verbs, but later on that may be something we need. Right now we do care about the internal structure of nominal phrases, in particular where the adjective falls. We're going to place it before the noun, like English... but outside the noun's relationship with its article, if any. So a phrase like "the black shield" would be dyw y chaurn, literally "black the shield." Which has a cool poetic rhythm to it.

Genitives

The genitive is one of a number of noun cases. It indicates possession, composition or origination. Different languages denote cases differently. Latin, for example, has a total of seven cases, of which several partially collapsed into one another. Russian has six and like Latin indicates case by inflection. Finnish has an astonishing fifteen cases. German has four cases and (at least in the abstract) kind of an ideal case system in my opinion. The nominative is for the subject of the verb, the accusative is for the directed object, the dative is for the indirect object and the genitive represents possession or composition. Most languages have cases that overlap with these.

There are two primary ways to denote the genitive in English. One is by adding an -'s or -' to the end of the noun to which the genitive noun is related. The other is to follow the noun to which the genitive noun is being related with of, and to then place the genitive noun after that. So you have constructions like Marcus' gloves or the gloves of Marcus. In both cases Marcus is the genitive noun, related by possession to the gloves. The genitive is either Marcus' or of Marcus.

In Latin the genitive form is derived from the nominative ("naming") form by inflection. So Marcus would become Marci, and the word for gloves (caestus) could be in the nominative or in some other case. Again, this is something that can become very complicated, but we're not going to move too far afield. For Cythric we want something simple and not too alien — and Welsh provides a fine model that we can use to illustrate a different option than those described above.

We will stipulate that genitive relationships in Cythric are denoted by apposition. That is, by the placement of one word with regard to the other. In Welsh the two nouns are placed together with the possessor coming second. So our manipular example would be denoted by the equivalent the gloves Marcus. Note the inclusion of the definite article the. In Cythric it works the same way. So "the king's hall" translates to y firhn wyrhn, literally "the hall king." Note that due to the lack of an indefinite article, just firhn wyrhn would translate implicitly to "a king's hall."


Compounding

In a sense our example phrase y firhn wyrhn is an ideal example, because it's very natural to assume that if it were the name of a place, the article might drop, leaving firhn wyrhn. From there it similarly follows that the word might be compounded into Firhnwyrhn... perhaps the name of a town that was once the seat of an old Cythric king.

Even using our very basic grammar rules and very limited lexicon we can come up with lots of possible names:

  • Ylanwraga, "the Stones," might be an ancient ring of standing stones once used for druidic rituals. Now they lie deep in the wilderness and are used for sacrifices held by a nefarious cult.
  • Frhalim, the high valley. A farming hamlet high in the foothills. Recently plagued by a series of hauntings, possibly orginating in nearby ancient barrows.
  • Sorhmtharn, Hawk's Ford, a small but bustling town at a major river crossing.
  • Eddrachaum, the Shield's Song. A ancient lay of the deeds and death of Gesge, a tribal hero of old.
  • Fearmhlen, the Strong Hall. Seat of Earl Thermge.
  • Gethraga, the Spear Lands. An old name for a fallen kingdom.
  • Hlyndahot, the Hound of Ill Omen. A feared beast that awakes once every thirteen years to devour the blood of the innocent. Note assimilation of /t/ to /d/ following /n/.

With this, our first basic naming language is assembled and with some additions to its lexicon we'll be able to build many more names from it. But all these names are what we might call names in the Old Forms. That is, names as the ancient Cythric people of the isles would have spoken them. Not all names in the Eastern Isles have such an ancient provenance. The next time we touch on Cythric we'll have a look at how that will be reflected.

Monday, December 16, 2013

The Cythric Word List

If the previous posts on phonetics and that kind of buzzkill stuff were not your cup of tea, you can start your name-building process here, with this post. If you did follow along with the last few articles you'll know that we have the first two components of what we'll need to develop the Cythric languages into a complete tool we can use for naming; the phonology and phonological rules of its ancestor language. Now we start making words.

There are basically three approaches to this:
  • You can create a set of random tables to build random syllables and assemble them into words. A somewhat slapdash example of this can be found in the alien language tables for Vilani, Zhodani, etc from GDW's classic Traveller line. The results of their implementation are... less than satisfactory.
  • You can use an online utility or program to build your words for you. As with the previous random-ish method, youre going to end up with a lot of junky words that you'll need to sift through.
  • You can make words that match your rules by hand. This works best over a long period but I caution against sitting down and writing up a hundred words in one sitting; you'll go stale fast.
What I'm going to do is hand-craft a few words (word elements, really) and supplement that using gen. How many words do I need? Well, that depends. Ideally, you'd want probably a few hundred, and really there is no number too large, but a lexicon of many hundreds of words is probably overkill for most people needing a simple naming language. For place names, you probably need a few dozen.

Start by observing how place names are built. Place names in England are best for this because their construction is most apparent. Sometimes this is obvious, as in Beaconsfield from Beacon's Field, for example, or Whitchurch from White Church. In other cases it's less obvious, as in Ayslesbury, from Aegel's Fort. It makes perfect sense once you know a burgh is a fort and that in Anglo-Saxon (Old English) the letter g is sometimes pronounced as /y/.

Half an hour with a map and Wikipedia should give you a sense of the pattern of place names... and a good idea of the words you'll need. Here is an exhaustive list of such elements from English place names. By the way, this is the exact method used by Ed Greenwood to construct many of the names in the Forgotten Realms. It's a method that works whether you render the names in plain English or translate them into some other language, although it does presuppose both a roughly medieval western European culture and roughly Indo-European language. Names in the Japanese pattern, for example, are very differently constructed, so for something more exotic you may want to range further afield.

So words for places like river, camp, hill, bay, port, field, mountain, farm, fort, crossing, woods, meadow.  And adjectives like rich, great, wide, tall, hale, and number and color words. With, say, ten or twenty from each group you should have sufficient fodder to start building names. Assign words from your word list to the meaning you've assembled.

Here's my sample starting list of thirty words:

element type meaning
dyw adj black
ethyrhn adj hallowed
fearm adj strong
frapa adj green
frha adj high
ges adj bold
reag adj old
hot adj ill-omened
shrapa adj fertile, verdant
therm adj grim
firhn n (place) hall
hlen n (place) tower
lanwra n (place) stone
lim n (place) river
llan n (place) valley
llyn n (place) farm
osyrhn n (place) river mouth
sirhn n (place) hill
sorhm n (place) ford
thra n (place) land
ashri n (thing) flower
chaurn n (thing) shield
eddra n (thing) song
ge n (thing) spear
hearhm n (thing) helm
hlynta n (thing) hound
tharn n (thing) bird of prey
thorn n (thing) hammer
warhn n (thing) chief, headman
wyrhn n (thing) king

This should be enough to get started with, and I made sure to include some elements that could be used in masculine or feminine personal names as well. Now we are almost ready to commence the actual world building, once we know how words can be put together.

Wednesday, December 11, 2013

The Phonology and Morphology of Cythric


In talking about phonology I'm going to go full language geek and use the IPA, or International Phonetic Alphabet. At least a cursory knowledge of this is essential to any but the most superficial language builder, as is a passing understanding of phonetics itself. I recommend J. C. Catford's A Practical Introduction to Phonetics for reference on the subject; it's not a dry textbook but an instruction manual, complete with exercises, that teaches you the different sounds of language using your own yap as the classroom. Great book. For now, though, feel free to ignore all that and just pay attention to the pronunciation and orthography that I describe below, and I'll try to at least mention anything else.

Briefly, phonemes are the sounds of a language. These aren't the same as letters used to write the language; English uses a pedestrian 26- letter Latin alphabet but employs something like (depending on dialect) 35 different phonemes. So the letter a, for example, is used for three or more different sounds depending on who is speaking. The IPA, on the other hand, is an alphabet that uses a unique characters for every possible phoneme (at least nominally.) Some IPA characters are also regular letters, so when we write them we enclose them in slashes, thus: /p/ is the phoneme that we find the the beginning of the English word pill, for example. My own orthographic representations of the sounds, the way I will depict them in writing, will be denoted by in bold. So, for example, we have the sound /k/ in Proto-Cythric but for the sake of flavor we will render it as c.

There are also allophones, which are variations of a phoneme as actually used in a language. If you carefully compare the initial /k/ sounds at the beginning of the English words cool and calm, you'll find that while they can both be represented by the phoneme /k/ and the letter c, they are actually slightly different sounds, allophones of /k/. But we're not going to worry about phones or allophones at this point; instead we'll concentrate on making everything understandable to speakers of English; phonologically at least, Cythric isn't terribly different.

As previously mentioned I'd like Cythric names to sound lightly Celtic with hints of Basque and Anglo-Saxon, but we'll see how close I can come to that goal. First we will develop the Cythric language and then derive from it two daughter languages, Islander and Kethwyrn, of which the former will be subject to heavy outside influences. The first stop on this road is to decide what sounds are used in the mother tongue.

Consonants

Cythric has six stops: labial /p/ and /b/, alveolar /t/ and /d/ and velar /k/ and /g/, represented orthographically by p, b, t, d, c and g, respectively. Note that both c and g are always hard as in call and guard, never soft as in cell or gender. Although there are many deep and subtle variations in the way these sounds are pronounced in different languages, we can consider them functionally equivalent to the corresponding English sound; in some form these six phonemes are found in almost every human language.

Cythric also has a voiceless labio-dental fricative /f/ and a voiceless alveolar fricative /s/, pronounced much as they are in English and represented by f and s. However, their voiced equivalents /v/ and /z/ respectively are not used in Cythric. There is also a voiceless dental fricative /θ/, represented by th, which is the initial sound of English thing, and a voiced equivalent /ð/, indicated by dd, the voiced sound of th from English this.

Cythric employs an unvoiced palatal fricative /ʃ/ (the sh of English shift,) represented by sh. Also present are the voiceless velar fricative /x/ (the ch in German Bach) which we will represent orthographically by ch and and a voiceless glottal fricative /h/ (as in English.) Lastly there is the voiced labio-velar approximant /w/, represented by w and pronounced as in English wake; some dialects of English have an unvoiced version, in for example which; in Cythric it is always voiced.

We will also include three nasal consonants: /m/ and /n/ (m and n,) similar to their equivalents in English, and /ŋ/, the ng sound in English ring. Note that this is not the ng of English finger. We'll see later whether this particular combination will occur in the Cythric tongues, but if it does we will use the orthography ngg.

The sounds we think of as "l" and "r" sounds are called lateral and rhotic sounds, respectively, and are collectively referred to as liquids. Even in English they are kind of a mess, but getting them right will go a long way toward giving Proto-Cythric the flavor I'm looking for.

Cythric's three lateral consonants are the voiced alveolar lateral approximant /l/, which is the clear l of English and will be denoted orthographically by l; a voiced velarized alveolar lateral approximant /ɫ/ (the dark l of English) denoted by hl; and a voiceless alveolar lateral fricative /ɬ/, which is the ll of Welsh and which is denoted the same way here. Also included is a rolling /r/, not the English r but the trill of Spanish perro. This will be denoted by rh. There is also a tapped /ɾ/ which I will denote by r. The standard /ɹ/ of American English is not present in Proto-Cythric.

It may not seem like it, but that's a fairly simple system of consonants (compare it to Hindi's) with only a few sounds (/x/, /r/ and /ɬ/) not found in English.

Vowels

The vowel system of Cythric is fairly straightforward, with seven vowels.
  • A near close, near front unrounded /ɪ/, the i of English pit, represented by y.
  • An open central unrounded vowel /ʌ/, the u of English dust, represented by u.
  • An open mid front unrounded /ɛ/, the e of English press, represented by e.
  • The open back unrounded /ɑ/, the a of English father, represented by a.
  • A close front unrounded vowel /i/, the ea of English neat, represented by i.
  • An open mid back rounded vowel /ɔ/, the ou of Midwestern American English thought, represented by o.

In addition, there are three dipthongs:
  • An /ɑʊ/, the ow of English prow and represented by au.
  • The /eɪ/, the a of English phase, represented by the digraph ae or the ligature æ.
  • A, /ɪə/, the ea of English near, represented by ea.
When vowels are long they will be doubled, as in shaan, "lake." The a, o and u vowels have long versions aa, oo and uu, and when these are found initially they are preceded by a breathing, denoted by h, much as in Ancient Greek. Also note that Cythric will have a regular stress on the penultimate syllable. Daughter languages of Cythric will have evolved phonologies, of course, but those will be easy to develop as variations on this base.

Phonological Constraints and Basic Wordbuilding

That panic-inducing phrase just means the ways in which the sounds of a language can be put together to form morphemes, which are language components that make up or are themselves words. Roots and affixes are both types of morphemes, for example; the English root bake and the suffix -ed are distinct morphemes and as such carry meaning with them.

Different languages have different rules or putting morphemes together. To English speakers the word ngai seems weird and perhaps even unpronounceable. That's because in English the /ŋ/ isn't allowed initially. In other languages, though, initial /ŋ/ is perfectly normal — Ngai is a common Cantonese-derived surname.

Designing this seems like a chore especially given the advice in Rosenfelder, which involves learning the code linguists use to describe such rules. It's actually fairly simple: start with a small sampling of words, preferably words of different numbers of syllables. Half a dozen words that you feel sound characteristic of your language should be enough.

Then, divide your phonemes into categories. This can be as simple as Consonants and Vowels, but it can get more involves if you'd like. For Cythric I've used six categories:
  • Stops (S): /p/, /t/, /k/, /b/, /d/ and /g/
  • Fricatives and Approximants (F): /f/, /s/, /θ/, /ð/, /ʃ/, /x/, /w/ and /h/
  • Rhotics (R): /r/ and /ɾ/
  • Laterals (L): /l/, /ɬ/ and /ɫ/
  • Nasals (N): /n/, /m/ and /ŋ/
  • Vowels and dipthongs (V): /i/, /ɛ/, /ɑ/, /ɔ/, /ʌ/, /ɪ/, /ɑʊ/, /eɪ/ and /ɪə/
Now take your short word list and dissect the words, syllable by syllable and sound by sound. For Cythric we'll use as an example the name of the language, cythric, literally the only word of the language that had been determined before this article was written. Cythric has two syllables, /kɪθ/ and /ɾik/. Disassembling each of these syllables we get stop-vowel-fricative and rhotic-vowel-stop. So right there from that one example we learn that SVF and RVS syllables are allowed in Proto-Cythric.

A second word is cethwyrn, the name of a daughter language of Cythric that is spoken by some tribal peoples of the Isles. Cethwyrn breaks down to /kɛθ/ and /wɪɾn/ and from there to SVF to FVRN. That's one more permissible syllable form. You can see that from a very small but varied sample of words you can extract a number of allowable combinations, and more can be added at will. If you like, you can even start to generalize them using the standard linguist's notation that Rosenfelder discusses, but once we have, say, six to ten such combos we've gone as far with this as we need to. For Proto-Cythric I have eleven. From here we can start generating word lists. And that's where we'll pick up next time.