Z-SAMPA

From KneeQuickie
Jump to: navigation, search

Z-SAMPA is a phonetic transcription system based on the IPA; all IPA symbols have a corresponding symbol in Z-Sampa, so that conversion is easy between the two systems. However, Z-Sampa extends the range of the IPA, and many of its symbols have no direct IPA equivalent. There are, however, equivalents to some but not all of these in the Extended IPA chart.

Z-Sampa was developed by the members of the ZBB (largely by Nuntar, building on previous work by Circéus and finlay as well as on the Kirshenbaum and CXS systems, but incorporating many helpful suggestions made by others) for the benefit of conlangers; its mission is therefore to make possible and easy the representation of all sounds that might be wanted in conlangs regardless of whether they exist in natural languages.

Z-Sampa is an extension of X-Sampa, and is fully backwards compatible: any X-Sampa string will have precisely the same meaning in Z-Sampa. There is therefore no possibility of confusion between the two systems. Z-Sampa also offers alternative representations for several X-Sampa symbols; in particular, its creators believe that the Z-Sampa systems for representing clicks and tones are hugely superior to those of X-Sampa.

Finally, Z-Sampa uses only ASCII characters, so that it can be used in any context; however, we recommend that, in contexts where some IPA symbols are available (the more common ones, such as æ), they can be used to replace the corresponding Z-Sampa symbols.

Like X-Sampa, Z-Sampa strings are uniquely parsable, and therefore do not need to be written with spaces; however, spaces can be inserted between words for clarity if desired.

Chart of all Z-Sampa symbols

Pulmonic consonants

When two symbols occupy a slot, the first is unvoiced and the second voiced. When only one symbol occupies a slot, it is voiced, with the exception of the epiglottal and glottal plosives and the percussives. All voiced segments can be made unvoiced with the diacritic _0.

Bilabial Labiodental Bidental Dental Alveolar Postalveolar Retroflex Alveolo-palatal Palatal
Plosive p b p_d b_d t_d d_d t d t_- d_- t` d` t`_m d`_m c J\
Nasal m F etc. n etc. n` etc. J
Trill B\ B\_d r c\` c\
Tap or flap 4 r`
Tap #\ #\`
Flap W\ b\ d\ d\`
Spirant fricative p\ B f v h_t\ h\_t\ T D T_a\ D_a\ T` D` C j\
Sibilant fricative s z S Z s` z` s\ z\ T\ D\
Lateral fricative K K\ K` K\` C\ 6\
Lateral + central fricative S\ Z\
Nareal fricative m_; F_; n_; n`_; J_;
Percussive w\ w\_d t\
Lateral approximant l l` L
Approximant P\ P or v\ r\ r\` j
Labiopalatal Velar Labiovelar Uvular Pharyngeal Epiglottal Glottal
Plosive k g k_p g_b q G\ >\ ?
Nasal N N_m N\
Trill $\ R\ %\
Tap or flap ^\ 4\ 9\
Fricative 8\ H_r x G W w_r X R X\ ?\ H\ <\ h h\
Lateral fricative F\ V\ q\ Q\
Nareal fricative N_; N\_;
Lateral approximant L\ Y\
Approximant H M\ w y\ e\

Note: the symbols /R/, /?\/ and /<\/ may all represent either a fricative or an approximant; the symbols /y\/ and /e\/ have been added in case one wants to specify the approximant. All epiglottal symbols may be used for pharyngeals if there is no separate symbol for the relevant pharyngeal consonant – though, except for /e\/, these are believed to be impossible.

Vowels

When two vowel symbols occupy one slot, the first is unrounded and the second rounded.

front front central back back
tense lax lax tense
Close tense i  y 1  } M  u
alternatives: i\ u\
Close lax I  Y I\ U\ m\ U
Close-mid tense e  2 @\ 8 7  o
Mid lax E\ 2\   @ 7\ o\
Open-mid tense E  9 3  3\ V  O
Open lax { {\   6 A\
Open tense a  & a\ &\ A  Q

8-bit vowel transcription

This is an alternative system for representing vowels in contexts where 8-bit characters are available. It is fully compatible with Z-Sampa, as it does not reuse any Z-Sampa symbols with different meanings, and is much simpler to remember as it follows a regular pattern.

front front central back back
tense lax lax tense
Close tense i ü î û ï u
Close lax I Ü Î Û Ï U
Close-mid tense e ö ê ô ë o
Mid lax  @
Open-mid tense E Ö Ê Ô Ë O
Open lax ä  â
Open tense a å Â A Å

Note that not all Z-Sampa vowels have equivalents in this system.

Clicks

Z-Sampa retains the X-Sampa click symbols, but in addition has a click diacritic _! that allows a greater variety of modified clicks to be made easily:

unvoiced voiced nasalized
Bilabial O\ or p_! b_! m_!
Dental / laminal alveolar =\ or t_! d_! n_!
Apical (post)alveolar / retroflex !\ or t`_! d`_! n`_!
Laminal postalveolar / palatal |\ or c_! J\_! J_!
Alveolar lateral |\|\ or t_l_! d_l_! n_l_! (similarly, palatal lateral click c_l_! etc.)
Velar k\ or k_! g_! N_!
Sublaminal lower alveolar percussive click ;
Alveolar and sublaminal click ;\
Forward released lateral click +\

The combinined diacritics [_l_!] may be written as the single diacritic [_7].

Click modifications may also be shown with a tie-bar (or, preferably, the right parenthesis; see below), thus [k=\)] for unvoiced velar posterior closure, [N\=\)] for nasal uvular posterior closure, etc. If no posterior closure is indicated, [k] is assumed, except in the case of the velar click, which can only have a uvular posterior closure. (Although the IPA officially considers the velar click impossible in any case.)

Other phonemes

Sublaminal tap g\
Velarized alveolar lateral approximant 5
Strongly velarized alveolar lateral approximant 5\
Velopharyngeal fricative f\
Alveolar lateral flap l\
Retroflex lateral flap l\`
Simultaneous [S] and [x] x\
Sound with no available symbol *\

Diacritics

The following diacritics are normally written straight after the symbol they modify (e.g. [n=] for syllabic [n]), although all of them can optionally be joined to the preceding symbol with an underscore:

Syllabic =
Nasal ~
Retroflex/rhoticity `
Long :
Half-long :\
Palatalized '

All other diacritics are joined to the symbol with an underscore _, e.g. [t_d] for dental [t].

_a Apical _(v Initial partial voicing
_a\ Alveolar/labioalveolar _v) Final partial voicing
_A Advanced tongue root _V Pre-voicing
_B Extra low tone _V\ Post-voicing
_c Less rounded _w Labialized
_C Labial spreading _w\ Pre-labialized
_d Dental _W Exolabial
_d\ Dentolabial _W\ Endolabial
_e Velarized or pharyngealized _x Mid-centralized
_E Lower dental or linguolabial _x\ Tense
_f Whistled articulation _X Extra-short
_f\ Velopharyngeal friction _y Weak articulation
_F Falling tone _Y Strong articulation
_G Velarized _0 Voiceless or slack voice
_h Aspirated _0\ Partial devoicing
_h\ Preaspirated _(0 Initial partial devoicing
_H High tone _0) Final partial devoicing
_H\ Harsh voice _7 Lateral click
_j Palatalized _8 Whispery phonation
_k Creaky voice _9 Faucalized voice
_l Lateral release _! Click
_l\ Monolateral _" Centralized
_L Low tone _%\ Strident
_m Laminal _& Open rounded
_M Mid tone _+ Advanced
_n Nasal release _- Retracted
_n\ Pre-nasalized _/ Rising
_N Linguolabial _; Nasal escape
_N\ Interlinguolabial _< Implosive
_o Lowered _<\ Ingressive airflow
_O More rounded _=\ Unaspirated
_P Labiodentalised _> Ejective
_P\ Labialized (as distinct from labiovelarized) _>\ Egressive airflow
_q Retracted tongue root _? Glottalized
_r Raised _?\ Pharyngealized
_R Rising tone _\ Falling
_t Breathy voice _\\ Reiterated articulation
_t\ Interdental/bidental _^ Non-syllabic
_v Voiced or stiff voice _} No audible release
_v\ Partial voicing _~\ Denasal

Tones

As an alternative to the tonal diacritics above, Z-Sampa supports tone numbers enclosed in angle brackets, using 5 = _T = very high, 4 = _H = high, etc. Thus [a<254>] is equivalent to [a_L_T_H]. As in X-Sampa, numerical diacritics _1 to _6 are reserved for language-specific tone contours.

Sliding articulation

The dollar sign, $, is used to represent the ExtIPA "sliding articulation" marker. It is placed in between two segments, thus: [s$S] means sliding from [s] to [S].

The plus symbol

The plus, +, may be used as a generic "superscript" marker; that is, the immediately following symbol (or two or more symbols enclosed in parentheses) is to be interpreted as a modification. The following uses are particularly noteworthy:

  • +h after a segment shows aspiration; before, shows preaspiration.
  • +h\ is an alternative for breathy voice, and +? for glottalization or (before a segment) glottal onset.
  • +j, +G, +P etc. are alternative ways to show palatalization, velarization, labiodentalization etc. All of these can be placed before a segment to show that the secondary articulation precedes the primary.
  • +m, +n etc. (use the nasal at the same point of articulation as the primary segment) show nasalization, or prenasalization if placed before a segment.
  • +s, +S etc. (use the fricative with the same voicing and point of articulation as the primary segment) can be an alternative way to show an affricate as distinct from a stop + fricative cluster; and placing the plus before the stop component (+ts, +dz etc.) can be used to show that the fricative component carries the emphasis.
  • +i, +u etc. (any vowel) can be used for diphthongization, with the superscript symbol representing the non-syllabic element. Also, +@ can represent an epenthetic schwa.

If there is likely to be any ambiguity about whether a "superscript" symbol belongs with the following or preceding segment, the tie bar can be used to specify: [t_+jp] to mean palatalized [t] followed by [p], for instance, against [t+j_p] to mean [t] followed by pre-palatalized [p].

Connected speech

<.> Short pause
<..> Medium pause
<...> Long pause
<f> Loud
<ff> Louder
<p> Quiet
<pp> Quieter
<allegro> or <alg> Fast
<lento> or <len> Slow
<crescendo> or <crs> Getting louder
<diminuendo> or <dim> Getting quieter
<accelerando> or <acl> Getting faster
<rallentando> or <rall> Getting slower

(Other musical terms may be similarly used.)

Other symbols

<F> Global fall
<R> Global rise
< Begin nonsegmental notation
> End nonsegmental notation
^ Upstep
! Downstep
| Minor foot group
|| Major intonation group
- Separator
-\ Linking mark
* Undefined escape character (“conjunctor”)
" Primary stress
% or , Secondary stress
. Syllable break
# Word break
0 (zero) No phoneme

Parentheses

Z-Sampa supports the use of a single right parenthesis to show that the preceding two sounds are an affricate, diphthong or coarticulation; this is an alternative to placing an underscore between them. (For example, [ts)] is an alternative way to write [t_s].)

Triphthongs or coarticulation of three sounds may be written with two right parentheses; thus [R\rB\))] would be a coarticulated bilabial-alveolar-uvular trill, as opposed to [R\rB\)], which would be a uvular trill followed by a bilabial-alveolar coarticulate.

The left parenthesis is only used together with the right parenthesis, and shows that the sounds in between them behave as a single segment. This can be used in phonemic transcription for combinations such as /st/ or /nd/ that are not affricates or coarticulates. It is also just as acceptable to use both parentheses together to enclose a coarticulate as it is to mark it with just one. (For triphthongs or coarticulates of three sounds, this may be more aesthetic and easier to read than using the double right parenthesis.)

Left and right parentheses can also be used together to show that a diacritic acts on both or all of the sounds enclosed: for instance, (nt)_d is a neater way of writing n_dt_d. With some diacritics the two could even be distinguished; one could write, for instance, (tk)_> to mean sequential [t] and [k] sharing a single glottal closure, as opposed to t_>k_>, consecutive ejectives with separate glottal closures.

Informally, left and right parentheses together are often used to show that a phoneme is optional in a certain word, such as word-final /r\/ when comparing dialects of English. This is not standard Z-SAMPA, but is not discouraged.

Another use for left and right parentheses together is to show alternatives: [(g/G)] would mean that a phoneme can be realised as either [g] or [G]. This is also not standard, but is perfectly acceptable as it does not lead to any ambiguity.

Unused symbols

The following symbols are currently unused. (I'm making this list for my own purposes, should I want to add to Z-Sampa, and for the aid of anyone who wants to suggest additions.)

  • n\

Frequent Z-Sampa mistakes

This is a brief list of the most commonly made errors when using Z-Sampa.

  • The primary stress mark is a double quote ", not an apostrophe '. It goes before the stressed syllable: ["sIl.@.b5=]. This applies to the secondary stress mark (the percentage sign % or, preferably, the comma ,) as well.
  • [!\] and [|\] are clicks, although the click diacritic _! is preferred. [!] and [|] are not.
  • The underscore, or its alternative the right parenthesis, can be used to join the two components of an affricate, coarticulation or diphthong. It is not used to join other combinations of consonants that happen to behave as though they were single segments, such as /st/ or /nd/ or /tr/.
  • Nor is the "linking mark" -\ used for this; its function is to show the absence of a break in intonation.
(The left parenthesis has been added to fulfil this function; see Parentheses above.)
  • Note also that the underscore is not necessary in this context and can well be omitted unless a language actually distinguishes (for instance) /t_s/ from /ts/.
  • The separator (hyphen -) can be used between two segments that might be interpreted as an affricate, coarticulation or diphthong to show that they are in fact separate.
  • The retroflex marker ` is not really a diacritic but an integral part of the symbol t`, d` or whatever. Any diacritics must therefore come after the grave accent.
  • Similarly, although this is not quite so important, diacritics changing the place or manner of articulation of a sound should come before those indicating secondary modifications such as aspiration or palatalization.
  • Easily confused: the tilde ~ (with no underscore) is the diacritic for nasalisation; _n denotes nasal release and _; nasal escape.
  • Easily confused: _o is the diacritic for lowered; _O (capital o) denotes more rounding and _0 (zero) voicelessness.