INSTRUCTIONS FOR QAVAID Version 0 (in French)Dated 3/2/93 1. What files containQAVAID reads and writes text files that may be created and edited by Word 4, using the "save as pure text" option. QAVAID does not check file types, so it is up to you to use proper names distinguishing the three file types. 1.1 Data filesThey contain examples to be processed by QAVAID. Separation between examples is done by numbering them between square brackets, not by making a new paragraph. For instance, if you type:dha dha tr kt dhatr kt dha dhaQAVAID will read one single example: "dha dha tr kt dha tr kt dha dha"So you should type:[1] dha dha tr kt dha[2] tr kt dha dhaYou may type examples either in spaced format "dha dha tr kt dha" or in compact format: "dhadhatrktdha". In space format, spaces or line feeds indicate bols. QAVAID automatically records that "dha", "tr", "kt" are the basic alphabet. In compact format, QAVAID must have been instructed beforehand to make sense of the input sentence. This is done by keyboard encoding. In Bol Processor BP1 it was possible to map every key of the keyboard to a particular bol. You can do the same with QAVAID. The "STANDARD KEYBOARD" that was eventually defined for Bol Processor experiments on tabla is stored on the disk. Unfortunately, in the present version, it is uneasy to use keyboard encoding for typing items: the display you get is the list of keys, not the string of bols. So, keyboard encoding is mostly used to read data files that have been typed without spaces. Reading those files is rather slow, but it has the advantage that QAVAID checks that all bols belong to the alphabet predefined by keyboard encoding. Otherwise, if you type "dhha" QAVAID would not warn you that it is creating a new bol. 1.2 Grammar filesThese files must not be edited. They contain Prolog clauses understood by the Prolog interpreter. They are created by QAVAID when processing examples. They store intermediary results. You may reload them to process more examples.Grammar files may also contain a number of examples and negative examples ("contrexemples") waiting to be processed. You will often be asked whether you want to process these or delete them.A lot of additional knowledge is stored in grammar files. For instance, your answers to QAVAID's questions regarding segmentation, recognized "words", the display mode, possible generalisations, etc. 1.3 Keyboard encodingThese files also must not be edited. They contain the keyboard encoding as described in ¤1.1. 2. How to start, how to stopUse QAVAID under System 6.0.7 (or below) without MultiFinder. It works perfectly with a 2-Mbyte Macintosh, although 4 Mbytes might allow more storage. You need Prolog II, version MPW 2.4, which was distributed by PrologIA Co (Marseille) and is no longer a commercial product.I plan to use Prolog II+, a compiled version of Prolog, which should be able to run exactly the same software at a much higher speed.Your folder should contain Prolog II, the "initial" file, "LEARN" and "LEARN.SUITE". To start, double-click "LEARN". The current date will be displayed and later you get the Prolog prompt ">" with a flashing vertical bar. This takes a long time because Prolog first needs to load "LEARN" and "LEARN.SUITE". At this point you may either enter Prolog instructions (understood by the Prolog shell or the current worlds) or select a menu item with the mouse. The rightmost menu "Exec" is the specific one you will use. The only Prolog instruction you will use is:> quit;to exit the program. Note that a Prolog instruction always ends with a semi-colon.If you type Command-q, you'll see a few buttons asking you about "Sauvegarde". Normally select "Aucune". ("Etat Binaire" would save the binary state of the program, which makes it possible to reload the current situation.)Apart from the "Exec" menu you may use the "Stop" instruction, or Command-s, to stop any process. Of course, results are unpredictable then. It is advised to reinitialize QAVAID.Reinitializing QAVAID: select "initialisation" in the "Exec" menu. It may ask you "Effacer les donnees en memoire? O/N", just type "o".Note: you may type either "o" or "y" to mean "Yes". Don't confuse "o" with zero. Many menus expect you to type a number, in those cases you type zero. 3. Analyzing data from a disk file 3.1 Creating or upgrading the basic grammarStart QAVAID, do "initialisation", then select "phrases". You get the menu:(0) Quitter(1) Saisie de phrases au clavier [Entering items on keyboard](2) Saisie d'un fichier sur disque [Entering items from disk]Choix: [type "2"]Codes clavier: INACTIFS [Means you're not using keyboard encoding]OK? O/N [Y]Now you are prompted to find an input data file. Choose for instance "da.demo.12/6/92". QAVAID enters and displays the items:[1] tr kt dha ti dha dha...[2] ......[10] ...then it says:10 exemples0 contrexemples(0) Quitter(1) Completer le noyau [Update basic grammar](2) Sauvegarder les exemples [Save examples](7) Revoir un par un les exemples et contrexemples [Review examples and negative examples one by one](8) Effacer [Erase all entered data][The reason why it jumps from (2) to (7) is that this menu is not shown entirely here.]Choix: [type "1"]Mode ('chunk' or 'chop'): [Type: chunk][In 'chunk' mode, QAVAID does not accept words shorter than two bols, except for the predefined symbol "-". In 'chop' mode it does not check word length. 'Chop' mode may be used in all cases, but it will force you to answer more questions, as QAVAID may propose you a segmentation in which there are one-unit words.]Mise a jour de la grammaire [Updating current grammar. Incidently, current grammar was empty here.]Now you get the familiar QAVAID questioning. Each item is analyzed and you are prompted to answer questions regarding segmentation and specific paths ("chemin specifique"). In a segmentation question you just see the item with a slash marking the proposed segmentation, and you get the menu: O)ui N)on E)ssayer R)etour sur choix precedent"o" or "y" will validate the segmentation. This is done for ever, and QAVAID will store all the information it may deduct from your statement... "n" will prompt QAVAID to propose another segmentation. "E" means temporarily "yes". "R" expects QAVAID to backtrack on preceding choices. For instance, if you had "E" in a preceding question, first QAVAID tries "yes", but on bactracking it will try "no" and this last choice will be definitive."Chemin specifique" [specific path] is indicated between two slashes. This is the part of the automaton which QAVAID is building specifically for the current item. In fact, any item contains three sections: (1) a prefix that is already recognised by the automaton (because other items started identically), (2) the specific path, (3) a suffix that is already known by the automaton (because other items ended in the same way). It seems to me that the specific path is more or less the new information that the musician is exposing in the item.When the dialog stops, QAVAID has recorded a grammar that recognises exactly the sample set. It is wise to save it (along with the sample set). Select "sauvegarde".If you want to continue with other items, you may restart the "phrases" process with another data file. Indeed, if some examples are the same as the ones that have been processed, QAVAID will say something like "deja connu" and ignore them. 3.2 Displaying the grammar and producing itemsSelect "affichage-grammaire".Select "production". You may get:Traiter les exemples et contrexemples? O/NThis would mean that you have examples and/or negative examples in the input buffer, and QAVAID proposes to process them or delete them before it starts producing new items. The reason is that when producing items you'll have the option to store those items as positive or negative examples in order to modify your grammar accordingly. Generally the answer to that question is "n". You get:(0) Quitter(1) Dans l'ordre [Order production](2) Au hasard [Random production]If you select (1) QAVAID will produce the entire set that the automaton recognizes is some order that is fixed although generally it is not the very order in which items were entered. At this point you get exactly 10 items because the grammar has not yet been generalized. If you select (2) QAVAID will produce items in a random order (possibly the same one several times).Now you may get the question "Oublier les exemples recemment entres? O/N" if you still have examples in the input buffer. If you plan to declare some productions as "negative examples", you should answer "n". The reason is that when QAVAID uses a negative example to update the grammar it must eventually recheck the entire set of positive examples. This is the reason why sample sets are stored along with grammars.Then you get the question:Montrer segmentation? O/N [Show segmentation?]Here you may answer "y". Segmentation will be marked by slashes. This indicates that it is rather incomplete: you will immediately find that some new splits would be possible on the basis of the known vocabulary. See ¤3.4.For each item produce, QAVAID will prompt you to say whether it is correct. If you say "n" it will be placed in the "contrexemples" buffer. Then later you can update the grammar on the basis of all negative examples stored, by answering "y" to question:Modifier le noyau a l'aide des contrexemples ? O/N 3.3 Changing the displayYou may want a Bol Processor 1 type display, with indications of laya and sections. Select "format-affichage". You get:(0) Quitter(1) Laya(2) DivisionsFirst select (1) and enter the bol density, e.g. "4".Then select (2) and enter divisions as a list, e.g. 2,3,1Then select (0)The division list indicates the numbers of matras on each line of the display. 3.4 SegmentationSelect "segmentation". If segmentation has already been done you may undo it by selecting "desegmentation" instead. 3.5 GeneralisationCheck this feature on a grammar that has already been saved. Select "initialisation" then "reprise" and load "gr.dtdgndtrkt.1v.J." This i almost the grammar described in our CHum paper. It has been segmented.Select "generalisation". QAVAID first looks for possible generalisations in the current grammar. This may take a long time, so the list of posssible generalisations will also be saved with the grammar. Then you get the question:Induction a partir de la derniere phrase entree? O/Nwhich means whether you want to consider the latest entered item as a basis for generalisation. I have not yet remembered what this does exactly. If you answer "y" then this particular type of generalisation will be appended to other types in case your current file remembers which was the last entered item. So the answer does not matter really. Now you get:(0) Quitter(1) Generer des variations au hasard(2) Permutations(3) Substitutions(4) Fusion evidente(5) Fusion generale[(6) Induction a partir de derniere phrase entree]The normal process here is (1) [random production] because it allows the musician to assess a great variety of productions. Other processes will display all new variations in order, so everyone may be close to the next. Whatever process is chosen, QAVAID will proceed as follows: given a possible generalisation (which may be randomly in procedure (1)), QAVAID produces a new variation that resembles a known one. Then it displays both and requests you to assess the new one. So, new variations are stored in a special buffer, either as positive or negative instances. After some time you may have checked all possible new variations for a particular generalisation. Then QAVAID proposes you to validate the generalisation. You may do so even if there are a few negative examples: those "contrexemples" will be used to update the grammar once it has been generalised. Be careful, however, that processing negative examples is a very slow process because all positive examples must be rechecked afterwards. Also, processing negative examples may produce ambiguity (see ¤4.4). Therefore, only validate those generalisations which yielded very few negative examples. Another possibility that is offered to you is not to validate the generalisation but to use all the variations it produced as positive examples to update the grammar. This is safe if there are many negative instances and you really produced all possible variations.Indeed, you are offered the option to validate a generalisation at any time even if you have not assessed all the variations it produces. This is a risk you take. If later your grammar produces bad variations you will need to process them as negative examples.The order in which generalisations are indicated in the above menu indicates increasing generalisation power. "Fusion" (the merging of two states on the automaton) are likely produce large sets of new variations. Option (4) [Evident merge] is preferable because it examinates merges that are most likely to be proper. (See QAVAID paper in CHum) Permutations and substitutions are equally safe techniques. When choosing generalisations randomly QAVAID knows which generalisations are more likely and it will select them more often.When you accept a generalisation, QAVAID modifies the grammar (by merging variables / states and constructing new paths) and rebuilds the set of possible generalisations. If you had examined other generalisations these may no longer be elligible. However, examples assessed good or bad in studying them will be stored in the input buffer. Those no longer need to be assessed, so it may shorten the assessment process, and in the end they may be processed to update the current grammar.Suppose you've selected option (4). You get:Couples d'expressions a examiner:[1] ==> fusion de XG avec XE [merging XG with XE][2] ==> fusion de XH avec XE[3] ==> fusion de XG avec XHThis may make sense if you have drawn the automaton. Make a choice which merge you want to study first. You get:Valider immediatement? O/NIf you are looking at the automaton it may be evident that the merge is correct, so answer "y". If not, answer "n" and get:>> tidhatrktdha...devient:>> tidha...(0) Quitter (1) Accepter (2) Valider generalisation(3) Contrexemple (4) Changer sens substitutionThe first variation is one that is known by the grammar. You are asked to assess the second one, which is a new one. If you select (1) you accept it, (3) will reject it. Again, if you select (2) you immediately validate the generalisation (at your own risk). Option (4) produces another set of variations by changing the direction of subtitutions. For instance, if you replaced "trkt" with "tidha" later you will try to replace "tidha" with "trkt".Indeed you will not examine all variations in one single session. You may quit the process. You are proposed to validate the generalisation. If you say "n", you get:Eliminer cette generalisation ? (les exemples valides seront conserves) O/NHere you may say "y", that you want to forget about that generalisation. Variations assessed correct are put in the input buffer and you get the message:Modifier la grammaire pour accepter les exemples valides dans toutes les generalisations? O/N [Update grammar to accept examples assessed correct in all generalisations?]It may be fair to do that if you plan to rethink about generalisations, segmentation, etc.Now, save by selecting "sauvegarde" and quit QAVAID. The next day you reload the file ("reprise") and select "generalisation". When trying a particular generalisation, QAVAID will ask you:Oublier toutes les phrases validees dans cette generalisation? O/N [Forget all items assessed correct in this generalization?]Generally you answer "n" because you don't want to loose yesterday's results. But in the meantime you may think that yesterday's informant was not reliable, so you decide to throw away all decisions. Answer "y". 4. Other features 4.1 VocabularySelecting "vocabulaire" displays the vocabulary of chunks of bols currently recognised by QAVAID. 4.2 Recognized splitsSelecting "cesure" displays the list of correct and incorrect splits infered by QAVAID on the basis of your statements about segmentation. 4.3 Renaming variablesThis permits you to change the first character of variables designating states on the automaton. 4.4 AmbiguityIf you have processed negative examples you may start this process to modify the grammar so that each item is produced in one single manner. This process may take a very long time, but it is entirely automatic.