Features in next version of Parley

Andreas Xavier andxav at zoho.com
Sun Aug 31 16:01:36 UTC 2014


In this response, I will describe the method in greater detail and then address your specific
questions in the body of your email.

The method was developed as throw away code to determine what data is needed in the
file format to do useful estimation and scheduling. It does re-implement some existing
Parley functionality. It fixes Leitner's implicit estimation problem with short training
intervals, late training, and extra training.

The system is series of stages that are independent and run in sequence. The stages are the
estimator, scheduler, game/method selector, game/method and training data storage.
It is encapsulated well enough that you can swap out stages, game/methods or grammar.
The encapsulation depends on two things: trainable item and the game/method.

A trainable item is an word/phrase/picture/etc. from the database that is a target of some
game/method. Trainable items have training data stored. Native language words
and mnemonics are not trainable items. A whole set of inflections are a trainable item.
The individual conjugations might also be trainable if we have a game/method that
did individual conjugations. Individual gendered words are trainable, but the set of
all gender words probably isn't. The individual active games decide what is trainable.

Training data is added at each training event and is a list of events. It is either of the form
(trainable-id, type (listen/read/speak/write), datetime, success/failure), or as a user
self assessment (trainable-id, type (listen/read/speak/write), datetime, self assessment)

A key part of the encapsulation is the game/method. Each game method object does 2 things.
It runs one test of the user. It also answers the question, "Can this game/method train this item?"
The game/method is the only part that understands, if a trainable item expands into 6 conjugations,
3 antonyms, a prepositional phrase or something else. The game/method is the only
part that knows what mode of speech this tests and what gui front end is
used (flash card/multiple choice/written word) etc.
All of the other stages just treat trainable items as opaque.

Here is a more detailed explanation of the pipeline

input: training data
output: list of (trainable item , estimated time constant)

For each trainable item this estimates the current time constant from the trainable data.
You could plug in the Lietner method here except it unduly rewards early practice. This
method has a roll off for short training intervals.

inputs: list of (trainable item , estimated time constant)
output: list of trainable items to be trained now

This is simple. It looks are the number of times that the user wants to train, the
number of trainable items pending and any new untrained items. It makes a list.
It could be scheduling individual items in continuous mode, or a block of items.

Game Selector
input: list of trainable items
output: list of (game, trainable item)

For each trainable item this asks all of the active game/methods, "Can you train this item?".
It then chooses a game for that item from the list of game/methods that can train it.
An active game/method is one that is registered and the student or the lesson has
selected it.

input: one trainable item, one user/student, database
output: list of (trainable item, type (listen/read/speak/write), datetime, success/failure)

The game/method pulls any additional items it needs from the database, chooses
a gui, runs the training with the user and then returns a list of all of items trained
and their result.

For example a conjugation trainer, might receive one trainable item, (an infinitive, present tense).
It tests 6 present tense conjugations. The students gets 5 right and 1 wrong, so the
game/method returns 7 trainable items: 5 right and 1 wrong conjugation and the
original trainable item marked wrong.

Data Storage
input: list of (trainable item, type (listen/read/speak/write), datetime, success/failure)
output: training data

how: Save to database.

> There are some things I don't understand in your description below. For instance how you calculate the optimal intervals for an individual? Or do you?

The optimal interval is per user, per word and per mode. This version doesn't calculate the optimal intervals,
but I think there will be enough information in the data to do a better job by plugging in a better scheduler later.
The optimal interval also depends on many things. Whether the student's goals are to learn quickly no matter how much
time is spent studying, or learn efficiently. How cognizant they are that incorrect answers reinforced
just before you forget them are more effective than regular correct boring sameness.

> How is the algorithm affected if you don't train for some days even though it is scheduled?

The only stage affected by amount of practice is the scheduler. If the user doesn't have time to
practice all pending words, the scheduler prefers the longer time constant words
because they clear more time in the future schedule. The result is that if a user consistently
under practices they will first achieve mastery of a small subset from the beginning of the
lessons. Eventually, if they are just understudying by a small amount they will master the
whole file.

>How is it handled when you introduce new words in a collection that is already trained some time? Etc.

If new words show up in the collection, then the scheduler will ask the game/method, "Can you
train this word?" and proceed from there.

Cheers Andreas

More information about the kde-edu mailing list