[Parley-devel] Features in next version of Parley

Tue Sep 9 22:38:11 UTC 2014

Hi Andreas,

This is beautiful!  Did you come up with this yourself or did you read about 
it anywhere?

The description below was quite understandable but there are still a few 
things that I don't understand. See comments inside.

	-Inge

On Sunday, August 31, 2014 09:01:36 AM Andreas Xavier wrote:
> Hello,
> 
> In this response, I will describe the method in greater detail and then
> address your specific questions in the body of your email.
> 
> The method was developed as throw away code to determine what data 
is needed
> in the file format to do useful estimation and scheduling. It does
> re-implement some existing Parley functionality. It fixes Leitner's
> implicit estimation problem with short training intervals, late training,
> and extra training.
> 
> The system is series of stages that are independent and run in 
sequence. The
> stages are the estimator, scheduler, game/method selector, 
game/method and
> training data storage. It is encapsulated well enough that you can swap 
out
> stages, game/methods or grammar. The encapsulation depends on two 
things:
> trainable item and the game/method.
> 
> A trainable item is an word/phrase/picture/etc. from the database that is 
a
> target of some game/method. Trainable items have training data stored.
> Native language words and mnemonics are not trainable items. A whole 
set of
> inflections are a trainable item. The individual conjugations might also 
be
> trainable if we have a game/method that did individual conjugations.
> Individual gendered words are trainable, but the set of all gender words
> probably isn't. The individual active games decide what is trainable.
> 
> Training data is added at each training event and is a list of events. It is
> either of the form (trainable-id, type (listen/read/speak/write), datetime,
> success/failure), or as a user self assessment (trainable-id, type
> (listen/read/speak/write), datetime, self assessment)
> 
> A key part of the encapsulation is the game/method. Each game 
method object
> does 2 things. It runs one test of the user. It also answers the question,
> "Can this game/method train this item?" The game/method is the only 
part
> that understands, if a trainable item expands into 6 conjugations, 3
> antonyms, a prepositional phrase or something else. The game/method 
is the
> only part that knows what mode of speech this tests and what gui front 
end
> is used (flash card/multiple choice/written word) etc.
> All of the other stages just treat trainable items as opaque.
> 
> Here is a more detailed explanation of the pipeline
> 
> Estimator
> input: training data
> output: list of (trainable item , estimated time constant)
> how:
> 
> For each trainable item this estimates the current time constant from 
the
> trainable data. You could plug in the Lietner method here except it 
unduly
> rewards early practice. This method has a roll off for short training
> intervals.

This is the part that I have most problems with. I assume that the 
"estimated time constant" is the point in time which is optimal for 
reinforcement of this particular item.  Right?

But *how* is this constant calculated? You say that you use the forgetting 
curve, but it's not at all obvious how this is done.

> Scheduler
> inputs: list of (trainable item , estimated time constant)
> output: list of trainable items to be trained now
> how:
> 
> This is simple. It looks are the number of times that the user wants to
> train, the number of trainable items pending and any new untrained 
items.
> It makes a list. It could be scheduling individual items in continuous
> mode, or a block of items.

I assume that the strategy for picking the things to train now is fixed?  I.e. 
the list of trainable items are sorted in some way? But how is this done, 
i.e. what is the numerical priority calculated from the estimated time 
constant per iten and the current time? You give some clues below but 
those are not enough for me to understand the details.

> Game Selector
> input: list of trainable items
> output: list of (game, trainable item)
> how:
> 
> For each trainable item this asks all of the active game/methods, "Can 
you
> train this item?". It then chooses a game for that item from the list of
> game/methods that can train it. An active game/method is one that is
> registered and the student or the lesson has selected it.

Shouldn't this also take into account the type of training that the user 
wants to do (read/listen/speak/write)? 

In the future Parley could have checkboxes for (at least) those 4 types and 
we could schedule sessions with, say, written using the written answer 
widget, reading using the flashcard widget and say using some method 
from Artikulate.

> Game/Method

You keep mentioning game...  I suppose you have something in mind 
here? Do you have any specific gamifications that you are thinking of?

> input: one trainable item, one user/student, database
> output: list of (trainable item, type (listen/read/speak/write), datetime,
> success/failure) how:
> 
> The game/method pulls any additional items it needs from the 
database,
> chooses a gui, runs the training with the user and then returns a list of
> all of items trained and their result.
> 
> For example a conjugation trainer, might receive one trainable item, (an
> infinitive, present tense). It tests 6 present tense conjugations. The
> students gets 5 right and 1 wrong, so the game/method returns 7 
trainable
> items: 5 right and 1 wrong conjugation and the original trainable item
> marked wrong.

Hmm, why return the original marked wrong?

> Data Storage
> input: list of (trainable item, type (listen/read/speak/write), datetime,
> success/failure) output: training data
> 
> how: Save to database.

This is going to be massive in some time!  We need a real database here. 
Luckily this work is already started with Amarvir's GSoC work.

> > There are some things I don't understand in your description below. 
For
> > instance how you calculate the optimal intervals for an individual? Or 
do
> > you?
> The optimal interval is per user, per word and per mode. This version
> doesn't calculate the optimal intervals, but I think there will be enough
> information in the data to do a better job by plugging in a better
> scheduler later. The optimal interval also depends on many things. 
Whether
> the student's goals are to learn quickly no matter how much time is 
spent
> studying, or learn efficiently. How cognizant they are that incorrect
> answers reinforced just before you forget them are more effective than
> regular correct boring sameness.

Haha, I think we have a convincing problem here. :)  I myself would be 
hesitant to use a method where I get lots of answers wrong if I didn't trust 
it a lot.

But more importantly, you still haven't told me how the optimal interval is 
calculated.

> > How is the algorithm affected if you don't train for some days even 
though
> > it is scheduled?
> The only stage affected by amount of practice is the scheduler. If the 
user
> doesn't have time to practice all pending words, the scheduler prefers 
the
> longer time constant words because they clear more time in the future
> schedule. The result is that if a user consistently under practices they
> will first achieve mastery of a small subset from the beginning of the
> lessons. Eventually, if they are just understudying by a small amount 
they
> will master the whole file.

This is very good.  Just the way you would want it. (and incidentally also 
the way that Parley currently works.)

> >How is it handled when you introduce new words in a collection that is
> >already trained some time? Etc.
> If new words show up in the collection, then the scheduler will ask the
> game/method, "Can you train this word?" and proceed from there.
> 
> Cheers Andreas
> 
> 
> _______________________________________________
> Parley-devel mailing list
> Parley-devel at kde.org
> https://mail.kde.org/mailman/listinfo/parley-devel