Previous tests used a collection of input files, together with a suite of programs to do word-based analysis.
There are several problems with this approach.
So I went back to the drawing board. Earlier this year I developed the concept of chained bigrams, which allows us to create input texts with the correct input character and bigram frequency. For the purposes of this exercise, I created three input texts:
Each layout was evaluated on KLAnext, my fork of Patrick Gillespie's Keyboard Layout Analyzer. My version fixes various bugs and has a modified scoring system.
I then calculate each layout's score relative to the best score on each test, as a percentage.
Then the weighted average for "Fingers" is ((4 × English) + Code + Proglish) / 6.
For the Words tests, I previously took an average of these tests:
For this cycle, I initially took out some of those and replaced them with
Further testing and thought eventually led to two changes:
The Linux word list contains a great many words, many of which you are unlikely to type on a regular basis (e.g. extraterritorialities, acoustoelectrically, acetylsalicylates, hippopotomonstrosesquipedalian, pneumonoultramicroscopicsilicovolcanoconiosis, Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch). So using common words is a more realistic approach.
For the word tests, I count unique letters in each word rather than just words directly. Counting words treats long words the same as short words, and counting letters allows words like "kindnesses" to score higher than "kindness", even though it's the same letters.
I removed the Home Key Words test (popularised by Maltron) because layouts with home keys on the thumb have an unfair advantage. The test also penalises those layouts that opt to keep O and/or U and/or H off the home keys, because other factors are more important.
Each layout's score is then converted to a percentage of the highest score, as its "Words" score.
Finally, the overall average is (Fingers * 2 + Words) / 3.