aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authoryonatanzunger <30514250+yonatanzunger@users.noreply.github.com>2021-01-11 01:21:44 -0800
committerGitHub <noreply@github.com>2021-01-11 01:21:44 -0800
commit554b937d21f8c50515c22498f4f46df0b3ae6569 (patch)
treeed202d5c70dc0c0d538e0bb4e9c2f0fd3ec3804c
parentb113888ec55e456ffcff2d6b04ad29309d01b325 (diff)
downloadqmk_firmware-554b937d21f8c50515c22498f4f46df0b3ae6569.tar.gz
qmk_firmware-554b937d21f8c50515c22498f4f46df0b3ae6569.zip
[Keymap] Redo the accent implementation in melody96:zunger. (#11000)
The previous implementation generated accents in NFKD -- e.g., i followed by fn+e would generate í, which is actually an ordinary i followed by U+0301 COMBINING ACUTE ACCENT. Unfortunately, it turns out that a bunch of websites and apps (especially European ones written in languages that use these a lot) were very poorly written, and will misparse and/or crash if presented with Unicode NFKD. They require and expect NFKC, with characters like í (U+00ED LATIN SMALL I WITH ACUTE) that look visually identical -- and are in fact normalization-equivalent -- but have to be encoded differently. The new accent implementation handles this in a very flexible way. Many new comments added as well, as it's also clear that this is going to need a bit more expansion before it becomes a true polyglot keymap. Co-authored-by: Yonatan Zunger <zunger@desiderata.lan>
-rw-r--r--keyboards/melody96/keymaps/zunger/keymap.c268
1 files changed, 245 insertions, 23 deletions
diff --git a/keyboards/melody96/keymaps/zunger/keymap.c b/keyboards/melody96/keymaps/zunger/keymap.c
index d396de683..d0d2698b7 100644
--- a/keyboards/melody96/keymaps/zunger/keymap.c
+++ b/keyboards/melody96/keymaps/zunger/keymap.c
@@ -14,6 +14,83 @@
14 * along with this program. If not, see <http://www.gnu.org/licenses/>. 14 * along with this program. If not, see <http://www.gnu.org/licenses/>.
15 */ 15 */
16#include QMK_KEYBOARD_H 16#include QMK_KEYBOARD_H
17#include <assert.h>
18
19// This keymap is designed to make it easy to type in a wide variety of languages, as well as
20// generate mathematical symbols (à la Space Cadet).
21//
22// LAYER MAGIC (aka, typing in many alphabets)
23// This keyboard has three "base" layers: QWERTY, GREEK, and CADET. The GREEK and CADET layers
24// are actually full of Unicode points, and so which point they generate depends on things like
25// whether the shift key is down. To handle this, each of those layers is actually *two* layers, one
26// with and one without shift. In our main loop, we manage modifier state detection, as well as
27// layer switch detection, and pick the right layer on the fly.
28// Layers are selected with a combination of three keys. The "Greek" and "Cadet" keys act like
29// modifiers: When held down, they transiently select the indicated base layer. The "Layer Lock" key
30// locks the value of the base layer at whatever is currently held; so e.g., if you hold Greek +
31// Layer Lock, you'll stay in Greek mode until you hit Layer Lock again without any of the mods
32// held.
33// TODO: This system of layer selection is nice for math, but it's not very nice for actually
34// typing in multiple languages. It seems like a better plan will be to reserve one key for each
35// base layer -- maybe fn + F(n) -- which can either be held as a modifier or tapped to switch
36// layers. That will open up adding some more languages, like Yiddish, but to do this effectively
37// we'll need to find a good UI with which to show the currently selected layer. Need to check what
38// the melody96 has in the way of outputs (LEDs, sound, etc).
39//
40// ACCENT MAGIC (aka, typing conveniently in Romance languages)
41// We want to support easy typing of diacritical marks. We can't rely on the host OS for this,
42// because (e.g.) on MacOS, to make any of the other stuff work, we need to be using the Unicode
43// input method at the OS level, which breaks all the normal accent stuff on that end. So we do it
44// ourselves. Accents can actually be invoked in two different ways: one fast and very compatible,
45// one very versatile but with occasional compatibility problems.
46//
47// THE MAIN WAY: You can hit one of the "accent request" key patterns immediately *before* typing
48// a letter to be accented. It will emit the corresponding accented Unicode. For example, you can
49// hit fn-e to request an acute accent, followed by i, and it will output í, U+00ED LATIN SMALL
50// LETTER I WITH ACUTE. These "combined characters" are in Unicode normal form C (NFKC), which is
51// important because many European websites and apps, in particular, tend to behave very badly
52// (misunderstanding and/or crashing) when presented with characters in other forms! The catch is
53// that this only works for the various combinations of letters and accents found in the Latin-1
54// supplement block of Unicode -- basically, things you need for Western European languages.
55//
56// (NB: If you make an accent request followed by a letter which can't take the corresponding
57// accent, it will output the uncombined form of the accent followed by whatever you typed; so
58// e.g., if you hit fn-e followed by f, it will output ´f, U+00B4 ACUTE ACCENT followed by an
59// ordinary f. This is very similar to the default behavior of MacOS.)
60//
61// THE FLEXIBLE WAY: If you hit the accent request with a shift -- e.g., fn-shift-e -- it will
62// instead immediately output the corresponding *combining* Unicode accent mark, which will modify
63// the *previous* character you typed. For example, if you type i followed by fn-shift-e, it will
64// generate í. But don't be fooled by visual similarity: unlike the previous example, this one is
65// an ordinary i followed by U+0301 COMBINING ACUTE ACCENT. It's actually *two symbols*, and this
66// is Unicode normal form D (NFKD). Unlike NFKC, there are NFKD representations of far more
67// combinations of letters and accents, and it's easy to add more of these if you need. (The NFKC
68// representation of such combinations is identical to their NFKD representation)
69//
70// Programs that try to compare Unicode strings *should* first normalize them by converting them
71// all into one normal form or another, and there are functions in every programming language to
72// do this -- e.g., JavaScript's string.normalize() -- but lots of programmers fail to understand
73// this, and so write code that massively freaks out when it encounters the wrong form.
74//
75// The current accent request codes are modeled on the ones in MacOS.
76//
77// fn+` Grave accent (`)
78// fn+e Acute accent (´)
79// fn+i Circumflex (^)
80// fn+u Diaresis / umlaut / trema (¨)
81// fn+c Cedilla (¸)
82// fn+n Tilde (˜)
83//
84// Together, these functions make for a nice "polyglot" keyboard: one that can easily type in a wide
85// variety of languages, which is very useful for people who, well, need to type in a bunch of
86// languages.
87//
88// The major TODOs are:
89// - Update the layer selection logic (and add visible layer cues);
90// - Factor the code below so that the data layers are more clearly separated from the code logic,
91// so that other users of this keymap can easily add whichever alphabets they need without
92// having to deeply understand the implementation.
93
17 94
18enum custom_keycodes { 95enum custom_keycodes {
19 // We provide special layer management keys: 96 // We provide special layer management keys:
@@ -32,6 +109,16 @@ enum custom_keycodes {
32 KC_GREEK = SAFE_RANGE, 109 KC_GREEK = SAFE_RANGE,
33 KC_CADET, 110 KC_CADET,
34 KC_LAYER_LOCK, 111 KC_LAYER_LOCK,
112
113 // These are the keycodes generated by the various "accent request" keystrokes.
114 KC_ACCENT_START,
115 KC_CGRV = KC_ACCENT_START, // Grave accent
116 KC_CAGU, // Acute accent
117 KC_CDIA, // Diaresis / umlaut / trema
118 KC_CCIR, // Circumflex
119 KC_CCED, // Cedilla
120 KC_CTIL, // Tilde
121 KC_ACCENT_END,
35}; 122};
36 123
37enum layers_keymap { 124enum layers_keymap {
@@ -49,21 +136,6 @@ enum layers_keymap {
49#define MO_FN MO(_FUNCTION) 136#define MO_FN MO(_FUNCTION)
50#define KC_LLCK KC_LAYER_LOCK 137#define KC_LLCK KC_LAYER_LOCK
51 138
52// TODO: To generalize this, we want some #defines that let us specify how each key on the base
53// layer should map to the four special layers, and then use that plus the base layer definition to
54// autogenerate the keymaps for the other layers.
55// TODO: It would also be nice to be able to put the actual code points in here, rather than
56// numbers.
57
58// Accent marks
59#define CMB_GRV H(0300)
60#define CMB_AGU H(0301)
61#define CMB_DIA H(0308)
62#define CMB_CIR H(0302)
63#define CMB_MAC H(0304)
64#define CMB_CED H(0327)
65#define CMB_TIL H(0303)
66
67 139
68const uint16_t PROGMEM keymaps[][MATRIX_ROWS][MATRIX_COLS] = { 140const uint16_t PROGMEM keymaps[][MATRIX_ROWS][MATRIX_COLS] = {
69 // NB: Using GESC for escape in the QWERTY layer as a temporary hack because I messed up the 141 // NB: Using GESC for escape in the QWERTY layer as a temporary hack because I messed up the
@@ -164,14 +236,119 @@ const uint16_t PROGMEM keymaps[][MATRIX_ROWS][MATRIX_COLS] = {
164 // Function layer is mostly for keyboard meta-control operations, but also contains the combining 236 // Function layer is mostly for keyboard meta-control operations, but also contains the combining
165 // accent marks. These are deliberately placed to match where the analogous controls go on Mac OS. 237 // accent marks. These are deliberately placed to match where the analogous controls go on Mac OS.
166 [_FUNCTION] = LAYOUT_hotswap( 238 [_FUNCTION] = LAYOUT_hotswap(
167 CMB_GRV, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, KC_MUTE, KC_VOLD, KC_VOLU, _______, _______, RESET, 239 KC_CGRV, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, KC_MUTE, KC_VOLD, KC_VOLU, _______, _______, RESET,
168 CMB_GRV, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, 240 KC_CGRV, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______,
169 _______, _______, _______, CMB_AGU, _______, _______, _______, CMB_DIA, CMB_CIR, CMB_MAC, _______, _______, _______, _______, _______, _______, _______, 241 _______, _______, _______, KC_CAGU, _______, _______, _______, KC_CDIA, KC_CCIR, _______, _______, _______, _______, _______, _______, _______, _______,
170 _______, _______, _______, UC_M_OS, UC_M_LN, UC_M_WI, UC_M_BS, UC_M_WC, _______, _______, _______, _______, _______, _______, _______, _______, _______, 242 _______, _______, _______, UC_M_OS, UC_M_LN, UC_M_WI, UC_M_BS, UC_M_WC, _______, _______, _______, _______, _______, _______, _______, _______, _______,
171 _______, _______, _______, CMB_CED, _______, _______, CMB_TIL, _______, _______, _______, _______, _______, _______, _______, _______, _______, 243 _______, _______, _______, KC_CCED, _______, _______, KC_CTIL, _______, _______, _______, _______, _______, _______, _______, _______, _______,
172 _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______), 244 _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______, _______),
173}; 245};
174 246
247////////////////////////////////////////////////////////////////////////////////////////////////////
248// Accent implementation
249//
250// In the body of process_record_user, we store an "accent_request", which is the accent keycode if
251// one was just selected, or zero otherwise. When the *next* key is hit, we look up whether the
252// accent request plus that next keycode (plus the state of the shift key) together amount to an
253// interesting combined (NFKC) character, and if so, emit it; otherwise, we emit the accent as a
254// separate character and then process the next key normally. The resulting UI behavior is similar
255// to that of the combining accent keys in MacOS.
256//
257// We store two arrays, depending on whether shift is or isn't held. Each is two-dimensional, with
258// its outer key by the next keycode struck, and the inner key by the accent requested. The outer
259// array has KC_Z + 1 as its upper bound, so that we can save memory by only coding alphabetic keys.
260// The contents are either Unicode code points, or zero to indicate that we don't have a point for
261// this combination.
262
263#define KC_NUM_ACCENTS (KC_ACCENT_END - KC_ACCENT_START)
264#define KC_NUM_SLOTS (KC_Z + 1)
265
266const uint16_t PROGMEM unshifted_accents[KC_NUM_SLOTS][KC_NUM_ACCENTS] = {
267 // KC_CGRV, KC_CAGU, KC_CDIA, KC_CCIR, KC_CCED, KC_CTIL
268 [KC_A] = { 0x00e0, 0x00e1, 0x00e4, 0x00e2, 0, 0x00e3 },
269 [KC_E] = { 0x00e8, 0x00e9, 0x00eb, 0x00ea, 0, 0 },
270 [KC_I] = { 0x00ec, 0x00ed, 0x00ef, 0x00ee, 0, 0 },
271 [KC_O] = { 0x00f2, 0x00f3, 0x00f6, 0x00f4, 0, 0x00f5 },
272 [KC_U] = { 0x00f9, 0x00fa, 0x00fc, 0x00fb, 0, 0 },
273 [KC_Y] = { 0, 0, 0x00ff, 0, 0, 0 },
274 [KC_N] = { 0, 0, 0, 0, 0, 0x00f1 },
275 [KC_C] = { 0, 0, 0, 0, 0x00e7, 0 },
276};
277
278const uint16_t PROGMEM shifted_accents[KC_NUM_SLOTS][KC_NUM_ACCENTS] = {
279 // KC_CGRV, KC_CAGU, KC_CDIA, KC_CCIR, KC_CCED, KC_CTIL
280 [KC_A] = { 0x00c0, 0x00c1, 0x00c4, 0x00c2, 0, 0x00c3 },
281 [KC_E] = { 0x00c8, 0x00c9, 0x00cb, 0x00ca, 0, 0 },
282 [KC_I] = { 0x00cc, 0x00cd, 0x00cf, 0x00ce, 0, 0 },
283 [KC_O] = { 0x00d2, 0x00d3, 0x00d6, 0x00d4, 0, 0x00d5 },
284 [KC_U] = { 0x00d9, 0x00da, 0x00dc, 0x00db, 0, 0 },
285 [KC_Y] = { 0, 0, 0x00df, 0, 0, 0 },
286 [KC_N] = { 0, 0, 0, 0, 0, 0x00d1 },
287 [KC_C] = { 0, 0, 0, 0, 0x00c7, 0 },
288};
289
290// The uncombined and combined forms of the accents, for when we want to emit them as single
291// characters.
292const uint16_t PROGMEM uncombined_accents[KC_NUM_ACCENTS] = {
293 [KC_CGRV - KC_ACCENT_START] = 0x0060,
294 [KC_CAGU - KC_ACCENT_START] = 0x00b4,
295 [KC_CDIA - KC_ACCENT_START] = 0x00a8,
296 [KC_CCIR - KC_ACCENT_START] = 0x005e,
297 [KC_CCED - KC_ACCENT_START] = 0x00b8,
298 [KC_CTIL - KC_ACCENT_START] = 0x02dc,
299};
300
301const uint16_t PROGMEM combined_accents[KC_NUM_ACCENTS] = {
302 [KC_CGRV - KC_ACCENT_START] = 0x0300,
303 [KC_CAGU - KC_ACCENT_START] = 0x0301,
304 [KC_CDIA - KC_ACCENT_START] = 0x0308,
305 [KC_CCIR - KC_ACCENT_START] = 0x0302,
306 [KC_CCED - KC_ACCENT_START] = 0x0327,
307 [KC_CTIL - KC_ACCENT_START] = 0x0303,
308};
309
310// This function manages keypresses that happen after an accent has been selected by an earlier
311// keypress.
312// Args:
313// accent_key: The accent key which was earlier selected. This must be in the range
314// [KC_ACCENT_START, KC_ACCENT_END).
315// keycode: The keycode which was just pressed.
316// is_shifted: The current shift state (as set by a combination of shift and caps lock)
317// force_no_accent: If true, we're in a situation where we want to force there to be no
318// accent combination -- if e.g. we're in a non-QWERTY layer, or if other modifier keys
319// are held.
320//
321// Returns true if the keycode has been completely handled by this function (and so should not be
322// processed further by process_record_user) or false otherwise.
323bool process_key_after_accent(
324 uint16_t accent_key,
325 uint16_t keycode,
326 bool is_shifted,
327 bool force_no_accent
328) {
329 assert(accent_key >= KC_ACCENT_START);
330 assert(accent_key < KC_ACCENT_END);
331 const int accent_index = accent_key - KC_ACCENT_START;
332
333 // If the keycode is outside A..Z, or force_no_accent is set, we know we shouldn't even bother
334 // with a table lookup.
335 if (keycode <= KC_Z && !force_no_accent) {
336 // Pick the correct array. Because this is progmem, we're going to need to do the
337 // two-dimensional array indexing by hand, and so we just cast it to a single-dimensional array.
338 const uint16_t *points = (const uint16_t*)(is_shifted ? shifted_accents : unshifted_accents);
339 const uint16_t code_point = pgm_read_word_near(points + KC_NUM_ACCENTS * keycode + accent_index);
340 if (code_point) {
341 register_unicode(code_point);
342 return true;
343 }
344 }
345
346 // If we get here, there was no accent match. Emit the accent as its own character, and then let
347 // the caller figure out what to do next.
348 register_unicode(pgm_read_word_near(uncombined_accents + accent_index));
349 return false;
350}
351
175// Layer bitfields. 352// Layer bitfields.
176#define GREEK_LAYER (1UL << _GREEK) 353#define GREEK_LAYER (1UL << _GREEK)
177#define SHIFTGREEK_LAYER (1UL << _SHIFTGREEK) 354#define SHIFTGREEK_LAYER (1UL << _SHIFTGREEK)
@@ -185,6 +362,8 @@ bool process_record_user(uint16_t keycode, keyrecord_t *record) {
185 // get_mods or the like, because this function is called *before* that's updated! 362 // get_mods or the like, because this function is called *before* that's updated!
186 static bool shift_held = false; 363 static bool shift_held = false;
187 static bool alt_held = false; 364 static bool alt_held = false;
365 static bool ctrl_held = false;
366 static bool super_held = false;
188 static bool greek_held = false; 367 static bool greek_held = false;
189 static bool cadet_held = false; 368 static bool cadet_held = false;
190 369
@@ -192,18 +371,36 @@ bool process_record_user(uint16_t keycode, keyrecord_t *record) {
192 static bool shift_lock = false; 371 static bool shift_lock = false;
193 static int layer_lock = _QWERTY; 372 static int layer_lock = _QWERTY;
194 373
195 // Process any modifier key presses. 374 // The accent request, or zero if there isn't one.
375 static uint16_t accent_request = 0;
376
377 // If this is set to true, don't trigger any handling of pending accent requests. That's what we
378 // want to do if e.g. the user just hit the shift key or something.
379 bool ignore_accent_change = !record->event.pressed;
380
381 // Step 1: Process any modifier key state changes, so we can maintain that state.
196 if (keycode == KC_LSHIFT || keycode == KC_RSHIFT) { 382 if (keycode == KC_LSHIFT || keycode == KC_RSHIFT) {
197 shift_held = record->event.pressed; 383 shift_held = record->event.pressed;
384 ignore_accent_change = true;
198 } else if (keycode == KC_LALT || keycode == KC_RALT) { 385 } else if (keycode == KC_LALT || keycode == KC_RALT) {
199 alt_held = record->event.pressed; 386 alt_held = record->event.pressed;
387 ignore_accent_change = true;
388 } else if (keycode == KC_LCTRL || keycode == KC_RCTRL) {
389 ctrl_held = record->event.pressed;
390 ignore_accent_change = true;
391 } else if (keycode == KC_LGUI || keycode == KC_RGUI) {
392 super_held = record->event.pressed;
393 ignore_accent_change = true;
200 } else if (keycode == KC_GREEK) { 394 } else if (keycode == KC_GREEK) {
201 greek_held = record->event.pressed; 395 greek_held = record->event.pressed;
396 ignore_accent_change = true;
202 } else if (keycode == KC_CADET) { 397 } else if (keycode == KC_CADET) {
203 cadet_held = record->event.pressed; 398 cadet_held = record->event.pressed;
399 ignore_accent_change = true;
204 } 400 }
205 401
206 // Now let's transform these into the "cadet request" and "greek request." 402 // Step 2: Figure out which layer we're supposed to be in, by transforming all the prior stuff
403 // into layer requests.
207 const bool greek_request = (greek_held && !alt_held); 404 const bool greek_request = (greek_held && !alt_held);
208 const bool cadet_request = (cadet_held || (greek_held && alt_held)); 405 const bool cadet_request = (cadet_held || (greek_held && alt_held));
209 406
@@ -260,8 +457,33 @@ bool process_record_user(uint16_t keycode, keyrecord_t *record) {
260 layer_state_set(new_layer_state); 457 layer_state_set(new_layer_state);
261 } 458 }
262 459
263 // TODO: We can update LED states based on shift_lock (caps), layer_lock (layer lock), and 460 // Step 3: Handle accents. If there's a pending accent request, process it. If what the user just
264 // base_layer (base layer). 461 // hit creates a new accent request, update the pending state for the next keypress.
462 if (!ignore_accent_change && accent_request && record->event.pressed) {
463 // Only do the accent stuff if we're in the QWERTY layer and we aren't modifying something.
464 const bool force_no_accent = (
465 actual_layer != _QWERTY ||
466 ctrl_held ||
467 super_held ||
468 alt_held
469 );
470 const uint16_t old_accent = accent_request;
471 accent_request = 0;
472 if (process_key_after_accent(old_accent, keycode, shifted, force_no_accent)) {
473 return false;
474 }
475 }
476
477 // And if a new accent request just arrived, update accent_request.
478 if (keycode >= KC_ACCENT_START && keycode < KC_ACCENT_END && record->event.pressed) {
479 if (shifted) {
480 // Shift + accent request generates the combining accent key, and leaves accent_request alone.
481 register_unicode(pgm_read_word_near(combined_accents + keycode - KC_ACCENT_START));
482 return false;
483 } else {
484 accent_request = keycode;
485 }
486 }
265 487
266 return true; 488 return true;
267} 489}