Here’s to the alphabet

I’ve been on Github for 17 years now, never bothered to upload or contribute anything. This changed last weekend, or at least, I decided to upload one of my many projects. 😅

This one was quite a wild ride actually. I wasn’t sure if I needed the algorithm to begin with, but I thought, well, “nice to have.”  And in the beginning it looked super straightforward and easy, then for a long time it looked super complicated, and in the end, the final version looks so simple, it almost looks like it was jotted down in a minute.

Four days, and I guess 40 hours of work later, here’s my repo:

https://github.com/alfons/PinyinAbcSort

PinyinAbcSort – Sort HĂ nyǔ PÄ«nyÄ«n in alphabetical order (fast)

Description:

This project implements sorting Pīnyīn words into alphabetical word order, based on the rules outlined by John DeFrancis in ABC Chinese-English Dictionary, Page xiii, Reader’s Guide, I. Arrangement of Entries.

The sorting algorithm compares words letter by letter, not syllable by syllable. This approach reflects the fact that HĂ nyǔ PÄ«nyÄ«n is written using the Latin alphabet — the key insight and algorithm design choice behind this implementation.

The ordering rules are:

  1. Alphabetical order: Base characters (a–z), compared letter by letter
  2. u before ĂŒ, U before Ü
  3. Tones: 0 < 1 < 2 < 3 < 4
  4. Case: lowercase and mixed-case before uppercase
  5. Separators: apostrophe < hyphen < space
  6. Since no rules for numbers 0–9 were given, they were added first. All other characters are appended according to their Unicode value.

Credits:

  • John DeFrancis (1911-2009): Original PÄ«nyÄ«n alphabetical word order, in passionate acknowledgment of the advocates of writing reform LĂč Zhuāngzhāng (陆璋章, 1854–1928), Lǔ XĂčn (éČèż…, 1881–1936), MĂĄo DĂčn (Shěn YĂ nbÄ«ng, 茅盟, æČˆé›ć†°, 1896–1981), WĂĄng LĂŹ (王抛, 1900–1988) and LĂč ShĆ«xiāng (搕揔æč˜, 1904–1998), and Zhƍu Yǒuguāng (ć‘šæœ‰ć…‰, 1905–2017).
  • Mark Swofford of Banqiao, Taiwan: summarised the rules on the internet, and pointed out where to find them.
  • Alfons Grabher: Idea, concept, prompting, testing, and driving the development of pinyinAbcSort.
  • Grok (xAI), ChatGPT 4o: Coding the implementation with flair and precision.

How does the hip rotate
 ?

Here’s a question I’ve heard in Feldenkrais contexts many, many times, and just today someone asked me, too, again:

How does the hip rotate as you are going down?

20 years ago I was wondering about that myself. I was in a Feldenkrais session, and about to go from standing to sitting, when the Feldenkrais-trained physical therapist kept saying that my pelvis needs to rotate. I asked, “What do you mean by this?” But he just kept repeating the same phrase.

When I asked a 4th time, I saw some desperation creep into his face, so I let it go. For that session. But my jaws were already set. It wasn’t an easy question. Even with my background in medical engineering and biomechanics, it took me a couple of years to realise that I need to apply this knowledge, which was all theory, into my somatic movement practice, which was all experience, feeling, and sensing.

I needed to learn that noticing and naming need to go together. We are humans, we give names to living beings, and even to things. And also to thoughts and processes, to everything, really. This is one of our strengths, to name things.

So let me have my take on this. We need to answer two (2) questions. First:

 1. Just where is the hip?

With all that talk about the hips should move, everyone seems to assume that we all know what everyone is talking about. But what is it, the hips? We need to answer:

  • Is it the area between the knees and the shoulders?
  • Is it the two soft, rounded portions located in the lower behind regions?
  • Or is it the posterior superior iliac spine, a palpable bony prominence at the back of the ilium?

What is it? What are we talking about? We need to know!

2. Rotate
 in relation to what?

Numbers don’t make any sense as long as we don’t define what is zero (0). The number 300 doesn’t mean anything, as long as we don’t define what 1 or nothing is. 300 could be a lot, it could be nothing.

First of all, just shortly after the Mesopotamias, the Egyptians, about 5000 years ago, invented writing—by chiseling logographic symbols into temple walls, and painting them, and inking them on papyrus. That was about 2000 years before any Chinese etched anything into any bone. And the traders used that new invention, writing, to count goats and debts. And on that base our glorious Western alphabet came into existence. And a few thousand years later, some bright people, just a few hundred kilometers away from the great pyramids, defined basic maths.

And now we count rotation by degrees, from 0 to 360. But we need to know: What is rotating? And in relation to what?

  • Is it the pelvic girdle rotating in relation to the shoulder girdle?
  • Is it the humerus of the right leg rotating inwards, in relation to the rest of the body?
  • Or is it the pelvis (and the entire body) rotating around a stable, well planted right leg?
  • Or a combination of all of that?

If we don’t define what we mean by hip, if we don’t define what is rotating, and in relation to what, then it’s just like asking, “How long is a piece of string?” or “How tall is a stretch of time?”

For example, when you’re going down from standing straight to sitting back down onto a chair, then your hip
 what is it
 your pelvis, your pelvis is probably not rotating. Your pelvis is probably stable, stiff and fixed, and held tight, to support your upper body. There will be movement in your hip joints though, articulation, and rotation. And there will be movement with your eyes, in your neck, chest, in your ankles and the arches of your feet


But this blog post is not about describing a movement. This blog post is to answer HOW TO ANSWER that question. It’s easy to answer in science, engineering and physics, but it’s not so obvious when it comes to ourselves. We’re often blind to ourselves. Maybe, we’re not allowed to feel ourselves, or maybe we’ve been told:

“Thou shalt not be aware,” as in the title of a book by Alice Miller. But we need to be aware. We need to know 1. what is moving, and 2. in relation to what. And then we can look at the movement, and at ourselves. And then we know. And this feels good. So good. So soothing. Healing. Closure.

Status update and things learned

I’ve been completely obsessed with work for the past few months, writing on a book about Chinese grammar using romanisation (PÄ«nyÄ«n), and writing software to be able to actually write in Chinese PÄ«nyÄ«n. I was working to the point of madness. Only with a short one week holiday in Taiwan to sort out a visa issue, debounce, and catch up with good friends.

Turns out, both my new book and new software are far from finished. What could have finished, or “ended” to use the right term, is my health. So I take a step back, take the time to write a blog post, and summarise what I’ve learned:

1. Going to bed early is important

As Dr. Neal Bernard once pointed out, in his book about hormonal health, going to bed at 10pm seems right for him. Because if he sets his mind to go to bed at 11:30pm then the next thing you know, it’s a half hour past midnight and your whole next day is messed up.

2. Eating early is important

My grandmother used to have her last meal of the day at around 5:00pm, for which she usually had more of a snack, not a meal. Usually that was a slice of old, dark rye bread, with butter on it so thin
 well as a kid I always doubted it was worth the effort to go and fetch the butter from the pantry and also the work it took to clean the butter knife. Butter usually sticks to the knife and you need a sponge with liquid soap, and then you need to clean the sponge properly or your hands will be all sticky and smell like you touched a cow. As a kid I was thinking a lot about that.

In hindsight, and from decades of personal experiments, I find a small meal at 4pm helps my sleep and recovery the most.

3. Good things come to those who wait

Growing a book and a software so fast, with daily 4 to 10 hours of adding to it, naturally creates a large, messy code base. And now, of course, I ran into software performance issues. How could I not. The profiler says that my software will perform poorly on slower systems.

And for the book I need to go through all chapters AGAIN and clean up the mess I left behind, too. Making it better, and better.

Maybe it’s also a form of obsessing over details, or trying to achieve perfection. Just for example, I went through 26 iterations for the app’s icon. On the upside, the names for the software and the book came to me naturally, and I’m really happy, excited, and satisfied with them.

Trying to achieve perfection? I say, what else do we have in life, if not the pursuit of happiness, striving for meaning, beauty, harmony, and satisfaction with our creations? Or is this the talk of middle age? To me it really feels good to think about a difficult, structural problem, and sort it out, make the solution beautiful. And there’s the age old question: What have the Romans REALLY ever done for us?

There’s more. But I think this is a good time to end this blog post, or diary page. I’m very happy I didn’t use ChatGPT, or any LLM, for this blog post, for anything, not even for a spell-check. THIS feels amazing, too. It’s all me. My achievements, my mistakes, my own expression of beauty, of coherent, accomplised thinking and writing.

Wish you a great day, and take good care of your health!

New book by Alfons

Testing line heights, fonts and trim sizes. My new book on HĂ nyǔ PÄ«nyÄ«n is getting along well. It will still take some time, though. So many things to do. Still getting up around 6-7 am, a bit of somatic movement practice, and then working every free minute until late, around 11-12 pm.

However, creating and working on a self-chosen, self-defined project, as it presents itself, as it is brought to life, organic unfolding—it gives life meaning and purpose, I can definitely feel that. And I do enjoy that feeling.

Input efficiency for novel Pinyin input scheme

Ok, now it’s two days later. Two days with hardly any sleep and constant starring at my computer screen. Which is beautiful, btw, this nano texture screen on my MacBook, I love it so much.

And a few walks in between, to think about how to fix the terrible problems I kept running into. Well
 sometimes we need to slow down, so that solutions may come to us. I guess solutions, in their very nature, they also don’t like running targets.

I’ve been working on the numbers based input helper for about 3-4 months now, and I think it’s beautiful, too. I mean my frontend, which I can’t show you just yet.

It’s just
 this entire idea of typing text using numbers seems flawed to me. I guess in this sense I’m pretty old school. In my mind and heart numbers and words are from two different realms. Nevertheless, it’s the most efficient input method.

But since I want to type text without using numbers I had to come up with an input scheme that can do just that.

This is why I developed my Twin Tone input scheme. The idea behind this scheme is that there’s no double vowels (twins) in the Chinese language, at least not in the romanised script called Pinyin.

This input scheme allows for writing Chinese (using Pinyin) without having to type numbers. There’s only one drawback: it’s terribly inefficient. A lot more characters have to be typed than result in actual text (depending on the text, there’s close to a 50% overhead).

That’s why I was trying hard to find an optimised, streamlined scheme. And to solve this problem, I used statistics. I had to find out which letter combinations are permissible for using in a Pinyin typing helper scheme.

The result of this work was my optimised Twin Tone input scheme. But the original design was very hard to implement from a software developer’s perspective. Therefore I created a much simplier variant, which I named, Twin Tone Variant (TTV). It’s a lot simpler, and also much easier to learn from a user’s perspective.

Turns out, this works very, very well. I’m very happy with it. And despite the slightly larger overhead, it’s more convenient to use than the numbers based input—since with TTV-Input the fingers can stay on the homerow on the keyboard.

So.

That’s my contribution from a teachers’ and inventors’ perspective. I blurred the input letter combination since it was so much work to find this scheme. Maybe in the future billions of people will use it. I probably should get it patented. But I have no idea how.

If you’re a business angel, investor, or startup incubator, I would like to team up with you; with me being Educational Director or Head of Innovation of a new Startup focusing on Chinese language related projects, all without Chinese characters.

Furthermore, I would like to start a Publishing company in Taiwan (for one, to have a legal basis for the work that follows, and secondly, to get temporary residency and a work permit for Taiwan.) But I’m open to other countries as well. Any help, a point in the right direction, or suggestions for partnering up are very welcome. 🙏😊

Designing Chinese Pinyin input methods

Over the past few months I’ve typed out 20 stories (Peppa Pig, Xiǎo ZhĆ« PĂšiqĂ­) in Chinese language with PÄ«nyÄ«n. This can act as base for some research.

First of all, the word count: 10,024 words, 53,115 characters, 11,842 vowels with diacritics. Questions arise. Answers are possible. For example:

Which tones are the most frequently used in the Chinese language? In vernacular modern Standard Chinese, that is, as Peppa Pig stories are written with that. Any Chinese teacher will say Chinese has 4 tones, and “Nǐ hǎo”, is as common as “Zàijiàn”, but is it really?

Finally, with Pinyin I can answer this question. Here is a screenshot of a software I designed to do the counting:

Clearly, the most common tone-marked tone in Chinese language is À, the 4th tone on the letter A. This is surprising to me, and relevant for Chinese Pinyin input method designs.

For example, after a lengthy calculation and reasoning, I found the following: By making Ă  the default letter instead of a, there could be approximately 2.35% fewer keystrokes necessary to type such text in Chinese Pinyin. For the above text that would be exactly 1,356 fewer keystrokes.

While at it, the complete vowel depository looks like this:

It seems like, as if in vernacular (modern spoken) Chinese, the most common vowel is the letter I, by far, which came as a surprise to me, too.

If we look at the total word repository from the dictionary CC-CEDICT, we get a similar picture in many regards, but quite different counts for the neutral letters A, E and U. However, keep in mind that this is not spoken Chinese, but a repository, and thus more an academic exercise than a look into the soul of the Chinese language:

So far I have created two systems for Chinese Pinyin input (with two more in the making):

🍏 Method 1. Number marked Pinyin Input

Chinese Pinyin is written with diacritics. However, even modern computers notoriously lack convenient ways to input diacritics. Therefore many input systems have emerged. One of the most common ones is “number marked Pinyin input” which will automatically change the numbers 1-4 into the respective  diacritics as we type.

For example typing dang4 (or da4ng) will produce dāng.

I’ve spent quite some time to write my own software to do so, and in quite a few variations, too. Here’s a screenshot of one of the Proof of Concepts I’ve created:

When typing with numbers we need one extra keystroke to produce a vowel with diacritic. This means:

  • A definition: I will call vowels with diacritics, such as “ā, ĂĄ, ǎ, à”, dvowels – the word vowel with a “d” in front, d for diacritic.
  • For 11,842 dvowels we need double the amount of keystrokes than for vowels without diacritics, in total 23,684 keystrokes.
  • This means, when typing, for example, 10,024 words with 53,115 characters (as in the Peppa Pig texts above), we will not have to type 53,115 keystrokes in total, like in a language without diacritics, such as English, but 64,957 keystrokes in total.
  • The math for this is: 53,115 characters – 11,842 vowels + 23,684 dvowels = 64,957.
  • How much is that in percent? 11,842 / 53,115 * 100 = 22.295

To write Pinyin Chinese with diacritics, with the number marked Pinyin input method, we have a keystroke overhead of 22.30% .

This is unavoidable and the bare minimum. It is the most direct way to input Pinyin without predictive typing methods, where a computer will suggest possible words in advance and allow the user to chose a word, rather than type it.

The qualm I have with number marked input is twofold:

Qualm 1) Reaching up for the number row is inconvenient, especially for longer typing. We humans can get used to most any inconvenience, but still it is inconvenient, or let’s say, not ideal in terms of ergonomics.

However, there’s plenty of room for optimism, I mean optimization. For example, with the stats above it became clear that a 4th tone a (Ă ), is by far the most typed vowel with diacritic. This means, improvements can be made. For example, one idea I had yesterday:

Problem: When using number 4 to produce the 4th tone, we need to use the left hand for both keystrokes: key a and key 4, which are both on the left side on the keyboard.

Solution: It would be easier and faster if we also could use, for example, key 8 to produce the 4th tone, too. In this way we can use both hands for typing, and be faster, and more comfortable.

I’ve made a proof of concept for that, too, and it seems to indeed be quite an improvement in the typing experience. To be observed.

Qualm 2) The bigger problem I have with using numbers to type text is the thing in itself, why do I have to use numbers to write text? This doesn’t make sense to me. I understand that it is for reasons of convenience, especially when it comes to predictive text and assisted typing. But coming from a humanist background, it seems unacceptable to me to have to use numbers to write down words.

This bigger, psychological, “I stand by my principles” problem I will try to solve with inventing input methods that don’t need numbers to input text.

🍏 Method 2. Twin Tone Pinyin Input, TT-Input

If you’ve never heard of this input method before: I’ve made this one up. It’s my own invention. It’s something new.

Years of frustration, months of intense work and research, and 16 hour days of complete obsession led me to produce a robust, alternative system that reliably produces Pinyin text without the need to use numbers.

I’ve also created a proof of concept, a working software implementation. I’ve had many different attempts and angles at this problem, and ran all of them against the 102,000+ words from the CC-CEDICT dictionary. However, most of the algorithms produced an error rate of a minimum of 5%, which is unacceptable.

But this, TT-Input, this one is flawless:

Twin Tone Input (TT-Input) works by repeating a vowel twice and then typing a marker to produce the diacritic: “r” for ĂĄ (2nd tone), “v” for ǎ (3rd tone), “f” for Ă  (4th tone), or repeating the vowel one more time for the 1st tone.

For example, typing aaa gives ā, aar gives á, aav gives ǎ, aaf gives à.

aaa → ā 
aar → á 
aav → ǎ 
aaf → à 
Buufyoofng daaanxiiin → BĂčyĂČng dānxÄ«n

This means, just like in Telex for typing Vietnamese, we don’t need to reach up to the numbers row to produce a diacritic. Instead we use letters to type letters.

But now for the overhead. How much worse is it in terms of keystrokes?

  • With TT-input, we need 3 keystrokes to type one dvowel.
  • For 11,842 dvowels we need a total of 35,526 keystrokes.
  • This means when typing 10,024 words with 53,115 characters (as in the Peppa Pig texts I’ve transcribed), we will need to type 76,799 keystrokes in total.
  • Here’s the math: 53,115 characters – 11842 vowels + 35526 dvowels = 76,799 keystrokes in total.
  • The percentage for that is: 11,842 * 2 / 53,115 * 100 = 44.59

The “TT-Pinyin Input” keystroke overhead is 44.59%. As opposed to 22.3% with number marked input.

Which means: In publishing, a standard manuscript page is often 250–300 words, which translates to 1,500–2,000 characters (including spaces). Let’s say 275 words and 1750 characters, including whitespace. To type 1 standard manuscript page, being an average typist with ~300 KPM (60 WPM):

  • 5 mins 49 s (1,750 keystrokes) – English (No Diacritics)
  • 7 mins 6 s (2,135 keystrokes) – Chinese (Number-Marked Pinyin)
  • 8 mins 26 s (2,530 keystrokes) – Chinese (TT-Input)

To conclude, for today:

The 22.30% overhead for number-marked Pinyin input is the minimum possible for producing diacritics, if we disregard predictive text and assisted typing algorithms.

Opening keys 4–9 for number-marked input could be a game-changer. Assigning tones to the right-hand number row allows left-hand vowel input + right-hand tone selection, creating a parallel, two-handed input flow. This could speed things up significantly.

The 44.59% overhead for my latest invention, Twin Tone (TT) Pinyin Input, is double that of number-marked input. While TT-Input improves ergonomics by keeping the fingers close to the home row, the extra keystrokes could slow down typing overall, and also take a toll on the finger joints for the extra stress. On the other hand the shorter traveling distances for the fingers could improve comfort and rhythm, and might be worth investigating this input method further.

Obsession – the only way to live

“Being obsessed over doing something, and be busy with it day and night, is the only way to live.” My best friend commented, while peering over my shoulder last June. This was the time when I decided to improve my handwriting



and filled hundreds and hundreds of pages with calligraphy and lettering styles, using the practice sheets I had designed myself, with lines angled at 17 degrees, and a new book on handwriting, (which I’ve not finished.) Furthermore, I went to countless bookshops and stationary stores, spent days inspecting the shelves of libraries, including the beautiful, BEAUTIFUL, National Library in Vienna, Austria
 to dig up official documents and research papers on the official Austrian handwriting style



for almost half a year I didn’t do much else, day and night. After all, my forsaken home country is one of the few countries that has such a thing: its own handwriting model, the Austrian Schulschrift, last revised by the Austrian Ministry of Education in 1995. And why not, why not learn to model my own handwriting using the Austrian model? 



maybe because even the Austrian government keeps saying that it’s just a model, and we shouldn’t aim to copy it perfectly. They are right in that sense, perfection is unattainable
 for the Os are unlike the German egg-shaped Os
 in the Austrian model they are perfectly elliptical, and that’s something humans just can’t draw.

My friend George, who made that comment, he himself is living the way of passionately being obsessed over something and doing only that, day and night, too—but not as a teacher, like myself, but as a musician. Next to sleeping, there’s hardly anything else he and his fellow musicians do, other than practicing, playing and creating music.

We don’t have much money, but we have purpose. Our obsession with what we think of as our work fills us with that rare, highly saught after, ethereal substance called “purpose.” Purpose, an invisible good that seems to nourish us spiritually, fills our lives with meaning, and aligns all our actions to that purpose. And at least I am lucky enough to have patrons who support me and make my deep dives into my projects possible, my deep dives into whatever seems to be the most important thing to do in this world
 and is somehow related to my main profession, being a teacher of Somatic Education, with a background in studying the lessons of MoshĂ© Feldenkrais.

This obsessing over topics doesn’t always yield movement lessons, in the sense of being a teacher of Somatic Education. But for once I can say

for once I can say that I have completely and fully understood what MoshĂ© Feldenkrais meant by saying,

“You learn the official handwriting style first,
and only then you develop your own.”

Up until now this quote was only a toothless paper-tiger to me, a bland quote that has been repeated up and down in any Feldenkrais training ever done, cited without passion, quoted not knowing the obsession and the striving for perfection, joy, and purpose. Maybe that’s why—up until last June—this quote never resonated with me, was never convincing. Baseless, emotionless, irrelevant.

But now I understand. Now I can talk from experience. I had to become 50 years old to undo the damage done to me in public schooling, and learn how to use body, mind, hand and pen, to put ink to paper, beautifully, and with great satisfaction.


Here’s a video from my “beginning days” of handwriting: https://www.youtube.com/watch?v=7eNLy-YgsKg

Since October last year I have a new obsession:

Romanised Chinese writing and spelling, with the latin alphabet, to the official standard of the Chinese government, called HĂ nyǔ PÄ«nyÄ«n.

It started as a feeling, a desire to do something with lasting effect. Something that’s like putting my foot down, making a dent in history, leaving a legacy, doing something revolutionary, gamechanging, joyful and amazing



and quickly, within days after my decision to turn PÄ«nyÄ«n, the romanized Chinese writing system, into something great in my life, my interest in it turned into my new obsession, leading up to the 12 hour workdays I’m having these days, which always feel too short.

I’m working on a web application as a typing helper (to actually be able to write PÄ«nyÄ«n,) with a new type of dictionary interface. And a Mac app. I’ve given up on the iPhone app, for now. And now I’m doing research on PÄ«nyÄ«n input methods, devising new ones, comparing robustness, ease of use and efficiency. And with these tools I’ve already transcribed a few thousand words, every day a few more.

Furthermore, right now I’m writing a book about PÄ«nyÄ«n spelling according to the Basic rules of the Chinese phonetic alphabet orthography (GB/T 16159-2012), a standard by the Chinese Ministry of Education and the Chinese State Language Commission.

I wake up, usually around 6:30am, I do half an hour of Feldenkrais-inspired exercises (without which I would probably break apart,) shave and shower, and then, for the rest of the day

I work every minute I’ve a chance to. And I go to bed at midnight. Monday through Sunday .And besides that, I try to keep up with my work as a teacher of Feldenkrais. This has been most of my days since the past 3 months already.

“Yes, you’re on the right track,” I hear my friend George saying in my mind; and off he goes to his own practice session.