Because I choose to.

Write a Diary with Sphinx

Jun 22, 2018 13:49 @ Palo Alto

My favorite lightweight markup language is reStructuredText [1]. Unlike XML/HTML, a lightweight markup language is more human-readable, while it still preserves some basic abilities of expressing hyper-text content. This article (and also the blog) is written in reStructuredText. Though not as famous as Markdown [2], reStructuredText, in my opinion, is more feature-rich and extensible. The official Python document [3] is written in it, with the help of Sphinx [4], the documentation generator, powered by the lower-level reST engine, docutils [5].

Usually documentation generators are used by code projects to generate accompanied references and tutorials. However, thanks to the versatility of Sphinx, we can also keep our personal diaries with it! Here I'm gonna show how one can keep multiple diaries at the same time via a single Sphinx project.

Quickstart

First thing to do is to make sure sphinx is installed on your machine. In Archlinux, it is simply:

Then we can make use of the quick start tool to create a template documentation project:

This will start a wizard which asks several questions about the preference of your project. Suppose you mostly follow the default settings, the project directory diaries/ will end up like this:

>> cd diaries/
** 02:28:50 /t/diaries ymf@Pixelbook **
>> ls -l
total 16
drwxr-xr-x 2 ymf ymf   40 Jun 22 02:28 _build
-rw-r--r-- 1 ymf ymf 4723 Jun 22 02:28 conf.py
-rw-r--r-- 1 ymf ymf  449 Jun 22 02:28 index.rst
-rw-r--r-- 1 ymf ymf  607 Jun 22 02:28 Makefile
drwxr-xr-x 2 ymf ymf   40 Jun 22 02:28 _static
drwxr-xr-x 2 ymf ymf   40 Jun 22 02:28 _templates

Here, we can include our reST files in index.rst, each keeps track of one diary. In index.rst:

As you might have guessed, we should accordingly add two reST files life.rst and research.rst to our project directory. In life.rst:

In research.rst:

Finally, try to build your diaries:

This command shall generate HTML output to _build/html directory. Open the browser to see the compiled diaries:

The link of "My Life" leads to the diary content:

Hacking

So far so good. The idea of using sections as diary entries is not bad because there won't be too many entries for a browser to handle: imagine you write 10 entries per day, and do it for 100 years, the total number of sections will only be 365,000, which can be easily rendered by modern browsers (I suppose). Scalability won't be a real issue if only for diary purpose. If it will, just partition your diaries into multiple files, according to years, for example.

However, the inconvenience of keeping diaries this way is about the order of sections. One tends to append new writing at the end of the reST file instead of prepending it to the beginning. However, docutils will render the sections according to their order of appearance in reST, which makes sense for an article with only few sections, but not quite so for a diary that "misuses" sections. Naturally, we'd like to see our latest entry at the top of the page instead of scrolling all the way to the end of the page to see it. Luckily, there is a way of reordering the sections both in text and in the table of contents. Add the following code to the end of your conf.py:

The sections are reversed as expected with the hack:

[1]	http://docutils.sourceforge.net/rst.html

[2]	https://daringfireball.net/projects/markdown/

[3]	https://docs.python.org/3/

[4]	http://www.sphinx-doc.org/en/master/

[5]	http://docutils.sourceforge.net/

Comments

Archlinux on Pixelbook

May 13, 2018 11:19 @ Ithaca

It's been a while since I last paid attention to the laptop market. With a little disappointment, the hardware spec hasn't been improved as much as I expected. Ultrabooks are still mostly shipped with two typical combinations: 8G RAM with i5 or 16G RAM with i7. I don't need i7 because I weigh portability over performance. I have my nice Linux desktop for performance demanding tasks and I would like to have a skinny laptop that is friendly to both surfing and coding. Moreover, I've been waiting for a "dream" machine that can properly serve both as a laptop and a tablet. Pixelbook seems just right.

I got one with 8G RAM, i5 CPU, and 128G SSD at $880 during a sale. The hardware is worth the price, considering the decent spec given 2.4lb weight, and I also personally like the look that won't let any haxor down. It operates with the heavily shielded Chrome OS. After some search I realize that I need to give up the security guarantee by switching to developer mode, in order to run a decent Linux distro aside. Doing so will cause the laptop generates loud beep sound if one does not press Ctrl+D to skip it at boot time and all data will be erased if one presses space bar and enter key, before the beep. At the moment I was giving up to this plan of using crouton, Google held 2018 I/O event and announced crostini, a way to run other Linux distros inside sandboxed lxc containers, one day before I got my Pixelbook.

The crostini environment currently works, but with hacky scripts. The sommelier [1] program emulates an X11/Wayland server that in turn forwards the rendering commands to Chrome OS host, penetrating the container boundary. The good part of the story is, with linux container technology, unlike traditional VM with fully emulated hardware, the sandboxed Linux system runs with very little overhead. The provided distro, Debian Stretch, is only slightly adapted compared to the one given by Linux Containers [2]. This gives me some hope of figuring out how to apply the changes to in theory make containers of any Linux distros.

I use Archlinux since my conversion from Gentoo two years ago. It is easy to setup and bare-bones enough to customize. It has very up-to-date binary packages and the rolling update does not break as frequently as in its early years. Archlinux also uses systemd as in Debian, so it shouldn't be hard to make it work in crostini.

So far, I managed to run Archlinux as a container with working network, console, and Wayland display. I fixed the network issue using the lxc profile modification found here [3], and converted the Debian packages to make Wayland work. In theory, X11 should work as well, but due to some bug inside sommelier, I'm currently unable to run any X11-only program see the picture below.

To get Archlinux working on your Pixelbook, try the following steps (in termina prompt):

Adjust the profile for networking:

Create and run the container:

Enter the root shell:

Or use console to login:

You can use bazel to build the Debian packages provided here [4], and convert them to Archlinux packages using debtap [5]. The conversion isn't perfect, so don't forget to create a symlink from /usr/bin/sommelier to /opt/google/cros-containers/bin/sommelier.
To avoid annoying retry messages, disable the ttys by systemctl disable getty@lxc-tty1, and the other five.
For those slackers, here [6] is the link to the converted packages.
Finally, a picture is worth a thousand words:

/images/pixelbook-with-archlinux.thumbnail.png

It shows a running Archlinux with a Wayland terminal emulator, proper fonts, Rust, Texlive, etc. Here is another picture showing X11 support with urxvt, fcitx IME and open file dialog of evince:

/images/pixelbook-with-archlinux2.thumbnail.png

Troubleshooting

If X11 programs refuse to start and complain about keycodes, try to comment out the following lines in /usr/share/X11/xkb/keycodes/evdev:
```
...
<I255> = 255;   // #define KEY_RFKILL              247

//<I372> = 372;   // #define KEY_FAVORITES           364
//<I374> = 374;   // #define KEY_KEYBOARD            366
...
```
This is because sommelier does not support the keycodes only exist in the latest version of xkeyboard-config.

[1]	https://chromium.googlesource.com/chromiumos/containers/sommelier

[2]	https://linuxcontainers.org/

[3]	https://github.com/lxc/lxd/issues/4071

[4]	https://chromium.googlesource.com/chromiumos/containers/cros-container-guest-tools/

[5]	https://aur.archlinux.org/packages/debtap/

[6]	https://tedyin.com/archive/cros-archlinux/

Comments

A Brief Intro to Input Method Framework, Linux IME, and XIM

Jun 27, 2017 23:06 @ Ithaca

There are chances one need an input method editor (IME). For CJK users, supporting unicode and wide characters from Chinese, Japanese and Korean is not enough, since it only gives the display of their native languages, not the way of input. Western people, especially who can manage to type their characters and words directly from a standard keyboard, may not understand the need for such input facility, which could possibly be the reason why CJK support is usually added as an additional feature in the end of a software system.

Briefly speaking, imagine the case where English has more than 26 alphabets, far more than that, what would happen? Imagine a language with tens of thousands of basic alphabets (characters, or typographically, glyphs). How would you design the input stack of a computer system to let users input efficiently? Since we cannot introduce a "super" keyboard having thousands of keys, a better way is to try to "spell" each character by making a series of key strokes. So, inaccurately, if you do this in English, it is like you spend some time pressing the keys to get an "a" in the end. Or press more than five keys (probably 15 keys or more) to have "linux" shown up in your text editing software. This way, we only incur logarithmic time complexity to index a character in CJK space (thinking about looking up a word in an English dictionary by tracing the leading letters). Another good news is, using very basic statistical methods or advanced NLP effort, such way of making input can be fairly efficient in spite of multiple candidates given the same key press combination. The ambiguity comes from the fact that, many mainstream input methods of Asian languages use English alphabets (some language, such as Japanese, calls it "Romaji", related to old Romanian alphabets) to represent the pronunciation of a character. It is likely that, in some languages, for example Chinese, to have different characters or words spelled with the same sequence of alphabets. For example, both 「元音」("vowel") and 「原因」("reason") are spelled by "yuan yin" in pinyin scheme, the pronunciation notation standardized by government of China (mainland). Another scheme, zhuyin (or Mandarin Phonetic Symbols), advocated by Taiwan, is also used for users in that area.

Scripts for Adding Bookmarks to a PDF

Dec 26, 2015 17:26 @ Singapore

As I gradually get used to reading more and more e-books, I find that it could be 100x more efficient if there were bookmarks that help you jump around the chapters for a scanned version pdf. It could be even more pleasant than reading the original paper-based book since the table of contents can be always on one side of the screen which is more accessible.

However, the true story is, those scanned pdfs usually don't come with such detailed bookmarks naturally. It is a painstaking task to use a pdf editing software to add bookmarks one after another via a graphical UI. Thus, a more desirable solution is to have a scripting-like way to add these bookmarks. Luckily, with the help of Adobe pdfmark Reference and this article, it is fairly easy to achieve:

The code above is used for generating the bookmark description according to the pdfmarks reference. Finally, we can concatenate the original pdf file with the generated bookmark file. Therefore:

Note that you need to change the variable toc in the above script to one which describes your table of contents.

Comments

Application for PhD

Oct 19, 2015 22:25 @ Singapore

I'm applying for the CS PhD programs starting from Fall 2016. Click here for my CV. I've made my decision: Cornell. Thanks for your attention!
Accounts:
- GitHub: https://github.com/Determinant/
- Stackoverflow: http://stackoverflow.com/users/544806/determinant
Project Mirror: http://tedyin.com/cgit/
Interesting Links:
- http://hackingdistributed.com/2015/10/19/six-degrees-of-alan-turing/
This post will be supplemented with more details.
UPDATE: my last paper has been accepted by ICASSP 2016!

Comments