Skip to main content

3 posts tagged with "project"

View All Tags

Semaphore - part 1

· 3 min read
Jimmy Chu
Site Author

I have been studying about zero-knowledge proof in the past eight weeks. It is hard to grasp how a software project could integrate zero-knowledge proof. Then, through some community chat groups, I came across Privacy & Scaling Explorations group and one of their projects, Semaphore. Semaphore is a very educational project. In this project, I see how ZKP is being applied (in a fashion I understand) and observe excellent software engineering inside.

Semaphore allows users belonging to a certain group to vote without revealing which member they are, aka anonymous voting. There is also a mechanism to prevent users from double-voting, which is done by posting some "residues" generated by the zero-knowledge proof protocol on-chain.

But what is great about the project is that it is very beginner-friendly, and I say this with great respect. The whole project is architected very nicely. It can even be viewed as a model project for how software and packages are structured in modern-day dApps. I have learned a lot from their software structure and encapsulation.

  • The core software project: Semaphore protocol, with a nice structure of:

    • contracts package - the smart contract that verify the proof generated offchain.

    • proof package - the off-chain logics that generate the ZKP with wasm code generated by an arithmetic circuit.

    • circuits package - contains the arithmetic circuit written in circom that generates and verifies proof.

    • hardhat package - a package that allows developers to deploy the core semaphore contract via hardhat without downloading the semaphore project.

  • A template to see how the protocol can be incorporated into other projects. It consists of:

  • The project leverages the Merkle tree structure to verify that a member belongs to a group, with all the tree leaves being the users' identity commitment. The user will generate a Merkle proof to show that he belongs to the group. The core team implements the Merkle tree structure separately as a Lean Incremental Merkle Tree (with a summary note).

  • The project has excellent documentation and website as well, including as two different packages inside the source code.

Studying the project taught me a lot of optimization tricks I can apply in the zk-battleship codebase. I also saw how they used to save on-chain gas costs, particularly using emitted events to store voting results (the user feedback messages) instead of on-chain storage to reduce gas costs further. This is something that never crossed my mind before.

There are two feature enhancements I can help implement on the current Semaphore basis to further my engineering understanding in zk-app development.

  1. Users can later prove they have voted without revealing their voting option, which can become proof of participation for them.

  2. The voting result is hidden during the voting phase until it ends. In this way, the choices made by late voters won't be affected by the given voting result so far.

Cangjie & Chu Bong Foo - part 1

· 4 min read
Jimmy Chu
Site Author

On one occasion in Taiwan, people around me were amazed that we Hong Kong people type Chinese using the Cangjie (倉頡) method. I realize it is uncommon for people to learn how to deconstruct Chinese characters and type them with the Cangjie encoding. Further thinking about it, it is a brilliant idea that Chinese characters can be deconstructed this way. It makes me wonder how the inventor came up with this input methodology. So I dug deeper into its history and found an inspiring story.

The architect Chu Bong Foo (朱邦復), during the 1960s, was working in a publishing house in Brazil. He saw that the publishing house could finish an English book publication in 3 to 5 days. From his youth experience, he knew it would take about two to three months to get a Chinese book published, converting from the author's manuscript to a publishing form. This is because, at that time, there was no systematic way to index Chinese characters. English words are composed of 26 alphabets, and how to index them is well-established. However, there is no systematic way of indexing Chinese characters, and there are over 14k+ most frequently used characters and 40k+ frequently used characters.

So Chu Bong Foo first studied the most frequently used characters, analyzing their shape and deconstructing them into 24 basic shapes (radicals) and 76 auxiliary shapes, often rotated or transposed versions of the basic shapes. Then, he used the following character roots to represent them.

Philosophical Group

  • A: 日 - Sun
  • B: 月 - Moon
  • C: 金 - Gold
  • D: 木 - Wood
  • E: 水 - Water
  • F: 火 - Fire
  • G: 土 - Earth

Stroke Group

  • H: 竹 - apostrophe / slant
  • I: 戈 - dot
  • J: 十 - cruciform
  • K: 大 - cross
  • L: 中 - vertical
  • M: 一 - horizontal
  • N: 弓 - hook

Body Parts Group

  • O: 人 - person
  • P: 心 - heart
  • Q: 手 - hand
  • R: 口 - mouth

Character Shape Group

  • S: 尸 - corpse
  • T: 廿 - twin
  • U: 山 - mountain
  • V: 女 - female
  • W: 田 - field
  • Y: 卜 - fortune telling

Collision / Difficult Group

  • X: 難 - difficult
  • Z: 重 - unknown/collision

Each Chinese character can be represented by a sequence of one to five character roots above, and the duplication rate (given a valid sequence returning more than one character) is less than 10%. Cangjie encoding is the most efficient way of inputting Chinese characters.

With such a system, publication houses can now input Chinese characters much like an English word, and Chinese characters can be sorted in a particular order and indexed efficiently. This has increased the Chinese publication process.

Later in Mr. Chu Bong Foo's life, he was also involved in the following projects:

  • Chinese integration in modern computer systems, back in the 1980s. When the computer itself is not too popularized yet.

  • Launched the first Chinese e-book system in 2001, when the first version of Kindle Reader was launched six years later in 2007.

  • Launched an AI system that takes a Chinese poem text as input and generates a 3D animation, all without human intervention. The process involves natural language processing, Chinese character interpretation, and generative AI, which we talk about today, all back in 2011! For details, refer to his announcement post at that time and the animation 記承天寺 (original download link, or the Youtube video).

Mr. Chu Bong Foo is truly a visionary.

References

Advent of Code 2021

· 6 min read
Jimmy Chu
Site Author

In the past three months I have been working on solving Advent of Code 2021 puzzles. I probably have spent 80+ hours in solving all 25 puzzles. Yes, that is pretty much a whole two weeks of full-time work cumulatively, and that is with the help of this subreddit forum and a lot of online resources 😳.

My solutions are here, with a corresponding website.

For those who don't know, Advent of Code are a series of programming puzzles released every year before Christmas, from Dec 1st till the Christmas day 25th to be exact, one problem a day. The author Eric Wastl has been doing this since 2015, and the programming puzzles have since used in company training, university coursework, and practice problems.

Coming out from the rabbit hole of solving these puzzles, there are three problems that stand out from the rest, in my opinion. So I want to document them down and reflect on what I have learned from them.

Spoiler Alert

For those who want to solve the puzzles yourselves, please stop reading here 😄

Day 19: Beacon Scanner

The original problem. In solving this puzzle, the first issue is to clarify what the 24 orientations are. The way I eventually visualize it is with a Blu Tack and three piece of papers.

So we can have X, Y, Z axes like that, and we can rotate them to have three different axes arrangement:

image

Now we can have axes pointing at different directions as follows:

image

With eight different set of axes directions and each direction with three axes arrangement, we get a total of 24 orientations.

I then learn that we can rotate a coordinate XYZ based on these orientations by matrix transformation(multiplication). The 24 matrics are listed here at the end.

Now with all beacon coordinates relative to a scanner location, and the scanner orientation be one of the 24 orientations, how are we going to solve the problem? At this point I am out of my wit and look for clues in the subreddit forum. The key is by comparing the distance of beacons between different scanner report.

Yes, Of course! That is the only invariant we have here! And I recall what Bezos have famously said "What is not going to change?".

Indeed, if a scanner have scanned the same set of beacons as another scanner, the distance between any pair within that beacon set are not going to change, no matter where the scanner is (its x, y, z offset), and what orientation the scanner faces (the matrix transformation).

With this insight we proceed to the next problem. How could we determine if two scanners have scanned the same set of twelve beacons, as specified by the problem?

Here we need to get to a concept in graph theory - complete graph. It is a set of nodes with each node have edges connect to every other nodes. So if two scanners have scanned the same set of 12 beacons, we should be able to build a complete graph from these 12 beacons (nodes) and get all of its edge length matches. Now how many edge lengths should we be counting here? It is 12 chooses 2, so 66. So if we build an edge table (with all edge length between any two beacons) for the two scanner reports, and find there are 66 entries in the two tables matches, we can claim there is a very high likelihood that the two scanners have scanned the same set of twelve beacons.

Once we identify the set of 12 beacons, we can determine the orientation of the second scanner. Here we always use the first scanner as the baseline, meaning it is located at (0, 0, 0) with the [[1, 0, 0], [0, 1, 0], [0, 0, 1]] orientation matrix.) We apply all 24 orientations to four beacon coordinate and see which one will result to having the same vector displacement of the first scanner, then we can determine the location and orientation of the second scanner.

Using this methodology we can map out all beacon and scanner locations and the orientation of scanners iteratively.

Day 22: Reactor Reboot

The original problem. I learned two concepts in solving this puzzle. The first one is how to compute the area covered by multiple sets, without double counting, when they overlap with each others.

For instance, the following:

image

We know the total of area are:

Area A + Area B - Area A^B

Now with the following:

image

Area A + Area B + Area C - Area A^B - Area A^C - Area B^C + Area A^B^C

So with N set, following the pattern, it is:

Area A + Area B + .... Area N - Area A^B - Area A^C - .. Area A^N - Area B^C - ... + Area A^B^C + Area A^B^D + .... - ...

I then learned that this is the Inclusion-exclusion principle.

I stumbled upon another sub-problem when solving this puzzle, that is, given a set of items, efficiently generate all its combination and compute its n choose k combinations. I thought it will be fun to code that, so using the algorithm from Dr. James McCaffrey, I coded the algorithm and also the front end Combination App (source code) that you can play with.

Day 23: Amphipod

The original problem. This is a fun puzzle. I learn from sub-reddit that this is basically a shortest path problem and can be solved with Dijkstra's algorithm. So it is a refresher on coding that algorithm and encoding various game states into keys of nodes, with edge extended from the node as all its next possible moves, with the edge length as the move cost.

But in implementing Dijkstra's algorithm, with a large amount of possible states and edges, how do you efficiently get the next shortest path to try? This is when I learn the data structure about Indexed Priority Queue (IPQ) from this Youtube video.

The overall structure is shown as follow:

image

You use an index to represent its nodes' keys and corresponding edge costs. Then during the heap operations (swimming up or down), we just move the index around. So here is the implementation of indexed priority queue with test cases.


That's it on Advent of Code 2021. Overall it is a fun and intellectually challenging experience, and I am glad to finish all 2021 problems 😉.