Systolic Blood Pressure

At the end of this semester (18726), we finally got our PCB/die shot back!

Master Siyuan, may the force be with you.

Debian 12 "Bookworm"

Debian 12 “Bookworm” has been released last month (June 2023)!

Development

Debian 12 comes with recent development tools, such as

  • GCC 12.2
  • Clang 14.0
  • Linux 6.1
  • Python 3.11
  • Vim 9.0
  • Java 17.0
  • PHP 8.2
  • Node 18.13

Desktop

GNOME 43.4 is shipped with Debian 12. Compared to the GNOME desktop in Debian 11, it has built-in power management features, a notification area that looks nicer (and resembles the one in Windows 11), and better support for dark theme. One major change is the virtual desktops are horizontally aligned now.

Upgrade

My upgrade process is extremely smooth. Just need to change repository and let apt upgrade, and after a reboot the system becomes Debian 12 (>w<). The only noticeable changes are:

  • TLP is replaced by GNOME power management utility. Honestly I think TLP is more powerful and provides more control, but GNOME seems to be sufficient for laptops in most cases.
  • unattended-upgrades is replaced by GNOME software. I really don’t like GNOME software (it is way less reliable than unattended-upgrades), so I will just run updates manually.

Overall it’s a nice system upgrade. Debian 12 works as reliable as any Debian releases.

Functional Programming is Terrible

Functional programming is somewhat popular at CMU. There is even a course about it. However, functional programming is the worst programming paradigm. For the marketing points of the course,

  • FP isn’t better for program verification: Verification is almost trivial without mutable states, in almost all programming languages. Analyzing control flow, etc. isn’t hard as long as the syntax is well-defined. States are the challenge to automatic verification, and functional programming is supposed to be immutable and have no state. But you can’t avoid states in real life. States are what makes software useful. If you write a program without mutable states, it is essentially one huge expression and just some adapter code. You can’t even implement dynamic programming without some kind of states for memorization.
  • FP makes parallelism harder: It is true that making more things immutable can reduce the possibility for race conditions. But all common languages have const or final that control it better. And other languages have better parallelism ecosystem. C++ has CUDA/SIMD instrinstics, Go has goroutines, Java has Hadoop, and there are many others.
  • Abstraction: It’s way easier to implement proper abstraction with OOP languages. OOP is mostly just about abstraction/encapsulation, and it does really well at that. OCaml has better abstraction by learning from OOP, but the syntax is still somewhat counterintuitive.

And most FP languages use terrible language constructs,

  • ADT: Abstract data types is a very inefficient way to represent data. ADT is often used recursively (if not, it is as harmless as std::variant). And it can be huge: there are often programs that store trees in ADT. The data have to be allocated on heap (compilers can’t know the size in advance) and all nodes are tagged (to store the variant). Now some seemingly innocuous data structures can be very inefficient.
  • Pattern matching: Pattern matching has to be sequential because of the pattern matching priority from top to bottom, which makes it no more useful than a long list of if/else. And unfortunately you can’t avoid pattern matching because of ADT.
  • FP significantly decreases code readability and maintainability: Lack of common features like loops, you have to use lambda functions instead. This is much more difficult to read than something like range-based for in most languages. You have to write a function to print 5 “Hello, world!” and good luck on your stack usage. And it becomes worse with high order function, it just wastes programmer effort to understand the code and the control flow. Most languages actually have a human-readable control flow, as early as Fortran! Don’t overuse lambda functions, in any language.
  • Code reuse: There is essentially no polymorphism in FP languages. The type system is at most something like C++20 concepts and broken generics, that’s NOT polymorphism.
  • FP makes complexity analysis harder: Do you want to analyze the time complexity of a huge mess of lambdas, function arguments, closures and expressions?

Very Slow Computer

Very Slow Computer is a project in build18 2023. It has an adder/comparator built from NAND gates. And it can run programs for 15251’s “register machine” model.

TNP: Introduction

TNP (Tensor Native Processor) is a experimental processor design to run AI computation in a very efficient way. Primarily, TNP relies on systolic array to accelerate matrix operations (Google’s TPU also uses systolic arrays). However, we want to maximize systolic array performance, we make everything in the project design “tensor native”:

  • 2-diagonal cacher read/write - Keep the systolic array busy.
  • Matrix-shaped registers - Each matrix register has the same size as the systolic array. Now we can read/write rows, columns and diagonals for each matrix register.
  • Local memory for each core (NUMA) - Maximize memory bandwidth, and avoid shared, last level cache.
  • Vector-add ALU inside matrix core - Adding two vectors turns out to be a very common case during matrix multiplication, so we include it in the same core.

And we want the compiler to have maximum control of underlying hardware, since the compiler is aware of the global computation structure:

  • Software managed cache (instead of LRU) - More intelligent register resource allocation.
  • Message-passing interface between cores - Let compilers explicitly indicate inter-core communication.
  • Deterministic execution time - This helps us predict the execution time for any partial code. This enables a feedback loop for code generation/execution time evaluation right in compilation time.

We implemented the hardware part (matrix cores, vector cores, switch) in SystemVerilog and the software part (assembler, compiler, ONNX interface) in C++.

This project is initially a 15418/15618 course project (project poster).

Hello, modules!

Module is a very exciting part of C++20. Similar to Go modules, C++20 modules can save the compiler from repeatedly scanning header files. This will make C++ code build faster and become more modular (independent of #include order). And the Hello World program becomes easier to write XD

This program is compiled using cl.exe 19.33 (MSVC v143, Visual Studio 2022). Gcc and Clang still don’t support importing standard library yet, and MSVC support is still experimental. But the future of C++ looks great!

I learned C++98 in around 2014 (due to outdated C++ compilers in National Olympiad of Informatics: we were still using Ubuntu 10.10 (a non-LTS version, EOL in 2011) in competitions then because it still works). Since C++98, C++ has evolved significantly and helps programmers write modern, efficient code. The largest evolution is probably C++11 and C++20. Some impressive new features are:

  • auto - Automatic type deduction, which saves us from typing something like std::vector<std::vector<int> >::const_iterator manually.
  • constexpr and consteval - Compile time evaluation. As C++ users, we want extreme performance!
  • enum class and std::variant - More structured enums and unions. Finally they are not just integer values.
  • Range-based for - I guess it’s syntactic sugar, but it can be life-saving!
  • Smart pointers - RAII is the right way to handle memory (as well as other resources) in an efficient way.
  • concepts - Makes generic programming more fun and reliable.
  • T&& - RValue reference. Together with copy elision, we can make data flow more efficiently by removing redundant copy operations. (Which type is auto&&?)
  • Lambda function [] () {} - Sounds useful with STL algorithms.
  • Ranges - More powerful “iterators”!
  • Lots of cool stuff in STL (std::thread, std::atomic, std::bind, std::chrono, and many others).

Christmas 2022

Merry Christmas!

All stores near CMU are closed, so I bought some “Instant Ramen”!

PulseAudio Issue

Recently I upgraded to Linux kernel 5.19 (using Debian backports) and updated UEFI firmware. First, there is a kernel error message about unstable tsc clocksource issue, and I added tsc=nowatchdog to kernel parameter (also I added iommu=pt). (This is a bug in Lenovo’s firmware.) However, the audio output from 3.5mm audio jack contains some noise. It is solved after I changed

1
load-module module-udev-detect

to

1
load-module module-udev-detect tsched=0

in /etc/pulse/default.pa, but I still don’t know why it works.

Update: My microphone doesn’t work recently, and it works after I reverted this change…

CMU Course List

CMU Course List (Project Candela) is released (including next semester courses)! Welcome to post comments/tags/pages to courses!