Content tagged programming

Intro

Following a piece of advice from a friend, I decided to buy this new domain name and start writing down all the cool things I do. I have written a bit in a bunch of other places before and have other quasi-failed blogs, so I actually already have a bit of content to bootstrap this one.

There's plenty of blogware options out there, but, as a programmer, I like the ones that keep the content in a version control system intended for software. From the alternatives available in this department, I decided to go for c()╬╗eslaw. It's kind of similar to Jekyll, which I have used before, and it's written in Common Lisp, which, typically, is a good sign in general.

There is very little instruction over the Internet on how to use it, but it's not hard to figure out after reading the code. This post is a brief summary of what I have done to create this website and convert the content from Jekyll.

Site Structure

The first thing that you need to do is to create a .coleslawrc file describing the layout of the site, the theme to be used to render the final HTML, and other such things. There's a good example here and you can get the full picture by reading the source. :) I like to change the separator (:separator "---"), so that --- is used to distinguish the metadata from the content section in source files, this makes things look the Jekyll way. The static-pages plugin, makes it possible to create content other that blog posts and indices.

Coleslaw will search the repo for files ending with .post (and .page if the static-pages plugin is enabled) and run them through the renderer selected in the page's metadata section. It will generate the indices automatically and copy verbatim everything it finds in the static directory.

You can create our own theme following the rules described here or choose something from the built-in options. I built the theme you see here more or less from scratch using Bootstrap and the live customizer to tweak the colors. It was a fairly easy and pleasant exercise.

In the end, the resulting directory structure looks roughly like this:

==> find
./.coleslawrc
./about.page
./pictures.page
./talks.page
./posts/0027-blogging-with-coleslaw.post
...
./static/pictures/pic_0001.jpg
...
./static/scripts/jquery.min.js
./static/images/profile.jpg
./static/images/favicon.png
...
./plugins/markdown.lisp
./plugins/preprocessor.lisp
./plugins/deploy-rsync.lisp
./themes/jany-st/base.tmpl
./themes/jany-st/index.tmpl
./themes/jany-st/post.tmpl
./themes/jany-st/js/bootstrap.min.js
./themes/jany-st/css/bootstrap.min.css
./themes/jany-st/css/custom.css
./themes/jany-st/css/syntax.css

The first few lines of the post you are reading right now look like this:

---
title: Blogging with Coleslaw
date: 2015-12-07
tags: blogging, lisp, programming, linux, sbcl
format: md
---

Intro
-----

Following a piece of advice from a friend, I decided to by this new domain name

Patches

Coleslaw and the packages it depends on work pretty well to begin with, but I made a couple of improvements to make them fit my particular tastes better:

  1. Some themes and plugins are site specific and cannot be generalized. There is very little point in keeping them in the coleslaw source tree when they really belong with the site content. I submitted patches to make it possible to define themes and plugins in the content repo. See PR-98 and PR-101.
  2. I like to have the HTML files named in a certain way in the resulting web site, so it's convenient for me to be able to specify lambdas in .coleslawrc mapping the content metadata to file names. I made a pull request to allow that (PR-100), but Brit, the maintainer of coleslaw, has different ideas on how to approach this problem.
  3. I think pygments have no real competition if it comes to coloring source code, so I made changes to 3bmd - the markdown rendering library used by coleslaw - allowing it to use pygments. See PR-24.
  4. It's nice to be able to control how the rendered HTML tables look. In order to do that, you need to be able to specify the css class for the table. See PR-25.

Customization

3bmd makes it fairly easy to customize how the final HTML is rendered. For instance, you can change the resulting markup for images by defining a method :around print-tagged-element. I want the images on this web site to have frames and captions, so I did this:

 1 (defmethod print-tagged-element :around ((tag (eql :image)) stream rest)
 2   (setf rest (cdr (first rest)))
 3   (let ((fmt (concatenate 'string
 4                           "<div class=\"center-wrapper\">"
 5                           "  <div class=\"img\">"
 6                           "    <img src=\"~a\" ~@[alt=\"~a\"~] />"
 7                           "    <div class=\"img-caption\">~@[~a~]</div>"
 8                           "  </div>"
 9                           "</div>"))
10         (caption (with-output-to-string (s)
11                    (mapcar (lambda (a) (print-element a s))
12                            (getf rest :label)))))
13     (format stream
14             fmt
15             (getf rest :source)
16             caption
17             caption)))

Being able to use $config.domain and other variables in the markdown makes it possible to define relative paths to images and other resources. This comes handy if you want to test the web site using different locations. In order to acheve this you can define a method :around render-text in the following way:

 1 (defmethod render-text :around (text format)
 2   (let ((processed
 3          (reduce #'funcall
 4                  (list
 5                   #'process-embeds
 6                   (lambda (text)
 7                     (regex-replace-all "{\\\$config.domain}"
 8                                        text
 9                                        (domain *config*)))
10                   (lambda (text)
11                     (regex-replace-all "{\\\$config.repo-dir}"
12                                        text
13                                        (namestring (repo-dir *config*))))
14                   text)
15                  :from-end t)))
16     (call-next-method processed format)))

Deployment

I use DreamHost for my web hosting and want to use sbcl as the lisp implementation. Unfortunately, all of my attempts to run sbcl there ended up with error messages like this one:

mmap: wanted 1040384 bytes at 0x20000000, actually mapped at 0x3cfc6467000
ensure_space: failed to validate 1040384 bytes at 0x20000000
(hint: Try "ulimit -a"; maybe you should increase memory limits.)

After some investigation, it turned out that DreamHost uses grsecurity kernel patches and, it looks like, their implementation of ASLR (Address Space Layout Randomization) does not respect the ADDR_NO_RANDOMIZE personality that is indeed set by sbcl at startup. They still allow the memory to be mapped at a specific location, which is a requirement for sbcl, if the MAP_FIXED flag is passed to mmap. The patch fixing this problem was a fairly simple one once I figured out what's going on. It looks like it will be included in sbcl 1.3.2. Until then, you will have to recompile the sources yourself.

Let's see if we get a speedup if we compile the code. The snippets below list the contents of col1.lisp and col2.lisp respectively:

(require 'coleslaw)
(coleslaw:main "/path/to/repo/")
(uiop:quit)
(require 'coleslaw)
(defun main () (coleslaw:main (nth 1 *posix-argv*)))
(sb-ext:save-lisp-and-die "coleslaw.x" :toplevel #'main :executable t)

And this is what you get:

]==> time sbcl --noinform --load col1.lisp
sbcl --load col2.lisp  6.39s user 1.05s system 97% cpu 7.609 total

]==> sbcl --noinform --load col2.lisp
[undoing binding stack and other enclosing state... done]
[saving current Lisp image into coleslaw.x:
writing 4944 bytes from the read-only space at 0x20000000
writing 3168 bytes from the static space at 0x20100000
writing 85229568 bytes from the dynamic space at 0x1000000000
done]

]==> time ./coleslaw.x /path/to/repo/
./coleslaw.x /path/to/repo/  3.37s user 0.74s system 95% cpu 4.304 total

]==> du -sh ./coleslaw.x
83M     ./coleslaw.x

The compiled code runs almost twice as fast, but the executable weights 83M!

I wrote the following post-receive hook in order to have the site rendered automatically every time I push the new content to the master branch of the repo.

 1 CLONE_DIR=`mktemp -d`
 2 
 3 echo "Cloning the repository..."
 4 git clone $PWD $CLONE_DIR > /dev/null | exit 1
 5 
 6 while read oldrev newrev refname; do
 7   if [ $refname = "refs/heads/master" ]; then
 8     echo "Running coleslaw..."
 9     coleslaw.x $CLONE_DIR/ > /dev/null
10   fi
11 done
12 
13 rm -rf $CLONE_DIR

Conclusion

Building this web site was quite an instructive experience, especially that it was my first non-toy project done in Common Lisp. It showed me how easy it is to use and hack on CL projects and how handy QuickLisp is. There's plenty of good libraries around and, if they have areas in which they are lacking, it's quite a bit of fun to fill the gaps. The library environment definitely is not as mature as the one of Python or Ruby, so new users may find it difficult, but, overall, I think it's worth it to spend the time getting comfortable with Common Lisp. I finally feel emotionally prepared to go through Peter Norvig's Paradigms of Artificial Intelligence Programming. :)

Generally, LWN runs top quality articles. I always read them with pleasure and they are good enough to make me a paid subscriber. Every now and then though, they would publish something pretty great even by their standards. I read this and was amazed. I had not realized that it is this easy to create and run a simple virtual machine. I typed the code in and played with it for a couple of hours. You can get the file that actually compiles (C++14) and runs here.

  1 //------------------------------------------------------------------------------
  2 // Based on: https://lwn.net/Articles/658511/
  3 //------------------------------------------------------------------------------
  4 
  5 #include <iostream>
  6 #include <iomanip>
  7 #include <cstdint>
  8 #include <cstring>
  9 #include <cerrno>
 10 #include <sys/stat.h>
 11 #include <fcntl.h>
 12 #include <unistd.h>
 13 #include <sys/ioctl.h>
 14 #include <sys/mman.h>
 15 #include <linux/kvm.h>
 16 
 17 //------------------------------------------------------------------------------
 18 // Error handling macro
 19 //------------------------------------------------------------------------------
 20 #define RUN(var, command) \
 21 var = command;       \
 22 if (var == -1) {     \
 23   std::cerr << #command ": " << strerror(errno) << std::endl; \
 24   return 1;          \
 25 }
 26 
 27 //------------------------------------------------------------------------------
 28 // The code to be run inside of the virtual machine
 29 //------------------------------------------------------------------------------
 30 const uint8_t code[] = {
 31   0xba, 0xf8, 0x03, /* mov $0x3f8, %dx */
 32   0x00, 0xd8,       /* add %bl, %al */
 33   0x04, '0',        /* add $'0', %al */
 34   0xee,             /* out %al, (%dx) */
 35   0xb0, '\n',       /* mov $'\n', %al */
 36   0xee,             /* out %al, (%dx) */
 37   0xf4              /* hlt */
 38 };
 39 
 40 int main(int argc, char **argv)
 41 {
 42   using namespace std;
 43 
 44   //----------------------------------------------------------------------------
 45   // Open KVM
 46   //----------------------------------------------------------------------------
 47   int kvm = open("/dev/kvm", O_RDWR | O_CLOEXEC);
 48   if (kvm == -1) {
 49     cerr << "Unable to open /dev/kvm: " << strerror(errno) << endl;
 50     return 1;
 51   }
 52 
 53   //----------------------------------------------------------------------------
 54   // Check the version of the API, we need 12
 55   //----------------------------------------------------------------------------
 56   int ret;
 57   RUN(ret, ioctl(kvm, KVM_GET_API_VERSION, 0));
 58 
 59   if (ret != 12) {
 60     cerr << "KVM_GET_API_VERSION " << ret << ", expected 12" << endl;
 61     return 1;
 62   }
 63 
 64   //----------------------------------------------------------------------------
 65   // Check if the extension required to set up guest memory is present
 66   //----------------------------------------------------------------------------
 67   RUN(ret, ioctl(kvm, KVM_CHECK_EXTENSION, KVM_CAP_USER_MEMORY));
 68 
 69   if (!ret) {
 70     cerr << "KVM_CAP_USER_MEM extension is not available" << endl;
 71     return 1;
 72   }
 73 
 74   //----------------------------------------------------------------------------
 75   // Set up a virtual machine
 76   //----------------------------------------------------------------------------
 77   int vmfd;
 78   RUN(vmfd, ioctl(kvm, KVM_CREATE_VM, (unsigned long)0));
 79 
 80   //----------------------------------------------------------------------------
 81   // Get some page-aligned memory and copy the code to it
 82   //----------------------------------------------------------------------------
 83   void *mem = mmap(0, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
 84                    -1, 0);
 85   if (mem == MAP_FAILED) {
 86     cerr << "Failed to get a page of memory: " << strerror(errno) << endl;
 87     return 1;
 88   }
 89   memcpy(mem, code, sizeof(code));
 90 
 91   //----------------------------------------------------------------------------
 92   // Tell the virtual machine about this region
 93   //----------------------------------------------------------------------------
 94   kvm_userspace_memory_region region;
 95   memset(&region, 0, sizeof(region));
 96   region.slot            = 0;
 97   region.guest_phys_addr = 0x1000;
 98   region.memory_size     = 0x1000;
 99   region.userspace_addr  = (uint64_t)mem;
100 
101   RUN(ret, ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &region));
102 
103   //----------------------------------------------------------------------------
104   // Create a virtual CPU #0
105   //----------------------------------------------------------------------------
106   int vcpufd;
107   RUN(vcpufd, ioctl(vmfd, KVM_CREATE_VCPU, (unsigned long)0));
108 
109   //----------------------------------------------------------------------------
110   // Allocate memory for kvm_run data structure
111   //----------------------------------------------------------------------------
112   int vcpu_run_size;
113   RUN(vcpu_run_size, ioctl(kvm, KVM_GET_VCPU_MMAP_SIZE, 0));
114 
115   kvm_run *run;
116   run = (kvm_run *)mmap(0, vcpu_run_size, PROT_READ | PROT_WRITE, MAP_SHARED,
117                         vcpufd, 0);
118   if (run == MAP_FAILED) {
119     cerr << "Allocating VCPU run struct failed: " << strerror(errno) << endl;
120     return 1;
121   }
122 
123   //----------------------------------------------------------------------------
124   // Set up the special registers of the VCPU #0
125   //----------------------------------------------------------------------------
126   kvm_sregs sregs;
127   RUN(ret, ioctl(vcpufd, KVM_GET_SREGS, &sregs));
128   sregs.cs.base = 0;
129   sregs.cs.selector = 0;
130   RUN(ret, ioctl(vcpufd, KVM_SET_SREGS, &sregs));
131 
132   //----------------------------------------------------------------------------
133   // Set up the standard registers
134   //----------------------------------------------------------------------------
135   kvm_regs regs;
136   memset(&regs, 0, sizeof(regs));
137   regs.rip    = 0x1000;
138   regs.rax    = 2;
139   regs.rbx    = 2;
140   regs.rflags = 0x2; // starting the VM will fail with this not set, x86
141                      // architecture requirement
142   RUN(ret, ioctl(vcpufd, KVM_SET_REGS, &regs));
143 
144   //----------------------------------------------------------------------------
145   // Run the VCPU #0
146   //----------------------------------------------------------------------------
147   while (1) {
148     RUN(ret, ioctl(vcpufd, KVM_RUN, 0));
149     switch (run->exit_reason) {
150 
151       //------------------------------------------------------------------------
152       // HLT instruction - we're done
153       //------------------------------------------------------------------------
154       case KVM_EXIT_HLT:
155         cerr << "KVM_EXIT_HLT" << endl;
156         return 0;
157 
158       //------------------------------------------------------------------------
159       // Simulate an IO port at 0x3f8
160       //------------------------------------------------------------------------
161       case KVM_EXIT_IO:
162         if (run->io.direction == KVM_EXIT_IO_OUT &&
163             run->io.size      == 1 &&
164             run->io.port      == 0x3f8 &&
165             run->io.count     == 1)
166           cout << *(((char *)run) + run->io.data_offset) << flush;
167         else
168           cerr << "Unhandled KVM_EXIT_IO" << endl;
169         break;
170 
171       //------------------------------------------------------------------------
172       // Underlying virtualization mechanism can't start the VM
173       //------------------------------------------------------------------------
174       case KVM_EXIT_FAIL_ENTRY:
175         cerr << "KVM_EXIT_FAIL_ENTRY: hardware_entry_failure_reason = 0x";
176         cerr << hex;
177         cerr << (unsigned long long)run->fail_entry.hardware_entry_failure_reason;
178         cerr << endl;
179         return 1;
180 
181       //------------------------------------------------------------------------
182       // Error from the KVM subsystem
183       //------------------------------------------------------------------------
184       case KVM_EXIT_INTERNAL_ERROR:
185         cerr << "KVM_EXIT_INTERNAL_ERROR: suberror = 0x" << hex;
186         cerr << run->internal.suberror << endl;
187         return 1;
188     }
189   }
190 
191   //----------------------------------------------------------------------------
192   // Cleanup
193   //----------------------------------------------------------------------------
194   munmap(mem, 0x1000);
195   munmap(run, vcpu_run_size);
196   close(vcpufd);
197   close(vmfd);
198   close(kvm);
199 
200   return 0;
201 }

When writing software, I have always assumed that I could have trust in the underlying platform. At least to some basic extent. For instance, when writing a multi-threaded program running on Linux, it is not unreasonable to think that the POSIX thread synchronization mechanisms are actually, you know, thread-safe. As it turns out, it's not quite true. We have learnt about this fact in a rather painful way - having a heavily-loaded production system crash every now and then. I ended up having to implement my own semaphores.

Video Link

Pretty interesting talk on how to prevent squirrels from stealing bird food using python and computer vision.

Steps to recognize a squirrel on a picture:

  • Subtract background.
    • Compute average value of each pixel over time to build a background profile.
  • Detect blobs.
  • Discriminate blobs. Distinguish between squirrels from other things. The author used support vector machines.
    • Blob size
    • Color histograms
    • Entropy detection (squirrel tail)

Other interesting stuff mentioned:

Video Link

Way too long for the amount of useful content presented, but still quite OK.

Highlights: