The purpose of this website is to be my Feynman Notebook of a sort. If you haven't heard of the idea before, check this page out. I intend to describe here some of the technical things I learn.
It would not be a good about page if I didn't introduce myself. My name is Lukasz Janyst. Even though I am a relative software engineering rookie, I have managed to learn a thing or two about the following topics:
Massive storage systems: I designed and built the namespace module for EOS, the Tier-0 storage system for the Large Hadron Collider's data. Last time I checked, its 5 instances at CERN held 200 million files and 30 petabytes of data. By now, it's likely double or triple that because of recent intense data taking.
C/C++: In the antiquity, when clang's support for C++ was in the planning phase, I wrote the proof-of-concept prototype for Cling. Quite surprisingly, my name is still in some of the source files, even though the code changed a lot. I also did some work on C++ introspection extending existing systems (CINT and Reflex) to handle data model evolution rules.
Network Programming: I designed and built the new XRootD client (XrdCl) library and command-line utilities. XRootD is the base technology for global data federations of the LHC experiments. The client handles the last stage of LHC data taking and most of the data analysis IO traffic worldwide. It's tens of GB/s 24/7 at CERN alone. CASTOR and EOS use it also for internal data transfers, such as pool re-balancing, parity-reconstruction, tape transfers. XRootD is a very generic technology. LSST, for instance, uses it to shard MySQL databases.
Data crunching: I made various improvements to the file formats used to store the LHC data. I wrote the code that made it possible to store collections of polymorphic pointers in column-wise mode. This approach reduces the on-disk size and read time by ~30% compared to object-wise mode. I proposed and implemented a semi-automatic way of handling data model evolution. It's useful when you need to change your data model, but already have petabytes stored using the old one.
Software engineering: I was in charge of releasing XRootD to external and internal partners. To ensure the quality, I created an automated build and testing system. The packages produced in the process were made available to other projects for incremental integration.
My current semi-professional interests (a.k.a. skills in development) include:
GPGPU: I play with implementing classical parallel algorithms in CUDA.
Lisp: I am currently going through Paul Graham's On Lisp and Peter Norvig's Paradigms of Artificial Intelligence Programming.
Deep learning: I am working my way through the Deep Reinforcement Learning course from UC Berkeley.
When out of work, I like: