Hagen Fritsch

Protein Design

Protein Design Cycle
In my efforts towards world domination, I discovered that the key technology to master is biology. In a recent seminar I had to dive quite deep into protein design trying to figure how it works and what the fundamental concepts of this bioinformatic technique are.

So here are:

which I hope provide some insight into this fantastic technique and its promises.

The slides were actually done in html5 using html5slides, but the project does not yet use all the power html5 has to offer, so the slides remain pretty basic and a lot of layout overhead was needed.

Hagen Fritsch

lxml-based BeautifulSoup loader

With ElementSoup there is already a tool, that allows you to create an etree Document using the more fault-tolerant BeautifulSoup-parser. However, looking for the oposite direction (i.e. creating a BeautifulSoup document using the lxml-parser was not yet possible).
In my experience, I discover BeautifulSoup’s API much more intuitive and useful, especially for quick scraping and data manipulation tasks. So the only reason to use lxml in the first place, is that its parser is much quicker and consumes less memory.
Recently I had a workflow made for BeautifulSoup based documents, but found, that BeautifulSoup was too slow to parse my several MB document. So here is lxmlsouper, a tool, that uses lxml to parse the document and creates the BeautifulSoup DOM from it, which is at least way quicker than the native way.

Notes: feel free to exchange the etree-Implementation with whatever you like best. Also this does not emulate the BeautifulSoup-API on top of etree, but uses the etree data to create a BeautifulSoup document from scratch, copying everything.

Files: lxmlsouper.py

Usage:
import lxmlsouper
data = unicode(open("bigfile.html").read(), "utf8")
soup = lxmlsouper.fastSoupLoader(data)

Hagen Fritsch

Goodbye studiVZ

Goodbye-studiVZ-LogoUnser allseits geliebtes studiVZ ist ja schon lange nicht mehr als ein großer Friedhof voller Datenleichen. Immer mehr Leute entscheiden sich zurecht, die eigene Datenleiche endlich zu beerdigen und löschen ihren Account. Das hat jedoch leider immer den Wermutstropfen, dass man die ganzen Sachen, die sich da so im Laufe der Zeit angesammelt haben, sein es Nachrichten, Pinnwand-Einträge, Foto-Alben oder die Gruppenliste der Freunde hinter sich lassen muss.
Nun ja, nicht ganz: Es gibt ein studiVZ-Plugin für den generischen POP3-Wrapper freepops mit dem man mit seinem Lieblingsemailprogramm schonmal alle eigenen Nachrichten und die eigene Pinnwand herunterladen kann. Super Sache!
Nun wäre allerdings auch noch toll, wenn man sich irgendwie ein Archiv seiner Daten dort basteln könnte. Genau das habe ich nun gemacht, indem ich ein Skript geschrieben habe, dass einem seine dort gelagerten Daten herunterlädt und rudimentär parst. Das Ergebnis ist ein strukturiertes JSON-File mit allen wichtigen Informationen, die man dort so gelassen hat. Auf Wunsch lassen sich zusätzlich zu den Profilen der Freunde auch noch deren Pinnwandeinträge, Verlinkungen und Fotoalben herunterladen, was dann etwas länger dauert und einen beachtlich großen Datenberg generiert.

Um den Ausstieg komplett gefahrlos zu machen, bietet sich nun noch an allen Leuten in der Freundesliste, die man noch irgendwie kennt, eine Nachricht mit den neuen Kontaktoptionen zu schicken.

Wer das Skript selbst benutzen möchte kann dies sehr leicht tun:

$ git clone git://github.com/rumpeltux/vz-backup.git
$ cd vz-backup
$ ./studivz email password

Und dann hinsetzen und warten, falls Fehler auftreten, bin ich natürlich an einer möglichst genauen Diagnose interessiert :)
Am Ende entsteht ein zip-File mit allen gescrapten HTML-Seiten (das auch benutzt wird, falls man nach einem Fehler den Vorgang wiederholt um nicht nochmal alle Seiten herunterladen zu müssen), sowie eine .img-list Datei mit den urls aller Bilder. Diese müssen noch seperat, z.B. mit wget heruntergeladen werden:

$ mkdir images; cd images
$ sort -u ../*.img-list | wget -i -

Für alle denen das zu kompliziert ist, gibt es auch einen Webservice, der das für euch erledigt. Geht dazu einfach auf http://studivz.irgendwo.org/goodbye/. Der Vorgang kann je nach Auslastung des Servers mehr oder weniger lange dauern und ihr bekommt eine Email mit einem Link zu eurem Archiv.

Was ich mir nun noch wünsche:

  • Mit den Daten kann man momentan noch nicht viel anfangen. Sie sind zwar „da“, aber werden nicht schön repräsentiert. Wenn jemand etwas Elan aufbringt und eine idealerweise html-only Anzeigeseite für die JSON-Daten bastelt, wäre das großartig!
  • Wenn jemand ein Diaspora-import tool schreibt, wär das auch prima. Dann könnte man eine Art migrier-Service draus machen.

So ich hoffe, das Tool nützt euch was und könnt ihr nun auch endlich studiVZ „Goodbye“ sagen :)

Hagen Fritsch

Side-Channel-Analysis on RFID-Tags

Framework IllustrationFor the last six months I’ve been playing around with the idea to run differential power analysis attacks against ciphers in RFID smartcards. This became the topic of my master thesis, but due to the lack of my analogue expertise, someone else was responsible for the measurement setup which however did not work out. Consequently that part failed and I had to produce something without an adequate measurement setup. Unfortunately, the attainable results were not good enough to launch an actual attack, but sufficient to illustrate the power profile of the targeted cards. The actual result is a revised topic and a bunch of software, which I’ll release soon (bug me if not!) or already released:

  • libdpa: a library for dpa preprocessing
  • a Mifare DESFire implementation for the proxmark3

Furthermore, there is the presentation of the thesis (SVG), which illustrates process and design rather graphically than in words. The latter part though is handled by the official document:
Design of a Framework for Side-Channel Attacks on RFID-Tags (PDF).
I elaborate on the basics of RFID and Side-Channel Attacks, give an overview of the DESFire protocol (parts of which had to be reverse-engineered for thesis) and present the results of the experiments conducted, concluding that SCA on RFID-tags seems very likely to be possible with slightly improved measurement setups.

Hagen Fritsch

Building a color world map

I’m trying to use the data of xkcd’s color survey to produce a kind of political map of color the HSV color world. This is getting way more complicated than I thought initially ;)

Here is the initial plain world map of HVS color space (mapped 2D i.e. not including greyscale):
2d view of hsv colorspace

Well, with this mapping comes a problem, best illustrated by this graph of the survey’s data density mapped into HSV space (the survey data is distributed just fine, but when mapping RGB to this crippled HSV, things get weird). White is much data, black is none:
density when converting from RGB to HSV
So here is a couple of things that suck as well when playing around: noisy data, spelling (xkcd has a list), RGB, gimp, HSV, python (occasionally) and most of all: colors (that’s not true, colors rock!). Things that rock too: python, PIL (sometimes), PNG, mogrify.

Anyways, I’m not yet completely there, but some of my intermediate results are already pretty cool, so have a look.
initial color speckles
I actually forgot how I produced this map, but I guess it shows top-40 colors people describe at each point of the HVS space mapped to color most fitted by the word they used. Dots are slightly bigger because I added a 3×3 gaussian filter to input data.

A lot of more coding and quite some magic allowed to get the borders for continents, so I hereby present you the first HSV color continents map:
hsv continents world map
So now, you might be thinking: if he’s producing a continent map, why can’t he do the countries the same way?
Well, countries are much more complicated, as they tend not to be so well seperated as continents. Also many people just answer “green” for all kinds of green-tones, so in the “light green” area, there is a lot of “green” noise, but I figured that out already.

plain simple country map
However, if you look at this map, there are some areas for which it is quite clear, which country they should be (e.g. yellow is very well seperated from the others), but several blue, green or purple tones are still troublesome.
My current approach is to find these troublesome areas, so that I can declare these as war-zones colors are still fighting for. Doing so, produced a map with the colorblend for each pixel and the likelyhood of being the right choice in the alpha channel. Looks cool, eh?

countries with likelihood alpha

To get to the final map shouldn’t take too long, but then again I already worked way to long on this, so I might only finish it in a more distant future.

Nächste Einträge »