Entries tagged as programming
Friday, June 9. 2006
In the last few days, I was working hard to finish a first working version of my "cwapd" (totally uninspired name, I know, it's supposed to mean "C++ Web Application Platform Daemon"). It is based on my previously-mentioned HTTP stack.
The basic concept of cwapd is that one or several base URLs can be mapped to so-called "contexts". As soon as a URL that begins with one of the registered base URLs is called, the "action" (i.e. the part of the URL after the base URL) is handed over to the registered context. This context then handles the request, and returns the result. As a feature, methods can be easily associated with actions (see my previous posting), which means that no manual dispatching is necessary. Alternatively, actions can also be handled manually, e.g. for cases where the "action" actually represents something like a filename (think of a context that delivers static files like a "normal" web server).
As a simple example, I also created a "wallpaper_context" class that automatically scales images to the optimal wallpaper size of the respective client, employing Magick++ and WURFL. This could be e.g. useful for people who want to deliver mobile phone wallpapers via WAP. The code itself is really simple, the wallpaper_context class contains nothing more than 77 SLOCs, showing that writing custom contexts is extremely simple.
The only "big" thing that is still missing from it is proper handling of configuration. Currently, many things that are actually supposed to be stored in configuration files are hard-coded in the source code. So, if anybody could recommend me a flexible configuration engine with an interface to C++, I'd be happy about it.
Anyway, if you're interested in the code, simply have a look at the SVN repository.
Wednesday, June 7. 2006
I just created a pretty neat callback mechanism for C++: template<typename T> class callback { public: callback(T* obj, unsigned int (T::*func)(string, ostream&, header_container&, cgiparam_container& )) : callee(obj), cb(func) { } ~callback(); unsigned int run(string actionstr, ostream& os, header_container& hdrs, cgiparam_container& params) { return (callee->*cb)(actionstr,os,hdrs,params); } private: T* callee; unsigned int (T::*cb) (string,ostream&, header_container&, cgiparam_container& ); }; class context { public: context(); ~context(); unsigned int dispatch_action(string actionstr, ostream& os, header_container& hdrs, cgiparam_container& cgiparams); protected: void register_action(string,callback<context>*); private: map<string,callback<context>*> actions; };
The implementation of register_action() looks like this: void context::register_action(string action, callback<context>* cb) { actions[action] = cb; }
And this is used in the following way (called from within the context class): register_action("foobar",new callback<context>(this,&context::some_method));
Now, the problem that I have is that that adding callbacks that are not actually within the context class but within a derived class is only possible with lots of ugly casting (but it seems to work afterwards): #define REGISTER_ACTION(name,action) do { register_action((name),new callback<context>((context*)this, (unsigned int (context::*)(std::string, std::ostream&, net::header_container&, net::cgiparam_container&))&action)); } while(0)
Do I see correctly that C++ seems to have certain problems with derived classes and templates, especially when it comes to using pointers to methods? Any input is welcome on how to remove this ugly casting while keeping the same functionality.
Tuesday, May 9. 2006
Although probably not quite finished yet, I decided to make my HTTP server stack for C++ publicly available (use Subversion to check out). Running "make" first compiles the stack itself an then simplehttpd, a simple HTTP server example only able to serve static files, and implementing simplehttpd was pretty easy.
And that leads me to a topic that I discussed before here. During implementation of simplehttpd, I thought about embedding some interpreter, in order to make it in some way possible to have dynamic content. During the research about this topic, I came across NJS, an embeddable, LGPL-licensed JavaScript interpreter. According to the examples delivered with it, it seems to be pretty easily usable, and even features cool stuff like compilation to byte code before execution. When I saw this, it immediately triggered my wildest fantasies of bytecode caching mechanisms.
Now I have two options: I could simply embed NJS into the existing simplehttpd (whenever such a server-side JS file is requested, it is executed by the interpreter and its output is sent back to the client), or I could build a servlet container around NJS. What do you think?
Wednesday, May 3. 2006
What a productive night... after several months, I just again wrote some lines of code. What did I do? I took the HTTP server stack that I built a few months ago in my effort to create a fast Ruby servlet container (which failed due to technical limitations in the Ruby interpreter), streamlined it in a few points, and used it to build a simple HTTP server that delivers static files, only. So far, the HTTP server is really very simple, but the size of both the stack and the server are quite nice: the stack contains of about 1000 lines of C++ (employing the C++ standard library, btw), while the server consists of about 200 lines. The latter one is still quite buggy in a few places (the prevention of /../../ URLs doesn't work quite correctly yet, nor does the directory index listing), but in its basic functionality, it works pretty good. Expect a release of both in a few days, when everything has been freed from the obvious bugs and cleaned up.
Monday, April 24. 2006
Once in a while, I'm googling for feedback on various projects that I implemented in the past. One of these projects is ContraPolice, which resulted in a brief paper and a prototype implemented as a patch for dietlibc. In fact, I'm really proud of what I produced here, because it is a straight-forward and simple solution to a big problem in IT security. While I don't claim that it's perfect (no solution is), it's supposed to be effective in (wild guess) 90 % of all incidents it tries to prevent.
Also, other security researchers came upon ContraPolice, most notably Yves Younan, who mentioned it in a number of papers, including a lecture at 22C3 on memory allocator security.
What I regret most about it is that I didn't put any further work into it. The concept of ContraPolice itself could be definitely improved in some points (Yves points out some weaknesses in his papers), and the prototype could be made more complete. I could also have tried to try to present it at some conference by myself, as it is definitely something that makes sense and is interesting to a number of people, if it only got out of prototype stage. Oh, actually it does already, e.g. the Annvix project mentions ContraPolice as one of their hardening technologies they employ.
And on a funny sidenote, I also found a call for help in a Linux forum where ContraPolice seems to have brought a bug to notice, with the consequence that the user is unable to install Mandriva 2006 onto his RAID. And according to the changelog found here this issue only seems to come up on x86_64...
Tuesday, March 7. 2006
So, seit 18:00 hab ich meinen ersten ChvT-Dienst (Charge vom Tag, eine Art militärischer Portier) in der SteKo Linz. Der Unterschied zu dem bisherigen einen ChvT, den ich im SanAusbZg zu schieben hatte, ist, dass ich Internet sowie Fernsehen zur Verfügung habe, und (nonanet) darf ich mein Notebook verwenden. Und da kommen dann so Sachen raus wie ein Programm, das mir ausrechnet, wie meine aktuelle Lage (für alle Outsider: die Anzahl der Tage bis zum Abrüsten) berechnet.
Im übrigen wollte ich schon längstens Adjustierungs-pr0n Fotos von mir mit diversen Ausrüstungsgegenständen und Uniformteilen veröffentlichen, leider hab ich bis jetzt immer meine Digicam vergessen. Naja, in den nächsten paar Tagen. Versprochen.
Friday, December 23. 2005
This is probably already well-known to some of you, but nevertheless, I played a little bit with C++ templates, and wrote a piece of C++ template code that lets the compiler computer a number's factorial for you. template <int N> struct fact { enum { value = N * fact<N-1>::value }; }; template <> struct fact<1> { enum { value = 1 }; };
Not very useful, but nevertheless fun. You use it the following way: std::cout << "5! = " << fact<5>::value << std::endl;
Wednesday, December 14. 2005
# replace *text* by '''text'''. p = re.compile('*([^*]+)*',re.MULTILINE) text = p.sub(r"""'''\1'''""",text)
So many quotes, just awful, and so absolutely un-intuitive.
Saturday, December 3. 2005
Time for some BSD bashing.
A few days ago, I got a bug report for akpop3d, the POP3 server that I wrote a few years ago. The author of that mail told me that akpop3d on FreeBSD only binds to a tcp6 socket, and thus is not usable from IPv4 networks. Well, that sounded very strange to me, but I did some research on that topic, and that's the reason for this strange behaviour:
In akpop3d, I implemented a mechanism for getting a server socket that tries out all available socket types, and uses the first one that binds successfully. Why? First of all, because Unix Network Programming, IMHO the reference on network-related programming on Unix-like operating systems, says so. The reason stated in this book for why the code is how it is is that this is the way to be as independent from the available socket types as possible, or short: with that code, the program both works with IPv6 (if available) and IPv4.
So, as a consequence, when IPv6 is available as socket type, akpop3d tries to bind to it. Now, one cool IPv6 feature comes into play, and that is "IPv4-mapped IPv6 addresses", which according to RFC 2553 is there to provide interoperability between IPv4 and IPv6. Yes, that's right, interoperability. This means that when you bind to an IPv6 socket, programs and other hosts that don't speak IPv6 yet are able to connect to that IPv6 service, with the operating system working as a mediator. For the server it's always IPv6, for the client IPv4, and both sides are happy.
Now, this all sounds pretty good, so what's the reason behind the bug report that I mentioned before? Well, a few years ago, itojun, OpenBSD's KAME hacker, wrote a paper with the title " IPv4-mapped address considered harmful", where he claimed that IPv4-mapped addresses would bear a security risk, and so OpenBSD decided to disable the IP6_V6ONLY socket option by default (normally, it's enabled, also enabling that interoperability thing). What I found especially interesting about those security risk claims was that nobody really challenged this, except for Felix von Leitner, who wrotes in his remarks on some BSD scalability tests: That's what itojun has said for ages. When I challenged him to point to even one case that demonstrated anyone was ever negatively impacted by the normal behaviour, he posted a message to bugtraq asking for people to step forward. Nobody did. The executive summary of this whole "IPv4-mapped addresses insecure"-hype is that somebody could send you an IPv6 packet with an IPv4-mapped source address, creating an ambiguity (::ffff:127.0.0.1 could be interpreted as coming from localhost) and thus a security hole, and so the way OpenBSD chose to deal with this problem was to disable IPv4-mapped addresses altogether. Hello?! That mechanism is useful even if you don't use IPv6 networking, simply because of interoperability.
So I did some further research, and found out that not only OpenBSD, but also FreeBSD and NetBSD had switched their behaviour, although it seems that on FreeBSD and NetBSD, the default behaviour is still configurable.
To make my point: this just sucks. I'm not willing to do any workarounds for operating systems that deliberately chose to be broken, and only create additional work with no new real outcome.
Monday, November 28. 2005
I recently heard from nion and others that "assert() is crap". I would like to object this, and back that assertion (harhar!) with a few facts.
So, why is assert() not crap? Well, assert() is an early predecessor of a concept that is now known as Design by contract (DBC). The idea of DBC is that the caller of a function and the callee (i.e. the function) place a contract on what function arguments are allowed (the so-called "preconditions") and which return values are allowed (the so-called "postconditions"). In order to check whether both parties meet the contract, these preconditions and postconditions are formulated by the programmer of the callee, and the compiler (if supported by the language) adds code to check these conditions or some support library checks the conditions. If one of the conditions failed, the program fails hard - as one of the parties has violated the contract, - usually by aborting the program. As DBC has its roots in the OO area, a third condition has been added, the so-called "invariant", which checks the internal consistency of the object before leaving the function.
So why use DBC? Because, if implemented properly for all involved functions, it provides a specification on how the function may be called and what return results can be expected. During development, more attention is paid to the code and specification, and during testing, a lot of bugs (not all, such as incorrect specifications and logical errors) can be detected a lot easier than without DBC. Experience with languages like Eiffel has shown that this way of implementing and debugging code is very efficient, helping the programmer to get a program practically bug-free a lot faster. Of course, as with other techniques, DBC doesn't solve all problems, and one must not solely rely on DBC to find and fix bugs.
Now back to the actual topic, assert(). assert() is a simple mechanism that can help C and C++ programmers to implement simplified variants of DBC. Especially pre- and postconditions are easy to implement with them. So that's why assert() is "not crap", as nion stated. It is a useful tool, can help with debugging and testing and even be disabled when not required anymore.
Last, but not least, the humorous side of engaging this argument:
assert(strcmp("assert()","crap")==0); // leads to "assertion failed"
Saturday, October 8. 2005
Currently, I'm searching for the right scripting language for which I would like to write my servlet container. The original idea was to use my beloved Ruby scripting language, as it contains all the features I need and like. For using it in my servlet container, I also need the possibility to embed more than just one VM instance, to keep the applications strictly separated. And as far as I was able to research, this isn't possible with Ruby.
And now I'm searching for some other object-oriented, embeddable scripting language that bears no arbitrary limit on the number of VM instances within a single process. AFAICS, Python isn't an option, either (nor are all the other hardly known scripting languages that I found), so I'm open for hints or recommendations. I'm willing to use any language that bears the above-mentioned attributes (as I was unable to find a language that matches all requirements).
Saturday, September 24. 2005
As the observing readers of my weblog already I know, I'm working on an HTTP stack as part of a high-performance Ruby servlet container that I want to implement. This weekend, I finished the work on the HTTP stack, and the tests so far look good, so everything seems quite stable. That means: refactoring. Currently, there's only one piece of code that doesn't look to good in my eyes, as it is to "un-C++-ish", but as I'm not a regular user of the C++ language and especially its standard library, and have no good ideas how.
That piece of code is an URL decoder, and looks like the following:
static inline int h2c(char hi, char lo) { std::string s; int value = util::PARSE_FAILED; s += hi; s += lo; std::istringstream is(s); is >> std::hex >> value; return value; } std::string util::url_decode(const std::string& s) { std::string result; std::istringstream iss(s); char c; for (iss >> c;!iss.eof();iss >> c) { if ('+' == c) { result += ' '; } else if ('%' == c) { char c1 = 0, c2 = 0; iss >> c1; iss >> c2; if (!iss.eof()) { int r = h2c(c1,c2); if (r != PARSE_FAILED) { result += static_cast<char>(r); } } } else { result += c; } } return result; }
Any recommendations?
Sunday, September 18. 2005
As I mentioned yesterday, I'm working on wrapper around Unix sockets. Well, just in case somebody else also wants to use the __gnu_cxx::stdio_filebuf GNU extension for marrying file descriptors or C FILE*s with C++ iostreams, one word of warning: don't try to create one iostream for both directions. Instead, create everything separated, from the FILE* to the stdio_filebuf to the istream and ostream, otherwise one of the two channels (in my case the out channel) gets messed up. Here's a little code snippet to demonstrate that:
FILE * ifh = fdopen(fd,"r");
FILE * ofh = fdopen(fd,"w");
__gnu_cxx::stdio_filebuf * ibuf = new __gnu_cxx::stdio_filebuf(ifh,std::ios_base::in,1);
__gnu_cxx::stdio_filebuf * obuf = new __gnu_cxx::stdio_filebuf(ofh,std::ios_base::out,1);
std::istream * conn_istream = new std::istream(ibuf);
std::ostream * conn_ostream = new std::ostream(obuf);
Why the extra wrapping with FILE *, even while stdio_filebuf provides a constructor where file descriptors can be put in? Because I made the experience that buffering doesn't quite work as you expect. You need to provide a buffer size, and it only seems to return data to e.g. getline() when the buffer is completely filled. This leads to weird behaviour that getline(istream,strobj) leads to strobj == "ASDF" when the buffer size was 4 and the input was actually "ASDFQWERT\n". So, that's why I'm putting FILE * in between, as it gives me the behaviour that I expect (see principle of least surprise).
Saturday, September 17. 2005
In this blog, I already wrote about developing a new content management system. Well, after doing some research, I concluded that, in order to get a reasonable performance, I need to develop a high-performance servlet container for Ruby (think of Tomcat for Ruby).
I evaluated a few available ready-to-use HTTP server stacks, but I liked none of them (I was always forced to use threads), so I decided to develop my own one, in C++. Currently, I'm at an early level, all I've got is a nice and easy-to-use wrapper around Unix sockets, providing only the socket functionality that I need. But a nice feature is that it enables me to do the network communication via C++ iostreams. Yes, that's right, no more low-level read()/write()/send()/recv()/whatever.
|