Zack Rusin: 2008

Thursday, August 28, 2008

SVG in KDE

"Commitment" is one of the words that have never been used in this blog. Which is pretty impressive given that I've managed to use such words as sheep, llamas, raspberries, ninjas, donkeys, crack or woodchuck quite extensively (especially impressive in a technology centric blog).

That's because commitment implies that whatever it is one is committed to plays an important role in their life. It's a word that goes beyond the paper or the medium on which it was written. It enters the cold reality that surrounds us.

But today is all about commitment. It's about commitment that KDE made to a technology broadly refereed to as Scalable Vector Graphics. I took some time off this week and came to Germany where I talked about usage of SVG in KDE.

The paper about, what I like to call, the Freedom of Beauty, is available here:

https://www.svgopen.org/2008/papers/104-SVG_in_KDE/

It talks about the history of SVG in KDE, the rendering model used by KDE, it lists ways in which we use SVG and finally shows some problems which have been exposed by such diverse usage of SVG in a desktop environment. Please read it if you're interested in KDE or SVG.

Hopefully this paper marks a start of a more proactive role KDE is going to be playing in shaping of the SVG standard.

Tuesday, August 26, 2008

Fixes in Sonnet

As we all know inner beauty is the most important kind of beauty. Especially if you're ugly. Not ugly, don't sue me, I meant to say "easy on the eyes challenged". That's one of the reasons I like working on frameworks and libraries. It's the appeal of improving the inner beauty of certain things. I gave up on trying to improve the inner beauty of myself (when I was about 1) so this is the most I can do.

You can do it too. It's real easy. I took this week off because I'm going to Germany for SVG Open where I'll talk about SVG in KDE and today fixed a few irritating bugs in Sonnet.

One of the things that bugged me for a while was the fact that we kept marking misspelled text as red instead of using the God given red squiggly underline. Well, I say no more!

Our spelling dialog lists available dictionaries now and one can change them on the fly. That's good. Raspberries good. And raspberries are pretty darn good. Even sheep like raspberries. Or so I think, the only sheep I've ever seen was from a window of a car and it looked like an animal who enjoys raspberries. Who doesn't? The only problem was that it liked listing things like "en_GB-ise" or "en_GB-ize-w_accents" as language names which is really like a nasty bug in the raspberry. And what do you with bugs? I'm not quite certain myself but given the way this blog is heading it's surely something disturbing... Anywho. that's also fixed. Now we list proper and readable names. As in:

Working on Sonnet is a lot of fun. A small change in a pretty small library affects the entire KDE which is rather rewarding. So if you wanted to get into KDE development in an easy and fun way go to https://bugs.kde.org search for "kspell" or "sonnet" pick an entry and simply fix it!

Wednesday, August 20, 2008

Fast graphics

Instead of highly popular pictures of llamas today I'll post a few numbers. Not related to llamas at all. Zero llamas. These will be Qt/KDE related numbers. And there's no llamas in KDE. There's a dragon, but he doesn't hang around with llamas at all. I know what you're thinking: KDE is a multi-coltural project surely someone must be chilling with llamas. I said it before and I'll say it again, what an avarage KDE developer, two llamas, one hamster and five chickens do in a privacy of their own home is none of your business.

Lets take a simple application, called qgears2, based on David Reveman cairogears and see how it performs with different rendering backends. Pay attention to zero relation to llamas or any other animals. The application takes a few options, -image: to render using a CPU based raster engine, -render: to render using X11's Xrender and -gl to render using OpenGL (-llama option is not accepted). It has three basic tests, "GEARSFANCY" which renders a few basic paths with a linear gradient alpha blended on top, TEXT that tests some very simple text rendering and COMPO which is just compostion and scaling of images.

The numbers come from two different machines. One is my laptop which is running Xorg server version 1.4.2. Exa is 2.2.0. Intel driver 2.3.2. GPU is 965GM, CPU is T8300 at 2.4GHz running on Debian Unstable's kernel 2.6.26-1.
The second machine is running GeForce 6600 (NV43 rev a2), NVIDIA proprietary driver version G01-173.14.09, Xorg version 7.3, kernel 2.6.25.11, CPU is Q6600 @ 2.40GHz (thanks to Kevin Ottens for those numbers, as I don't have NVIDIA machine at the moment).

The results for each test are as follows:

GEARSFANCY
	I965	NVIDIA
Xrender	35.37	44.743
Raster	63.41	41.999
OpenGL	131.41	156.250

TEXT
	I965	NVIDIA
Xrender	13.389	40.683
Raster	(incorrect results)	(incorrect results)
OpenGL	36.496	202.840

COMPO
	I965	NVIDIA
Xrender	67.751	66.313
Raster	81.833	70.472
OpenGL	411.523	436.681

COMPO test isn't really fair because as I mentioned Qt doesn't use server side picture transformations with Xrender but it shows that OpenGL is certainly not slow at it.

So what these results show is that GL backend, which hasn't been optimized at all, is between 2 to 6 times faster than anything out there and that pure CPU based Raster engine is faster than the Xrender engine.

So if you're on an Intel GPU, or NVIDIA GPU rendering using GL will immediately make your application a number times faster. If you're running on a system with no capable GPU then using raster engine will make your application faster as well.
Switching Qt to use GL backend by default would result in all applications running a magnitude times faster. The quality would suffer though (unless HighQualityAntialiasing mode would be used in Qt in which case it would be the same). This certainly would fix our graphics performance woes and as a side-effect allow using GL shaders right on the widgets for some nifty effects.
On systems with no GPU raster engine is a great choice, on everything else GL is clearly the best option.

Friday, June 27, 2008

Accelerating desktops

In general I'm extremely good at ignoring emails and blog posts. Next to head-butting it is one of the primary skills I've developed while working on Free Software. Today I will respond to a few recent posts (all at once, I'm a mass-market responder) about accelerating graphics.

Some kernel developers released a statement saying that binary blobs are simply not a good idea. I don't think anyone can argue that. But this statement prompted a discussion about graphics acceleration, or more specifically a certain vendor who is, allegedly, doing a terrible job at it.

First of all the whole discussion is based on a fallacy rendering even the most elaborate conclusions void. It's assumed that in our graphics stack there's a straight forward way between accelerating an api and fast graphics. That's simply not the case.

I don't think it's a secret that I'm not a fan of XRender. Actually "not a fan" is an understatement I flat out don't like it. You'd think that the fact that 8 years after its introduction we still don't have any driver that is actually real good at accelerating that "simple API" would be a sign of something... anything. When we were making Qt use more of the XRender api the only way we could do that is by having Lars and I go and rewrite the parts of XRender that we were using. So what happened was that instead of depending on XRender being reasonably fast we rewrote the parts that we really needed (which is realistically just the SourceOver blending) and did everything else client side (meaning not using XRender)

Now going back to benchmarking XRender. Some people pointed out an application I wrote a while back to benchmark XRender: please do not use it to test a performance of anything. It will not respond to any real workloads. (also if you're taking something I wrote to prove some arbitrary point, it'd be likely a good idea to ping me and ask about it. You know on account of writing it, I just might have some insight into it). The thing about XRender is that there's a large amount of permutations for every operation. Each graphics framework which uses XRender uses specific, defined paths. For example Qt doesn't use server-side transformations (they were just pathetically slow and we didn't feel it would be in the best interest of our users to make Qt a lot slower), Cairo does. Accelerating server side transformations would make Cairo a lot faster, and would have absolutely no effect on Qt. So whether those tests pass with 20ms or 20hours has 0 (zero) effect on Qt performance.

What I wanted to do with the XRender performance benchmarking application is basically have a list of operations that need to be implemented in driver to make Qt, Cairo or anything else using XRender fast. "To make KDE fast look at the following results:" type of thing. So the bottom line is that if one driver has for example result of 20ms for Source and SourceOver and 26 hours for everything else and there's second driver that has 100ms for all operations, it doesn't mean that on average driver two is a lot better for running KDE, in fact it likely means that running KDE will be five times faster on driver one.

Closed sourced drivers are a terrible thing and there's a lot of reasons why vendors would profit immensely from having open drivers (which is possibly a topic for another post). Unfortunately I don't think that blaming driver writers for not accelerating graphics stack which we went out of our way to make as difficult to accelerate as possible is just a good way of bringing that point forward.

Monday, June 02, 2008

Animated interfaces

Lately I've been writing a lot about frameworks, today I want to take a step back and talk about a "technique". "Drunken master"/"Praying Mantis" kind of foo pertaining to animations.

Over the years of writing animated user interfaces I've developed a set of rules that I follow when writing animations. It's been a checklist that I've been following almost religiously. Much like my morning list of "1) Open eyes, 2) Check for dead bodies in the bed, 3) Around the bed, 4) if 2 and 3 are negative brush teeth and take a shower, otherwise prepare for a very bad day", which is the main reason why I never had a bad day in my life. Which is another good lesson to learn - very low expectations make for a very fulfilling life.

I've realized that those rules might be useful to others so I'll write a bit about them today. I guarantee you that if you'll follow them the animations that you'll add to any user interface will not make any of your users want to stab you, which again, following the low expectations lesson from the above, is a making of a great day. In fact following these rules will make your UI rock, which even if you have high expectations is a desirable quality.

So without further ado, here are my rules:

Anger rule:
Creating animations is a lot of fun. Which in turn makes the act of adding animations to a user interface a happy activity. When we're happy we're willing to endure a lot more abuse. In particular ignore or not even notice something that is very irritating. Unfortunately computer UIs are usually used by people who are not happy at all (e.g. they're at work) and their perception of what seemed like a neat animation to you when you were in a great mood will be vastly different. So always, always make sure you've experienced all of your animations when being angry. If they haven't irritated the hell out of you, congratulations you are on to something.

Blind interpolator rule
Find someone who has never seen the animation you're designing, tell them to close their eyes as soon as the animation starts. Ask them how they think it ended. If their brain isn't able to fill in the blanks and figure out how the animation ends then the animation does something unexpected that will force your users to learn it. For a user interface to be intuitive you have to avoid forcing users to learn its behavior. It has to come naturally.

The timing rule
This one is tricky. Timing your animation correctly is one of the hardest things to do. I use a two step approach to figure this one out:
- follow physics - so follow timings from a real world, e.g. if something is falling let it lasts as long as it would if you had dropped something in real world,
- make it fast - if animation lasts too long people try to stop it by hitting any random key on the keyboard. From user-interface perspective what you definitely want to avoid is having your users hitting random keys while the application is running.
Animations in user-interfaces need to be very, very short. The point of them is to give subtle hints as to where things are coming from. They're an aid in understanding computers, not a graphical effect that is meant to impress. I tend to violate this rule because if I spend 2 hours writing a really neat animation I'll be damned if everyone won't be forced to look at it. But that is wrong! Very wrong. Subtlety is the key here. If the animation is running on a desktop a good rule of thumb is the "escape key rule" - if your animation is longer than the time required to move a hand from the mouse and hit the escape key, the animation is too long.

No sci-fi rule.
Also known as the 'avoid goofy and crazy things rule'. Effects grounded in reality will make sure your interface is easier to learn and more intuitive. For user interfaces wacky and cool don't imply "good", in fact it's usually just the opposite. These are not games where "wacky" and "cool" are desirable qualities.

The refresh rule

Make your animation run at the number of frames per second equal to the refresh rate of the output device and synchronize the updates with vertical retrace.

Lately I got obsessed with trying to automatically figure out what is an optimal number of frames per second for animations in user interfaces. I can obsess with the craziest of them so last week I added this rule.

What do you think, how many frames per second should an animation be running at? 15? 24? 30? 40? 60? Coincidentally those are also this weeks winning lottery numbers. The answer is "it depends". It is highly dependent on the refresh rate of the output device. The "you need 24fps (or 30fps or even 60fps) to achieve smoothness" is a myth. No one knows how many frames per second humans can actually perceive but pilots were able to decipher kinds of planes shown for 1/220th of a second. So it seems that we could actually recognize objects at 220fps. How many would we require to not notice any frames is a question without an answer right now but it's likely that you'd need more than 400fps to do it. None of the commercially available display devices can refresh at that speed. So ideally what you want to do is synchronize the number of frames per second to a refresh rate of your output device. Here's an example you can play with: http://byte.kde.org/~zrusin/animsync.tar.bz2. You'll see a square moving back and forth in a window like this:

You can specify the number of frames per second on the command line and passing "-s" option will sync the animation to the vertical retrace of your output device (assuming your GL driver supports it, which, unless you're running DRM head or the closed NVIDIA driver is unlikely). Experiment with it a bit.

So, these are my rules. They're not laws so in certain situations you might need to break one of them but if you do, you better have a very good explanation that you can back up with some facts for why you're doing so.

Monday, March 03, 2008

No black here

Sup, y'all. I realized that Free Software is a lot like the wild west used to be. So, partner, I'll be spreading some "west" and a lot of "wild" over this post.
"What?", you say (oh I'll have a conversation with you whether you want it or not). Well, the connection is obvious once you think about it: during the wild west days people used to ride horses, kill each other for no apparent reason and raise cattle, while in Free Software we write software. I rest my case.

I've spent the last week with Aaron. I absolutely love hanging out with him. It's platonic. Or so I think, with all the heavy drinking that I do, it all gets a little blurry. Also, Peyton (Aaron's son) is a wickedly cool kid.

Anyway, I have a lot of Gallium3D things to do which are a priority, but at nights Aaron and I hacked on Plasma and KDE. I think I speak on behalf of Aaron when I say that we became computer programmers for the women. Which might seem a little confusing to, well all of you (especially if you're a woman) until you realize that it came down to being either a computer programmer or a crackhead. Computer programmer job pays, like, way better and if I had to pick second reason why I do what I do it's money.
We got the Dashboard widgets working. It's been something that I wanted to do for the longest time. Obviously not all of them work because some of them use OS X specific apis (like Core Image magic).
I also added interfaces to use Plasma's DataEngine's from JavaScript in web applets. So you would do something like


var engine = window.plasma.loadDataEngine("time");
var data = engine.query("Local");
document.getElementById('time').innerHTML = "Time is " + data.value("Time");

to use Plasma's time data engine to display the current time. One could use Plasma's Solid data engine to get the list of all the devices attached to the computer and display it in the web applet which would be a little more useful than another a time widget but you ain't enterprise ready unless you have 23 clock applets and we're almost there. There's also a small bug somewhere, that apparently doesn't exist in Qt and is, in fact, a figment of my own imagination due to which the background looks black. On Chuck Norris widget (that like a lot of other dashboard widgets just works. You download it, click on it, run it, show it to all your friends, and remove it once they're gone because it's pretty damn useless) it looks like this:

Do you see black? No, you don't! It's simply that Chuck Norris is a black hole that consumes everything around it, including all the color. Deal with it. Chuck Norris has.

Thursday, February 07, 2008

OpenVG and accelerating 2D

I tend to break a lot of keyboards. Not because I release all the aggression that I hold deep within me on them, but because I drool a lot. It's not my fault, I really do love graphical bling. Since I'm one of the people who flourishes not when he's doing well, but when others are doing just as badly I've thought about getting other people on the "excessive drooling" bandwagon.

I've tried it in the past. First with my "you ain't cool, unless you drool" campaign, which was not as successful as I've seen it be in my head. It made me realize that marketing is unlikely one of my superpowers. That was a real blow especially since it came a day after I've established that there's like a 95% chance that I can't fly and if I can, then my neighbor will be seriously pissed if I keep landing on his car. You'd think they'd build them stronger, but I digress. After that I went with my strengths and had two technical efforts. The first one led to Exa the second to Glucose. Both are acceleration architectures that try to accelerate Xrender - the API which we use for 2D on X. What Xrender is very good at is text. What Xrender is not so good at is everything else.

Furthermore what one really wants to do nowadays is use the results of a 2D rendering in a 3D environment as a texture or simply implement effects on top of the 2D rendering with shaders. Wouldn't it be nice to have a standard and simple API for all of that? You bet your ass it would. In this particular case "you bet your head" would be a more suitable expression, since by a simple act of reading this blog it's clear you already gave up on your ass and stake your future on your head. I endorse that (both the head more important than ass theory and better api idea). Currently, through the magic of DRM TTM and GLX_texture_from_pixmap one could achieve partially that (we'd need GLX_pixmap_from_texture to finish it), but if you've seen Japanese horror movies you know they got nothing on the code one ends up with, when doing that.

I already mentioned in my previous post that we can lay any number of API's on top of Gallium3D. In fact in the last diagram I already put the two graphics API's that interest me on top of it. OpenVG and OpenGL. In my spare time I've started implementing OpenVG on top of Gallium3D. I'm implementing 1.1 which hasn't been officially released yet. While OpenVG 1.0 is essentially useless for our drool-causing desktops because it doesn't even touch the subject of text handling, 1.1 does and that in itself makes it a great low-level 2D vector graphics api.

We already have OpenVG engines for Qt and Cairo which should make the switch fairly painless. "Hey", you say and I swiftly ignore you, because I have a name you know. "Sunshine", you correct yourself and I smile and say "Huh?". "I want my KDE 4 fast and smooth! Gallium3D has this 3D thing in the name and I have hardware that only does 2D, what about me?". Nothing. You need to use other great solutions. "But my hardware can accelerate blits and lines!". Awesome, then this will likely rock your world. As long as you won't try to run any new applications of course. Even embedded GPU's are now programmable and putting the future of our eye-candy on a technology that predates 2 year old embedded GPU's is an exercise in silly which my chiseled pecs refuse to engage in.

OpenVG standard says "It is possible to provide OpenVG on a platform without supporting EGL. In this case, the host operating system must provide some alternative means of creating a context and binding it to a drawing surface and a rendering thread." which is exactly what we want. That's because we already have that layer, it's GLX. GLX will do the context creation for us. This also means that we'll be able to seemingly combine 2D vector graphics and 3D and manipulate the results of vector rendering the same way we would normal 3D.

Finally 2D graphics will be accelerated the same way 3D is and those hours which you've spent playing 3D games thinking "how the hell is it possible that putting a clock on my desktop makes it choppy when this runs at 400fps?" will be just a story you'll get to tell your grand-kids (while they stare at you with the "please god, let me be adopted" look). As a bonus we get two extremely well documented API's (OpenGL and OpenVG) as our foundation and instead of having two drivers to accelerate 2D and 3D we'll have a single driver.

So what happens with Glucose? Alan and José are still working on it a bit and in a short-term it does provide a pretty enticing solution but long term OpenVG/OpenGL combo is the only thing that really makes sense.

With much love,
Drool Coordinator

Wednesday, February 06, 2008

GPGPU

Would you like to buy a vowel? Pick "j", it's a good one. So what if it's not a vowel. My blog, my rules. Lately I had a major crash on all things "J". Which is why I moved to Japan.

It's part of my "Most expensive places in the world" tour, unlikely coming to a city near you. I lived in New York City, Oslo, London and now Tokyo. I'm going to write a book about all of that entitled "How to see the world while having no money whatsoever". It's really more of a pamphlet. I have one sentence so far "Find good friends" and the rest are just pictures of black (and they capture the very essence of it).

José Fonseca helped me immensely with the move to Japan, which was great. Japan is amazing, even though finding vegetarian food is almost like a puzzle game and trying to read Japanese makes me feel very violated. So if you live in Tokyo your prayers have been answered, I'm here for your pleasure. Depending on your definition of pleasure of course.

Short of that I've been working on this "graphics" thing. You might have heard of it. Apparently it's real popular in some circles. I've been asked about GPGPU a few times and since I'm here to answer all questions (usually in the most sarcastic way possible... don't judge me, bible says not to, I was born this way) I'm going to talk about GPGPU.

To do GPGPU there's ATI's CTM, NVIDIA's Cuda, Brooke and a number of others. One of the issues is that there is no standard API for doing GPGPU across GPU's from different vendors so people end up using e.g. OpenGL. So the question is whether Gallium3D could make such things as scatter reads accessible, without falling back to using vertex shaders or vertex shaders/fragment shaders combination to achieve them.

Core purpose of Gallium3D is to model the way graphics hardware actually works. So if the ability to do scatter reads is available in modern hardware then Gallium3D will have support for it in the API. Now having said that, it looks like scatter reads are usually done in a few steps, meaning that while some of the GPGPU specific api's expose it as one call, internally number of cycles passes as few instructions are actually being executed to satisfy the request. As such this functionality is obviously not the best to expose in a piece of code which models the way hardware works. That functionality one would implement on top of that api.
I do not have docs for the latest GPU's from ATI and AMD so I can't say what it is that they definitely support. If you have that info let me know. As I said the idea being that if we'll see hardware supporting something natively then it will be exposed in Gallium3D.

Also you wouldn't want to use Gallium3D as the GPGPU api. It is too low level for that and exposes vasts parts of the graphics pipeline. What you (or "I" with vast amount of convincing and promises of eternal love) would do is write a "state tracker". State trackers are pieces of code layered on top of Gallium3D which are used to do state handling for the public API of your choice. Any api layered like this will execute directly on the GPU. I'm not 100% certain whether this will cure all sickness and stop world hunger but it should do, what even viagra never could, for all GPGPU fanatics. The way this looks is a little like this:

This also shows an important aspect of Gallium3D - to accelerate any number of graphical API's or to create a GPU based non-graphics API, one doesn't need N number of drivers (with N being the number of API's), as we currently do. Gallium3D driver (that's singular!) is enough to accelerate 2D, 3D, GPGPU and my blog writing skills. What's even better is that of the aforementioned only the last one is wishful thinking.

So one would create some nice dedicated GPGPU api and put it on top of Gallium3D. Also since Gallium3D started using LLVM for shaders, with minimal effort it's perfectly possible to put any language on top of GPU.

And they lived happily ever after... "Who" did is a detail, since it's obvious they lived happily ever after thanks to Gallium3D.