About Me

Sunday, February 08, 2009

Synchronizing a multimedia graph system

One of the things I've been working on (well, mostly planning so far) is a graph system for Multimedia (similar to DirectShow, but there's a reason why I'm not simply using DirectShow that I'll explain another time).

I've spent some time figuring out how to best synchronize modules in the graph that have latency, such that the output is synchronized. Every Module that has some sort of latency (such as a sound card output module, or a FFT convolution module, for example) reports this latency to the underlying Graph.  


Based on this, the Graph determines the downstream latency of each module. The latency of modules that are serially connected are added together, and at junction points the maximum latency is used. 
Once this is done, the graph then strategically inserts delays. At junction points, delays are inserted into the appropriate output connections such that the downstream delay in the destination module equals the downstream delay of the source module.

As an optimization, the graph could forego delays and instead use different seek timings when upstream modules support seeking. Also, modules marked as free running need not be delayed (the modules themselves should say if they are free running or not).

I welcome any and all comments.

Wednesday, November 19, 2008

Reflection on WDM/KS

So I finally got all the kinks worked out of our WDM Kernel Streaming audio stuff at work, and I've decided to write up a bit of a "Don't forget or you'll die" section (I'm not exaggerating - doing something wrong is likely to cause a bluescreen. I'm not kidding.)

Keep the buffers around that you use to send data to/from the pins. The kernel sees your reads and writes not as a synchronous call, but as an asynchronous task. When rendering, after you issue the write call, it will access your buffer at will to get the data, until it triggers the event that it's done. When capturing, it will write stuff into your buffer after your read call until the event is triggered. It won't copy it away for you like other APIs will. 

Make sure you set the pins to KSSTATE_RUN, and make sure you enable them. Otherwise, you'll be surprised that events simply don't get called. 

Don't rely on _ANYTHING_. Check every HRESULT religiously. 

Make sure your audio thread is elevated to realtime, otherwise expect crackles even at the most extreme latency settings.

Make sure you shut everything down correctly. Even if you set the pins to KSSTATE_STOP, if you don't explicitly shut down the audio pin many buggy sound drivers will then cause a blue screen. I had to debug this problem via a kernel debugger, even - the driver runs in kernel mode and so is free to snoop around your program's address space. Once your process quits, and the sound driver doesn't catch that, it'll trigger a kernel-mode access violation (STOP 0x0000000A, IRQL_NOT_LESS_OR_EQUAL). This also means that if your app crashes, and you have one of those buggy sound drivers, a blue screen is inevitable.

WDM kernel streaming is a mess, truly. Everything that one could consider good is buried beneath a whole lot of stuff that has to be done, is designed weird, and must be remembered because otherwise you'll get no sound, a crash, or a bluescreen. What doesn't help is that there's not really that much strain put on the audio drivers to actually implement things, which means you have to individually check everything. One of the computers I tested on literally didn't support querying for the name of the device! Some sound drivers allow multiple render pins, which allow the kmixer and your app to peacefully coexist. Some only allow one, which means it's either you or kmixer (which means many, many support calls).

I do have to say, however, that the flexibility that the WDM model allows is stunning. I just wish they could have put it in an easier to use package. 

Friday, October 31, 2008

WDM/KS

Wow.

Kernel Streaming is hard.

At work, I have to implement Kernel Streaming support for Windows XP computers. XP, unfortunately, doesn't have Vista's nice WASAPI model which is a very, very nice API and gives low-latency audio. DirectShow etc. all go through XP's kmixer, which adds an inherent 30ms latency, and this is unacceptable for what I need to do. So, I get to implement WDM Kernel Streaming. 

And it is hard.

I think this is one of the hardest things I've done. You really need to get everything right, otherwise it just won't work. I have much more respect than ever for Michael Tippach, author of ASIO4ALL which uses Kernel Streaming among other things to get things to work. 

Tuesday, October 21, 2008

Digsby Widget

I've got a Digsby Widget here now, and I dropped the Meebo widget since I don't use that anymore.

Life

I should blog more.

I went through and deleted some older, silly, irrelevant, blah blog posts, and I intend to hopefully blog more. Hah.

So, how's life?

  • I'm in the process of getting my CE Driver's license (the equivalent of Class A Truck License in the US) for the Fire Department. Fuck yes!
  • Working at rocketscience GmbH, becoming the absolute master C++ dude and gaining a ton of experience I never would have in 20 years of University. Yeah, quitting was worth it.
  • When I'm not busy as fuck, I try to work on Epiar. I'll be reworking the code internally to a client/server system, and also getting multiplayer to work (at least rudimentarily).
Fun stuff. 

Tuesday, November 14, 2006

Cygwin - oh how I loathe it.

I actually wrote this rant about 2 years ago, but I never got around to actually submitting it.

In the last 4 years, I've installed Cygwin probably 8 different times. It never works on the first try, and its always something different. I was forced to install it again, for some Software I was required to use for class. Last night, I downloaded setup.exe, ran it, selected a mirror, and installed the default packages, which included vi and gcc, and I added Tcl/Tk. The first time, it got through about 80% of the install process before it aborted on its own, something about not being able to connect to the mirror. Fair enough, that happens. So I run setup.exe again and select a different mirror. I would have expected that it only downloaded that which it hadn't already downloaded, but instead it decided to download everything again. Fair game, I thought, I have a flat rate anyway. By that time, I'm tired so I go to bed. 

Next morning, its finished installing, though it gives me an error about a missing cygncurses-8.dll, yet still telling me it installed successfully. I tried to open up the shell, just to find that it doesn't work because cygncurses-8.dll doesn't exist (apparently it's a dependency). I did a search on my computer, and it was nowhere to be found. I load up setup.exe again, look for libncurses8, and it says its already installed. I tell it to reinstall only libncurses8, and in the process, it installs a bunch of other stuff I don't know about. The problem still persisted, however, and cygncurses-8.dll still didn't exist on my machine. I did a quick Google search and found some University software which had a copy of it in its binary distribution, so I downloaded that and put the .dll where it belongs. 

Once that was done, bash started up, but there were no fancy colors like you usually see in the Cygwin shell. I typed ls and was greeted with a friendly "bash: ls: command not found" - the equivalent of a hearty "fuck you, haha" reverberating from C:\cygwin\. I run setup.exe again, tell it to uninstall. Before it uninstalls, it asks me if I want to install certain components which had dependencies - in other words, it asked me if I wanted to install everything that I'm about to remove to fill these dependencies. I said no, and it proceeded to uninstall. "Uninstall", in that it deleted a few files but left the bulk where it is (what a great installer). I deleted most of the cygwin tree and ran setup.exe again and did a clean install. 'Lo and behold, this time it just happened to work, and bash loaded up with the pretty colors and a working ls command. But wait - I installed the same packages as I did above, yet everything that I could ever need cygwin for it decided it wouldn't install afterall. That means, no text editor for writing a script (I tried vi, emacs, nano, pico, joe, even ed), and no C or C++ compiler. I had to go back and install those again, with that absolutely godawful setup.exe tool.

So, that's that. I've had a few more troubles since. I installed Cygwin, which just happened to work fine the first time, to be greeted by, again, the friendly "ls: command not found" prompt. After some rather ... "interesting" conversation with two cygwin developers on IRC, I was told that most likely I had some other software installed that was using cygwin. I couldn't think of any, though the problem was right in front of me - I was using irssi for Windows, which used cygwin-1.dll internally. Apparently, Cygwin cannot work with more than one cygwin.dll, so once I quit irssi, I was able to correctly get cygwin to work by ... rerunning setup.exe. 

So the above is bad enough, but this whole thing combines with those very interesting cygwin developers that I have been in contact with because of issues. EVERY TIME, it was MY fault that it didn't work - I did something wrong, I did something "weird" to my Computer, I hacked the cygwin install so it didn't work on purpose, I installed irssi knowing it would fuck up the install. Well, I didn't know the irssi binary used cygwin, and I also didn't know cygwin was so fucked up that it's only possible to have one .dll of it otherwise they'll step all over each other with weird, seemingly impossible bugs. Oh yay. If I wrote code like that at work, I'd be fired immediately. And what gives you, dear cygwin developers, the moral right to insult me and my intelligence in this manner? I was thoroughly friendly throughout the entire exchange, I did not get condescending even though you immediately accused me of doing everything wrong, and I stayed calm when I found out a major shortcoming of cygwin was causing this.

It's a free product though, so I can't vote with my wallet. That's fine, I'll vote with something else instead - freedom of speech. I don't intend to use cygwin ever again, and now that I'm out of University, I can't think of a single reason why would ever need to. I'd rather run linux in VMWare or on another partition or on another computer, and I urge you to do the same. 

Saturday, October 28, 2006

Driving for the Fire dept.


I'm finally allowed to drive it! That is one of the three vans our Fire Department has, and I'm allowed to drive that one now. Its basically the "if it don't fit anywhere else, put it here" van - it has a myriad of hoses, pumps, oil repellant, shovels, brooms, some containers for things, an industrial-grade wet/dry vacuum cleaner, and a bunch of other stuff.
I was actually surprised the first time I drove it. It is very comfortable, easy to drive, and very direct - theres no power steering (which isn't a big deal at all, unless you're not moving), and you feel pretty much every movement on the road. Unlike "modern" cars, theres a 1 to 1 translation from your hands to the wheels and vice versa, and I like that. Its slow though - its loaded to its maximum of 3.5 tons, only has something like 80HP (Diesel), so it takes quite a bit to get up to speed. But I'll be honest - after driving around in E-Class Mercedes and 5-Series Golfs, its very refreshing to drive something that doesn't immediately jump a million miles forward when you tap the gas pedal.

Thursday, October 19, 2006

Internet Explorer 7 - why the hate?

Internet Explorer 7 is out. I've been using it a bit and seeing the usual Slashdot ramblings ("omg evil") I decided to write this rant.
I've used the major browsers for Windows. Internet Explorer 6 and below are just completely insecure and featureless, Firefox and its cousins are huge, slow, and never work exactly right, theres always at least one small thing that isn't working (and more often, something bigger), not to mention the memory leaks and the occasional lets-use-100%-CPU-for-no-reason that nobody wants to admit is there. Opera used to be cool but it doesn't work with all websites and the recent versions have become very Firefox-like. So I wa using Flock for a while, which at least kinda worked except that it ate and leaked memory like a faucet.
Upon Windows reinstall, I decided to try something new. I got the Windows Internet Explorer 7 beta, and the Windows Defender Beta 2 with it. Installing it required a restart, but that was okay, I needed a restart anyway because of Windows Updates.
After the restart I started IE, and it came up almost instantly. Flock and Opera and Firefox all take ages to start up, IE took just a bit. I browsed around a bit, checked my frequent sites to find they all work perfectly fine. I changed a setting regarding tab behavior and changed the default search to Google, but other than that I've left it as it is. I'm fully satisfied. The ClearType stuff makes text look very, very good. Its also very responsive. Firefox/Flock would frequently hang itself for 1-10 seconds while it did something, like switch tabs. Nothing here, everything works nearly instantly. The Quick Tabs feature is nice for previewing the open tabs (it basically shows a graphical, scaled down view of all tabs and you can click one to select).
Most importantly, I have had exactly 0 problems with Spyware, Adware, or Viruses - even after going to pretty suspect sites. The combination Internet Explorer + Defender pretty much let nothing through (and I'm VERY satisfied with Defender as well, but I'll leave that to another rant).
So why all the hate towards it? Web developers say that a few CSS things are missing. But you're already making separate CSS files for every browser anyway, right? Its not like those are things that can't be worked around, its not like everybody isn't already using different CSS files for different browsers. And Firefox doesn't pass Acid2 either, now does it?
I'll keep using Internet Explorer 7. Finally, its a browser that doesn't make my system implode under the weight, while still having the features I want (tabs, favorites) and being secure.

Wednesday, August 16, 2006

It clicked

The .NET Framework is huge, and really very different from the stuff I've used before. My dads been using it years and he says once it clicks, it'll click. Well, I think it clicked for me today, or at least the network stuff did.

I realized that you just can't be afraid of threads with .NET. Its just so inherently thread oriented that you can't really avoid it. I'm trying to setup a little server thing (more details later), and with C, I would have had maybe two threads maximum, one of them that sends out the data to each client sequentially. I tried this with .NET, and while its possible, it dawned on me that this is not the way it should be done. And it clicked.

My second attempt is much easier to work with. My Accept loop basically accepts a socket, then spawns a thread with the socket as argument. That thread then initializes everything. And each of my sockets just spawns two threads, one for receiving and one for sending, and they simply use a bunch of WaitHandles to wait on something to happen and then do it. Its so much easier and nicer to work with that its pathetic I ever thought a different way.

Its not ideal, of course. You still have to deal with locks and stuff but really, theres just a few conventions which, if you regularly follow them, will make your life just so much easier.

Back to coding....


Tuesday, August 01, 2006

Runtime "critical loop" code generation

This is something I definetely want to do. Its something I've thought of for a while, and a concept which isn't that new, actually.

In the Forth programming language, its not uncommon to see code that is compiled before use at runtime. In fact, most Forth systems work that way .. the application is run by "interpreting" the main source file, which compiles the parts it needs and then executes it. So basically, the whole system is just source code which is compiled on demand. There is no need for relocatable code or dynamic linking or any of that. Its really a very brilliant way of doing things, something that I would like to see implemented in modern languages.

So, what is my idea? I'm sure you can guess by now ... embed source code in the executable and compile it on the fly. This seems pretty useless on its own but my idea has an interesting twist. Suppose this is an Audio/MIDI Sequencer application (such as Sonar or FL Studio or such). Basically, you have several tracks, each with some number of insert effects or such, routing through various bus tracks and finally all arriving at the master track. Probably the most common way of doing it, currently, is tracing from the master track inwards, in a depth-first processing sort of way, probably using a recursive function. This is pretty flexible, as anything you throw at it (except feedback loops and multiple sends) will work without a hitch, and even those things can be easily implemented with 3 or 4 additional lines of code each.

However, it really isn't the fastest. Further optimizations are possible by simply iterating through a queue of tracks, adding receive track at the end as dependencies are found, or even precomputing the ideal order, then run through that list. Its something no audio software maker that I know of specifies details, and I think its simply because its not a great marketing point because theres not alot of magic behind it.

So, my idea? You may know of things like Synthmaker, Synthedit, Max/MSP, CPS, Reaktor, and a few other graphical modular systems (modules interconnected by wires) which can be used to "program" DSP algorithms without touching a line of code. That is a pretty big inspiration for my idea.

How it (should) work: The audio mixdown engine is, pretty much, dynamically generated. And with that I mean compiled code. A signal path tracing is done (through the bus tracks and regular tracks and effects and volumes and pans and whatever), and an in-memory modular representation is created (like the visual modular systems as above, except not visual and dynamically generated). From this, do a depth-first trace, having each module create some sort of bytecode (describing some parralel language, so that SSE/MMX/AltiVec/whatever optimizations can be done), and this is all chained together into one string of bytecode. This bytecode is then, by a hopefully quick optimizing compiler, converted into raw machine code, which

  • Does not contain any function calls
  • Does not have any sort of jumps or conditionals
  • Is SSE/MMS/AltiVec/whatever optimized so more than one sample is processed at a time
  • Processes the entire current signal chain :)

As I was writing this, I did think of a problem, however, and that is external plugins. However, I think it is not of too much consequence, since there needs to be a prebuffer of a certain size anyway, and the plugins are called to fill the buffer when it needs to be filled. The above audio mixdown loop would then simply extract things from that buffer.

What do you think? Possible? Reasonable? I know its about as unportable as programming in raw binary but most Audio stuff is done on x86 these days anyway, and it would only take a different compiler to port that part to a new platform.