Sean Middleditch » 2004 » May

The seminars weren’t quite so exciting as I had hoped. Mostly marketing drool and “introducing Linux to the Windows/Netware folks,” very little technical meat. Although I must say that the ease of use Novell/SuSE has put into their products is amazing. Red Hat seriously needs to get off their ass and get someone to write decent config and management utilities. The system-config-* tools in Fedora are complete and absolutely pieces of crap compared to what I saw today.

The Novell Identity Manager talks also gave me the idea to work on an Open Source identity management system. Just having a user account isn’t really enough. Some users have multiple email address, and they need to be easily associated with different servers. Different systems may require different login names as they are developed by different vendors with different ideas of what makes a valid user name. And so on.

A basic but functional identity manager shouldn’t be hard to build at all. You need store that keeps a true unique ID for each user (UUID probably) and a list of attributes with this object. The attributes, however, need to be able to be keyed with a machine and/or service ID. The system needs to be searchable by its attributes. The whole thing could be modelled on LDAP, on SQL with an LDAP interface, or a custom engine/API with an LDAP interface. The LDAP interface can be used for the lot of LDAP-aware apps around (we could map a special context to simulate the service lookup on attributes; i.e., look for cn=USER,ou=SERVICE,o=DOMAIN). Other apps could use a cleaner/nicer well-documented and stable API for accessing the information. A module for NSS can be written that would provide the limited information NSS provides to the system. A similar module for PAM could be written that also passes in the service name and does authentication using the system. We would, in this system, pretty much make PAM and NSS deprecated APIs in preference to the richer identity API.

The reasons I’m thinking about this are numerous. For one, the basic UNIX system sucks. Much of UNIX sucks, to be honest, though most of us would probably agree it’s better than the alternatives. The user/authentication model, though, is very simplistic and inflexible. You can authenticate against a user name using arbitrary rules with PAM. You can lookup basic user information using no key other than name or UID with rather simplistic rules using NSS. LDAP serves a purpose, but it too is too simplistic in some cases (requires heavy app customization to be able to pull out different data depending on the service accessing it) and is too obtuse to use efficiently otherwise (no standard simple way of storing identities and looking them up). Novell Identity Manager is a great system, although it’s not Free/Open and is also a little complex from an administration stand-point.

And, unlike so many other ideas I’ve had, I’ve actually *started* on this one. Unfortunately, like so many other ideas I’ve had, I probably won’t ever finish it. ;-)

Back on the original topic of the BrainShare conference, the event was fun other than the sessions themselves. ;-) The food was delicious, in any case, and while there weren’t any Ximian folks there to meet, there were some rather cool Novell folks that I’m afraid I hadn’t ever heard of before, but should’ve, because they’re all very knowledgable people with a lot of passion for the products they work on. Much like the Ximian folk.

Also, for anyone who needs an enterprise-grade groupware sollution, check out GroupWise on Linux. It’s not Free Software, but if you’re the kind that prefers solving problems over arguing theology, it’s probably the best groupware system you’ll find, bar none.

Tomorrow is the Best of BrainShare 2004 Road Show in Detroit, which my boss and I will be attending. Not the big fun extravaganza that the real BrainShare is but it should still be educational and all. I’m curious as to who the speakers will be for the seminars I signed up (the Novell technology + Linux ones, of course). I’m guessing most of the illuminaries I’d like to see and meet are probably not travelling all over the world with the backwater version of BrainShare, though. ;-)

So yes, I’m doing it for real; setting up Linux for Grandma. She’s used to Windows, MS Office, and AOL, which isn’t too bad. GNOME can be configured to act a lot like Windows, OpenOffice.org does everything she’s used to from MS Office, and AOL is out of the picture anyway so she’s gonna hafta learn “plain ol’ Web” no matter what. (Long story, rather personal, but she can’t afford the price of AOL any more.)

The machine we built, mostly of spare parts, is a real pain. Some/most of the RAM seems bad (should be about 196MiB in there, Linux is only seeing about 45MiB) and the CD-ROM is making the loudest and most ear piercing shriek I’ve heard in a long time. I’m half way through the installation of Fedora Core 2, so I figure I’ll just let it finish, then go back and figure out if there’s anything we can do for the RAM and any way to shut the CD-ROM up. We might just have to return the CD-ROM (we bought it today, the cheapest CD-ROM we could find; big surprise it’s a piece of crap, eh?).

Anyways, once the OS is installed, I’ll need to go in and customize things a good deal, install the add-ons she’ll probably want: Java for Yahoo! Games, Flash, probably a proprietary driver for the conexant modem, maybe the NVIDIA driver for the TNT1 card in there (but she won’t need 3D, so I’m not sure there’s any good reason to mess with the Fedora kernel to get it installed), get her printer setup, and figure out a good way to dial-in to the local cheap ISP (probably KPPP or the GST dialer).

Hopefully this works out. All my previous attempts to convert people I know to Linux have failed completely, mostly because of the absolute abysmal state of software installation caused by the kernel developers’ lack of caring about driver installation (Free/Open code or not), library and app developers not understanding ABI compatibility and versioning, distro makers not giving a damn about cross-distro compatibility, and the fact that most ISVs still think it’s OK to require users to run a series of shell commands to install their crappy products. I still think the near-term hope lies on AutoPackage, although in the long-term some more complete and low-level fixes will be needed to the insanity caused by incompetent and/or uncaring software developers and packagers. My Grandma, at least, likely won’t be installing a whole lot of software after the initial machine is setup, so I think she’ll be fine despite Linux’s severe software installation impairment.

Have been having some good discussions on the awemud-devel mailing list recently. Ideas have been thrown back and forth regarding exit locks, NPC movement control, the room/location system, and a couple other things. Also threw out the beginning of my explanation of the skill system I want to use with AweMUD.

Versus reposting all the goodness that has been said, it’s probably best to just link to the discussion. So here it is.

Haven’t had much time for actual code, but I plan on fixing that tonight and tomorrow. The basic lock code and NPC movement controlling code should be dead easy to write, so they should be in soon.

I decided to play around with getting better antispam and antivirus setup here at my house. I first thought I’d try Exim4, which is supposedly much more powerful than Exim3, but that was rather bust. The Debian packages seem rather broken (they produce a mangled config file) and without something to build off of, the differences between Exim4 and Exim3 left me rather lost. Maybe I’ll try that again.

Meanwhile, though, I made some improvements to my Exim3 configuration. For one, I validate the recipient address immediately, so all those pesky invaliduser@awesomeplay.com type spams won’t even come into my system, thus cleaning up the queue a good deal.

I also wanted to enable sender verification, but ran into a problem. Most mailing lists use an envelope from address of something akin to listname-bounces@server. The listname-bounces local part is not a valid address on the server. Thus, sender verification fails, and all mails from the list are reject. Yuck. No good way to get around that one, unfortunately, other than to get all mailing lists to correct their servers (yes, correct them - it’s invalid to use a mail from envelope address that does not exist).

My next order of business was to get away from Comcast’s crappy SMTP proxy, and switch instead to DynDNS.org’s wonderful outbound mailhop service. A single small yearly fee of $24.95 gives me 300 relays a day, which is more than I’ll need. I could have gone with an even cheaper 150 relays per day, but a certain problem made that currently infeasible.

That certain problem is the forwarding I’m doing for a friend. I used to host her email, but since Comcast is so wonderfully craptastic, the server was often unavailable when she needed it the most. She switched off to another service, but needed mail forwarding for her old address. She also wanted automatic mails sent to senders notifying them of the change, similar to vacation emails. The problem is that all those wonderful spams and viruses are generating bounce messages. Given the number she gets, 150 relays seemed a little skimpy. 300 is good enough, and for $10/year it wasn’t all that expensive at all.

The final bit I’m working on is getting SPF records up for awesomeplay.com and awemud.net. The former address is only used on my local network, so the only IPs that should send mail from it are my Comcast IP (thanks DynDns.org for making it possible to track that IP in near real time in DNS!) and DynDns.org’s mailhop service. The later address, awemud.net, is a little trickier, since a couple remote hosts send mail from that domain, mostly for things like the bug mails and login verifications from SourceForge. I think I have that nailed; should be easy. Although the test service is giving me a little pain. I think I just need to wait for their DNS caches to flush. Which shouldn’t take too long, of course.

I also (after testing a bit) setup my domains to use SPF Proxy. That’s a nice service that gives you the protection of SPF, for those of use who can’t/won’t setup modern MTAs like Exim4. ;-) You need to publish another special TXT record for your domain to configure SPF Proxy (how’s that for a configuration file, eh?) but it works very well. I have of course set it to “loose” mode, which means that domains which have no SPF records will be allowed to mail me. I would like to eventually switch off of SPF Proxy’s service, however, merely to reduce strain on that most excellent free service so it can serve those who don’t have any option but to use it.

I’ve posted all my old Advogato diary entries to my blog here. They’re all a little funky given how Advogato didn’t support entry titles and the like, but at least they’re all here. They’ll all under the Advogato category, strangely enough. ;-)

I’ve been working on the basic AweMUD entity system for the last few days, in addition to GC problems.

Recent changes have made it necessary to know rather immediately when an Entity is no longer anywhere in the system. For example, the tagging feature. When an entity disappears (say, an NPC was killed) it should no longer appear in the entity set for that tag.

The problem reminds me of the strong/weak/shared pointer system that graydon was discussing recently. Certainly, in simpler application, that method would solve this problem perfectly. Unfortunately, in a complex app like AweMUD, that solution is still completely unusable.

The biggest blocker is, as always, scripts. Scriptix is dynamically typed and there is no way for it to keep track of which pointers are owning, weak, etc. Additionally, there’s real need for objects to stay around after their owner releases them; for example, a system may be in place to detect when an object becomes unowned and reactivate it. The only way to use a language that enforced graydon’s system would require another “object manager” that held all the owning pointers and which required calls to request release or creation of new objects. And with that, we’re right back to square one - the developer is forced to manually manage object lifetime. Not very useful.

The system I’m devising for AweMUD does have some similarities to graydon’s proposal. It is more ad-hoc and more loosely defined, which is in this case a good thing. AweMUD already has a concept of ownership (well, parentalhood) for entities. Each entity has a direct parent, with the exception of zones. Thus, with any live entity, you can recursively iterate over the parents until you hit the top-level zone. Any entiy for which this is not true is not part of the live system.

My recent modifications add to this concept by creating the “activated” flag for each entity. An entity may only be marked activated when it is in the live system, and must be marked unactivated when it is out of the system. I’m using a separate flag for this instead of just checking the parent chain because the flag is much faster to query. The code will help in ensuring this flag stays accurate, however.

This has some interesting affects. An entity may not be made the child of another entity unless both are activated (live) or both are deactivated. This means that if you add a deactivated entity to an activated entity, the former must then become activated. Additionally, when an entity’s active flag changes states it must update all of its children as well. The active flag may also never be set on an entity being collected by the GC, and we check for this and raise an error when it occurs, as it signifies a bug.

The whole end purpose of this is, again, to allow us to run some code whenever an entity’s active flag changes. When it becomes active the entity must be added to the update list and map entries added for each of its tags for quick lookup. The reverse must be done when it becomes deactivated. This way, objects which exist but which are not part of the live system will never have their update method called and will not show up in tag queries. This is exactly what we want.

The actual current code does not yet reflect the above design. My original plan was going to be much stricter and require the coder to explicitly activate and deactivate an entity. For example, instead of automatically activating an entity when its made the child of an active object, we instead have an assertion on the add method that checks that the entity being added is already activated. This system falls apart quickly, however, for several reasons.

First off, back to scripting, it’s not possible to ensure. Nothing stops a coder from creating an object and adding to a room without activating it first. We’d need to add non-debug run-time checks to pretty much every method to check if an entity is active. Secondly, error checking becomes much more of a pain, especially given C++’s lack of a finally clause in exception handlers. (And the fact that I avoid exceptions, a lot more than I probably should be.) If you have an entity you plan on adding to, say, a room, you must check the room’s add method to see if it was indeed successfully added, and deactivate if if not. This is awkward and easy to get wrong.

The only problem with the new system is potential efficiency issues or bugs from misused methods. While the error checking problem as above isn’t there, and the new problem isn’t nearly so easy to get wrong, it does still require care. Basically, any entity which is removed from its parent, using the remove() method or something similar, must be deactivated. Thus, current code which calls remove and then adds the entity to a new parent will be grossly inefficient.

The fix for this problem is to add a new method to eaach entity, perhaps called release(). This method will do the exact same thing as remove(), except that it will not deactivate the entity. The tricky part is that if release() is called, the entity *must* be added to a new parent to ensure that the entity is live. This isn’t too hard to manage, however. Only the add methods of parent entities should ever call release(), and they are rare enough that it should be easy to make sure they all use it properly. Scripts will not have (and will not need) the ability to call release(), so all in all everything should work quite nicely.

Now I just need to get to actually updating the code to work this way. ;-)

I think I finally found the cause of the odd crashes and parser errors that occured when parsing large source files.

The collector had a habit of crashing, or scripts would error out with bogus “unexpected $eof” messages in the middle of the source, when a script was large enough. This happened especially with AweMUD’s character creation script.

After several days of near constant hacking, debugging, guessing, and pure frustration (Valgrind isn’t working on Fedora Core 2) I believe I finally found the problem. It’s an insanely simple fix, I don’t particularly understand why it works in this case (though I have theories), and it’s completely possible that it’s just masking the problem again.

That possible fix is to simple use GC_MALLOC_UNCOLLECTABLE in the grammar.yy and lexer.ll files. I think Flex/Bison are some how hiding or mangling certain pointers which causes the GC to not find them, and thus end up freeing the necessary memory. Since both Flex and Bison already clean up any memory they allocate, there’s no reason to need the GC to collect that memory, so the uncollectable allocation should suffice.

The lexer.ll file was even using GC_MALLOC_ATOMIC, which was likely twice as bad, since any pointers Lex stored wouldn’t be scanned by the GC. (Of course, I don’t know why Lex would be storing any such pointers… it shouldn’t be.) Turning that to plain GC_MALLOC made bug disappear for a few runs, but another source change made it reappear. I’m hoping the uncollectable change won’t behave in a similar fashion and show up again later. I’m rather sick of this bug and what it *gone*.

This would have been *so* much easier if Valgrind was working…

Just got back from one of my very good friend’s wedding. All in all, boring as I expected. I mean, it’s a wedding; how much fun could it be? ;-) Especially given that drinking was probably the most exciting thing going on, and I don’t drink.

On the upside, however, I got to buy a very snazzy suit. So, on future occassions when I feel like looking incredibly good, I have that option. ;-)

Actually, I shouldn’t lead one to believe the wedding was completely boring. It was a good deal of fun aside from the wedding part. I mean, hanging out with friends, meeting quite a few new people, numerous opportunies to ogle the bride’s delicious sister, etc. All in all, it beat sitting around here coding.

The usually wonderful Boehm GC (Garbage Collector) is giving me a head-ache at the moment. For some reason, finalizers (destructors) don’t seem to be running for certain objects. This means that non-GCd resources (including non-GCd memory) aren’t being released.

Can you say resource leak?

This sorts of problems are such a pain to track down.